Jarrod Johnson
153956b2cd
Further refine collective startup behavior
2021-04-08 18:26:59 -04:00
Jarrod Johnson
64d5081be3
Fix collective retry logic
...
It erroneously treated a thread as a bool, need
to check if None to know if it is scheduled.
2021-04-08 15:56:33 -04:00
Jarrod Johnson
8d16b412ae
Further refine collective start process
...
Serialize assimilation, do not induce activity that may have been
aborted by an earlier chain.
Further, accelerate initial startup by making potential timeouts
occur concurrently, rather than sequentially.
2021-04-08 13:44:20 -04:00
Jarrod Johnson
c5ec34d5a5
Actually hook in new assimilation behavior
...
Additionally, use collective member name as
tiebreaker if txcount and follower count is
identical.
2021-04-07 13:15:36 -04:00
Jarrod Johnson
6a4086679c
Rework collective assimilation logic
...
Followers will only depart if their current leader
is assimilated.
Leaders with quorum will refuse assimilation and instruct
member trying to assimilate to join it.
Leaders without quorum will either follow the assimilation leader
or refuse, depending on who has highest transaction count, and if
a tie, which has the larger set of followers
2021-04-07 13:05:02 -04:00
Jarrod Johnson
3d2422ded3
Coerce iterator to list for length check
2021-04-07 08:40:18 -04:00
Jarrod Johnson
7d0f47bbcb
Correct syntax for measuring collective size
2021-04-07 08:31:29 -04:00
Jarrod Johnson
9a779f2dd2
Improve collective startup behavior
...
Have confluent focus on establishing quorum before
initiating headless console and discovery activity.
2021-04-07 07:58:10 -04:00
Jarrod Johnson
4724d9e45b
Eliminate one source of stale data leaking
...
If nodes are swapped, then clearing the attributes
did not clear some mappings.
2021-04-06 18:19:13 -04:00
Jarrod Johnson
8404ddf3a2
Fix import of sortutil
2021-04-06 10:43:37 -04:00
Jarrod Johnson
872a13589c
Sort collective members
...
Improve consistency of output by sorting the output.
2021-04-06 09:35:34 -04:00
Jarrod Johnson
670fc87e1d
Address several collective issues
...
When a stream has been deleted from cfgstreams, contiue exception
handling since the desired result.
For connections to a manager, institute a 15 second socket level timeout.
This avoids an abandoned server conversation from locking a colleective member startup.
When scheduling the failover check, first block any redundant attempts to schedule.
Wrap the collective startup in an exception
handler, so that a retry is
more well guaranteed.
2021-04-05 16:39:41 -04:00
Jarrod Johnson
14d13749ad
Guarantee consoleserver init before use
...
During a restart, a client may aggressively trigger
console reconnect before the consoleserver starts.
Make sure that the daemon is running and globals
ready before API could possible ask for console.
2021-04-02 13:57:45 -04:00
Jarrod Johnson
d9051e80d3
Fix console buffer interaction
...
In some environments, the read method on the pipe object fails
to work, os.read should be the same, but seems to work better and
is happy to perform the opportunisticly large reads I want.
2021-04-02 12:29:41 -04:00
Jarrod Johnson
2d43fac1b5
Add a contingency for temporary read error
2021-04-02 11:54:38 -04:00
Jarrod Johnson
4a0d419f85
Improve behavior on python2 systems
2021-04-02 11:47:14 -04:00
Jarrod Johnson
b482410072
Have server pull in vtbufferd
2021-04-01 17:31:52 -04:00
Jarrod Johnson
beab6a3c02
Migrate VT buffering to c
...
C implementation to improve memory and cpu utilization.
Forked off to further move the work off the main process.
Still needs attribute rendition and packaging before merging to main
branch.
2021-03-31 17:28:26 -04:00
Jarrod Johnson
5e0ebce300
Add logs for offering boot
...
Make it easier to debug a failure
to boot due to misconfiguration.
2021-03-29 16:58:03 -04:00
Jarrod Johnson
85c4ec5654
Skip fqdn in cert generation
...
There are scenarios where getqfdn can induce a hang.
The certificate having FQDN isn't that useful anyway,
since confluent never uses it and external use of it
may need more carefully crafted certificate to have
a good chance of matching it anyway.
Also, the chances a user would import our cert as a
CA to something like a browser are low.
2021-03-29 14:29:42 -04:00
Jarrod Johnson
ca0c592044
Commence work on syncfileclient
...
This will be used to wait for deployer
to finish, then execute handlers
for 'MERGE' entries.
2021-03-25 16:55:56 -04:00
Jarrod Johnson
829d1316b2
Remove 'APPEND:'
...
If possible, we wannt to stick to 'MERGE', since the handlers
are designed to make that
idempotent for repeat.
APPEND: would not be idempotent...
2021-03-25 12:37:21 -04:00
Jarrod Johnson
c2c1c85651
Increase verbosity of syncfiles
2021-03-25 08:59:16 -04:00
Jarrod Johnson
c5833f1417
Reserve some characters for special syntax
...
May need to modify some behaviors in future, provide
a healthy supply of reserved characters toward that end.
2021-03-25 08:57:48 -04:00
Jarrod Johnson
07ae3593c3
Syncfiles fixes
2021-03-24 17:33:26 -04:00
Jarrod Johnson
8ab35b11cd
Add -a notation about syncfiles relevance
2021-03-24 16:01:48 -04:00
Jarrod Johnson
35ef6170ba
Implement syncfiles server side
2021-03-24 16:00:54 -04:00
Jarrod Johnson
d650f11255
Begin work on syncfiles concept for confluent
2021-03-23 17:32:45 -04:00
Jarrod Johnson
7c5dd85e74
Copy in ansible to genesis profiles
...
Make it clearer that there could be ansible support in genesis
2021-03-19 13:09:21 -04:00
Jarrod Johnson
df97e808c6
Wire up client retrieval of remoteconfig
...
remoteconfig can now watch for completion and return
data to client
2021-03-17 17:46:27 -04:00
Jarrod Johnson
98d14344ce
Have runansible complete execution
...
It now completes runs and stores results for retrieval by deploying node.
2021-03-17 16:54:37 -04:00
Jarrod Johnson
4818bd57bb
Advance state of runansible
...
Begin work to wire the results back to the supervisor and ultimately
back to the node affected.
2021-03-16 17:19:23 -04:00
Jarrod Johnson
7763327a63
Skip confluent import on exec
...
In main context, we don't need
sshutil, and python path is not
cooperative, so just skip it.
2021-03-16 16:23:27 -04:00
Jarrod Johnson
dac957364f
Merge branch 'master' into ansibleplay
2021-03-16 14:54:53 -04:00
Jarrod Johnson
7fdb4dccca
Fix network configuration changes through collective
2021-03-16 14:53:55 -04:00
Jarrod Johnson
a3c8c305c1
Add ansible supervisory code
2021-03-16 14:19:44 -04:00
Jarrod Johnson
f2cb6ea535
Merge branch 'master' into ansibleplay
2021-03-16 10:46:54 -04:00
Jarrod Johnson
5a81279b9d
Fix servicedata fetch
...
In the new framework with
filehandle passing, the provided
callback must accept a data
argument. get_diags can't
actually use it, but does have
to accept a token value.
2021-03-15 16:45:43 -04:00
Jarrod Johnson
7026fdfa3a
WIP change to enhance selfservice api for ansible
...
Begin work on extending selfservice to allow nodes
to request the deployer run ansible plays.
2021-03-12 12:23:58 -05:00
Jarrod Johnson
e38dbc4470
Pull in the automation key into default profiles
2021-03-10 15:41:01 -05:00
Jarrod Johnson
a26624a614
Use ssh-agent to store keys
...
Also add the 'automation' key for ansible to
take advantage of.
2021-03-10 15:41:01 -05:00
Jarrod Johnson
6a88f35fc2
Fix compatibility with some switch configurations
...
While some implementations mess up portid and need portdescr instead, others are
just the opposite.
Tolerate match either by description or name.
2021-03-10 13:40:47 -05:00
Jarrod Johnson
7a27fba94b
Workaround non-cisco switch crash
...
Querying Cisco MIB on certain
firmware levels of non-cisco switches
causes a crash. Tolerate and
wait a bit to give SNMP a chance to restart.
2021-03-10 09:51:05 -05:00
Jarrod Johnson
2a87c32800
Handle malformed json data more gracefully
2021-03-08 02:55:58 -05:00
Jarrod Johnson
2e63ff3aca
Close other places that may be false negative
...
Have checks for neightable be preceeded by an attempt to refresh,
to mitigate false negatives.
2021-03-05 13:14:41 -05:00
Jarrod Johnson
e3c17491e5
Attempt refresh of neigh table on miss
...
When an address is new it may not be in the last
captured neighbor table. Induce refresh before deciding
that neighbor is unavailable.
2021-03-05 13:08:25 -05:00
Jarrod Johnson
5a5b3f9927
Correct incorrect function call on image import
2021-03-04 16:55:26 -05:00
Jarrod Johnson
865d18d367
Back off progress update on file copy
...
The output was too frenzied, cool it down to fewer
updates.
2021-03-04 11:01:17 -05:00
Jarrod Johnson
b27d07f304
Fix nic index map with bonding
...
The assumption that /sys/class/net is interfaces is incorrect,
when encountering entries that are not interfaces, do not
mess up the call.
2021-03-04 10:47:28 -05:00
Jarrod Johnson
b1f1f15ba8
Fix filecopy and enhnance
...
shutil was copying from wrong
index. Stop using shutil and
bring it in, and instrument the
percentage.
2021-03-04 09:29:31 -05:00