2
0
mirror of https://github.com/xcat2/confluent.git synced 2025-02-05 21:42:24 +00:00

1721 Commits

Author SHA1 Message Date
Jarrod Johnson
153956b2cd Further refine collective startup behavior 2021-04-08 18:26:59 -04:00
Jarrod Johnson
64d5081be3 Fix collective retry logic
It erroneously treated a thread as a bool, need
to check if None to know if it is scheduled.
2021-04-08 15:56:33 -04:00
Jarrod Johnson
8d16b412ae Further refine collective start process
Serialize assimilation, do not induce activity that may have been
aborted by an earlier chain.

Further, accelerate initial startup by making potential timeouts
occur concurrently, rather than sequentially.
2021-04-08 13:44:20 -04:00
Jarrod Johnson
c5ec34d5a5 Actually hook in new assimilation behavior
Additionally, use collective member name as
tiebreaker if txcount and follower count is
identical.
2021-04-07 13:15:36 -04:00
Jarrod Johnson
6a4086679c Rework collective assimilation logic
Followers will only depart if their current leader
is assimilated.

Leaders with quorum will refuse assimilation and instruct
member trying to assimilate to join it.

Leaders without quorum will either follow the assimilation leader
or refuse, depending on who has highest transaction count, and if
a tie, which has the larger set of followers
2021-04-07 13:05:02 -04:00
Jarrod Johnson
3d2422ded3 Coerce iterator to list for length check 2021-04-07 08:40:18 -04:00
Jarrod Johnson
7d0f47bbcb Correct syntax for measuring collective size 2021-04-07 08:31:29 -04:00
Jarrod Johnson
9a779f2dd2 Improve collective startup behavior
Have confluent focus on establishing quorum before
initiating headless console and discovery activity.
2021-04-07 07:58:10 -04:00
Jarrod Johnson
4724d9e45b Eliminate one source of stale data leaking
If nodes are swapped, then clearing the attributes
did not clear some mappings.
2021-04-06 18:19:13 -04:00
Jarrod Johnson
8404ddf3a2 Fix import of sortutil 2021-04-06 10:43:37 -04:00
Jarrod Johnson
872a13589c Sort collective members
Improve consistency of output by sorting the output.
2021-04-06 09:35:34 -04:00
Jarrod Johnson
670fc87e1d Address several collective issues
When a stream has been deleted from cfgstreams, contiue exception
handling since the desired result.

For connections to a manager, institute a 15 second socket level timeout.
This avoids an abandoned server conversation from locking a colleective member startup.

When scheduling the failover check, first block any redundant attempts to schedule.

Wrap the collective startup in an exception
handler, so that a retry is
more well guaranteed.
2021-04-05 16:39:41 -04:00
Jarrod Johnson
14d13749ad Guarantee consoleserver init before use
During a restart, a client may aggressively trigger
console reconnect before the consoleserver starts.

Make sure that the daemon is running and globals
ready before API could possible ask for console.
2021-04-02 13:57:45 -04:00
Jarrod Johnson
d9051e80d3 Fix console buffer interaction
In some environments, the read method on the pipe object fails
to work, os.read should be the same, but seems to work better and
is happy to perform the opportunisticly large reads I want.
2021-04-02 12:29:41 -04:00
Jarrod Johnson
2d43fac1b5 Add a contingency for temporary read error 2021-04-02 11:54:38 -04:00
Jarrod Johnson
4a0d419f85 Improve behavior on python2 systems 2021-04-02 11:47:14 -04:00
Jarrod Johnson
b482410072 Have server pull in vtbufferd 2021-04-01 17:31:52 -04:00
Jarrod Johnson
beab6a3c02 Migrate VT buffering to c
C implementation to improve memory and cpu utilization.

Forked off to further move the work off the main process.

Still needs attribute rendition and packaging before merging to main
branch.
2021-03-31 17:28:26 -04:00
Jarrod Johnson
5e0ebce300 Add logs for offering boot
Make it easier to debug a failure
to boot due to misconfiguration.
2021-03-29 16:58:03 -04:00
Jarrod Johnson
85c4ec5654 Skip fqdn in cert generation
There are scenarios where getqfdn can induce a hang.
The certificate having FQDN isn't that useful anyway,
since confluent never uses it and external use of it
may need more carefully crafted certificate to have
a good chance of matching it anyway.

Also, the chances a user would import our cert as a
CA to something like a browser are low.
2021-03-29 14:29:42 -04:00
Jarrod Johnson
ca0c592044 Commence work on syncfileclient
This will be used to wait for deployer
to finish, then execute handlers
for 'MERGE' entries.
2021-03-25 16:55:56 -04:00
Jarrod Johnson
829d1316b2 Remove 'APPEND:'
If possible, we wannt to stick to 'MERGE', since the handlers
are designed to make that
idempotent for repeat.

APPEND: would not be idempotent...
2021-03-25 12:37:21 -04:00
Jarrod Johnson
c2c1c85651 Increase verbosity of syncfiles 2021-03-25 08:59:16 -04:00
Jarrod Johnson
c5833f1417 Reserve some characters for special syntax
May need to modify some behaviors in future, provide
a healthy supply of reserved characters toward that end.
2021-03-25 08:57:48 -04:00
Jarrod Johnson
07ae3593c3 Syncfiles fixes 2021-03-24 17:33:26 -04:00
Jarrod Johnson
8ab35b11cd Add -a notation about syncfiles relevance 2021-03-24 16:01:48 -04:00
Jarrod Johnson
35ef6170ba Implement syncfiles server side 2021-03-24 16:00:54 -04:00
Jarrod Johnson
d650f11255 Begin work on syncfiles concept for confluent 2021-03-23 17:32:45 -04:00
Jarrod Johnson
7c5dd85e74 Copy in ansible to genesis profiles
Make it clearer that there could be ansible support in genesis
2021-03-19 13:09:21 -04:00
Jarrod Johnson
df97e808c6 Wire up client retrieval of remoteconfig
remoteconfig can now watch for completion and return
data to client
2021-03-17 17:46:27 -04:00
Jarrod Johnson
98d14344ce Have runansible complete execution
It now completes runs and stores results for retrieval by deploying node.
2021-03-17 16:54:37 -04:00
Jarrod Johnson
4818bd57bb Advance state of runansible
Begin work to wire the results back to the supervisor and ultimately
back to the node affected.
2021-03-16 17:19:23 -04:00
Jarrod Johnson
7763327a63 Skip confluent import on exec
In main context, we don't need
sshutil, and python path is not
cooperative, so just skip it.
2021-03-16 16:23:27 -04:00
Jarrod Johnson
dac957364f Merge branch 'master' into ansibleplay 2021-03-16 14:54:53 -04:00
Jarrod Johnson
7fdb4dccca Fix network configuration changes through collective 2021-03-16 14:53:55 -04:00
Jarrod Johnson
a3c8c305c1 Add ansible supervisory code 2021-03-16 14:19:44 -04:00
Jarrod Johnson
f2cb6ea535 Merge branch 'master' into ansibleplay 2021-03-16 10:46:54 -04:00
Jarrod Johnson
5a81279b9d Fix servicedata fetch
In the new framework with
filehandle passing, the provided
callback must accept a data
argument.  get_diags can't
actually use it, but does have
to accept a token value.
2021-03-15 16:45:43 -04:00
Jarrod Johnson
7026fdfa3a WIP change to enhance selfservice api for ansible
Begin work on extending selfservice to allow nodes
to request the deployer run ansible plays.
2021-03-12 12:23:58 -05:00
Jarrod Johnson
e38dbc4470 Pull in the automation key into default profiles 2021-03-10 15:41:01 -05:00
Jarrod Johnson
a26624a614 Use ssh-agent to store keys
Also add the 'automation' key for ansible to
take advantage of.
2021-03-10 15:41:01 -05:00
Jarrod Johnson
6a88f35fc2 Fix compatibility with some switch configurations
While some implementations mess up portid and need portdescr instead, others are
just the opposite.

Tolerate match either by description or name.
2021-03-10 13:40:47 -05:00
Jarrod Johnson
7a27fba94b Workaround non-cisco switch crash
Querying Cisco MIB on certain
firmware levels of non-cisco switches
causes a crash.  Tolerate and
wait a bit to give SNMP a chance to restart.
2021-03-10 09:51:05 -05:00
Jarrod Johnson
2a87c32800 Handle malformed json data more gracefully 2021-03-08 02:55:58 -05:00
Jarrod Johnson
2e63ff3aca Close other places that may be false negative
Have checks for neightable be preceeded by an attempt to refresh,
to mitigate false negatives.
2021-03-05 13:14:41 -05:00
Jarrod Johnson
e3c17491e5 Attempt refresh of neigh table on miss
When an address is new it may not be in the last
captured neighbor table. Induce refresh before deciding
that neighbor is unavailable.
2021-03-05 13:08:25 -05:00
Jarrod Johnson
5a5b3f9927 Correct incorrect function call on image import 2021-03-04 16:55:26 -05:00
Jarrod Johnson
865d18d367 Back off progress update on file copy
The output was too frenzied, cool it down to fewer
updates.
2021-03-04 11:01:17 -05:00
Jarrod Johnson
b27d07f304 Fix nic index map with bonding
The assumption that /sys/class/net is interfaces is incorrect,
when encountering entries that are not interfaces, do not
mess up the call.
2021-03-04 10:47:28 -05:00
Jarrod Johnson
b1f1f15ba8 Fix filecopy and enhnance
shutil was copying from wrong
index. Stop using shutil and
bring it in, and instrument the
percentage.
2021-03-04 09:29:31 -05:00