Jarrod Johnson
3105b9b1f9
Significantly rework the collective startup behavior
...
One, make the tracking bools enforce a lock to reduce confusion
Treat an initializing peer as failed, to avoid getting too fixated
on an uncertain target.
Make sure that no more than one follower is tried at a time by
killing before starting a new one, and syncing up the configmanager
state
Decline to act on an assimilation request if we are trying to connect
and also if the current leader asks us to connect and we already are.
Avoid calling get_leader while connecting, as that can cause a member
to decide to become a leader while trying to connect, by swapping
the reactions to the connect request.
Avoid trying to assimilate existing followers.
Fix some logging.
2018-10-12 11:45:23 -04:00
Jarrod Johnson
f525c25ba6
Provide more verbose collective logging
...
This helps understand the flow in practice of collective behavior.
2018-10-11 15:15:11 -04:00
Jarrod Johnson
3012de1fe4
Prioritize deletion of transactioncount
...
If the invalidation is incomplete, make sure that transactioncount
is invalidated first to avoid it being able to propogate through
a collective.
2018-10-11 09:16:57 -04:00
Jarrod Johnson
be930fc076
Add missing subsystem marker from a collective log
2018-10-10 16:30:28 -04:00
Jarrod Johnson
2d0199a4e9
Wrap bdb deletion in same lock that sync itself uses
...
If os.remove happens at a bad time, it causes an unfortunate behavior
in dbm. Serialize this sort of operation to avoid the bad behavior.
2018-10-10 15:24:55 -04:00
Jarrod Johnson
6b70a4322a
Fix rollback
...
The fix for the stale data introduced breaking clear rollback
Restore the behavior and make self._cfgstore a somewhat slower property
for now.
2018-10-10 15:22:20 -04:00
Jarrod Johnson
6a784e3a1c
Ensure sync is complete prior to leaving configmanager sync
...
The initialization lock is meant to avoid collective and generic
initialization stepping on each other. This is somewhat reduced in
efficacy if one has a sync running while the other is changing relevant
data.
2018-10-10 14:49:33 -04:00
Jarrod Johnson
3b2b96a4cf
Force fullsync if dead sync thread likely
...
If the sync thread died previously, force the next sync to be full.
2018-10-10 14:32:13 -04:00
Jarrod Johnson
32ddb33de3
Fix error when trying to do fullsync without globals yet
...
If globals is missing, then do not break the sync trying to handle it
2018-10-10 13:11:15 -04:00
Jarrod Johnson
b77ed8dbff
Fix config sync on dead writer
...
The sync thread can die without clearing syncrunning. Make sure that
the thread is alive *and* that the thread has not indicated
intent to give up.
2018-10-10 13:07:27 -04:00
Jarrod Johnson
d5c093a30d
Provide fallback for unexpected reply in collective show
2018-10-10 09:46:01 -04:00
Jarrod Johnson
cf9d2a43e8
Revert "Provide fallback for unexpected reply in collective show"
...
This reverts commit 2f566fb81ddfdd14b3b623ee6d1ff48d67e636b4.
2018-10-10 09:44:06 -04:00
Jarrod Johnson
2f566fb81d
Provide fallback for unexpected reply in collective show
2018-10-10 09:41:25 -04:00
Jarrod Johnson
1ee418392f
Provide better error behavior on missing collective.manager
...
collective.manager was a blanket response, make it per node and
only triggered by the bad nodes, not the rest.
2018-10-09 15:44:17 -04:00
Jarrod Johnson
2a7eeb6e08
Fix missing argument in calling a function
2018-10-09 15:21:11 -04:00
Jarrod Johnson
5c83c78a90
Add warning on incompatible ssh key with SLES12
2018-10-09 14:44:06 -04:00
Jarrod Johnson
6a466b0100
Avoid proxy consoles generating proxy consoles
...
When the client is a proxy term, disable ability to produce
proxy terminals. This was wreaking havoc with client
count with ghosts and triggering output multiplication.
2018-10-09 13:21:02 -04:00
Jarrod Johnson
5f46899358
Prevent clear_configuration from invaliding existing ConfigManager
...
Clear out the existing dictionary instead of replacing it.
This prevents configmanager objects from being stuck.
2018-10-08 16:51:58 -04:00
Jarrod Johnson
73c06fd25e
Fix display of error on join of collective
2018-10-08 09:54:03 -04:00
Jarrod Johnson
8d9a082739
Provide better exceptions and propogate them to client on snmp
...
When doing snmp, messages would always go to log only, even if the
user was at the confetty cli. Give user access to knowing the error
impacting the query.
2018-10-04 14:59:25 -04:00
Jarrod Johnson
32602fbba3
Provide interactive handling of key mismatch in ssh sessions
...
Before, ssh would close without so much as a warning, fix this by
dealing with the key data.
2018-10-04 10:23:55 -04:00
Jarrod Johnson
2f616d4586
Better error when collective.manager is set to something invalid
...
If the collective.manager field does not correspond to any collective
manager, give a useful error rather than unexpected error.
2018-10-03 16:23:20 -04:00
Jarrod Johnson
d86e1fc4eb
Give the cfg init a lock
...
Move collective manager and configmanager to share a configinitlock,
so that bad timings during internal initialization and collective
activity cannot interfere and produce corrupt database.
This became an issue with the fix for 'everything' disappearing.
2018-10-02 10:17:44 -04:00
Jarrod Johnson
78a1741e0e
Fix usage of check_quorum()
...
It is not a boolean, it is exception driven.
2018-10-01 16:02:16 -04:00
Jarrod Johnson
4329c1d388
Have collective start bail out if leader
...
Leader should not relinquish if quorum, so don't bother in such
a case.
2018-10-01 15:50:49 -04:00
Jarrod Johnson
b0b5493ff7
Cancel retry if we become leader
...
If an instance is first to start, it's retry should be canceled
when other members prod it to become leader.
2018-10-01 15:29:18 -04:00
Jarrod Johnson
326f56219b
Fix /networking/macs/by-mac
...
The module apimacmap was not correctly scoped.
2018-10-01 14:40:02 -04:00
Jarrod Johnson
e098c0ba91
Fix missing tenant argument on user management function
...
The tenant was omitted preventing those particular rpc calls from
working correctly.
2018-10-01 14:04:03 -04:00
Jarrod Johnson
61e7c90ad1
Do not restart on intentional kill
...
Additionally, add some output to help filter events log
2018-10-01 10:32:55 -04:00
Jarrod Johnson
e57cdf9a7b
Add more collective event log handling
...
More detail to analyze how the collective membership is handled.
2018-09-27 15:15:05 -04:00
Jarrod Johnson
10ce7a9de9
Add more logging to collective process
2018-09-27 10:51:06 -04:00
Jarrod Johnson
0724ad812b
Add logging to the assimilation phase of collective
...
When attempting assimilation, provide logging about the attempt.
2018-09-27 10:51:01 -04:00
Jarrod Johnson
a3b0b0240d
Abort assimilation attempt on non-member cleanly
...
If a confluent instance has forgotten the collective, more cleanly
handle the situation, and abort the assimilation rather than assuming
the peer should be leader, unless txcount specifically is called out
as the reason.
2018-09-27 10:50:54 -04:00
Jarrod Johnson
18bebde337
Disable gssapi in paramiko
...
It is just plain broken, workaround by tanking calls to gssapi prior
to pulling in paramiko.
2018-09-21 13:46:07 -04:00
Jarrod Johnson
f601032a66
Fix everything group missing if nodegroup created before node
...
everything group was not making it to disk unless a node is created
first. Correctly mark the need for disk sync to fix.
2018-09-14 16:50:20 -04:00
Jarrod Johnson
db5f861dc5
Fix introduced typo in error message
2018-09-10 14:25:04 -04:00
Jarrod Johnson
d04be19ae5
Preferentially use a 'name' subfield as 'name'
...
Pyghmi now may suggest a more useful name. The component name
is unique, but 'name' can indicate the common name of things with
multiple instances.
2018-09-07 14:37:02 -04:00
Jarrod Johnson
e7be24d478
Revert "Fix non-unique name for similar inventory items."
...
This reverts commit 47a53a51e4bb625e28232cb8f792f2053b2e0b64.
2018-09-07 11:44:01 -04:00
Jarrod Johnson
34b7abcb2d
Change systemd unit to not have PIDFile
...
systemctl restart *always* prints a worrying message
with pidfile.
2018-09-07 11:27:43 -04:00
Jarrod Johnson
47a53a51e4
Fix non-unique name for similar inventory items.
2018-09-07 11:16:09 -04:00
Jarrod Johnson
b3bf6929df
Add replacement logic for another generic variant
...
In IMM, PCeGen3 x8 and similar is also possible.
2018-09-06 16:16:26 -04:00
Jarrod Johnson
2a8d61ecf6
Enrich the less than useful 'Adapter' inventory items
...
We can provide DNS provided info about such generic items to
make them look more fleshed out.
2018-09-06 16:10:48 -04:00
Jarrod Johnson
cf3e9037ab
Provide 'discovery.passwordrules'
...
This provides an ability to designate the desired rules that
are applied in the wake of automatic discovery. The most popular
would be 'expiry=no,loginfailures=0'
2018-09-05 15:50:36 -04:00
Jarrod Johnson
03135543a6
Add 'switchuser' and 'switchpass' aliases
2018-09-05 13:51:19 -04:00
Jarrod Johnson
f92b1ed4a3
Implement ability to prompt for nodeattrib options.
...
For certain attributes, notably passwords, it is sometimes desirable
to prompt interactively to help facilitate keeping such data out of
bash_history, screen sharing, and ps output. -e enables this if the
user is aware of how to use 'read', -p is a quicker way to enable this.
2018-09-04 09:38:01 -04:00
Jarrod Johnson
ba18b9936f
Fix mistakes in previous commit
2018-08-29 15:15:34 -04:00
Jarrod Johnson
3b7ecd0095
Add ability to clear system configuration
...
This provides a method to request the system firmware be restored to
factory defaults.
2018-08-29 14:49:19 -04:00
Jarrod Johnson
8352007570
Limit to one active scan at a time
...
Additionally, provide read access to rescan for discovery.
2018-08-28 11:25:48 -04:00
Jarrod Johnson
f7965d235a
Improve /networking/macs API behavior
...
For the 'by-mac', only remove the structure when it is ready for API
view without changing internal view.
For the 'by-switch', do the update per switch and after it's done.
Provide ability to check scan status through reading
/networking/macs/rescan
2018-08-28 11:10:32 -04:00
Jarrod Johnson
6aec9534e7
Fixes for nodesupport
2018-08-23 16:56:40 -04:00