Previously, if a username or password was bad, retry would not occur.
Correct this such that every so often or right when someone connects,
the target is checked to see if the user/password is now considered good.
Previously, it would register 2**x attribute watchers by mistake. Exponential
growth of threads trying to talk to one BMC is evidently a bad thing. Fix
this by correctly tracking and cancelling previous attribute watchers.
Additionally, mask a harmless exception brought on by the death of orphaned
pyghmi console objects by having them yell into the endless void rather
than trip on an exception.
The ipmi plugin, at least, is not yet quite right. Need to
continue debugging having a console session open, then changing
the bmc to a bad address, then changing it back. I fixed some of
the easier exceptions, but it is clearly still getting quite confused
to the point where 3 or 4 cycles guarantees the console can not easily heal.
Require user indicate 'console.method' rather than trying to guess.
Notably, console.method might not be desired in a configuration
that wishes only to use remote video.
The strategy was going to allow for a distinct IPMI account for automation
from other protocols. However, this is pretty complicated to explain to
people. The thought before was that the HTTPS/SSH type access could use
a passphrase that is easy to remember whilst ipmi accounts would tend
to be randomized. Instead, have the software managed authentication
info be used across all protocols and avail endpoint of user management
to add human-friendly accounts if needed (disabling IPMI/SNMP by default
in such cases).
Implement 'everything' group behavior
precheck group and node settings
do not create groups or nodes by default
Have httpapi preserve original query in case the plugin modifies it for accurate API
explorer output
Firmware fixes obsolete the need. The bad behavior on older firmware
is sufficiently tolerable that code to workaround that could have bad
side effects can reasonably be abolished.
To do performance optimization in this sort of application, this is
about as well as I have been able to manage in python. I will say perl with
NYTProf seems to be significantly better for data, but this is servicable.
I tried yappi, but it goes wildly inaccurate with this codebase. Because of
the eventlet plumbing, cProfile is still pretty misleading. Best strategy
seems to be review cumulative time with a healthy grain of salt around the
top items until you get down to info that makes sense. For example, trampoline
unfairly gets a great deal of the 'blame' by taking on nearly all the activity.
internal time seems to miss a great deal of important information.
Previously, the state would be seen as 'connected' and then 'disconnected' in event of
connection failing. Rework things such that the console session stays in 'connecting' state
until timeout or success occurs and don't send disconnect, instead raising an exception.
This makes the connection action a bit more intuitive to the user, who would assume a 'connected'
console means the endpoint was reachable. This may not always be possible in a console plugin,
but it's a nice pattern when possible. If a console plugin cannot tell when 'connected' happens, then
the previous behavior of this plugin makes sense as a 'best effort': return 'connected', send
disconnect event when the console turns out to be bad. For example, executable consoles are most
likely going to follow this pattern. An option could be for an executable to have a certain
signature to print to show 'connected' though...
This change causes cfg change notifications to more accurately reflect atomic
expectactions. If multiple fields are changed on multiple nodes that a watcher may
have registered, they will now get that data in one chunk instead of many.
Add ability for code to add watchers on nodes and their attributes. This is likely to
be reworked internally to better aggregate requests, but the code interface
is potentially complete.
It has been expressed that plural form for collection names are preferred. Additionally, tab
completion is nicer if names do not share so much leading characters.
The facility was incorrectly reassembling the text records in reverse order. With this change,
the set buffer from log function seems to be working as intended.
Before, it would delve back to set state if recent entries indicated to not
assert the tracked states. Correct that behavior so only last entry counts.
ESXi requires a distinctly different keypad mode and 'shift in' character set. Track
requests for those states. Reset on 'null' character, which seems to only be emitted by UEFI
so far. Ideally, things change such that we can remove that workaround.
Since the log analysis merely needs to know if a connect/disconnect is redundant,
only report 0, 1, or '2' connections to indicate 2 or greater. log analysis
then would want to seek out a connect with eventdata of '1' and disconnect with
eventdata of '0' and mostly ignore the '2' info. Desire for more data
could be done by actually counting the connects and disconnects, this is
just to provide a fast path to finding the 'first connection' and 'last disconnect'
signatures.