Previously, if a username or password was bad, retry would not occur.
Correct this such that every so often or right when someone connects,
the target is checked to see if the user/password is now considered good.
Previously, it would register 2**x attribute watchers by mistake. Exponential
growth of threads trying to talk to one BMC is evidently a bad thing. Fix
this by correctly tracking and cancelling previous attribute watchers.
Additionally, mask a harmless exception brought on by the death of orphaned
pyghmi console objects by having them yell into the endless void rather
than trip on an exception.
The ipmi plugin, at least, is not yet quite right. Need to
continue debugging having a console session open, then changing
the bmc to a bad address, then changing it back. I fixed some of
the easier exceptions, but it is clearly still getting quite confused
to the point where 3 or 4 cycles guarantees the console can not easily heal.
Require user indicate 'console.method' rather than trying to guess.
Notably, console.method might not be desired in a configuration
that wishes only to use remote video.
The strategy was going to allow for a distinct IPMI account for automation
from other protocols. However, this is pretty complicated to explain to
people. The thought before was that the HTTPS/SSH type access could use
a passphrase that is easy to remember whilst ipmi accounts would tend
to be randomized. Instead, have the software managed authentication
info be used across all protocols and avail endpoint of user management
to add human-friendly accounts if needed (disabling IPMI/SNMP by default
in such cases).
Implement 'everything' group behavior
precheck group and node settings
do not create groups or nodes by default
Have httpapi preserve original query in case the plugin modifies it for accurate API
explorer output
Firmware fixes obsolete the need. The bad behavior on older firmware
is sufficiently tolerable that code to workaround that could have bad
side effects can reasonably be abolished.
To do performance optimization in this sort of application, this is
about as well as I have been able to manage in python. I will say perl with
NYTProf seems to be significantly better for data, but this is servicable.
I tried yappi, but it goes wildly inaccurate with this codebase. Because of
the eventlet plumbing, cProfile is still pretty misleading. Best strategy
seems to be review cumulative time with a healthy grain of salt around the
top items until you get down to info that makes sense. For example, trampoline
unfairly gets a great deal of the 'blame' by taking on nearly all the activity.
internal time seems to miss a great deal of important information.
Previously, the state would be seen as 'connected' and then 'disconnected' in event of
connection failing. Rework things such that the console session stays in 'connecting' state
until timeout or success occurs and don't send disconnect, instead raising an exception.
This makes the connection action a bit more intuitive to the user, who would assume a 'connected'
console means the endpoint was reachable. This may not always be possible in a console plugin,
but it's a nice pattern when possible. If a console plugin cannot tell when 'connected' happens, then
the previous behavior of this plugin makes sense as a 'best effort': return 'connected', send
disconnect event when the console turns out to be bad. For example, executable consoles are most
likely going to follow this pattern. An option could be for an executable to have a certain
signature to print to show 'connected' though...