Rather than use the string 'Unspecified', use the
None object to denote an event that has no associated
component.
Change-Id: I8b8431257a01434ec09aaa940fab5bbce2164ae6
The specification actually only has the lower 5 bits refer to the slot
number. Mask out the upper three bits to be accurate.
Change-Id: I654f7a96ad1ae2fd18430ca73aa33b30ce80a8e2
This class of non-redundant simply means the resource is inherently
non-redundant per its current configuration. Other sorts of non-redundant
are available to indicate when this is a consequence of failure.
Change-Id: I0326d13159ccffd2ad790fc8d14fefec17cc2671
Provide a facility for OEM plugins to modify events on the way
out of the event handling functions.
Change-Id: I24f31f7035dcba2c1f30810a6282c87ee987ddad
For the text description and severity, use the negated values
added previously to make it more clear what an event represents.
Change-Id: I9eb47363d5f190bd0fd5bec58bc49fa12a9a3ed0
IPMI carries a connotation of what an 'entity' is. While
sticking strictly to the nomenclature of 'sensor' that actually
maps would be misleading, entity is also confusing. Renaming
the element to be 'component' to express what it tends to represent
without getting confused with the IPMI concept of 'entity'.
Change-Id: I592e1e6b9588bf381588c7d9d3e6531eb277d23c
Some callers of the code would like to use an enumeration
rather than using the strings verbatim or as a key.
Accomodate this through adding fields to carry those
data for such callers.
Change-Id: I75442a12448cdee1e7c0cb99501d6f229cc0c2b2
While implementing events, it suggests that the severity and deassertion
of events warrants a distinct set of enumeration. Additionally,
the structure made it clear that many of the sensor offsets and their
sensor_type indicated strings were redundant with each other, and
language was changed to eliminate that redundancy.
Change-Id: I1c2f733c4ffb1e7ac2ad8b1004bd560487117b66
If a BMC returns 0 data, then assume end of data has been
reached. Not all implementations accurately advertise FRU
size, so this is a way to detect and break out.
Change-Id: I03d0393563f8527e16830098b649ae940ee7ee9e
When the change to refactor the event and event_data happened,
the time correction code was not correctly modified. Rectify
the mistake.
Change-Id: I8317539761456f1c5621745989fbd8ebfc973455
This is generally used to indicate the benign configuration
choice of disabling some present device. For example, a
server with a built in network device disabled in favor of
an add-in option.
Change-Id: I1cf9f4feccf7ec022cca7ce7909093b16807244c
It would be good for some consumers to have the broad event
description separated from the event specific data. Facilitate
this with language that makes more clear the piece that refers
broadly to a class of things that can go wrong versus the
potentially very specific data associated with that event.
Change-Id: I983dc1cdf59ae8bc08db3ff3f55192e82f693dbe
BMCs retain historical event data in the SEL.
Implement code to read through the SEL. It
also passes the processed data to the OEM
framework for further processing since
OEMs may define a number of events.
Note that it is not necessarily the OEM of
the system that defines the OEM decode of the
event. A timestamped OEM event may contain a
different OEM id. This permits things like
the system, the OS, agents, et all to use the
SEL to store various things.
Change-Id: Ibfb07146b1dfa0ce06df863e805b5a30f17d2f18
While not strictly in the FRU area, it is often desirable
to have the system UUID available. The intent is for the
UUID to match what dmidecode would return. If a manufacturer
does it right, that UUID will be unique. For ThinkServers,
override with the UUID from the OEM FRU fields rather than
using the get system UUID result.
Change-Id: Ie9a1b7e8fee2cb40ab679cbf2df04db61fd4e42f
The 6 bit ascii decode was not correctly assembling
the third character in every chunk. It was incorrectly
masking away the most significant bit before shifting.
Correct the mask to only mask the appropriate bits.
Change-Id: Ib55ce934d2834d53879e64cc44bcf12bef0eef1c
Often, a vendor will pad their data fields with spaces.
Compensate through use of strip. Similarly, some devices
elect to use spaces rather than ascii zeroes on Lenovo
devices, recognize those as not present fields as well.
Change-Id: I3e1d1ffd5dae4d4febc727e7193fa6652050b267
It may be desirable for calling code to specifically call
out a single component. Add the 'get_inventory_of_component' to
make that possible.
Also refine the OEM processors such that they should pass through
'None' to make some upper level code more straightforward.
Change-Id: Ic662d6c330af24fb8ed7a9cf2f8bcfd6202c1337
Some FRU has malformed data in the extra fields.
Tolerate this by giving up when out of data to
feed the parser, and returning what was parsed
to that point.
Change-Id: I9404c579e9020dd1afe668138eefba8266f1437b
While the base IPMI specification is quite comprehensive,
there are various points where OEM enhancement is possible.
This can run the gamut from entirely distinct function
(e.g. remote graphics) to additional 'sensors' to providing
more interesting decode of 'extra' fields in FRU to decoding
otherwise indecipherable SEL events.
Change-Id: Iaf670f336f225d0ea00e1803eebb84104a78e8b3
Implement parsing of FRU data. This phase omits Multirecord
area. Provide access to either just the names or names
and extended information.
Change-Id: I8b0adf649769a880bf40cbe973864e889f1a6959
Set the user session limit explicitly to lift
any restrictions an implementation may default
to. Some systems consider this byte mandatory
though the specification says optional
(From Steve Weber)
Change-Id: I95f4743ded702a436be019c902487813f916bd27
eventlet only cares about multiple readers. Multiple threads doing send
do not bother it. As such, just call the sendto directly rather
than going through the hoop of an io_apply and the associated
event creation and wait and general confusion of jumbling
up the IO worker thread. This seems to buy about 10%
performance gain in the ~100 server scenario doing get_health.
Change-Id: Ia671f201a43f32589324b37aadf79f21548aef35
At least one BMC with one firmware sends junk data
at the end of their RAKP2 and RAKP4 messages. Tolerate
by ignoring that data, since it is harmless to ignore
Change-Id: I9417f26649c1be527fd9de7b648121f49452031b
Get more performance improvements by moving more of the
serialized effort into the IO thread to avoid churn. This
also simplifies the issue with select being called without
recvfrom, allowing removal of the ignoresockets mechanism.
Also rework wait to avoid having to build lists that no one
ever consumes and move work out of the eternal loop
that only should happen at startup. This has shaved an
additional 25% off of wallclock time in a single-processor
context for a given workload.
Change-Id: If321a69fabfb3ee55599ecfe3d24fbacd33388b5
A particular non-redundant state value has been observed
to more commonly describe a system that simply isn't
redundant by nature. Rely on more ominous states and
sensors to convey truly problematic conditions.
Change-Id: I601fb6d358df626d9b12050c1f4a201121a7b264
A large chunk of generic discrete codes had been
skipped. Rectify the omission and classify the
events. Some debate could be had around 'non-redundant',
but the intent seems to be a way for a nominally redundant
component to describe a suboptimal state.
Change-Id: I48bbef96b7b6c952bcc940f5bb950962d07507d9
The string formatting is corrected for some 'bmc' errors.
It has been suggested to restructure things to not complain,
but so far when investigated, the systems actually had
a defect in putting the wrong sensor number in. As a
diagnostic aid, this seems to be useful. If a legitimate
application of duplicate SDR record for sensor is pointed
out, then we can restructure.
Change-Id: I58c2ffc1108cbb157f1398b420ea3a24bc4f05e8
The packet queue assembly in consumer threads
made up a lot of thrashing and overhead. Moving
this into the IO thread saves about 25% of the CPU
overhead associated with 100 consumer threads.
Change-Id: I44ba6888a68f58297469e0630c24c12d9246c706
When used in a threaded application, there was a chance for one wait
iteration to slurp away packets from another instance. When this
would happen, a wait may mistakenly act as if no packets were
received and expire a session timeout. Fix this by having arbitrarily
many instances feed and consume from the same queue. This way if any
instance of the wait function pulls a packet, all consumers are made
aware of the packet. This dramatically improves performance when
dealing with very long conversations (SDR+sensors) with hundreds of
nodes across hundreds of threads.
Change-Id: I3f51097fb41197445a447cbdaddc8c1c29d4a873
A caller may end up making requests that would get gathered
into a single session object. However due to being aggressive,
a second session object supersedes the original without the
original getting a chance to complete login and satisfy the
second caller. Address this by reusing a 'logging' session
and then having __init__ follow the state of the other session
object already in progress. Without this a caller can end
up provoking a fight and having the BMC refuse to continue
to entertain the shenanigans in short order.
Change-Id: I47acbc0c974900ff50c02d470b5a79a3fed9bb73
Right now, this only supports power command, though
it could certainly be extended to include console
and boot device settings.
Change-Id: I94d438a1ea11519196efa6a819309af9d3219424
delay_xmit was broken due to deferral of the retry calculation.
It relied upon the deadline set in the retry to govern the
initial transmit of data. By trying to make it delay
to a more realistic time by not starting the clock
until the packet was transmitted, the delay_xmit
which never transmitted was in bad shape. This was already
broken since retry=False would have had the same effect, but
that is a rare scenario. Fix by having delay_xmit specifically
handle the assignment separate of retry logic.
Change-Id: Ia251215f14f8a5808e8a8f5b3f53fbef40721709
python interpreter exit is very chaotic with a daemon thread.
Address this by making the worker thread non daemon, and
hooking join() to do the exit instead. Do a particular
dance to accomodate a caller that may replace python's library
after we are imported by defining the class at the same time
the threading library was originally referenced.
Change-Id: I8ebd2d4e89b4e11e352e440775fd236599c024a0
When handling multiple sessions hiting a timeout expiry,
there was a chance that during recursion a session would
get redundantly scheduled for retry/timeout. Address this
by clearing out the scheduled sessions prior to acting on
any of the sessions. Additionally, only start the timeout
clock after successfully placing the payload on the wire,
rather than including local delays against timeout expiry.
Change-Id: I2f58f0afcb13943654489630f7e8164913633a49
While 'timeout' is not something defined in the IPMI spec
(it would make no sense), assign it an impossible value
so that calling code will experience timeout condition
as if it were a 'normal' ipmi error.
Change-Id: I8165497704148b79bc7996229f6f889b011e6d56
If SOL data comes in while trying to issue a command, the
caller had the risk of being bothered with unrelated execption
due to a failure to handle ack while waiting to send raw command.
If an exception occurs, silently mark the console object as closed
and move on.
Change-Id: I894d4f6596cf0546e8fe76ced0e309175198651d
In some cases, an exception warrants a different handling
depending on the code of the error. Have the exception
carry the code up for calling code to make a decision.
Change-Id: Ie7e134d3a7d99eaa1ec3a82c6f39e82dc4422ca7
In some cases, it is useful for management software to
specifically read one or a few sensors specifically.
This new function allows management software a convenient
way to get sensors without requesting a full sweep.
Change-Id: Ibbb7ab0d76b7aea934be804c6ee79c34f6a6c568
Provide a function to enumerate available sensors
without actually reading them. This paves the way
for management software to selectively query sensors
of interest depending on circumstance.
Change-Id: If63be5bc83996da10ee1fbb330395648340090bf