2
0
mirror of https://opendev.org/x/pyghmi synced 2025-01-15 12:17:44 +00:00

225 Commits

Author SHA1 Message Date
Jarrod Johnson
349a4aa2eb Rename description field and split data out
It would be good for some consumers to have the broad event
description separated from the event specific data.  Facilitate
this with language that makes more clear the piece that refers
broadly to a class of things that can go wrong versus the
potentially very specific data associated with that event.

Change-Id: I983dc1cdf59ae8bc08db3ff3f55192e82f693dbe
2015-05-06 15:45:11 -04:00
Jarrod Johnson
0967ac6025 Implement event log retrieval from BMCs
BMCs retain historical event data in the SEL.
Implement code to read through the SEL.  It
also passes the processed data to the OEM
framework for further processing since
OEMs may define a number of events.
Note that it is not necessarily the OEM of
the system that defines the OEM decode of the
event.  A timestamped OEM event may contain a
different OEM id.  This permits things like
the system, the OS, agents, et all to use the
SEL to store various things.

Change-Id: Ibfb07146b1dfa0ce06df863e805b5a30f17d2f18
2015-05-06 10:44:33 -04:00
Jarrod Johnson
54b90439e7 Add system UUID to inventory
While not strictly in the FRU area, it is often desirable
to have the system UUID available.  The intent is for the
UUID to match what dmidecode would return.  If a manufacturer
does it right, that UUID will be unique.  For ThinkServers,
override with the UUID from the OEM FRU fields rather than
using the get system UUID result.

Change-Id: Ie9a1b7e8fee2cb40ab679cbf2df04db61fd4e42f
0.7.1
2015-04-29 10:54:39 -04:00
Jarrod Johnson
d022c58e61 Fix parsing of 6bit ascii
The 6 bit ascii decode was not correctly assembling
the third character in every chunk.  It was incorrectly
masking away the most significant bit before shifting.
Correct the mask to only mask the appropriate bits.

Change-Id: Ib55ce934d2834d53879e64cc44bcf12bef0eef1c
0.7.0
2015-04-28 14:15:05 -04:00
Jarrod Johnson
f223ed7849 Clean up strings from FRU
Often, a vendor will pad their data fields with spaces.
Compensate through use of strip.  Similarly, some devices
elect to use spaces rather than ascii zeroes on Lenovo
devices, recognize those as not present fields as well.

Change-Id: I3e1d1ffd5dae4d4febc727e7193fa6652050b267
2015-04-28 13:58:39 -04:00
Jarrod Johnson
33de9a451f Add function to fetch a specific items inventory
It may be desirable for calling code to specifically call
out a single component.  Add the 'get_inventory_of_component' to
make that possible.

Also refine the OEM processors such that they should pass through
'None' to make some upper level code more straightforward.

Change-Id: Ic662d6c330af24fb8ed7a9cf2f8bcfd6202c1337
2015-04-27 13:43:24 -04:00
Jarrod Johnson
35c3a326a8 Tolerate errors in 'extra' fields
Some FRU has malformed data in the extra fields.
Tolerate this by giving up when out of data to
feed the parser, and returning what was parsed
to that point.

Change-Id: I9404c579e9020dd1afe668138eefba8266f1437b
2015-04-27 10:44:35 -04:00
Jarrod Johnson
e5bfd786cb Create framework for OEM extensions
While the base IPMI specification is quite comprehensive,
there are various points where OEM enhancement is possible.
This can run the gamut from entirely distinct function
(e.g. remote graphics) to additional 'sensors' to providing
more interesting decode of 'extra' fields in FRU to decoding
otherwise indecipherable SEL events.

Change-Id: Iaf670f336f225d0ea00e1803eebb84104a78e8b3
2015-04-24 15:52:07 -04:00
Jarrod Johnson
c60b684d23 Implement FRU inventory
Implement parsing of FRU data.  This phase omits Multirecord
area.  Provide access to either just the names or names
and extended information.

Change-Id: I8b0adf649769a880bf40cbe973864e889f1a6959
2015-04-23 13:56:22 -04:00
Jarrod Johnson
77593758f7 Try setting optional byte in set user acess
Set the user session limit explicitly to lift
any restrictions an implementation may default
to.  Some systems consider this byte mandatory
though the specification says optional

(From Steve Weber)

Change-Id: I95f4743ded702a436be019c902487813f916bd27
0.6.27
2015-04-01 09:12:39 -04:00
Jenkins
81fd13f3b3 Merge "Reduce cost of packet transmit" 0.6.26 2015-02-18 20:47:03 +00:00
Jarrod Johnson
cdef6531ca Reduce cost of packet transmit
eventlet only cares about multiple readers.  Multiple threads doing send
do not bother it.  As such, just call the sendto directly rather
than going through the hoop of an io_apply and the associated
event creation and wait and general confusion of jumbling
up the IO worker thread.  This seems to buy about 10%
performance gain in the ~100 server scenario doing get_health.

Change-Id: Ia671f201a43f32589324b37aadf79f21548aef35
2015-02-18 15:32:06 -05:00
Jarrod Johnson
cc0559d9b9 Ignore packet overrun in RAKP2 and RAKP4
At least one BMC with one firmware sends junk data
at the end of their RAKP2 and RAKP4 messages.  Tolerate
by ignoring that data, since it is harmless to ignore

Change-Id: I9417f26649c1be527fd9de7b648121f49452031b
2015-02-18 14:10:41 -05:00
Jarrod Johnson
f0d3050a79 Streamline and simplify IO Polling
Get more performance improvements by moving more of the
serialized effort into the IO thread to avoid churn.  This
also simplifies the issue with select being called without
recvfrom, allowing removal of the ignoresockets mechanism.
Also rework wait to avoid having to build lists that no one
ever consumes and move work out of the eternal loop
that only should happen at startup.  This has shaved an
additional 25% off of wallclock time in a single-processor
context for a given workload.

Change-Id: If321a69fabfb3ee55599ecfe3d24fbacd33388b5
0.6.25
2015-02-18 10:15:09 -05:00
Jarrod Johnson
b779379511 Reduce severity of a non-redundant state
A particular non-redundant state value has been observed
to more commonly describe a system that simply isn't
redundant by nature.  Rely on more ominous states and
sensors to convey truly problematic conditions.

Change-Id: I601fb6d358df626d9b12050c1f4a201121a7b264
0.6.24
2015-02-17 14:05:47 -05:00
Jenkins
a0a922309e Merge "Add missing generic discrete codes" 2015-02-17 18:58:09 +00:00
Jarrod Johnson
77aad5f728 Add missing generic discrete codes
A large chunk of generic discrete codes had been
skipped.  Rectify the omission and classify the
events.  Some debate could be had around 'non-redundant',
but the intent seems to be a way for a nominally redundant
component to describe a suboptimal state.

Change-Id: I48bbef96b7b6c952bcc940f5bb950962d07507d9
2015-02-17 13:40:14 -05:00
Jarrod Johnson
b4f265b2d7 Fix exceptions on sdr read
The string formatting is corrected for some 'bmc' errors.
It has been suggested to restructure things to not complain,
but so far when investigated, the systems actually had
a defect in putting the wrong sensor number in.  As a
diagnostic aid, this seems to be useful.  If a legitimate
application of duplicate SDR record for sensor is pointed
out, then we can restructure.

Change-Id: I58c2ffc1108cbb157f1398b420ea3a24bc4f05e8
2015-02-17 13:13:20 -05:00
Jarrod Johnson
af02266b0b Move packet queue into IO thread
The packet queue assembly in consumer threads
made up a lot of thrashing and overhead.  Moving
this into the IO thread saves about 25% of the CPU
overhead associated with 100 consumer threads.

Change-Id: I44ba6888a68f58297469e0630c24c12d9246c706
2015-02-17 11:06:08 -05:00
Jarrod Johnson
a32f1080b1 Fix needless retries due to misdirected packets
When used in a threaded application, there was a chance for one wait
iteration to slurp away packets from another instance.  When this
would happen, a wait may mistakenly act as if no packets were
received and expire a session timeout. Fix this by having arbitrarily
many instances feed and consume from the same queue.  This way if any
instance of the wait function pulls a packet, all consumers are made
aware of the packet. This dramatically improves performance when
dealing with very long conversations (SDR+sensors) with hundreds of
nodes across hundreds of threads.

Change-Id: I3f51097fb41197445a447cbdaddc8c1c29d4a873
2015-02-16 19:58:13 -05:00
Jarrod Johnson
871984bde0 Handle concurrent session requests
A caller may end up making requests that would get gathered
into a single session object.  However due to being aggressive,
a second session object supersedes the original without the
original getting a chance to complete login and satisfy the
second caller.  Address this by reusing a 'logging' session
and then having __init__ follow the state of the other session
object already in progress.  Without this a caller can end
up provoking a fight and having the BMC refuse to continue
to entertain the shenanigans in short order.

Change-Id: I47acbc0c974900ff50c02d470b5a79a3fed9bb73
2015-02-16 13:41:42 -05:00
Peter Martini
fe31004d5e Added a BMC (IPMI) frontend for virsh
Right now, this only supports power command, though
it could certainly be extended to include console
and boot device settings.

Change-Id: I94d438a1ea11519196efa6a819309af9d3219424
2015-02-11 16:53:27 -08:00
Peter Martini
f44881a135 Add a "--port" option to fakebmc
Change-Id: Id609034c1a23cc335ccd478df113b17e04e33cc5
2015-02-11 12:08:03 -08:00
Jenkins
9f33418d2b Merge "Work toward Python 3.4 support and testing" 2015-02-11 19:33:03 +00:00
Jenkins
6edb5f5246 Merge "Raise IpmiException on an error setting/getting the boot device" 2015-02-11 19:31:35 +00:00
Jarrod Johnson
13acb7e650 Correct delay_xmit behavior
delay_xmit was broken due to deferral of the retry calculation.
It relied upon the deadline set in the retry to govern the
initial transmit of data.  By trying to make it delay
to a more realistic time by not starting the clock
until the packet was transmitted, the delay_xmit
which never transmitted was in bad shape.  This was already
broken since retry=False would have had the same effect, but
that is a rare scenario.  Fix by having delay_xmit specifically
handle the assignment separate of retry logic.

Change-Id: Ia251215f14f8a5808e8a8f5b3f53fbef40721709
0.6.23
2015-02-01 09:38:20 -05:00
Jarrod Johnson
d4a4689f0e Rework IO Worker thread behavior
python interpreter exit is very chaotic with a daemon thread.
Address this by making the worker thread non daemon, and
hooking join() to do the exit instead.  Do a particular
dance to accomodate a caller that may replace python's library
after we are imported by defining the class at the same time
the threading library was originally referenced.

Change-Id: I8ebd2d4e89b4e11e352e440775fd236599c024a0
0.6.22
2015-01-30 14:04:50 -05:00
Jarrod Johnon
31c797c221 Correct redundant timedout calls in recursion
When handling multiple sessions hiting a timeout expiry,
there was a chance that during recursion a session would
get redundantly scheduled for retry/timeout.  Address this
by clearing out the scheduled sessions prior to acting on
any of the sessions.  Additionally, only start the timeout
clock after successfully placing the payload on the wire,
rather than including local delays against timeout expiry.

Change-Id: I2f58f0afcb13943654489630f7e8164913633a49
0.6.21
2015-01-22 16:57:05 -05:00
Jarrod Johnon
f4590dad58 Assign code to timeout behavior
While 'timeout' is not something defined in the IPMI spec
(it would make no sense), assign it an impossible value
so that calling code will experience timeout condition
as if it were a 'normal' ipmi error.

Change-Id: I8165497704148b79bc7996229f6f889b011e6d56
2015-01-22 11:18:27 -05:00
Jarrod Johnon
c74b7ef0ee Gracefully handle error while acking SOL
If SOL data comes in while trying to issue a command, the
caller had the risk of being bothered with unrelated execption
due to a failure to handle ack while waiting to send raw command.
If an exception occurs, silently mark the console object as closed
and move on.

Change-Id: I894d4f6596cf0546e8fe76ced0e309175198651d
2015-01-22 09:24:03 -05:00
steverweber
7c90bc9efa add more commands
add more ipmi commands

Change-Id: I6990c42ad084c56bd789b4a6d89e926a1a1443d8
2015-01-20 09:18:00 -05:00
Jarrod Johnon
f93b90f2a5 Enhance IpmiException to carry IPMI codenumber
In some cases, an exception warrants a different handling
depending on the code of the error.  Have the exception
carry the code up for calling code to make a decision.

Change-Id: Ie7e134d3a7d99eaa1ec3a82c6f39e82dc4422ca7
2015-01-15 14:35:57 -05:00
Jarrod Johnon
acd193ce6b Allow request for single sensor by name
In some cases, it is useful for management software to
specifically read one or a few sensors specifically.
This new function allows management software a convenient
way to get sensors without requesting a full sweep.

Change-Id: Ibbb7ab0d76b7aea934be804c6ee79c34f6a6c568
2015-01-15 12:53:55 -05:00
Jenkins
455d6955d6 Merge "Expose sensor description data" 2015-01-14 21:20:00 +00:00
Jenkins
10e6fb5481 Merge "Implement server side IPMI protocol" 2015-01-14 21:19:41 +00:00
Jarrod Johnon
bc80931e59 Expose sensor description data
Provide a function to enumerate available sensors
without actually reading them.  This paves the way
for management software to selectively query sensors
of interest depending on circumstance.

Change-Id: If63be5bc83996da10ee1fbb330395648340090bf
2015-01-14 13:15:09 -05:00
Jarrod Johnon
3c5c0e2dc2 Implement server side IPMI protocol
Provide framework for a utility to listen and respond to
ipmi protocol messages.  Also provide an example 'fakebmc'
to give a general idea of how to create a ipmi device

Change-Id: I240b233ff161bc3672795b3ac3bf609e4c8c98bb
2015-01-12 10:21:32 -05:00
Jenkins
d33d03f0cf Merge "Check for IPMIPASSWORD env var in pyghmiutil" 2015-01-05 21:32:06 +00:00
Peter Martini
889ec1c96f Check for IPMIPASSWORD env var in pyghmiutil
The environment variable IPMIPASSWORD is fetched without checking to see
if its present, which throws an exception instead of printing a usage
message if its not set.  This patch fixes that.

Change-Id: I2b15efc4ea4b617e08f096de4130004150d64133
2014-12-31 02:59:28 -05:00
Jim Rollenhagen
c0691104fb Run pep8 on files in bin/
Files in bin/ don't have a .py extension, and so don't get picked
up by flake8. Add them to the flake8 command to have them checked.

Also fix an existing error in bin/pyghmicons.

Change-Id: I4db9b8c4e13c7c7f652acaa12add125f0e0458cd
2014-12-31 00:15:01 +00:00
Jarrod Johnon
467c2e52b8 Provide access to chassis identify
Enable command method to turn on identify with full access
to duration and indefinite control.  Note that there is no method
to retrieve it, as the specification does not provide that capability.

Change-Id: I3478101ed4db15232842b508a42aeb9ec9285434
2014-11-25 11:26:27 -05:00
Jarrod Johnon
7eb6fab348 Implement retrieval of uefi flag in boot devs
While pyghmi supported the setting of the parameter,
it did not support the retrieval for the data.  Remedy
this senseless asymmetry.

Change-Id: Ib26412e012d08b39df7b6997a1f929d4277219f9
2014-11-14 14:49:12 -05:00
Jarrod Johnon
06714089fe Recover from kill() while in command
If a thread has kill() after incommand was set, but before it actually
completed, then no fixup action will occur.  Correct this by shrugging
off incommand if it was set more than the maximum possible timeout
before so that things can try to recover.  An attempt was made with a
Lock and 'with', but kill() did not see the lock actually release, so
resort to expiring criteria instead.

Change-Id: I67d47a1533bf2e46db4534d9c1465bea08de6f64
2014-09-29 16:56:44 -04:00
Jarrod Johnon
2a44ee68c7 Remove overly aggressive packet processing
With the socket pooling, we no longer have to be as aggressive about
trying to get packets out of the socket buffer.  However, in a very
busy system of several hundred nodes, this aggressive processing
tends to produce larger python stacks and can exceed the default
1,000 limit.  We may have to resort to increasing the limit one day,
but for now this case can be avoided.

Change-Id: I83bdaa2d8ad464727a69e6ece064f51cd4318822
2014-09-23 15:53:21 -04:00
Jarrod Johnon
2d5eebebd8 Handle custom keepalive modifications on the fly
It is possible for activity under 'raw_command' to modify custom
keepalive registry.  Tolerate the structure changing in the loop gracefully.

Change-Id: I99c99b52718dff518c303819e7a24085cc6fb97a
2014-09-16 09:46:40 -04:00
Jarrod Johnon
eace546448 Avoid exception on close
If close is called and the remote BMC session no longer works,
do not pass up worrisome trace to a caller, which is calling close()
to try to make sure things are clean and here there is just some part
that was already done.

Change-Id: Ib0c770b57eb0f204bcde6fc786e8f064f02ece1a
2014-09-15 11:35:55 -04:00
Jarrod Johnon
e4827408f3 Avoid recursing between keepalive and raw_command
On June 10 2014 one condition was addressed that caused infinite
recursion.  Then it was an invalid timer that could fire in the midst of
a command.  The case where this could validly occur was overlooked.
Address this by deferring invocation of keepalives until after command
exits.  If incommand indicated activity advances timeout in non
custom keepalive case, then the keepalive timer will actually be correctly advanced.

Change-Id: Iebe0241c1f928c4187f167f3ffa407f8c6f7fa84
2014-09-15 10:03:08 -04:00
Jeremy Stanley
471f9f8755 Work toward Python 3.4 support and testing
Change-Id: Ibde15015cf665ca95e217381566baccee233ab37
2014-09-03 19:07:25 +00:00
Jarrod Johnson
737643c33c Fix IO worker tolerance of errors
If an _io_apply encounters an exception, the io worker entirely would fall apart.  Encompass the key entry
points in try clauses to allow the thread to keep working and the dependent IPMI object to have their
waiter acknowledged.  It's still considered a grave bug for this to ever occur, but at least
the application would carry on.

Change-Id: I61b0797025b25c6d9d3e86a5110603a6fc2d67fb
2014-08-27 11:01:21 -04:00
Jenkins
9bf8258edf Merge "Force non-numeric for compact sensor records" 2014-08-08 12:52:55 +00:00