pyghmi

mirror of https://opendev.org/x/pyghmi synced 2025-01-15 20:27:45 +00:00

Author	SHA1	Message	Date
Jarrod Johnson	cc0559d9b9	Ignore packet overrun in RAKP2 and RAKP4 At least one BMC with one firmware sends junk data at the end of their RAKP2 and RAKP4 messages. Tolerate by ignoring that data, since it is harmless to ignore Change-Id: I9417f26649c1be527fd9de7b648121f49452031b	2015-02-18 14:10:41 -05:00
Jarrod Johnson	f0d3050a79	Streamline and simplify IO Polling Get more performance improvements by moving more of the serialized effort into the IO thread to avoid churn. This also simplifies the issue with select being called without recvfrom, allowing removal of the ignoresockets mechanism. Also rework wait to avoid having to build lists that no one ever consumes and move work out of the eternal loop that only should happen at startup. This has shaved an additional 25% off of wallclock time in a single-processor context for a given workload. Change-Id: If321a69fabfb3ee55599ecfe3d24fbacd33388b5 0.6.25	2015-02-18 10:15:09 -05:00
Jarrod Johnson	b779379511	Reduce severity of a non-redundant state A particular non-redundant state value has been observed to more commonly describe a system that simply isn't redundant by nature. Rely on more ominous states and sensors to convey truly problematic conditions. Change-Id: I601fb6d358df626d9b12050c1f4a201121a7b264 0.6.24	2015-02-17 14:05:47 -05:00
Jenkins	a0a922309e	Merge "Add missing generic discrete codes"	2015-02-17 18:58:09 +00:00
Jarrod Johnson	77aad5f728	Add missing generic discrete codes A large chunk of generic discrete codes had been skipped. Rectify the omission and classify the events. Some debate could be had around 'non-redundant', but the intent seems to be a way for a nominally redundant component to describe a suboptimal state. Change-Id: I48bbef96b7b6c952bcc940f5bb950962d07507d9	2015-02-17 13:40:14 -05:00
Jarrod Johnson	b4f265b2d7	Fix exceptions on sdr read The string formatting is corrected for some 'bmc' errors. It has been suggested to restructure things to not complain, but so far when investigated, the systems actually had a defect in putting the wrong sensor number in. As a diagnostic aid, this seems to be useful. If a legitimate application of duplicate SDR record for sensor is pointed out, then we can restructure. Change-Id: I58c2ffc1108cbb157f1398b420ea3a24bc4f05e8	2015-02-17 13:13:20 -05:00
Jarrod Johnson	af02266b0b	Move packet queue into IO thread The packet queue assembly in consumer threads made up a lot of thrashing and overhead. Moving this into the IO thread saves about 25% of the CPU overhead associated with 100 consumer threads. Change-Id: I44ba6888a68f58297469e0630c24c12d9246c706	2015-02-17 11:06:08 -05:00
Jarrod Johnson	a32f1080b1	Fix needless retries due to misdirected packets When used in a threaded application, there was a chance for one wait iteration to slurp away packets from another instance. When this would happen, a wait may mistakenly act as if no packets were received and expire a session timeout. Fix this by having arbitrarily many instances feed and consume from the same queue. This way if any instance of the wait function pulls a packet, all consumers are made aware of the packet. This dramatically improves performance when dealing with very long conversations (SDR+sensors) with hundreds of nodes across hundreds of threads. Change-Id: I3f51097fb41197445a447cbdaddc8c1c29d4a873	2015-02-16 19:58:13 -05:00
Jarrod Johnson	871984bde0	Handle concurrent session requests A caller may end up making requests that would get gathered into a single session object. However due to being aggressive, a second session object supersedes the original without the original getting a chance to complete login and satisfy the second caller. Address this by reusing a 'logging' session and then having __init__ follow the state of the other session object already in progress. Without this a caller can end up provoking a fight and having the BMC refuse to continue to entertain the shenanigans in short order. Change-Id: I47acbc0c974900ff50c02d470b5a79a3fed9bb73	2015-02-16 13:41:42 -05:00
Peter Martini	fe31004d5e	Added a BMC (IPMI) frontend for virsh Right now, this only supports power command, though it could certainly be extended to include console and boot device settings. Change-Id: I94d438a1ea11519196efa6a819309af9d3219424	2015-02-11 16:53:27 -08:00
Peter Martini	f44881a135	Add a "--port" option to fakebmc Change-Id: Id609034c1a23cc335ccd478df113b17e04e33cc5	2015-02-11 12:08:03 -08:00
Jenkins	9f33418d2b	Merge "Work toward Python 3.4 support and testing"	2015-02-11 19:33:03 +00:00
Jenkins	6edb5f5246	Merge "Raise IpmiException on an error setting/getting the boot device"	2015-02-11 19:31:35 +00:00
Jarrod Johnson	13acb7e650	Correct delay_xmit behavior delay_xmit was broken due to deferral of the retry calculation. It relied upon the deadline set in the retry to govern the initial transmit of data. By trying to make it delay to a more realistic time by not starting the clock until the packet was transmitted, the delay_xmit which never transmitted was in bad shape. This was already broken since retry=False would have had the same effect, but that is a rare scenario. Fix by having delay_xmit specifically handle the assignment separate of retry logic. Change-Id: Ia251215f14f8a5808e8a8f5b3f53fbef40721709 0.6.23	2015-02-01 09:38:20 -05:00
Jarrod Johnson	d4a4689f0e	Rework IO Worker thread behavior python interpreter exit is very chaotic with a daemon thread. Address this by making the worker thread non daemon, and hooking join() to do the exit instead. Do a particular dance to accomodate a caller that may replace python's library after we are imported by defining the class at the same time the threading library was originally referenced. Change-Id: I8ebd2d4e89b4e11e352e440775fd236599c024a0 0.6.22	2015-01-30 14:04:50 -05:00
Jarrod Johnon	31c797c221	Correct redundant timedout calls in recursion When handling multiple sessions hiting a timeout expiry, there was a chance that during recursion a session would get redundantly scheduled for retry/timeout. Address this by clearing out the scheduled sessions prior to acting on any of the sessions. Additionally, only start the timeout clock after successfully placing the payload on the wire, rather than including local delays against timeout expiry. Change-Id: I2f58f0afcb13943654489630f7e8164913633a49 0.6.21	2015-01-22 16:57:05 -05:00
Jarrod Johnon	f4590dad58	Assign code to timeout behavior While 'timeout' is not something defined in the IPMI spec (it would make no sense), assign it an impossible value so that calling code will experience timeout condition as if it were a 'normal' ipmi error. Change-Id: I8165497704148b79bc7996229f6f889b011e6d56	2015-01-22 11:18:27 -05:00
Jarrod Johnon	c74b7ef0ee	Gracefully handle error while acking SOL If SOL data comes in while trying to issue a command, the caller had the risk of being bothered with unrelated execption due to a failure to handle ack while waiting to send raw command. If an exception occurs, silently mark the console object as closed and move on. Change-Id: I894d4f6596cf0546e8fe76ced0e309175198651d	2015-01-22 09:24:03 -05:00
steverweber	7c90bc9efa	add more commands add more ipmi commands Change-Id: I6990c42ad084c56bd789b4a6d89e926a1a1443d8	2015-01-20 09:18:00 -05:00
Jarrod Johnon	f93b90f2a5	Enhance IpmiException to carry IPMI codenumber In some cases, an exception warrants a different handling depending on the code of the error. Have the exception carry the code up for calling code to make a decision. Change-Id: Ie7e134d3a7d99eaa1ec3a82c6f39e82dc4422ca7	2015-01-15 14:35:57 -05:00
Jarrod Johnon	acd193ce6b	Allow request for single sensor by name In some cases, it is useful for management software to specifically read one or a few sensors specifically. This new function allows management software a convenient way to get sensors without requesting a full sweep. Change-Id: Ibbb7ab0d76b7aea934be804c6ee79c34f6a6c568	2015-01-15 12:53:55 -05:00
Jenkins	455d6955d6	Merge "Expose sensor description data"	2015-01-14 21:20:00 +00:00
Jenkins	10e6fb5481	Merge "Implement server side IPMI protocol"	2015-01-14 21:19:41 +00:00
Jarrod Johnon	bc80931e59	Expose sensor description data Provide a function to enumerate available sensors without actually reading them. This paves the way for management software to selectively query sensors of interest depending on circumstance. Change-Id: If63be5bc83996da10ee1fbb330395648340090bf	2015-01-14 13:15:09 -05:00
Jarrod Johnon	3c5c0e2dc2	Implement server side IPMI protocol Provide framework for a utility to listen and respond to ipmi protocol messages. Also provide an example 'fakebmc' to give a general idea of how to create a ipmi device Change-Id: I240b233ff161bc3672795b3ac3bf609e4c8c98bb	2015-01-12 10:21:32 -05:00
Jenkins	d33d03f0cf	Merge "Check for IPMIPASSWORD env var in pyghmiutil"	2015-01-05 21:32:06 +00:00
Peter Martini	889ec1c96f	Check for IPMIPASSWORD env var in pyghmiutil The environment variable IPMIPASSWORD is fetched without checking to see if its present, which throws an exception instead of printing a usage message if its not set. This patch fixes that. Change-Id: I2b15efc4ea4b617e08f096de4130004150d64133	2014-12-31 02:59:28 -05:00
Jim Rollenhagen	c0691104fb	Run pep8 on files in bin/ Files in bin/ don't have a .py extension, and so don't get picked up by flake8. Add them to the flake8 command to have them checked. Also fix an existing error in bin/pyghmicons. Change-Id: I4db9b8c4e13c7c7f652acaa12add125f0e0458cd	2014-12-31 00:15:01 +00:00
Jarrod Johnon	467c2e52b8	Provide access to chassis identify Enable command method to turn on identify with full access to duration and indefinite control. Note that there is no method to retrieve it, as the specification does not provide that capability. Change-Id: I3478101ed4db15232842b508a42aeb9ec9285434	2014-11-25 11:26:27 -05:00
Jarrod Johnon	7eb6fab348	Implement retrieval of uefi flag in boot devs While pyghmi supported the setting of the parameter, it did not support the retrieval for the data. Remedy this senseless asymmetry. Change-Id: Ib26412e012d08b39df7b6997a1f929d4277219f9	2014-11-14 14:49:12 -05:00
Jarrod Johnon	06714089fe	Recover from kill() while in command If a thread has kill() after incommand was set, but before it actually completed, then no fixup action will occur. Correct this by shrugging off incommand if it was set more than the maximum possible timeout before so that things can try to recover. An attempt was made with a Lock and 'with', but kill() did not see the lock actually release, so resort to expiring criteria instead. Change-Id: I67d47a1533bf2e46db4534d9c1465bea08de6f64	2014-09-29 16:56:44 -04:00
Jarrod Johnon	2a44ee68c7	Remove overly aggressive packet processing With the socket pooling, we no longer have to be as aggressive about trying to get packets out of the socket buffer. However, in a very busy system of several hundred nodes, this aggressive processing tends to produce larger python stacks and can exceed the default 1,000 limit. We may have to resort to increasing the limit one day, but for now this case can be avoided. Change-Id: I83bdaa2d8ad464727a69e6ece064f51cd4318822	2014-09-23 15:53:21 -04:00
Jarrod Johnon	2d5eebebd8	Handle custom keepalive modifications on the fly It is possible for activity under 'raw_command' to modify custom keepalive registry. Tolerate the structure changing in the loop gracefully. Change-Id: I99c99b52718dff518c303819e7a24085cc6fb97a	2014-09-16 09:46:40 -04:00
Jarrod Johnon	eace546448	Avoid exception on close If close is called and the remote BMC session no longer works, do not pass up worrisome trace to a caller, which is calling close() to try to make sure things are clean and here there is just some part that was already done. Change-Id: Ib0c770b57eb0f204bcde6fc786e8f064f02ece1a	2014-09-15 11:35:55 -04:00
Jarrod Johnon	e4827408f3	Avoid recursing between keepalive and raw_command On June 10 2014 one condition was addressed that caused infinite recursion. Then it was an invalid timer that could fire in the midst of a command. The case where this could validly occur was overlooked. Address this by deferring invocation of keepalives until after command exits. If incommand indicated activity advances timeout in non custom keepalive case, then the keepalive timer will actually be correctly advanced. Change-Id: Iebe0241c1f928c4187f167f3ffa407f8c6f7fa84	2014-09-15 10:03:08 -04:00
Jeremy Stanley	471f9f8755	Work toward Python 3.4 support and testing Change-Id: Ibde15015cf665ca95e217381566baccee233ab37	2014-09-03 19:07:25 +00:00
Jarrod Johnson	737643c33c	Fix IO worker tolerance of errors If an _io_apply encounters an exception, the io worker entirely would fall apart. Encompass the key entry points in try clauses to allow the thread to keep working and the dependent IPMI object to have their waiter acknowledged. It's still considered a grave bug for this to ever occur, but at least the application would carry on. Change-Id: I61b0797025b25c6d9d3e86a5110603a6fc2d67fb	2014-08-27 11:01:21 -04:00
Jenkins	9bf8258edf	Merge "Force non-numeric for compact sensor records"	2014-08-08 12:52:55 +00:00
Jenkins	0ad769d9d4	Merge "Handle non-linear and unrecognized linearizations"	2014-08-06 13:58:35 +00:00
Jarrod Johnson	7a3096e07b	Force non-numeric for compact sensor records In the IPMI spec, compact sensors have the numeric format reserved and mandate an implementation set it to '3'. This mandate seems to have been ignored by some implementations. Force the value to be 3 for all compact sensor records and assume the reserved bits may never be used in a compact sensor. Change-Id: I88f5d7b533869809f213ab0c5379b276af50cd23	2014-08-06 09:49:28 -04:00
Jarrod Johnson	817998fe0a	Change to name-only lookups in RAKP In IPMI2, there are two modes the BMC can regard the Role parameter. It can either consider it a 'max privilege' or 'match privilege'. It defaulted to 'match privilege' in order to enhance compatibility with some earlier BMC implementations that misinterpreted the specification in a way that allowed 'match privilege' to work but 'max privilege' to break without a specific workaround for that BMC. That BMC family is pretty much out of service and if the same issue arises later, we can put in auto-detect and workaround pretty cheaply. With this in mind, change the mode to look account up by name only since that is how 99% of ipmitool invocations are done and it also is a more straightforward model. Change-Id: Ibf82b70e1b85e4e05c93365a684e21c434b4d5b4	2014-08-05 16:00:12 -04:00
Jarrod Johnson	7ceb22b8c4	Handle non-linear and unrecognized linearizations Since we cannot hope to linearize a linearizable value without understanding the formula (OEM or future spec), treat all unrecognized linearizations and non-linearizable and rely upon get sensor reading factors to determine the value. Add the capability to actually get the sensor reading factors and then pass the resultant data through the same decode_formula that would have been use had the factors been retrieved through the SDR record. Change-Id: I4c3a6bbbd6c68f7a0d19c2a7a221eb5fb57c99de	2014-08-04 11:04:44 -04:00
Jarrod Johnson	94433a2ab5	Add 'persistent' to return dict of get_bootdev This enables a caller to understand more about the nature of the boot device parameter Change-Id: Iec2e9b5be06a6340fbc1e3461532f749c972558e	2014-07-31 10:22:53 -04:00
Lucas Alvares Gomes	14573871eb	Raise IpmiException on an error setting/getting the boot device Currently if an error happens when setting/getting the boot device pyghmi is returning a dictorary with the error, instead I think it should raise an exception just like other methods (e.g set_power, get_power) does. Change-Id: Ifaebd5ff578ff670950e2ebe584c235d20fce145	2014-07-23 10:32:26 +01:00
Lucas Alvares Gomes	ee4cd47588	Use get() to avoid KeyError on get_bootdev() The method get() returns a value for the given key. If key is not available then returns default value None. I think that was the intention of the initial code. Change-Id: I974258822d54f7ac09bc4197eb4ec249784012e7	2014-07-23 10:22:01 +01:00
Jarrod Johnson	8740687e0f	Reduce severity of generic discrete assert to 'Ok' In practice, generic discrete sensors have not indicated good or bad health They have most commonly been used to indicate something like a particular option being available or user disabled. This does mean that something trying to use an utterly generic discrete sensor will not trigger a health issue, but hopefully those cases leverage more informative events that do have clear 'health' connotations. There remains the chance that a sensor will rely upon the vocabulary of the text in SDR and that just cannot be avoided. Change-Id: I777b2f1300301291ca5a3aa7a6b18de1de6f9d1a	2014-07-15 15:55:14 -04:00
Jarrod Johnson	4e04b07f6e	Tolerate more privilege degradation scenarios There is some inconsistency in the way BMCs may balk at pursuing a privilege level beyond the user requesting. Add code to cope with two scenarios: -RAKP2 returning 0xd -set session privilege level returning 0x80 or 0x81 Change-Id: I500e5bbdf88b569b1f1c3f8476033be080770871	2014-06-20 15:10:12 -04:00
Jarrod Johnson	e167ee6a47	Fix concurrent raw_command calls to Session If two contexts call raw_command concurrently, there was a scenario where the first to transmit has its result overwritten by the next to send and corrupts the results of the first command. One scenario where this was encountered was when a get health call was being serviced at the same moment SOL attempted to open a console, causing one of the get sensor readings to complain that 'SOL was already active'. Address it by storing away lastresponse in a more context specific place before deasserting 'incommand' and remove instances that deasserted it earlier. Change-Id: I504da3f54562a4b65b8f4e9e20c19aed9d21a09f	2014-06-13 09:38:59 -04:00
Jarrod Johnson	3bb4e3ac09	Don't defer custom keepalive expiry on all payloads pyghmi was using any activity to defer any keepalive. If a caller has a custom keepalive, only keepalive activity should advance the keepalive expiry. Modify code to defer keepalive only if it is the generic keepalive. Change-Id: I852ad7a5de65af60fb8e11580bd2ef32896b71f6	2014-06-10 10:45:06 -04:00
Jarrod Johnson	26e8e4fdf0	Fix infinitely recursing custom keepalives Custom keepalives are called regardless of whether a command is issued or not. The rationale being that custom keepalives are checking for something specific rather than just assuring session state. Notably, SOL uses a custom keepalive to see if the payload is still active. The resultant problem was that if keepalive expired just at the time something was in the midst of a command, a session would infinitely recurse into its own keepalive. The issue was that the keepalive expiry incorrectly omitted _monotonic_time, causing expiry to always be far in the past. It normally did not break because if not incommand, send_payload was setting an appropriate value after the incorrect setting. Change-Id: Ie86e49890a6ac96ddf07206fb1b8558161c00a20	2014-06-10 09:56:36 -04:00

1 2 3 4 5

213 Commits