In at least one ARM UEFI, the PXE support is non-compliant,
and absolutely requires the filename field to be populated.
That said, the same firmware fails to let iPXE function, so
we have other issues with this firmware.
Very preliminary testing indicates that PXE compliant firmware
ignores the filename in the offer and proceeds to proxyDHCP.
The non-compliant firmware ignores the PXEClient guidance
and just consumes the filename.
In theory, this just means that the compliant firmware inflicts
a bit of redundant effort on our part, as we must sort out the filename
to offer twice (once for DHCP, then again for proxyDHCP). It may aid
non compliant firmware to proceed in more limited circumstances.
Whether due to the management node or node IP addresses,
check if deployment can reasonably proceed using IPv4 or IPv6,
and give a warning with some suggestions to check.
Also, add nodeinventory <node> -s as an example resolution for missing
uuid.
This opens the door for normalized common sensors
for clients that care about the semantics but
cannot keep track of inconsistent sensor names from
implementation to implementation.
Unfortunately, apache can get a bit odd over how it
reports a non-viable open socket for keepalive, which
can happen in certain windows.
Disable the keepalive feature and take some performance penalty in
browsers for the sake of more consistent return behavior and
fewer idle greenthreads doing nothing.
The yaml python default behavior is 'pure python' and is
tortuously slow.
As a test, yaml dump of a 17,000 element list took 70 seconds in default configuration.
Opting into the C functions, that time comes down to 10 seconds, a
nice and easy improvement for generic yaml.
For dumping a simple dumb list (e.g. the nodelist for ssh), a special
case yaml-looking result is done, which hits 0.4 seconds on that same
test. So this special case is added to nodelist, which can be very long
and very in demand at the same time.
This will take care of padding when
padding is consistent across a range.
However, we still have a problem with a progression like:
01
02
...
98
099
100
Where numbers in the middle start getting padding unexpectedly without a leading digit.
For one, shorten the DNS timeout, if the DNS server is completely out, give up quickly.
For another, if a host has a large number of net.X.hostnames, the sequential nature
was intolerable.
Have each network be evaluated in a greenthread concurrently to serve
the DNS latency concurrently.
Since the multi-iterator ambition is out,
ditch the expensive set wrangling step.
Now the procedure is:
-Suck nodes into groups, as possible
-Separately for groups and nodes:
-Sort the elements
-Chunk the elements based on 'non-numberical' situation matching
-analyze the iterators to apply [] to shorten the name
-Multi-iterator will cause a discontinuity, and a new ',' delimited name gets constructed
There's too many cases that can go wrong.
Note that with this lower ambition, it would be possible to
significantly streamline the implementation.
Notably, the 'find discontinuities' approach
was selected to *try* to
support multiple iterators,
but since that didn't pan out,
a more straightforward
numerical strategy can
be used from the onset.
The info is hard to put together client side, but
supremely easy server side.
Provide a nice call to
get the layout for a noderange, similar to (but better than) current
GUI code.
Now GUI can get a nice canned JSON
description of the layout.
Consult collective.manager
to decide to skip
consideration of a node, if
that node shouldn't be managed anyway.
This should avoid "cross-island" behavior for such
environments.
Setting attributes can be a touch expensive, since
there's a high risk
of this being old news,
check that discovery hasn't already set values
before trying to set them again.
ssh module was pausing input for the
entire websocket while doing the simple 'write' operation.
Change to background the actual
logon processing,
rather than blocking what should be a fairly trivial write operation.
For one, remove 'non-voting' members from being leaders.
Large number of leader candidates create long delays for
converging on a valid organization. Further, some treat 'non-voting'
more roughly, inducing the worst case convergence scenario of unclean
shutdown of leader.
Convergence now happens fairly quickly for collectives with large
number of non-voting members.
During initial DB transfer, the leader would be tied up unreasonably
long handling the jsonification of a large configuration. Offload to a worker
process to allow the leader to continue operation while this intensive, rare
operation occurs.
Reliably run a reassimilation procedure for the lifetime of the leader.
This allows orphaned members to be prompted to join the correct leader.
Serialize the onboarding of a connecting member, and have redundancy more gracefully
paused. This avoids excessive waiting in lock and more deterministic timing
with respect to timeout expectations by the connecting system.