mirror of
				https://github.com/xcat2/xNBA.git
				synced 2025-11-04 05:12:33 +00:00 
			
		
		
		
	
		
			
				
	
	
		
			59 lines
		
	
	
		
			3.7 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			59 lines
		
	
	
		
			3.7 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
The  compressor achieves  an  average compression  rate of 60%  of the
 | 
						|
original size which is on par with "gzip". It seems that you cannot do
 | 
						|
much better for compressing  compiled  binaries.  This means that  the
 | 
						|
break even  point  for using compressed  images is   reached, once the
 | 
						|
uncompressed size approaches 1.5kB. We  can stuff more than 12kB  into
 | 
						|
an 8kB EPROM and more than 25kB into an 16kB EPROM.   As there is only
 | 
						|
32kB of RAM  for both the uncompressed  image  and its BSS  area, this
 | 
						|
means that 32kB EPROMs will hardly ever be required.
 | 
						|
 | 
						|
The compression  algorithm uses a  4kB  ring buffer  for buffering the
 | 
						|
uncompressed data. Before   compression starts,  the  ring buffer   is
 | 
						|
filled  with spaces (ASCII  character  0x20).  The algorithm tries  to
 | 
						|
find repeated  input sequences of a  maximum length of  60 bytes.  All
 | 
						|
256 different input  bytes  plus the 58 (60   minus a threshold of  2)
 | 
						|
possible  repeat lengths form a set  of 314 symbols. These symbols are
 | 
						|
adaptively Huffman encoded.  The  algorithm starts out with a Huffmann
 | 
						|
tree  that  assigns equal code lengths    to each of  the  314 symbols
 | 
						|
(slightly favoring the repeat  symbols over symbols for regular  input
 | 
						|
characters), but  it will be changed whenever  the frequency of any of
 | 
						|
the symbols  changes. Frequency counts are  kept in 16bit  words until
 | 
						|
the total number of compressed codes totals 2^15.  Then, all frequency
 | 
						|
counts will be halfed (rounding to the bigger number).  For unrepeated
 | 
						|
characters (symbols 0..255) the Huffman code  is written to the output
 | 
						|
stream.  For repeated characters the  Huffmann code, which denotes the
 | 
						|
length of the repeated character sequence, is written out and then the
 | 
						|
index in the ring buffer is computed.   From this index, the algorithm
 | 
						|
computes  the offset   relative to  the current  index  into  the ring
 | 
						|
buffer. Thus,  for typical input data,  one would expect that short to
 | 
						|
medium range offsets are more frequent  than extremely short or medium
 | 
						|
range to long range offsets. Thus the  12bit (for a 4kB buffer) offset
 | 
						|
value  is statically Huffman encoded  using a precomputed Huffman tree
 | 
						|
that favors  those  offset  values    that  are deemed to   be    more
 | 
						|
frequent. The  Huffman encoded offset  is  written to the output  data
 | 
						|
stream,  directly  following the code  that   determines the length of
 | 
						|
repeated characters.
 | 
						|
 | 
						|
This algorithm, as implemented in the  C example code, looks very good
 | 
						|
and  its operating parameters are   already well optimized. This  also
 | 
						|
explains   why  it achieves     compression ratios    comparable  with
 | 
						|
"gzip". Depending on the input data, it sometimes excells considerably
 | 
						|
beyond what "gzip -9" does, but this  phenomenon does not appear to be
 | 
						|
typical. There are some flaws with  the algorithm, such as the limited
 | 
						|
buffer  sizes, the  adaptive  Huffman tree  which takes  very  long to
 | 
						|
change, if    the input  characters  experience   a sudden   change in
 | 
						|
distribution, and the static Huffman   tree for encoding offsets  into
 | 
						|
the  buffer.   The slow  changes of   the  adaptive  Huffman  tree are
 | 
						|
partially counteracted by  artifically keeping  a 16bit precision  for
 | 
						|
the frequency counts, but  this does not  come into play until 32kB of
 | 
						|
compressed data is output, so  it does not  have any impact on our use
 | 
						|
for "etherboot", because  the BOOT Prom  does not support uncompressed
 | 
						|
data of more then 32kB (c.f. doc/spec.doc).
 | 
						|
 | 
						|
Nonetheless,  these problems  do  not  seem  to affect  compression of
 | 
						|
compiled  programs very much.  Mixing  object code with English  text,
 | 
						|
would not work too  well though, and  the algorithm should be reset in
 | 
						|
between. Actually, we  might  gain a little  improvement, if  text and
 | 
						|
data   segments    were compressed  individually,    but   I have  not
 | 
						|
experimented with this option, yet.
 |