Release Notes: Testing on a 128-core HP SuperDome showed a known bottleneck in the multiple-workers decompressor to be significant on many-core machines: whenever there were fewer input blocks than cores, the work was distributed unevenly. Hence, the splitter-to-workers queue of "scan and decompress" tasks was replaced with two queues: a low priority, splitter-to-workers one of "scan" tasks, and a high priority, workers-to-workers one of "decompress" tasks. Alas, this also increased the number of context switches. The new worker broadcast conditions were formally proven in the comments.
Release Notes: Now the Makefiles, with the help of the standard getconf utility, select a programming environment, if there is one, in which large files are supported.
Release Notes: In the multiple-workers decompressor, the tail pointer of the splitter-to-workers queue proved to be private to the splitter. Accordingly, said pointer was eliminated as a shared resource, simplifying the code.
Release Notes: A manual page that uses the man macro package was added. The README file describes the compressed output and why there is only one exit status for all types of errors. A more portable Makefile was created. A little code was cleaned up.
Release Notes: A class of valid bz2 files the multiple-workers decompressor (MWD) possibly refuses has been described in the Bugs section of the README. By concatenating empty bzip2 streams (each having a length of 14 bytes) and optionally inserting such a sequence before, after, or between non-empty bzip2 streams, the size of the input block that unavoidably contains an entire bzip2 block header can be increased without bound. This invalidates the assumption the MWD is based on. However, neither bzip2 nor lbzip2 creates such files, and bz2 file sets that do defeat the MWD when catenated should be rare.
Release Notes: When decompressing with a single worker thread, lbzip2 was previously 45% slower than standard bzip2. The new, dedicated single-worker decompressor is only 3% slower, and provides input and output buffering, which is useful in pipelines and on network file systems. Hence using lbzip2 incurs virtually no performance penalty over bzip2 even on a single-core machine. A script was added to help automated testing. Some thread notification conditions have been cleaned up. This release compresses an empty file to a valid bzip2 stream instead of an empty file.
Release Notes: The decompressor was redesigned: all CPU-bound operations were moved into the worker threads, so that now, besides the muxer, the splitter is purely I/O-bound too. Lbzip2 supports tracing its memory allocation with the new "-t" option. Both the compressor and decompressor were retested.
Release Notes: Decompression was extracted from the split-work-multiplex skeleton into a separate module. Compression was added. The project has been renamed to lbzip2. The reordering of processed sub-blocks happens entirely in the multiplexer now, changing the time complexity from O(log n) to O(1) inside the critical section among the workers and the muxer. The command line conforms to utility syntax guidelines. Lbzip2 queries the number of online processors if sysconf() supports it. Block serial numbers have fixed 64-bit width. The README file was updated. The development status has been advanced to Beta.
Release Notes: After running lbunzip2 on the bz2 test material of CERT-FI 20469, a bug was fixed where a worker (decompressor) thread could get into an infinite loop, spinning until finally outrunning the multiplexer thread, then consuming all available memory and exiting.
Release Notes: Version 0.01 didn't throttle the decompressor threads when the multiplexer thread was blocked on the write() system call, thus memory consumption could grow indefinitely. This is fixed now. Some performance testing was done on five multicore machines (Alpha, Athlon, Itanium, Sparc, and Xeon). The README file was rewritten.