All releases tagged Code cleanup


Release Notes: The code has been reorganized into modules. Some iteration constructs have been converted to Python iterators and generators. All text processing internally is now handled as Unicode. Analyzers are back as generators of tokens. The changes to the code to make it more Pythonic appear to have resulted in trading time for space: preliminary tests indicate about a 5% speedup on one dataset in exchange for a 20% increase in memory usage.


Release Notes: Some minor changes were made for Python 2.3, although a couple of warnings about bit operations remain. This release breaks some code: field.Keyword() must now be used instead of field.Field.Keyword(). If you are using the Indexer wrapper, searches are now more accurate because the query is tokenized first.