The Universal Text Recognizer and Converter (Utrac) is a commandline tool and a C library that recognizes the encoding of an input file (UTF-8, ISO-8859-1, CP437, etc.) and its end-of-line type (CR, LF, or CRLF). It features automatic recognition (depending on the file and on the system's locale, reliable in most cases), assistance for verification or manual recognition, and conversion to another charset and/or end-of-line type.
|Tags||Software Development Internationalization Libraries Text Processing Linguistic Utilities|
Release Notes: The command line tool is fully usable. The documentation was updated for the library. Two tutorials have been written for the tool and the library. Supported character sets include ASCII, UTF-8, ISO-8859-xx, CP125x, CP4/7/8xx, MacXxxxx, and KOI8-x.