Projects / Unicode Utilities

Unicode Utilities

The Unicode Utilities are a set of programs for manipulating and analyzing Unicode text. uniname prints any combination of the character offset of each character, its byte offset, its hex code value, its encoding, the glyph itself, and its name. unidesc reports the character ranges to which different portions of the text belong. unihist generates a histogram of the characters in its input. ExplicateUTF8 determines and explains the validity of a sequence of bytes as a UTF-8 encoding. unirev reverses UTF-8 strings. unifuzz tests other programs' unicode handling.

Tags
Implementation

Recent releases

  •  18 Feb 2009 12:13

    Release Notes: This release updates character data to Unicode version 5.1 and fixes a bug in the validation option of uniname as well as a couple of other minor bugs.

    •  04 Apr 2008 07:09

      Release Notes: This release adds a new utility, unifuzz, which generates test input for programs expecting Unicode. In addition to generating random sequences of characters, unifuzz can generate a character from each range, tokens of various potentially problematic characters and sequences, very long lines, strings with embedded nulls, and ill-formed UTF-8.

      •  30 Jun 2007 07:34

        Release Notes: This release adds an option to unidesc that causes it to list the ranges detected after reading all input rather than listing them as they are encountered. uniname now has an option that causes it to ignore characters within the Basic Multilingual Plane.

        •  30 Jan 2007 14:31

          Release Notes: This release adds the utility unirev, a filter that reverses UTF-8 strings character-by-character. The package name has changed.

          •  12 Jan 2007 11:40

            Release Notes: Uniname and unidesc now report the unofficial ranges within the Private Use Areas registered with the ConScript Unicode Registry.

            Screenshot

            Project Spotlight

            OpenStack4j

            A Fluent OpenStack client API for Java.

            Screenshot

            Project Spotlight

            TurnKey TWiki Appliance

            A TWiki appliance that is easy to use and lightweight.