Projects / unac


unac is a C library and command that removes accents from a string. For instance, the string été will become ete. It provides a command line interface that removes accents from standard input or from a string given as an argument. In the library function and the command, the charset of the input is specified as an argument. The input is converted to UTF-16 using iconv(3), accents are stripped, and the result is converted back to the original charset. The iconv -l command on GNU/Linux will show all charsets supported. It currently has Perl, PHP3, and PHP4 interfaces.


RSS Recent releases

  •  03 Sep 2002 06:09

Release Notes: A Unicode 3.2 bug was fixed, debug information was added, and the information level from the regression test was improved. The manual pages were also rewritten for clarity and content.

  •  06 Jul 2002 14:02

Release Notes: An upgrade from Unicode 3.0.1 to Unicode 3.2, and updates to the autotools files.

  •  19 Jul 2001 14:51

Release Notes: This release has better detection of the iconv library using the AM_ICONV macro. Autotools files have been upgraded, and there are minor documentation upgrades.

  •  30 Jan 2001 06:14

    Release Notes: When unac_string finds an illegal sequence while converting, it now replaces it with a space. For instance, the 1/4 ISO-8859-1 character is converted to 1 4 (one space four) because the fraction character does not exist in ISO-8859-1. The new unac_version function returns the version number.

    •  30 Jan 2001 06:14

      Release Notes: New support for systems that do not have UTF-16BE defined but only UTF-16 being implicitly big-endian, which means that it will work with both glibc-2.1.3 and glibc-2.1.94. A fix for an occasional allocation bug, allocation of the returned buffer even if an empty string is given in input, and more regression tests.


      Project Spotlight


      An audio time-scaling library.


      Project Spotlight


      A markup language processor.