Catdoc is a MS Word file decoding tool that doesn't attempt to analyze file formatting (it just extracts readable text), but is able to handle all versions of Word and convert character encodings. A Tcl/Tk graphical viewer is also included. It can also read RTF files and convert Excel and PowerPoint files.
| Tags | Text Processing |
|---|---|
| Licenses | GPL |
| Operating Systems | Linux (32 and 64 bit) BSD |
| Implementation | C |
Last announcement
Catdoc has a new maintainer, Nick Bane, and is now hosted as part of the Alioth project supported by Debian.
New VCS content will appear in due ...
Recent releases


Release Notes: This release fixes codepage and charset bugs, handles negative numbers on 64bit architectures, and fixes a Macintosh MS1904 date bug in xlsparse.


Release Notes: A catppt utility for viewing PowerPoint files was added. Processing of Mac charsets and dates was improved.


Release Notes: A lot of bugs concerning the RTF parser and xls3csv have been fixed. The ability to define a customizable page separator for multi-page spreadsheets and command line switch to specify desired maximal precision of floating point numbers (the default now is to output as many digits as it is) have been added. A bug with reading pre-OLE word/write files and text files (Debian bug #255625) has been fixed.