RSS 232 projects tagged "Text Processing"

Download Website Updated 20 May 2013 Asymptote

Screenshot
Pop 740.43
Vit 1,111.82

Asymptote is a powerful descriptive 2D and 3D vector graphics language for technical drawing, inspired by MetaPost but with an improved C++-like syntax. It provides for figures the same high-quality level of typesetting that LaTeX does for scientific text. Asymptote is a programming language as opposed to just a graphics program. It can exploit the best features of script (command-driven) and graphical user interface (GUI) methods. High-level graphics commands are implemented in the language itself, allowing them to be easily tailored to specific applications.

Download Website Updated 14 May 2013 Recoll

Screenshot
Pop 322.43
Vit 129.08

Recoll is a personal full text desktop search tool based on Xapian. It provides an easy to use, feature-rich, easy administration interface with a Qt-based GUI. Text, HTML, PDF, PostScript, MS Word, OpenOffice, Wordperfect, KWord, Abiword, maildir, and mailbox mail folder formats are supported, along with their compressed versions and quite a few others. Powerful query facilities are provided. Multiple character sets are supported, and internal processing and storage uses Unicode UTF-8. Stemming is performed at query time and the stemming language can be switched after indexing.

Download Website Updated 12 May 2013 Sanzang

Screenshot
Pop 84.89
Vit 1.98

Sanzang is a compact and simple cross-platform machine translation system. It is especially useful for translating from the CJK languages (Chinese, Japanese, and Korean), and it is very suitable for working with ancient and otherwise difficult texts. Unlike most other machine translation systems, Sanzang is small and approachable. Any user can develop his or her own translation rules, and these rules are simply stored in a text file and applied at runtime.

Download Website Updated 30 Apr 2013 GNU Parallel

Screenshot
Pop 544.86
Vit 50.79

GNU parallel is a shell tool for executing jobs in parallel locally or using remote computers. A job is typically a single command or a small script that has to be run for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables. If you use xargs today you will find GNU parallel very easy to use, as GNU parallel is written to have the same options as xargs. If you write loops in shell, you will find GNU parallel may be able to replace most of the loops and make them run faster by running several jobs in parallel. If you use ppss or pexec you will find GNU parallel will often make the command easier to read. GNU parallel makes sure output from the commands is the same output as you would get had you run the commands sequentially. This makes it possible to use output from GNU parallel as input for other programs.

Download Website Updated 14 Apr 2013 poppler

Screenshot
Pop 536.63
Vit 69.41

Poppler is a PDF rendering library derived from xpdf. It has been enhanced to utilize modern libraries, and new features have been added. It also provides basic command line utilities.

No download Website Updated 08 Mar 2013 SILVERCODERS DocToText

Screenshot
Pop 205.33
Vit 16.47

SILVERCODERS DocToText is a powerful utility which can convert documents in many formats to plain text. It includes a console application and C/C++ library, which allows embedding text extraction mechanisms into other applications. It supports MS Office binary formats (MS Word (DOC), MS Excel (XLS), MS PowerPoint (PPT), and Rich Text Format (RTF)), OpenDocument formats (text documents (ODT), spreadsheets (ODS), and presentations (ODP)), Office Open XML formats (MS Word (DOCX), MS Excel (XLSX), and MS PowerPoint (PPTX)), and HyperText Markup Language (HTML). DocToText can extract text not only from the document body but also from annotations (comments) embedded in odt, doc, docx, or rtf files and read metadata like author, last modification date, or number of pages. It can be used as a fast console viewer, and is able to convert corrupted OpenDocument and Office Open XML documents. It can be used to recover text even if other recovery methods failed.

Download Website Updated 24 Feb 2013 gjots

Screenshot
Pop 368.42
Vit 38.62

gjots lets you organize text notes in a convenient, hierarchical way. It can be used for notes, jottings, bits and pieces, recipes, and even PINs and passwords, using encryption. It can also be used to "mind-map" larger compositions like manuals, Web pages, articles, etc. It is a bit like the KDE program "kjots", but uses the GTK library and supports a hierarchy of folders. Files can be output to HTML with an automatic table of contents or to docbook XML. Encryption is supported with ccrypt(1), gpg(1), and openssl(1), so that musings can be kept private.

No download Website Updated 15 Feb 2013 oXygen XML Editor

Screenshot
Pop 416.04
Vit 44.59

oXygen is an XML editor that supports any XML document, and works with XML Schemas, DTDs, Relax NG schemas, and NRL Schemas. It has powerful transformation support that allows you to edit XSLT and XSL-FO documents and to obtain documents in the desired output format (such as HTML, PS, or PDF) with just one click. It also includes a complete Subversion client, support for flattening XML Schemata, an XML Schema instance generator, integration with the X-Hive/DB, MarkLogic and TigerLogic XML databases, editing actions on the diagram, and a rename refactoring action.

No download Website Updated 14 Feb 2013 oXygen XML Developer

Screenshot
Pop 46.26
Vit 9.78

Oxygen XML Developer is an Oxygen distribution specially tuned for XML development, providing XML editing, XML conversion, XML Schema development, XSLT/ XQuery/ XPath execution and debugging, SOAP and WSDL testing, Native XML and relational database support, and XML instance generation.

No download Website Updated 12 Feb 2013 TAMS Analyzer

Screenshot
Pop 285.10
Vit 66.26

TAMS (Text Analysis Markup System) Analyzer is a qualitative or ethnographic coding and data extraction-analysis system.

Screenshot

Project Spotlight

Disk Drill

Data recovery software.

Screenshot

Project Spotlight

Mapyrus

Software for map plots in PostScript, PDF, SVG, and Web image formats.