2131 projects tagged "Text Processing"

No download No website Updated 11 Nov 2011 ExternalSort

Screenshot
Pop 16.94
Vit 30.82

ExternalSort is a class that can sort large files similar to the Unix sort command. It can read the file to be sorted in small buckets associated with temporary files to not exceed the configured PHP memory limits. The buckets are sorted individually and then merged to produce the final sorted output. The class provides command line interface options so it can be executed as a command from a shell.

Download Website Updated 26 Nov 2011 UverseWiki

Screenshot
Pop 16.55
Vit 30.62

UverseWiki is a modular open source PHP framework designed for text processing. Unlike most existing solutions, it is not regular expression-based but instead uses a recursive descent parser to build a document object model. After the parsing stage has been finished and the DOM is produced, the original source is discarded and all operations are performed on the document tree instead: nodes can be altered, serialized, or rendered into a particular format (such as HTML or RTF). The wiki syntax is language-neutral and the processing itself is carried out in UTF-8.

No download Website Updated 25 Jul 2012 ogeditor

Screenshot
Pop 67.04
Vit 1.46

ogEditor is a Web-based WYSIWYG HTML editor with a built-in file manager. It features a Tag Selector which lets you view and edit a tag's attributes and internal styles while working in the Design view of an HTML page. Tag Selector displays the entire chain of tags which apply to the current selection or to the cursor position. When any of the tags is selected, its corresponding element will be highlighted in the Design view, and the selected element's attributes and internal styles are also displayed and can be edited in the Property editor window.

No download Website Updated 14 Jun 2012 FuzzyIndex

Screenshot
Pop 13.30
Vit 27.09

FuzzyIndex indexes text for performing fuzzy searches using PHP and SQLite. It can process a list of text strings and build a database which indexes snippets of those strings and the locations where they appear. The class can also search for given keywords and returns the locations of the indexed strings where the best-matching text appears. It uses SQLite to store the indexed text database, but the class can be extended to use a different database type. It uses certain heuristics to extract the snippets from the indexed text. These heuristics are implemented as separate classes which can be used interchangeably.

Download Website Updated 08 Oct 2013 libunibreak

Screenshot
Pop 33.23
Vit 2.31

libunibreak is an implementation of the line breaking and word breaking algorithms as described in Unicode Standard Annex 14 and Unicode Standard Annex 29. It is a superset of, and supersedes, liblinebreak. It is designed to be used in a generic text renderer. FBReader is one real-world example.

Download Website Updated 04 Jun 2013 documentr

Screenshot
Pop 108.59
Vit 4.92

documentr is a Web-based tool for editing and presenting software documentation. It allows you to easily maintain documentation for multiple products and product branches. Edits can easily be copied between branches, with merge conflicts being handled gracefully. It uses Markdown as its markup language, along with some extensions, and has a role-based permission system.

No download No website Updated 02 Apr 2014 Template Data Interface (TDI)

Screenshot
Pop 100.14
Vit 5.45

Template Data Interface (TDI, /ʹtedɪ/) is a markup templating system written in Python with (optional but recommended) speedup code written in C. Unlike most templating systems, TDI does not invent its own language to provide functionality. Instead, you simply mark the nodes you want to manipulate within the template document. The template is parsed, and the marked nodes are presented to your Python code, where they can be modified in any way you want.

No download Website Updated 14 Feb 2013 oXygen XML Developer

Screenshot
Pop 26.38
Vit 22.11

Oxygen XML Developer is an Oxygen distribution specially tuned for XML development, providing XML editing, XML conversion, XML Schema development, XSLT/ XQuery/ XPath execution and debugging, SOAP and WSDL testing, Native XML and relational database support, and XML instance generation.

Download Website Updated 04 Mar 2014 Sanzang

Screenshot
Pop 126.58
Vit 7.39

Sanzang is a compact and simple cross-platform machine translation system. It is especially useful for translating from the CJK languages (Chinese, Japanese, and Korean), and it is very suitable for working with ancient and otherwise difficult texts. Unlike most other machine translation systems, Sanzang is small and approachable. Any user can develop his or her own translation rules, and these rules are simply stored in a text file and applied at runtime.

Download Website Updated 10 Oct 2013 text2html

Screenshot
Pop 43.34
Vit 1.01

text2html is a small tool which converts plain text files into HTML documents. It works with Unix pipes, generates hyperlinks from URLs (with the Regexp::Common Perl module), and parses bold text if you redirect input from another program.

Screenshot

Project Spotlight

Cerridwen

Accurate solar system data for everyone.

Screenshot

Project Spotlight

Toxic

A general purpose template engine.