RSS 2137 projects tagged "Text Processing"

Download Website Updated 12 May 2013 Sanzang

Screenshot
Pop 86.93
Vit 1.87

Sanzang is a compact and simple cross-platform machine translation system. It is especially useful for translating from the CJK languages (Chinese, Japanese, and Korean), and it is very suitable for working with ancient and otherwise difficult texts. Unlike most other machine translation systems, Sanzang is small and approachable. Any user can develop his or her own translation rules, and these rules are simply stored in a text file and applied at runtime.

No download Website Updated 14 Feb 2013 oXygen XML Developer

Screenshot
Pop 43.61
Vit 9.93

Oxygen XML Developer is an Oxygen distribution specially tuned for XML development, providing XML editing, XML conversion, XML Schema development, XSLT/ XQuery/ XPath execution and debugging, SOAP and WSDL testing, Native XML and relational database support, and XML instance generation.

Download No website Updated 11 Feb 2013 Comment Remover

Screenshot
Pop 74.30
Vit 2.64

Comment Remover removes all comments from various types of source codes, including, but not limited to Assembly, Pascal, Java, C, C++, D, Python, and LUA.

No download Website Updated 12 May 2013 Template Data Interface (TDI)

Screenshot
Pop 55.50
Vit 6.83

Template Data Interface (TDI, /ʹtedɪ/) is a markup templating system written in Python with (optional but recommended) speedup code written in C. Unlike most templating systems, TDI does not invent its own language to provide functionality. Instead, you simply mark the nodes you want to manipulate within the template document. The template is parsed, and the marked nodes are presented to your Python code, where they can be modified in any way you want.

Download Website Updated 01 Apr 2013 documentr

Screenshot
Pop 233.11
Vit 8.10

documentr is a Web-based tool for editing and presenting software documentation. It allows you to easily maintain documentation for multiple products and product branches. Edits can easily be copied between branches, with merge conflicts being handled gracefully. It uses Markdown as its markup language, along with some extensions, and has a role-based permission system.

Download Website Updated 12 Aug 2012 libunibreak

Screenshot
Pop 23.09
Vit 1.00

libunibreak is an implementation of the line breaking and word breaking algorithms as described in Unicode Standard Annex 14 and Unicode Standard Annex 29. It is a superset of, and supersedes, liblinebreak. It is designed to be used in a generic text renderer. FBReader is one real-world example.

No download Website Updated 14 Jun 2012 FuzzyIndex

Screenshot
Pop 17.58
Vit 18.54

FuzzyIndex indexes text for performing fuzzy searches using PHP and SQLite. It can process a list of text strings and build a database which indexes snippets of those strings and the locations where they appear. The class can also search for given keywords and returns the locations of the indexed strings where the best-matching text appears. It uses SQLite to store the indexed text database, but the class can be extended to use a different database type. It uses certain heuristics to extract the snippets from the indexed text. These heuristics are implemented as separate classes which can be used interchangeably.

No download Website Updated 25 Jul 2012 ogeditor

Screenshot
Pop 58.53
Vit 1.51

ogEditor is a Web-based WYSIWYG HTML editor with a built-in file manager. It features a Tag Selector which lets you view and edit a tag's attributes and internal styles while working in the Design view of an HTML page. Tag Selector displays the entire chain of tags which apply to the current selection or to the cursor position. When any of the tags is selected, its corresponding element will be highlighted in the Design view, and the selected element's attributes and internal styles are also displayed and can be edited in the Property editor window.

Download Website Updated 26 Nov 2011 UverseWiki

Screenshot
Pop 19.47
Vit 23.40

UverseWiki is a modular open source PHP framework designed for text processing. Unlike most existing solutions, it is not regular expression-based but instead uses a recursive descent parser to build a document object model. After the parsing stage has been finished and the DOM is produced, the original source is discarded and all operations are performed on the document tree instead: nodes can be altered, serialized, or rendered into a particular format (such as HTML or RTF). The wiki syntax is language-neutral and the processing itself is carried out in UTF-8.

No download No website Updated 11 Nov 2011 ExternalSort

Screenshot
Pop 19.16
Vit 23.66

ExternalSort is a class that can sort large files similar to the Unix sort command. It can read the file to be sorted in small buckets associated with temporary files to not exceed the configured PHP memory limits. The buckets are sorted individually and then merged to produce the final sorted output. The class provides command line interface options so it can be executed as a command from a shell.

Screenshot

Project Spotlight

MoSSHe

A lightweight, secure server monitoring system.

Screenshot

Project Spotlight

ClubMaster

A membership management system.