The 'expp' tool (the Epeios XML preprocessor) reads an XML file to transform it to another XML file. It simplifies the writing of XML files by allowing the handling of macros, the definition and testing of variables, the inclusion of files, and more. This is done by writing, directly in the source XML file, predefined tags owned by a given namespace, tags which are then recognized and handled by the 'expp' tool. The tool is also available as a Java native component.
oXygen is an XML editor that supports any XML document, and works with XML Schemas, DTDs, Relax NG schemas, and NRL Schemas. It has powerful transformation support that allows you to edit XSLT and XSL-FO documents and to obtain documents in the desired output format (such as HTML, PS, or PDF) with just one click. It also includes a complete Subversion client, support for flattening XML Schemata, an XML Schema instance generator, integration with the X-Hive/DB, MarkLogic and TigerLogic XML databases, editing actions on the diagram, and a rename refactoring action.
Oxygen XML Developer is an Oxygen distribution specially tuned for XML development, providing XML editing, XML conversion, XML Schema development, XSLT/ XQuery/ XPath execution and debugging, SOAP and WSDL testing, Native XML and relational database support, and XML instance generation.
Xapian is a search engine library, scalable to collections containing hundreds of millions of documents. It's written in C++ with bindings for Perl, Python, PHP, Java, Tcl, C#, Ruby, and Lua. It is a highly adaptable toolkit that allows developers to easily add advanced indexing and search facilities to their own applications. It supports the Probabilistic Information Retrieval model and also a rich set of boolean query operators. Omega is a Web search application built upon the Xapian library. It can index a Web server's document tree (including HTML, PDF, OpenOffice, MS Word/Excel/Powerpoint/Works, WordPerfect, RTF, PS, etc.), or data exported from arbitrary sources (e.g. SQL databases).
EditLive! is a cross-platform, browser-based Web content editor with a Word-like WYSIWYG interface. Key features include a live spell checker and advanced table and nested list support. It produces content that complies with Section 508 and W3C accessibility, key XHTML and CSS standards. It is designed for use with Web content management, knowledge management, and e-learning applications. Integrations for major CMS platforms and business solutions including IBM Workplace Web Content Management, Vignette, EMC (Documentum), Percussion, FileNet, Open Text, Ektron, Ingeniux, Stellent, vCampus, and Schoolwires are also available.
Jaxe is a Java XML editor with a graphical document-oriented interface. It is configurable with an XML schema and a configuration file. It supports validation at element insertion, and is customisable with Java modules. XSL transforms can be used to export documents in XML, HTML, or PDF. Sample configurations include XML schemas, XHTML strict, Docbook, and DITA.
edit-on Pro is a cross-platform, in-browser, WYSIWYG editor Java applet which enables XHTML content authoring with XML markup. The editor is compact, powerful and requires no special libraries or client plugins. It includes CSS support, table editing, a spelling checker, multi-language support, and features an API that allows full customisation and seamless integration into Web-based applications. It perfectly complements content management systems, e-learning, knowledge management systems, and CRM. A free trial version is available for download and includes complete developer samples and a comprehensive integration manual.
The Okapi project’s main purpose is to architect a set of building blocks for the creation of larger open source localization and translation tools. But many Okapi components are generic enough to be of interest to the text mining, natural language processing, and text retrieval communities. Okapi’s many text filters (HTML, Properties, XML (ITS XPath-based rules), OpenXML, ODF, Regex etc.) provide a straightforward way to access the text of multiple document formats. Its document events and pipeline can be made to integrate with other frameworks such as UIMA, LingPipe, OpenPipeline, OpenNLP, GATE, and Lucene. The advantage of Okapi’s text filters is that not only is text extracted, but all non-textual formatting is preserved. It is possible to decompose a document into events, process them via the pipeline, and then rebuild the input document without loss. Structural information can be added to Okapi document events so that tables, lists, links, titles etc. are grouped together and treated as a unit. This is useful when context based on a “universal” document structure is needed. The Okapi event model supports user configurable annotations, similar to UIMA, but simpler and more restricted in scope. User can annotate spans of text or add new resources such as translation memory matches, terminology, token types, or part of speech information.
jSmaTeP assists in the use of Java for processing import and export data by configuring a data structure rather than by programming it. The structure of the import data is specified in an XML file. jSmaTeP then generates a value object representing exactly one row or record in the import file based on a given XML data configuration. This means that if the import or export format changes, only the XML data configuration needs to be changed to match it.