libextractor is a library used to extract meta-data from files of arbitrary type. It is designed to use helper-libraries to perform the actual extraction, and to be trivially extendable by linking against external extractors for additional file types. The goal is to provide developers of file-sharing networks, file managers, and WWW-indexing bots with a universal library to obtain meta-data about files. It includes a shell-command and bindings for Java (JNI) and Python.
TYPO3 CMS is a Web Content Management System which features automatic creation of navigational menus, headlines, and other dynamic graphical elements, automatic conversion and scaling of images, the ability to present different templates based on variables such as client browser or country code, support for multiple templates on a site, and a built-in password-protection option. Pages can be timed to be shown on a certain date, be hidden on a certain date or just temporarily hidden. TYPO3 supports search in SQL-databases and redesigning of a website at once is just a matter of creating a single new template.
Nuxeo Platform provides a framework and set of components to address document management and collaboration needs, including metadata/taxonomies, versioning, lifecyle management, workflow, relations, searching, reporting, transformation, auditing, and retention. Its flexible extension system, based on OSGi, allows developers to quickly configure and extend the platform by creating new components. Its default Web user interface, based on the JSF standard, uses AJAX to create a pleasant user experience. It can also be accessed by a rich client interface through the use of Web services, for instance using the Eclipse-based Nuxeo RCP rich client platform.
HTTrack is an easy-to-use offline browser utility. It allows you to download a Web site from the Internet to a local directory, building recursively all directories, getting HTML, images, and other files from the server to your computer. HTTrack arranges the original site's relative link-structure. Simply open a page of the mirrored Web site in your browser, and you can browse the site from link to link, as if you were viewing it online. HTTrack can also update an existing mirrored site, and resume interrupted downloads. WebHTTrack is a Web-based GUI for HTTrack.
direx - directory dino allows you to run your own Web site directory. It is easy to setup, and runs correctly out of the box. Experienced Web masters can also customize the HTML layout. Static HTML pages are created for low server load and good search engine position. It features unlimited categories, an automatic check for double entry of URLs, searchability, and sendmail support.
PDFTextStream is a PDF text and metadata extraction library available for Java and .NET. It supports all versions of the PDF document specification (including v1.7, used by Acrobat 8, 9, and X), extraction of text encoded using double-byte character sets (including Chinese, Japanese, and Korean), decryption of documents encrypted using 40-bit, 128-bit, 256-bit, and variable bit length ciphers, and extraction of all document metadata provided by PDF documents (including form data, bookmarks, and annotations). Easy integration with Jakarta Lucene is included, as well as interactive form update capability.
PowerSeek SQL allows you to create, manage, and run your own search engine and directory portal with total control and ease. it is user friendly in every aspect and built for the most demanding uses and customization needs. It comes with an extensive admin panel, the ability to sell link listings, SEO friendly URLs, link reviews/ratings, content sensitive banner rotator, spam filter, broken link checker, custom data fields, mailers, crawlers, pre-designed template sets, reciprocal link checker, image/video/file uploading, RRS feeds, optional PPC functionality, and much more. It can be used for Yellow Pages, real estate, and travel directories, complex product catalogs, image galleries, and more.