GNU Ocrad is an OCR (Optical Character Recognition) program and library based on a feature extraction method. It reads images in pbm (bitmap), pgm (greyscale), or ppm (color) formats and produces text in byte (8-bit) or UTF-8 formats. It also includes a layout analyzer that is able to separate the columns or blocks of text normally found on printed pages. Ocrad can be used as a stand-alone console application, or as a backend to other programs.
OCRFeeder is a document layout analysis and optical character recognition application. It is able to automatically outline a document image's contents, distinguish between graphics and text and perform OCR over the latter. It can export to several formats, its main one being ODT. OCRFeeder has a GTK+ graphical user interface that allows the user to control the application and, for example, edit and correct the automatic recognition. It can also be used from the command line for automation.
Paperless Office is a document management and electronic filing system. It is similar to Paperport, but adds many new features, such as automatic document classification, synchronization with your filing cabinet, date extraction, semantic Web integration, and sophisticated natural language processing, such as extracting todo lists from documents, spam detection, urgency classification, along with planning, scheduling, and execution features. You can set due dates and interdependencies for documents and tasks, so it has workflow support.
MALODOS helps you to scan, store, and easily retrieve all your personal documents. Its storage format is open and documented, so your document archive can remain accessible even without MALODOS. The documents themselves are stored as standard PDF files, while their metadata (such as title, tags, and description) are stored into a separate SQLite database in an open format. With MALODOS, you can also manage existing files in PDF, JPEG, TIFF, and other formats, so you can still use the documents that you've already scanned. You can connect to any external OCR program to give access to a fulltext search feature.
Mayan EDMS is a document manager Web application with custom metadata indexing, file serving integration, and OCR capabilities. It features user defined metadata fields, dynamic default values for metadata, lookup support for metadata, filesystem integration by means of metadata indexing directories, user defined document UUID generation, local file or server side staging file uploads, batch uploading of many documents with the same metadata, user defined document checksum algorithms, previews for a great deal of image formats including PDF, document OCR and searching, automatic grouping of documents by metadata, permissions and roles support, multi-page document support, page transformations, distributed OCR processing, and support for multiple languages.