RSS 18 projects tagged "OCR"

No download Website Updated 20 Mar 2014 Pyocr

Screenshot
Pop 147.92
Vit 4.72

Pyocr is a simple Python wrapper for OCR engines (Tesseract, Cuneiform, etc.). It supports Python 2.7 and Python 3.x, and requires Pillow.

Download Website Updated 05 Mar 2014 Paperwork

Screenshot
Pop 220.20
Vit 2.58

Paperwork is a GUI to make papers easily searchable using OCR. The basic idea behind Paperwork is "scan & forget" : You should be able to just scan a new document and forget about it until the day you need it again.

Download Website Updated 16 Jun 2013 GlyphViewer

Screenshot
Pop 18.06
Vit 17.68

GlyphViewer is a desktop application that allows users to build translations from text in images and export them into different image formats or even HTML. Users can use OCR technology to identify text in images, such as English, German, Chinese, Arabic, Japanese, and many more. A unique feature of the application is its support for Ancient Egyptian Hieroglyphs.

No download No website Updated 09 Mar 2014 Character Recognition

Screenshot
Pop 202.29
Vit 9.88

Character Recognition is an Android app that allows the user to take a photo (or use existing image files on the device) and then apply the Tesseract OCR engine to extract the text in the photo. It is currently supporting English text, but other language support will be added in the future.

No download Website Updated 14 Oct 2013 getxbook

Screenshot
Pop 111.00
Vit 5.70

getxbook is a collection of tools to download books from websites. There are tools to download from Google Books' "book preview", Amazon's "look inside the book", and Barnes and Noble's "book viewer". There is an optional GUI written in Tcl/Tk, and some shell scripts using OCR to create plain text or searchable PDFs and DjVu files from the downloaded books.

Download No website Updated 26 Jul 2013 Aspose.OCR for .NET

Screenshot
Pop 44.38
Vit 3.30

Aspose.OCR for .NET is a character recognition component built to allow developers to add OCR functionality in their ASP .NET Web applications, Web services, and applications. It provides a simple set of classes for controlling character recognition tasks and supports BMP and TIFF.

Download Website Updated 19 Jun 2012 MALODOS

Screenshot
Pop 73.82
Vit 3.64

MALODOS helps you to scan, store, and easily retrieve all your personal documents. Its storage format is open and documented, so your document archive can remain accessible even without MALODOS. The documents themselves are stored as standard PDF files, while their metadata (such as title, tags, and description) are stored into a separate SQLite database in an open format. With MALODOS, you can also manage existing files in PDF, JPEG, TIFF, and other formats, so you can still use the documents that you've already scanned. You can connect to any external OCR program to give access to a fulltext search feature.

No download No website Updated 04 Mar 2011 OCR2DATA

Screenshot
Pop 19.75
Vit 33.87

OCR2DATA is a full OCR stack for document digitization analysis and OCR. It provides external connection by way of an API, standard document exchange formats, and a database.

No download Website Updated 23 Feb 2011 Mayan EDMS

Screenshot
Pop 38.94
Vit 33.98

Mayan EDMS is a document manager Web application with custom metadata indexing, file serving integration, and OCR capabilities. It features user defined metadata fields, dynamic default values for metadata, lookup support for metadata, filesystem integration by means of metadata indexing directories, user defined document UUID generation, local file or server side staging file uploads, batch uploading of many documents with the same metadata, user defined document checksum algorithms, previews for a great deal of image formats including PDF, document OCR and searching, automatic grouping of documents by metadata, permissions and roles support, multi-page document support, page transformations, distributed OCR processing, and support for multiple languages.

Download Website Updated 04 Nov 2012 tesseract-ocr

Screenshot
Pop 152.20
Vit 2.70

tesseract-ocr is an OCR engine originally developed by Hewlett Packard and now sponsored by Google. It is highly accurate and will read a binary, gray, or color image and output text.

Screenshot

Project Spotlight

Aircrack-ng

The next generation of aircrack.

Screenshot

Project Spotlight

poppler

A PDF rendering library.