Projects / PDF OCR X

PDF OCR X

PDF OCR is a simple drag-and-drop utility that converts PDFs and images into text documents. It uses advanced OCR (optical character recognition) technology to extract the text of the PDF or image. This is particularly useful for dealing with PDFs and images that were created via a scan-to-PDF function in a scanner or photo copier. It uses the Tesseract engine to perform OCR, and currently supports over 20 languages for OCR.

Tags
Licenses
Operating Systems
Implementation
Translations

Recent releases

  •  15 Nov 2013 21:40

    Release Notes: Fixes an issue with handling some PDFs with rotation set.

    •  06 Nov 2013 06:11

      Release Notes: This release fixes an issue with missing characters in Searchable PDF output mode with Cyrillic languages (e.g., Ukrainian, Bulgarian, Russian).

      •  28 Oct 2013 05:40

        Release Notes: This release fixes an issue with crashing on some Chinese, Japanese, and Korean documents.

        •  25 Oct 2013 19:40

          Release Notes: Support for Snow Leopard (10.6.8) was re-added.

          •  25 Oct 2013 17:02

            Release Notes: An issue that caused crashes on some installs of Mountain Lion was fixed.

            Screenshot

            Project Spotlight

            OpenStack4j

            A Fluent OpenStack client API for Java.

            Screenshot

            Project Spotlight

            TurnKey TWiki Appliance

            A TWiki appliance that is easy to use and lightweight.