PDFBox is a Java library for manipulating PDF documents and extracting contents from existing PDF documents.
| Tags | Information Management Document Repositories Internet Web Indexing/Search Site Management multimedia Graphics Viewers Software Development Libraries Java Libraries Text Processing Filters fonts General Indexing Utilities |
|---|---|
| Licenses | BSD Revised |
| Operating Systems | OS Independent |
| Implementation | Java |
Recent releases


Release Notes: This release contains significant bugfixes and an overhaul of the encryption framework.


Release Notes: Improvements to the display of PDF documents. Support for printing PDFs has been addded, and many outstanding bugs have been fixed.


Release Notes: An NPE issue where an image did not have any applied filters was fixed along with an issue where extra spaces were being added during text extraction for Type3 fonts. Meta-information is now extracted and updated as XML. Text in and between bookmarks is now imported. XFDFImport should now fail with non-XFDF documents.


Release Notes: Major enhancements include PDF 1.5 support, text extraction speed improvements, and a .NET version.


Release Notes: This release fixes a large number of outstanding bugs.