Projects / GNU libextractor

GNU libextractor

libextractor is a library used to extract meta-data from files of arbitrary type. It is designed to use helper-libraries to perform the actual extraction, and to be trivially extendable by linking against external extractors for additional file types. The goal is to provide developers of file-sharing networks, file managers, and WWW-indexing bots with a universal library to obtain meta-data about files. It includes a shell-command and bindings for Java (JNI) and Python.

Tags
Licenses
Operating Systems
Implementation

RSS Recent releases

Release Notes: This release adds support for Matroska, fixes some minor bugs (leaks on error-handling paths), and does some minor code clean up (fixing compiler warnings about dead code).

  •  14 Jun 2010 11:23

Release Notes: This release fixes various minor bugs, in particular better handling of malloc failures and more robust handling of malformed inputs in various plugins.

  •  14 Mar 2010 09:42

    Release Notes: This release fixes a problem with LE not finding its plugins under certain conditions. It also fixes an IPC issue under FreeBSD which caused some plugins to not work.

    Release Notes: This release adds out-of-process execution for plugins and improves the quality and quantity of the extracted meta data for many formats. It breaks API compatibility.

    •  05 Jul 2009 08:35

    Release Notes: This release adds support for librpm 4.7 and uses an external version of libexiv2 for improved and more up-to-date EXIV2 support.

    RSS Recent comments

    02 Feb 2008 05:00 Avatar grothoff

    Re: online demo not working
    There are two PDF plugins, one that is quite

    simplistic and another one based on code from

    xpdf (which has a bad security track record).

    Depending on which one I happen to enable on the

    website (options to configure), you get more or

    less information for PDF files.

    > When I upload dmca.pdf all it gives me

    > is mimetype. Am I missing something?

    24 Jan 2008 15:40 baloney

    online demo not working
    When I upload dmca.pdf all it gives me is mimetype. Am I missing something?

    14 Aug 2005 21:25 Avatar grothoff

    Re: Also Requires gobject-2.0
    Note that as of 0.5.3 LE still needs gobject-2.0 but the
    ordinary shared version will do fine now.

    27 Jan 2005 10:15 Avatar grothoff

    Re: Also Requires gobject-2.0
    Well, gobject-2.0 is part of glib, so it is listed as a
    dependency. What is more tricky is that we need the
    static, relocatable version of the library -- but try to specify
    that on freshmeat :-).

    27 Jan 2005 10:07 dforce

    Also Requires gobject-2.0
    Can't seem to get the OLE2 libraries to compile, make complains:

    /usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.3/../../../../i686-pc-linux-gnu/bin/ld: cannot find -lgobject-2.0

    Oh, and you may want to include these dependencies within either the README or INSTALL files.

    Screenshot

    Project Spotlight

    deltasql

    A database version control system.

    Screenshot

    Project Spotlight

    Talend Open Studio for Data Quality

    A program to analyze your databases and check your data quality.