Projects / webStraktor


webStraktor is a programmable World Wide Web data extraction client. It features a scripting language to facilitate the collection, extraction, and storage of information available on the Web, including images. The scripting language uses elements of regular expression and XPath syntax. The standard webStraktor output format is XML based, either in ASCII, UTF-8, or ISO-8859-1 (Latin1). It adheres to the Robots Exclusion Protocol and can be configured to operate anonymously by connecting through proxy servers. Exhaustive logging and tracing information are provided.

Operating Systems

Recent releases

  •  21 Apr 2014 15:20

    Release Notes: Initial Freecode announcement.


    Project Spotlight


    A Fluent OpenStack client API for Java.


    Project Spotlight

    TurnKey TWiki Appliance

    A TWiki appliance that is easy to use and lightweight.