Yioop! is a PHP search engine. Yioop! can be configured as either a general purpose search engine for the whole Web or it can be configured to provide search results for a set of URLs or domains. Yioop can crawl pages or can directly index archives such as ARC and WARC. It supports indexing several file formats such as HTML, Atom, PDF, DOC, PPT, RTF, RSS, XML, SVG, PNG, JPG, BMP, GIF, and sitemaps. The Yioop! crawler can be deployed on one or many machines. It supports having one or more to crawl scheduler processes, as well as multiple fetchers and mirrors. Crawling respects robots.txt including Crawl-delay. Yioop! crawls are stored in a Web archive format that is easy to move around. Crawling can be done on one machine and the results deployed elsewhere. Yioop! supports mixing of crawls. Yioop! comes with a search front end that can be localized as desired using a GUI. This GUI supports RTL languages. Management of crawls can also be done using this GUI. Yioop! can be configured in a straightforward manner to make use of file caching or memcache if available.
giis-ext4 (gET iT i sAY) is a file recovery tool for Ext4 filesystems. Once installed, current files and newly created files can be recovered. It allows users to recover all deleted files, recover files owned by a specific user, and recover files of a specific type, such as text or PNG. It uses ext2fs lib and sqlite.
I, Librarian is a PDF manager or PDF organizer that allows individual researchers or a group of researchers to create an annotated collection of PDF articles. Users may build the virtual library collaboratively, thus sharing the workload of literature mining. It enables smart browsing and fast searching in reference data and PDF files, and includes an advanced tool for mining scientific literature from PubMed, PubMed Central, NASA ADS, arXiv, IEEE Xplore, and HighWire Press.
blogstrap.py is a simple, no frills blog content management system powered by Twitter's Bootstrap and web.py. It features most things you would come to expect from a simple blogging platform. You can browse posts by category or subcategory, see recent posts, and mark favorites. You can perform basic searches. It includes an About page. A basic tag system is implemented (popular tags are counted and shown). A simple comment system is available. A robust administrative interface is included where you can create and edit posts. You can upload images and include them on a Credits page, where you can properly attribute the original author. Comments can be set to on, off, or manual approval (moderated). Security has been a top priority since the beginning. Blogstrap.py has low resource usage and runs quickly on top of Lighttpd.
FluxBB is a fast, light, user-friendly forum application for Web sites. It was originally based on PunBB and supports UTF-8, XHTML, and internationalization, and leaves out some of the less than essential features of other forums without sacrificing essential functionality or usability.
Wolf CMS simplifies content management by offering an elegant user interface, flexible templating per page, simple user management and permissions, and the tools necessary for file management. It is a fork of Frog CMS, which was itself a PHP migration of the Ruby-on-Rails app Radiant CMS. Wolf is now forging its own development path, although a family resemblance with these two systems can still be seen.
SQLet allows you to directly execute SQL on multiple text files, right from the Linux commandline. In one single command, you can read in text files (with or without header lines) and perform arbitrary select statements, including joins over several files. SQLet can thus replace awk or grep in some instances.