Yioop! is a PHP search engine. Yioop! can be configured as either a general purpose search engine for the whole Web or it can be configured to provide search results for a set of URLs or domains. Yioop can crawl pages or can directly index archives such as ARC and WARC. It supports indexing several file formats such as HTML, Atom, PDF, DOC, PPT, RTF, RSS, XML, SVG, PNG, JPG, BMP, GIF, and sitemaps. The Yioop! crawler can be deployed on one or many machines. It supports having one or more to crawl scheduler processes, as well as multiple fetchers and mirrors. Crawling respects robots.txt including Crawl-delay. Yioop! crawls are stored in a Web archive format that is easy to move around. Crawling can be done on one machine and the results deployed elsewhere. Yioop! supports mixing of crawls. Yioop! comes with a search front end that can be localized as desired using a GUI. This GUI supports RTL languages. Management of crawls can also be done using this GUI. Yioop! can be configured in a straightforward manner to make use of file caching or memcache if available.
Elefant is a full-featured, but refreshingly simple CMS and PHP Web framework. It features an intuitive, streamlined admin interface, a tightly integrated WYSIWYG editor, dynamically embeddable content objects for building dynamic Web sites without touching code, and an extremely fast, secure, and flexible framework for add-ons and themes. The core CMS includes page editing, a blogging engine, site navigation, file and user management, automatic version control, a tool for translators and multilingual site management, and an in-browser theme/layout editor. It is also extensively documented and has a small but friendly and active developer community.
I, Librarian is a PDF manager or PDF organizer that allows individual researchers or a group of researchers to create an annotated collection of PDF articles. Users may build the virtual library collaboratively, thus sharing the workload of literature mining. It enables smart browsing and fast searching in reference data and PDF files, and includes an advanced tool for mining scientific literature from PubMed, PubMed Central, NASA ADS, arXiv, IEEE Xplore, and HighWire Press.
selfoss is a multipurpose RSS reader, live stream, mashup, and aggregation Web application. You can register RSS feeds, and this Web-based PHP application will continuously fetch new RSS feed items. The items will be shown in a stream. You can also add other sources, like deviantart, Twitter, or tumblr users. Attaching new sources is very easy, and you can add any source you want (e.g. IMAP email account, log files, etc.). selfoss also allows you to collect all your postings on different communities (e.g. Twitter, your blog, etc.) and show it in one place. It features a Web-based RSS reader, universal aggregator, mobile support (Android, iOS, and iPad), and support for MySQL, SQLite, and MongoDB databases. It is easy extensible with an open plugin system (write your own data connectors). It is a lightweight PHP application taking up less than 2 MB.
WaldScan is a PHP class which recursively scans a given directory for a list of selected file types. It has many uses for any Web page which serves files over HTTP/HTTPS, for a CLI program which does batch processing of files, or in cron jobs for caching file data for faster access.
TinyIB is a lightweight PHP image board which emulates the functionality of 4chan. If you use MySQL or SQLite, you can use it to create an efficient setup able to handle large amounts of traffic. If you don't use a database, it can store posts as text files for a portable setup capable of running on virtually any PHP host.