57 projects tagged "Indexing/Search"
Harvest is a system to collect information and make it searchable using a Web interface. It can collect information using HTTP, FTP, NNTP, and local files. Supported formats include HTML, DVI, PS, fulltext, mail, man pages, news, troff, WordPerfect, C sources, and many more. Adding support for new formats is easy due to Harvest's modular design.
HTTrack is an easy-to-use offline browser utility. It allows you to download a Web site from the Internet to a local directory, building recursively all directories, getting HTML, images, and other files from the server to your computer. HTTrack arranges the original site's relative link-structure. Simply open a page of the mirrored Web site in your browser, and you can browse the site from link to link, as if you were viewing it online. HTTrack can also update an existing mirrored site, and resume interrupted downloads. WebHTTrack is a Web-based GUI for HTTrack.
PHP Content Management System (phpCMS) makes it possible to need only one template for your whole Web site. It allows you to provide dynamic menus with unlimited levels, and use templates and sub-templates without a database. It is search engine-friendly and proxy-friendly, as the pages it generates can not be distinguished from static HTML pages. PHP code can be added to any template and content file with an optional module. It supports the caching of parsed pages and gzip compression.
Artekopia Netjuke is a cross-platform Web-based audio streaming jukebox powered by PHP 4 and an increasing choice of databases. MP3, Ogg Vorbis, ASF/WMA, and other music file formats are supported. Artekopia Netjuke aims to enable small organizations or communities run a private "mp3.com-like" Web site to access the music they legitimately own, distribute, or are granted access to. It supports most audio players, language packs, optional file downloads, media protection schemes, multi-level security, shared and private playlists, random playlists, images, etc. It also features an unusual, easy-to-use installer module to get you started in minutes.
PHPX is a Web portal system, blog, Content Management System (CMS), forum, and more. It is designed to allow everyone to be able to have feature-rich, interactive websites even if you do not know a bit of programming. Some key features include fully-integrated forums, downloads, an image gallery with slideshow and auto-thumbnailing, support ticket system, a GUI interface for Web page content management, news with topics and instances, and a whole lot more. It allows you to fully customize the look of your site.
libextractor is a library used to extract meta-data from files of arbitrary type. It is designed to use helper-libraries to perform the actual extraction, and to be trivially extendable by linking against external extractors for additional file types. The goal is to provide developers of file-sharing networks, file managers, and WWW-indexing bots with a universal library to obtain meta-data about files. It includes a shell-command and bindings for Java (JNI) and Python.
Turbo Seek provides the capability to create and run a directory and search engine with ease. It comes with a visually friendly admin control panel that provides all of the aspects to create, customize, and run a fully functional search engine. It supports unlimited sub-categories, and includes a crawler, link checker, site rankings, site reviews, the ability to customize all text/layouts, relevance to search keywords, and more.