Projects / ASPseek

ASPseek

ASPseek is an Internet search engine, written in C++ using the STL library. It consists of an indexing robot, a search daemon, and a search frontend (CGI or Apache module). It can index as many as a few million URLs and search for words and phrases, use wildcards, and do a Boolean search. Search results can be limited to time period given, site, or Web space (set of sites) and sorted by relevance (PageRanks are used) or date. It is optimized for multiple sites (threaded index, async DNS lookups, grouping results by site, and Web spaces), but can be used for searching one site as well. It can work with multiple languages/encodings at once (including multi-byte encodings such as Chinese) due to optional Unicode storage mode. Other features include stopwords and ispell support, a charset and language guesser, HTML templates for search results, excerpts, and query words highlighting.

Tags
Licenses
Operating Systems
Implementation

Recent releases

  •  22 Jul 2002 12:50

    Release Notes: A bug in reverse citations merging which was introduced in 1.2.9 was fixed. Incorrect redirects merging was fixed. Incorrect merging of the reverse citation index was fixed in the case when URL A refers to B which redirects to C and A also refers to C and one of references from A disappeared. The fact that the ResultsPerPage parameter from s.htm overrides PS cookie was fixed. Small improvements and fixes were made to init.d/aspseek, aspseek-mysql-postinstall, the spec file, and the manual pages. The DBLibDir directive was added to aspseek.conf and searchd.conf.

    •  03 Jul 2002 11:20

      Release Notes: This version added support for the HTTP POST method in s.cgi, IncrementHopsOnRedirect and RedirectLoopLimit options and sed-like \1 to \9 sequences in the Replace command in aspseek.conf. Tag parsing was improved in the index to handle omitted quotes, and the option to limit results by a range of dates was fixed. Several rare core dumps in searchd and s.cgi and two rare memory leaks in searchd were fixed. FreeBSD portability fixes were made, along with fixes and improvements in the results cache and man pages.

      •  19 Feb 2002 18:22

        Release Notes: A generic client library (libaspseek) and module for Apache server, new man pages (index(1) and aspseek-sql(5)), fixes for 64-bit platforms such as Alpha for gcc3/ISO C++ conformance, sped-up citation merging, fixes for incorrect (in some cases) processing of -u -t -s options in "index", a fix for a bug with not closing file in "searchd", a fix for a bug in excerpts processing, a mechanism to reuse URL IDs (saves memory in "searchd"), and a fix for a bug that caused "orphaned" URLs in database.

        •  07 Dec 2001 14:19

          Release Notes: The package was made more portable, and it should now compile on non-Linux systems. The code is compilable with gcc3. Linking on certain Linux platforms was fixed. Checking of deleted entries in the inverted index was fixed. Processing of clones was fixed. The random number generation facility in s.cgi was fixed. CheckOnly and CheckOnlyNoMatch behavior was fixed (GET was used instead of HEAD). More langmaps for Czech charsets were added, and another (smaller) list of Italian stopwords was written.

          •  14 Nov 2001 18:45

            Release Notes: This release implements a buddy-like heap and buffered file in "index" for better memory usage and faster processing. It fixes improper clones processing in "index", a few coredumps in "searchd" and "index", a few memory leaks in "index", and the -P and -A flags in "index." The amount of stack used by "index" has been reduced. Concurrency between searchd threads having many DB connections has been improved; Options have been added to "index" to delete URLs from an inverted index and to re-create broken citation files from the DB. The I/O nce of "searchd" has been optimized.

            Recent comments

            01 Feb 2006 16:47 newjobdirect

            Aspseek is discontinued
            It's obvious that SWSoft does not develop and support this great project any more. Sadly... However there are another version where a lot of bugs were fixed and some features was added, like mysql 4 and gcc 3.2 support. As far I know this is the last and most stable version (we run in successfully on our site last year). You can download it from http://www.newjobdirect.co.uk/aspseek/. Search Man (http://www.newjobdirect.co.uk/)

            Screenshot

            Project Spotlight

            OpenStack4j

            A Fluent OpenStack client API for Java.

            Screenshot

            Project Spotlight

            TurnKey TWiki Appliance

            A TWiki appliance that is easy to use and lightweight.