Projects / YaCy

YaCy

YaCy is a search engine that anyone can use to index the Internet (WWW and FTP) or to create a search portal for others (Internet or intranet). The scale of YaCy is limited only by the number of users. and can index billions of web pages. In P2P mode it is fully decentralized. All users of the search engine network are equal and it is not possible for anyone to censor the content of the distributed index.

Tags
Licenses
Operating Systems
Implementation
Translations

RSS Recent releases

  •  11 Apr 2013 05:52

Release Notes: This release includes deeper Solr integration. The default search process has undergone a full re-design. Ranking was strongly enhanced. Indexing speed was improved. Some memory leaks were fixed, and overall memory usage should be lower. Many small bugs were fixed.

  •  08 Nov 2012 23:58

    Release Notes: YaCy now has an embedded Solr 4.0.0 with the standard Solr XML search interface integrated. This is the primary indexing engine now. There is now an enhanced crawler with live link structure visualization. This release adds a Host Browser to explore the file structure of crawled hosts. It shows loaded pages, pages with errors, and pending files in the same way a file browser would show the contents of a host.

    •  25 Feb 2010 15:51

      Release Notes: A torrent file parser was added. It is now possible to import special content (MediaWiki crawls, Wikipedia dumps, PHPBB3 dumps, and OAI-PMH sources). The interface has a new modern look. A "Search Portal Integration" toolbox helps to integrate a YaCy search on your own Web pages using a search widget.

      •  14 Jan 2009 18:58

      Release Notes: The full international character set and all UTF-8 characters are now supported for indexing and search. Support has been added for site:, inurl:, and filetype: operator search. A public API has been added to the search results, the indexing, and link structure in XML and JSON syntax.

      •  05 Oct 2008 08:07

      Release Notes: This is a quick release, with a lot of security fixes and bugfixes.

      RSS Recent comments

      08 Mar 2006 07:08 Orbiter Thumbs up

      Re: YAcY is a badly behaved robot
      Both is not true:

      1) YaCy respects the robots.txt since mid of 2005, it never ignored robots.txt on purpose. At this time it was simply the first time implemented.

      2) There is no referrer spam. YaCy shows that the page was indexed by a YaCy peer. Since the corresponding web page is referenced then not only by this peer, but by all peers, there must be a central address where a referred page must see that it was referenced by a non-centralized web crawler. This is a unique problem that other centralized crawlers do not have. In this case YaCy is just honest an references to the YaCy project page. This feature was removed with YaCy 0.43 because of too many people had been confused with this referrer.

      06 Mar 2006 15:42 Low012 Thumbs up

      Re: YAcY is a badly behaved robot

      > 1. YAcY doesnt ask for robots.txt, let

      > alone follow it.

      > 2. YAcY posts the yacy web address as

      > the HTTP Refer[r]er header similar to

      > spam bots.

      This issues have been resolved for some time now.

      27 Feb 2006 17:43 pgregg

      YAcY is a badly behaved robot
      1. YAcY doesnt ask for robots.txt, let alone follow it.

      2. YAcY posts the yacy web address as the HTTP Refer[r]er header similar to spam bots. Well behaved bots may put their url into the Agent header.

      I only came across this project whilst researching against HTTP Referrer spammers, nice idea - shame about the implementation.

      Screenshot

      Project Spotlight

      MUltihost SSH Wrapper

      Broadcasts commands over SSH to multiple hosts.

      Screenshot

      Project Spotlight

      TbsZip

      A PHP class for modifying Zip archives without extensions or temporary files.