Projects / YaCy / Comments

Comments for YaCy

08 Mar 2006 07:08 Orbiter

Re: YAcY is a badly behaved robot
Both is not true:

1) YaCy respects the robots.txt since mid of 2005, it never ignored robots.txt on purpose. At this time it was simply the first time implemented.

2) There is no referrer spam. YaCy shows that the page was indexed by a YaCy peer. Since the corresponding web page is referenced then not only by this peer, but by all peers, there must be a central address where a referred page must see that it was referenced by a non-centralized web crawler. This is a unique problem that other centralized crawlers do not have. In this case YaCy is just honest an references to the YaCy project page. This feature was removed with YaCy 0.43 because of too many people had been confused with this referrer.

06 Mar 2006 15:42 Low012

Re: YAcY is a badly behaved robot


> 1. YAcY doesnt ask for robots.txt, let

> alone follow it.

> 2. YAcY posts the yacy web address as

> the HTTP Refer[r]er header similar to

> spam bots.

This issues have been resolved for some time now.

27 Feb 2006 17:43 pgregg

YAcY is a badly behaved robot
1. YAcY doesnt ask for robots.txt, let alone follow it.

2. YAcY posts the yacy web address as the HTTP Refer[r]er header similar to spam bots. Well behaved bots may put their url into the Agent header.

I only came across this project whilst researching against HTTP Referrer spammers, nice idea - shame about the implementation.

Screenshot

Project Spotlight

ReciJournal

An open, cross-platform journaling program.

Screenshot

Project Spotlight

Veusz

A scientific plotting package.