©Copyright 2008. All Rights Reserved.
BACK TO HOMEPAGE

Search Engine visits and other Robots

Robots visit web sites for all kinds of reasons. Search Engine robots index sites by "crawling" content. This can be very useful as far as getting information disseminated about your content. Google, Yahoo and MSN, the most popular search engines, all have their robot that list content found on your site in searches.

Search engine robots create referrals and traffic for your site. Unfortunately, not all robots that come to your site are so well behaved.

Badly behaved robots, like spam bots, are basically in the business of harvesting your site for email addresses.

You can check your log files for the IP addresses of robots. Instead of them just finding you return the favour and look them up. A number of sites offer lists of robots that crawl your site by IP. One of these allows you to check robots by IP address look up. www.robotstxt.org. Another has some useful info on Search engines jafsoft.com. If you'd like a list of the major search engines by Ip you can find one at iplists.com. If you would like to keep updated on crawler activity, including hacking attempts on your site,CrawlTrack is a free program under the GPL license (opensource).

Legitimate robots will have web addresses and will go to the robots.txt file. What's this? This txt is a commonly configured file on web sites which is designed to exclude robots from certain directories and urls on your site. For example, you may have a page that is still under construction and feel it would be premature to have it indexed by Google. You can put a directive into your robots.txt and the Google bot will refrain from crawling the page you specify as no follow.

One of the strangest tracking experiences is to find hits on files/folders that don't exist! These are the bad robots - dark energies created by those who must have stunted lives - probably humiliated in their childhood by dysfunctional parenting, the creators of such robots must turn to destructive fantasies in order to get back some sense of power and victory. We can only hope they will seek help and one day see the light. Everyone, it seems, gets to see these robots come and go on their site. Check out this forum at webmasterworld.com for one example.

Security enhancement with site tracking software
With good robots, you can go to their original websites and access more information about their identity - they have nothing to hide. The bad ones will be hard to find because they are unscrupulous and often ignore your robot.txt file directives. Site tracking software allows you to see which crawlers contravene directives you have set them. Once you have their IP/domain name you can look them up. You can also use Google search to find info about these unscrupulous bots and, if widely prevalent, you will find a forum or two citing similar experiences with them as with the one mentioned above on webmasterworld.com.

How do I detect and deal with Spam bots?

Find out some useful strategies at this site: turnstep.com.CrawlTrack mentioned already may also cover this issue.