# wp-login.php boondoggle



## downtime (Oct 21, 2002)

I do hosting as a sideline for my clients, there are a few issues I would like to address. When I look at the logs on the hosting, I do that often, I see requests for robots.txt, favicon.ico etc. I put an empty robots.txt in all roots, and I make sure there's a favicon anywhere they request it. For WP sites, fine, it's there, but not in the root, so I put the same favicon for the site in the root to satisfy the browsers so they don't request it on every page they hit and give the server a little break. It would be cool if WP et al would at least copy its favicon to root, no? Too hard to code I guess. I constantly see requests for wp-login.php and wp-admin.php, admin.php, login.php etc even on the html sites and of course they return 404s. Even though they get a 404, they keep coming back endlessly for the same file. A lot like Google and MSN coming back again and again looking for the same old file that was deleted a year and a half ago..write some code byatches, let it go, but I digress. I see WP advice that says if I put "this" code into my .htaccess file, it will turn the bots with no referrer back on themselves. What I'd like to do is put dummy files in the root of these html sites that would do the same thing. If I, and another thousand+ sites waste the time of these bots, that could only be a good thing. Is there anything out there that I could copy to the site root that would burn the time/resources of The Browsing Dead and make them eat themselves? A cure, at least a spike to the head in techogeek? The time honored boot to the head, a swirlee in every pot. A basic package for drupal, wp, joomla, all the forums, all the open source software that is so awesome and could use a little support to ease the same burden they've put on every site on the net? Am I dreaming? Couldn't all sites, regardless of code, server, or location, put a series of files to make the bots eat their own shorts? Someone's gotta come up with something. Anyone want to take a stab at the code? And by the way, Merry Christmas!


----------



## colinsp (Sep 5, 2007)

Most people want the legit bots to crawl their sites for Search Engine Listing.

Robots.txt should not be empty you need to set some rules in there to prevent bots crawling areas of your site.


----------



## downtime (Oct 21, 2002)

Hi Colin. Yes, you're right about robots.txt. But the purpose of putting an empty robots file in the root of a site is to cut down a little bit on the server overhead. Every bot that comes is supposed to request robots.txt. If the file isn't there, then the server has to use the resources to check, find the file isn't available, and return a 404 to the bot, and write the 404 to the log. If the file is there, the first time the bot requests the file, on a centos server running apache, lets say it returns a 20 byte file (the empty one) to the bot. The server is done. No 404, no writing to a log with the time, date, ip address of the requester, the name of the file requested, and the path. When the bot returns and asks the server for the robots file, the server only returns to the bot that the file hasn't changed, and the transfer is 0 bytes. Reduced overhead. So if I added to robots all the juicy details of where I want it to stay out of, all the good bots will follow the rules. But if I create folders with the structures of WP, joomla, drupal, etc etc, and in those folders I put files that the bad bots are looking for, and the contents of those files are a nasty surprise, bot begone. Even if I could write a script to add the bad bots ip to a "deny from" in the.htaccess file, that would be a pretty good start. It would be more of a user level implementation of the way CFS uses iptables under WHM. It would be something the average site owner could use. At the very least, any bot that accessed the folders it was instructed not to visit in the robots would be added to a list that you could use for the IP Deny Manager. They talk about cooling server farms, and many other ways of making things greener, but no one is trying to get the little things right. Just sayin'.


----------

