Tuesday’s Tool: Robot.txt Generator
Yes I know its Thursday. I thought of the idea yesterday and Thursday’s Tool didn’t sound as good as Tuesday’s Tool.
Google announced that they are adding a robot text generator to their webmaster tool line up.
While I do like that Google is heightening awareness level of certain elements of a website the average user may not know about ( like robot.txt files and xml sitemaps) I am also wary about just how much Google KNOWS about me.
I am uneasy that I have to login in and register a site that I want to create a tool for. A post for another day, however.
A robot.tx file is a little file that tells who and where they can go on your site- any site not just a blog.
it can be a simple as allow everyone in or putting up stops signs to certain pages or areas you don’t want followed and indexed.
Do I Need a Robot.txt File On My Server?
No. You site will be crawled and indexed without one- all the pages will accessible. A robot.txt file lets you shape they way they crawl your site and BLOCK specific content.
Why Would You Want to Stop Search Engines From Crawling A Site?
Nature of the blog is to publish content. So when you create a post it will be slotted into different spot. The content will be sorted into archives and categories and by those zillions of little tags you have created. Duplicate content.
Duplicate content is to be avoided- some of the content will be dropped from index , dumped into supplement results or steal page rank.
I will also block or disallow section of my site – no need for Google to index my admin section, javascript or css. Those sections might be taking a piece of the page rank pie that you want to to share.
Google’s Robot.txt Generator
You can create a Robot,txt file in Google’s Webmaster Tool’s Actually bit tricky to find- you need to login and click on your verified site (and frankly creating your own robot.txt file would be faster). It lets you decide basic allows/disallows.
I find it just as easy and less intrusive to make your own robot.txt file. Basically create in notepad and upload to your WordPress root directory.
This is good robot.txt to use in a WordPress blog, it is my generic robot.txt file that i tweak as unnecessary.
I will say that to avoid duplicate content I do prefer to use dofollow and nofollow plugins.
Copy and edit text below or download
User-agent: *
Disallow: /*.js
Disallow: /*.png
Disallow: /*trackback
Disallow: /*.css
Disallow: /*/feed/$
Disallow: /*/feed/rss/$
Disallow: /*/trackback/$
Disallow: /tag/
Disallow: /wp-content/plugins/
Disallow: /wp-content/cache/
Disallow: /wp-content/themes/
Disallow: /wp-admin/
Disallow: /*?*
Disallow: /*?
Allow: /wp-content/uploads
Other Robot.txt Generators:
http://www.seochat.com/seo-tools/robots-generator/
http://www.mcanerin.com/EN/search-engine/robots-txt.asp