Wednesday, July 18, 2007

Blogger Adds Robots.txt

If you use Google Sitemaps for your Blogger and you are seeing a sudden increase in the number of links being blocks by robots.txt, don't panic as it is not anything you did wrong. It is all because Blogger is adding robots.txt by default recently. As you can see from the codes below, all pages under the /search directory is being disallowed meaning and pages under the /search directory would not appear included in any search result pages of major search engines (e.g. Google, Yahoo and Live.com).

User-agent: *
Disallow: /search
Sitemap: http://gspy.blogspot.com/feeds/posts/default?orderby=updated
This is, in fact, a good news to Blogger users as Google is treating most of these blocked pages as duplicated content and are listing them as 'Supplemental Results'. Furthermore, the more duplicated content you site has, the less Google is weighing your site's content.

A further improvement towards this great feature is to allow Blogger users to customize their own robots.txt so they could prevent undesired content from appearing in search results.

Note: If you are using Blogger and has recently directed your feed to FeedBurner, make sure you change your sitemap URL in Google Sitemaps to http://yourblog.blogspot.com/rss.xml?orderby=updated instead of just http://yourblog.blogspot.com/rss.xml otherwise an error would occur in Google Sitemaps.

27 comments:

Devi Mahapatra said...

hi Keith
thanks 4 the precious post.
i have been using blogger + google_webmaster for a long time . but i was not aware that /rss.xml can be uploaded as site map. i am now using /rss.xml?orderby=updated
as my sitemap. thanks a lot.

ivilla

forex trader said...

It's a pity theese robots's rules do not work with other SEs.

Keith Chan said...

@ devi You're welcome.
@ forex The robots.txt rule applies to all SEs like Yahoo and MSN.

Mike Dayoub said...

This is bad news for a Blogger.com user who wants all posts to be crawled and doesn't care what weight Google gives them.

Do I have any way to get the User-agent: *
Disallow: /search

removed from my robots file?

Keith said...

@ Mike Unfortunately, there isn't an option for Blogger users to modify the robots.txt file of their blogs. However, I don't see the point for Google to crawl duplicated because 99% of the /search pages are duplicated of your existing posts.

Mike Dayoub said...

well my posts are actually not duplicated. much of the content is the same (fire incidents) but the actual details (when, where, why) is what I want to be searchable on Google.

Chef Mom said...

I have two blogger blogs. One has minimal traffic and one has had much better results. I just checked in Adsense today, and my "minimal traffic" blog is making me some chump change, but the blog with all the traffic is showing as ZERO page impressions (although I've had 10,000+ visitors). I then used the Google Tools which told me that 47 pages of content are being blocked by robots.txt. I understand the whole "duplicate content" vs "original content", but why would nothing at all be showing up?

cindydanda said...

Sorry, I am confused. I am a user of both Blogger and Google Webmaster but I don't know how to access that robot.txt. I am having this problem because I really need to remove a page that google has cached. I tried everythin but it still hasn't worked. Now I want to try to put robots.txt to my blogger but I don't know how to do it. Help?

Keith said...

@ cindydanda Currently, Blogger aren't able to edit their blog's robot.txt. To remove a cache from Google, follow the instructions listed here: http://www.google.com/support/webmasters/bin/answer.py?answer=61062

सारंग पतकी said...

Hello Keith,
I am seeing my robots.txt as

User-agent: Mediapartners-Google
Disallow:

User-agent: *
Disallow: /search
Disallow: /

My main page itself is blocked by the robots.txt so nothing comes in search result when I put my link for searching in google.

How to get rid of this? Any idea?

Jane Air said...

Hi,
thanks for this.
I saw there were errors in my sitemap, now it's ok.
so, thanks to you ;)

gagi said...

Nice work on the robots.txt. I searched allot the net to find any info why google says I block some pages

PALS said...

The only solution to this is create another blog per post you have. Meaning if you have 50 posts, you should have 50 blogs with 1 post each. But still i think this does not work.

Shall we all transfer to wordpress now? Lets have a massive bloggers transfer to wordpress.

Thank you.

Technical Details said...


Thanks for your tips I was worried, why Google Site map shows certain URLs of my blog is blocked by robot.txt.
Visit my blog for tips and tricks about blogger and any computer user

Mitchie said...

I've had problems editing the robots.txt. I did some research about it and found that it couldn't be edited in blogger unless you have your own server like wordpress or page.ph. Anyway, i was just adding a sitemap to get indexed by google and I did it. I don't need a sitemap anymore.

Book said...

Thanks you very much. But robots.txt isnot changing. Blogger is havent for me any permission.

Gunawan said...

thanks for your tutorial bro, my this tutorial i share to my blog

Ann Donnelly said...

Thanks -- I've been a few other pages on the topic and none actually had the correct answer. Good one!

Livingstrong said...

Thank you so much for this great information. It was very helpful and I'm subscribing right now!

Emmanuel said...

thanks for the information.

Tim Freeman said...

Thanks for the information. I looked at it today and noticed it and wondered why. I guess I got lucky finding your site first.

New subscriber to your blog!

Shane Montgomery said...

I have google verified my site and changed all blogger settings to allow publishing. Yet Google will not index the site do the the auto robots file:

User-agent: Mediapartners-Google
Disallow:

User-agent: *
Disallow: /search

What is the solution. I do not have duplicate posts, do not want 75 blogs for the 75 postings, and must not be the only one having this problem.

Epenschmiede said...

Quite helpful. Thanks a lot.

hendri susanto said...

thansk for this information, this is very usefull for me that learning blogspot.

Vidit said...

I got errors when I added rss.xml as sitemap, but adding rss.xml?orderby=updated solved my problem. Thanks.

Tanzy said...

Hello Keith,

Thank you for your great article about sitemaps, I have now added orderby=updated to my google sitemaps.

Yesterday I submitted my site to bing and added the meta tag validation and today I see my website which was being listed in the first and sometimes 5th postion on google with various keywords is totally not visible even after 2nd page.

Keith do you have any idea. I suppose Its because of the bing meta tag code?? as soon as i realised this I have removed the meta tag from the blog. Please advice. I am panicking. what must be teh reason??

Any ideas ??

Thanks for reading!!

Vikingdread said...

@Tanzy: I have exactly the same problem. I added the Bing Meta Tag and now my blog no longer shows up in Google search results.

Removing the Bing Meta Tag is not resolving the problem. Haven't got a clue what to do about it.

Post a Comment

New! You can now receive emails whenever there are follow-up comments by signing into Blogger.

Off-topic posts will be deleted