Add Custom Robots.txt File in Blogger

Are you one of modern day bloggers without much knowledge of technical details, looking to enhance your blog's ratings and audience but don't know how to and your blogger friend told you that you can get a hike into your site visitors via editing your Robots.txt file? Or maybe you don't want search engines spiders to crawl through your pages? Or you do have a technical background but don't want to risk making changes without expert's words on topic? Well in either case, this is the right place for you to be. In this tutorial, you will see how to add Custom Robots.txt file in Blogger in a few easy steps.
custom robots.txt

But before we open and start working on Robots.txt, let's have a brief overview of its significance:
Warning! Use with caution. Incorrect use of these features can result in your blog being ignored by search engines.

What is Robots.txt?

With every blog that you create/post on your site, a related Robots.txt file is auto-generated by Blogger. The purpose of this file is to inform incoming robots (spiders, crawlers etc. sent by search engines like Google, Yahoo) about your blog, its structure and to tell whether or not to crawl pages on your blog. You as a blogger would like certain pages of your site to be indexed and crawled by search engines, while others you might prefer not to be indexed, like a label page, demo page or any other irrelevant page.

How do they see Robots.txt?

Well, Robots.txt is the first thing these spiders view as soon as they reach your site. Your Robots.txt is like a hour flight attendant, that directs you to your seat and keep checking that you don't enter private areas. Therefore, all the incoming spiders would only index files that Robots.txt would tell to, keeping others saved from indexing.

Where is Robots.txt located?

You can easily view your Robots.txt file either on your browser by adding /robots.txt to your blog address like http://myblog.blogspot.com/robots.txt or by simply signing into your blog and choosing Settings > Search engine Preference > Crawlers and indexing and selecting Edit next to Custom robots.txt.

blogger custom robots

How Robots.txt does looks like?

If you haven't touched your robots.txt file yet, it should look something like this:
User-agent: Mediapartners-Google
Disallow:
User-agent: *
Disallow: /search
Allow: /
Sitemap: http://myblog.blogspot.com/feeds/posts/default?orderby=UPDATED
Don't worry if it isn't colored or there isn't any line breaks in code, I colored it and placed line breaks so that you may understand what these words mean.

User-agent:Media partners-Google:
Mediapartners-Google is Google's AdSense robot that would often crawl your site looking for relevant ads to serve on your blog or site. If you disallow this option, they won't be able to see any ads on your specified posts or pages. Similarly, if you are not using Google AdSense ads on your site, simply remove both these lines.

User-agent: *
Those of you with little programming experience must have guessed the symbolic nature of character '*' (wildcard). For others, it specifies that this portion (and the lines beneath) is for all of you incoming spiders, robots, and crawlers.

Disallow: /search
Keyword Disallow, specifies the 'not to' do things for your blog. Add /search next to it, and that means you are guiding robots not to crawl the search pages /search results of your site. Therefore, a page result like http://myblog.blogspot.com/search/label/mylabel will never be crawled and indexed.

Allow: /
Keyword Allow specifies 'to do' things for your blog. Adding '/' means that the robot may crawl your homepage.

Sitemap:
Keyword Sitemap refers to our blogs sitemap; the given code here tells robots to index every new post. By specifying it with a link, we are optimizing it for efficient crawling for incoming guests, through which incoming robots will find path to our entire blog posts links, ensuring none of our posted blog posts will be left out from the SEO perspective.

However by default, the robot will index only 25 posts, so if you want to increase the number of index files, then replace the sitemap link with this one:
Sitemap: http://myblog.blogspot.com/atom.xml?redirect=false&start-index=1&max-results=500
And if you have more than 500 published posts, then you can use these two sitemaps like below:
Sitemap: http://myblog.blogspot.com/atom.xml?redirect=false&start-index=1&max-results=500
Sitemap: http://myblog.blogspot.com/atom.xml?redirect=false&start-index=500&max-results=1000

How to prevent posts/pages from being indexed and crawled?

In case you haven't yet discovered yourself, here is how to stop spiders from crawling and indexing particular pages or posts:

Disallow Particular Post
Disallow: /yyyy/mm/post-url.html
The /yyy/mm part specifies your blog posts publishing year and month and /post-url.html is the page you want them not to crawl. To prevent a post from being indexed/crawled simply copy the URL of your post that you want to exclude from indexing and remove the blog address from the beginning.

Disallow Particular Page

To disallow a particular page, you can use the same method as above. Just copy the page URL and remove your blog address from it, so that it will look something like this:
Disallow: /p/page-url.html

Adding Custom Robots.Txt to Blogger

Now let's see how exactly you can add Custom Robots.txt file in Blogger:

1. Sign in to you blogger account and click on your blog.
2. Go to Settings > Search Preferences  > Crawlers and indexing.

blogger custom robots.txt

3. Select 'Edit' next to Custom robots.txt and check the 'Yes' check box.
4. Paste your code or make changes as per your needs.

custom robots.txt

5. Once you are done, press Save Changes button.
6. And congratulations, you are done!

How to see if changes are being made to Robots.txt?

As explained above, simply type your blog address in the url bar of your browser and add /robots.txt at the end of your url as you can see in this example below:
http://helplogger.blogspot.com/robots.txt
Once you visit the robots.txt file, you will see the code which you are using in your custom robots.txt file. See the below screenshot:

custom robots

Final Words:

Are we through then bloggers? Are you done adding the Custom Robots.txt in Blogger? It was easy, once you knew what those code words meant. If you couldn't get it for the first time, just go again through the tutorial and before long, you will be customizing your friends' robots.txt files.

In any case, from SEO and site ratings it's important to make that tiny bit of changes to your robots.txt file, so don't be a sloth. Learning is fun, as long as its free, isn't it?

8 comments:

  1. Thank's a lot for your post, but I have a question, it's probably in domaing blogspot.com to .com or .net ?? or not it's possible?

    ReplyDelete
  2. Nice Post Admin ... Thnkx nice Article ... :! :D :)

    http://w2bz.blogspot.com/

    ReplyDelete
  3. Learning is fun, i like your final words.... !

    ReplyDelete
  4. I think , we should remove "Disallow :/search" , so google will index tags in blogger .

    ReplyDelete
  5. Anonymous7/22/2014

    showed page cant found message while looking for robots.txt file e.g. http://myblog.blogspot.com/ even though changed the url for my blog??

    ReplyDelete
  6. i have written this Allow: / Sitemap: http://myblog.blogspot.com/atom.xml?redirect=false&start-index=500&max-results=1000
    is it okay admin thanks http://blackoxs.com

    ReplyDelete
  7. "If you haven't touched your robots.txt file yet, it should look something like this.."

    Well, all I found in Template > Custom robots.txt > Edit > Enable was a completely empty box! Nothing was written. I've never touched the file because, till I came across your post today, I was too afraid I would mess up things. How come it's empty?

    At any rate, should I leave it as it is? Or overlook it and just copy what was supposed to be there from your post, ie. "User-agent: Mediapartners-Google.. etc"?

    Thank you in advance.

    ReplyDelete
    Replies
    1. Hi Demi,

      The box is supposed to be empty so that you can override the default settings. Just copy paste the code above inside that empty box and add your own settings.

      Delete