The Robots.txt file controls how search engines crawl a website. That means it plays an important role in the search engine optimization of Blogger blogs. In this article, we will understand the best implementation of the robots.txt files in the Blogger blog.
What is the function of the robots.txt file?
With the help of the robots.txt file, we tell search engines what pages should and shouldn’t be crawled. Thus, it allows us to control the activity of search engine bots.
|How to Create Perfect Custom Robots.txt File for Blogger
In the robots.txt file, we use user-agent, allow, disallow, sitemap functions to declare search engine bots, crawled pages, and disallowed pages.
Normally, we use commands for all search engine crawlers to index pages across the entire web. However, for more details, you must understand the robots.txt file for the Blogger blog.
Best Robots.txt File for Blogger Blog
To create the perfect custom robots.txt file for Blogger BlogSpot. First, we have to understand the function of the Blogger blog. For this, parse the default robots.txt file.
By default, this file looks like this:
- The first line of this file declares the type of bot. This is Google AdSense, no use allowed. That means AdSense ads can appear on the entire site.
- The next user agent is *, which means all search engine bots are not allowed to enter /search pages. That means disallowed for all search pages and labels (due to the same URL structure).
- And the allow tag specifies that all pages other than the disallow section will be allowed to crawl.
- The next line contains the post sitemap for the Blogger blog.
This is an almost perfect file for controlling search engine bots and providing instructions for pages to crawl or not to crawl. Please note, here, what is allowed to crawl will not guarantee that the pages will be indexed.
But this file allows archive pages to be indexed, which can cause duplicate content issues. That means it will create garbage for the Blogger blog.
We have to prevent this duplicate content problem from being caused by the storage. That can be achieved by preventing bots from crawling the archive. For this, we have to apply the Disallow/20* rule to the robots.txt file. But this rule will stop the crawling of pages. So to avoid this, we have to apply a new permission rule to the /*.html section that allows bots to crawl posts and pages.
The default sitemap includes posts, not pages. So you must add sitemaps for pages located in https://example.blogspot.com/sitemap-pages.xml or https://www.example.com/sitemap-pages.xml for custom domain.
So the perfect new robots.txt file for Blogger blogs would look like this
Sitemap: https:/ /www.example.com/sitemap-pages.xml
You must replace www.example.com with your Blogger domain or a custom domain. For example, assuming your custom domain is www.iashindu.com, the sitemap will be at https://www.iashindu.com/sitemap.xml. Alternatively, you can check the current robots.txt at https://www.example.com/robots.txt.
The file above, settings are best practice for robots.txt as well as for SEO. This will save your site’s crawl budget and will help your Blogger blog appear in search results. Along with that, you have to write SEO-friendly content to appear in search results.
For the best possible settings for robots.txt and the robots meta tag, try the advanced robots meta tag and the robots.txt file. This combination is one of the best practices for boosting the SEO of a Blogger blog.
How to edit the robots.txt file of a Blogger blog?
The Robots.txt file is always located at the base level of any web page. But in Blogger, there is no root access, so how to edit this robots.txt file?
Blogger provides all the original file settings in its settings like robots.txt and ads.txt files. You must log in to your Blogger account and edit the robots.txt file.
- Go to Blogger Dashboard and click on the settings option,
- Scroll down to crawlers and indexing section,
- Enable custom robots.txt with a toggle button.
- Click on custom robots.txt, a window will open, paste the robots.txt file, and update.
After updating the custom robots.txt file, test it by visiting https://www.example.com/robots.txt, where www.example.com will be replaced with the address of your domain.