How to Tell Bots What to Do with the Robots Meta Tag
The robots meta tag provides instructions to web crawlers (robots). It is an alternative to a robots.txt file and is implemented on a page-by-page basis by adding a meta tag to the head of an HTML document.
You won't find the robots meta tag mentioned in HTML5 because it isn't formally part of the specification. The tag was proposed during a W3C workshop in 1996 and it's use was explained in Appendix B of HTML 4.01 in December of 1999. However, the meta robots tag has never been officially added to the HTML specification.
That doesn't mean that search engines don't care about the tag. While unscrupulous web crawlers can (and do) ignore the tag, search engine web crawlers look at the robots meta tag for instructions on how to treat the contents of a webpage.
The basic syntax of the meta robots tag is quite simple:
<meta name="robots" content="instructions go here">
To use it, added the tag to the
head element of every webpage that needs to provide instructions for web crawlers. Unlike the robots.txt file which provides sitewide instructions, the robots meta tag only applies to the page on which it appears.
Robots.txt vs Robots Meta Tag
On the surface, robots.txt seems strategically superior to the robots meta tag because robots.txt keeps the web crawler instructions for an entire site in one place.
Unfortunately, search engines don't care about convenience. When it comes to keeping URLs from being indexed, using
disallow in robots.txt is not as effective as using
noindex in a robots meta tag.
Don't believe us? Then believe SEO experts Moz. They've presented a compelling case for choosing the robots meta tag over robots.txt.
Telling Robots What to Do
In the syntax example we used
"instructions go here" as a placeholder. There are several different values that can be used in their place.
Index and Noindex
One thing you can tell a web crawler is whether or not you want the page indexed.
<!-- index this page, the default behavior --> <meta name="robots" content="index"> <!-- don't index this page --> <meta name="robots" content="noindex">
By default, web crawlers assume they have a green light to index every webpage. So if you want a page indexed, just omit the robots meta tag. However, if you don't want a page indexed add the tag with a
Follow and Nofollow
By default, web crawlers follow every link on a webpage and index those linked pages (unless they have instructions that prevent indexing). So adding a robots meta tag with the
follow value is optional. However, use
nofollow if you don't want web crawlers to follow the links on a page.
<!-- follow all links, the default behavior --> <meta name="robots" content="follow"> <!-- don't follow any links --> <meta name="robots" content="nofollow">
Instructions can piggyback on each other as well. So if you want to add both
noindex to a webpage, you can add both at once like this:
<meta name="robots" content="noindex, nofollow">
All or None-Thing
The following combinations are common:
You can use the shorthand
none instructions to make this happen.
<!-- equivalent to content="follow, index", the default behavior-- > <meta name="robots" content="all"> <!-- equivalent to content="nofollow, noindex"-- > <meta name="robots" content="none">
Specialized Meta Robots Tags
name="robots" and you'll be addressing every web crawler that's listening. However, you can target one web crawler at a time.
<!-- instructions for google's web crawler --> <meta name="googlebot" content="instructions go here"> <!-- instructions for yahoo's web crawler --> <meta name="slurp" content="instructions go here"> <!-- instructions for bing and msn web crawlers --> <meta name="bingbot" content="instructions go here"> <!-- instructions for ask's web crawler --> <meta name="teoma" content="instructions go here">
There are additional keywords you can use to deliver targeted instructions. However, it's worth noting that not all search engines pay attention to all of these commands.
content="noimageindex": instructs web crawlers not to index any images that appear on a webpage. However, if those images appear on other webpages they will be indexed. To prevent indexing of images add robot instructions to the HTTP header delivered with the image file.
content="noarchive": instructs web crawlers to index the webpage without caching a complete copy of the webpage.
content="nosnippet": instructs search engines not to display a snippet when the page is shown in the search results and also prevents caching of the page.
content="noodp": instructs search engines not to use the page description from ODP as the snippet in search results.
content="noydir": instructs Yahoo not to use the page description from the Yahoo Directory as the snippet in search results.
content="notranslate": instructs Google not to offer to translate the web page.
content="unavailable_after: [RFC 850 date/time]": instructs Google that the page should not appear in search engine results after a specific date and time.
Most robots meta tags are pretty simple. However, if you're planning on providing complex instructions here are a few resources where you can use to learn more about these tags.
- Robots.txt Ultimate Guide: learn how to use a robots.txt file as an alternative to the robots meta tag.
- The Ultimate Guide to the Meta Robots Tag: this post includes a table that will lists the commands different search engine robots pay attention to.
- Robots Meta Tag and X-Robots: learn how Googlebot handles robot instructions.
- About the Robots <META> tag: an official overview of the robots meta tag.
Adding Robots Meta Tags to Your Website
If you want to add robots meta tags to your site you can copy the tags presented in this article and paste them into your site's HTML. In addition, there are tools you can use to generate custom instructions and to automatically add tags to webpages generated by content management systems.
- Advanced Meta Tag Generator and Google Search Results Preview: create meta tags, including robot tags, with this tool and see a preview of how Google will display your site with those rules in effect.
- Free Meta Tag Generator: create meta tags, including robot tags, in plain HTML. This tool does create an extra
name="generator"tag that you will probably not want to use.
- WordPress Plugin, Meta Tag Manager: easily add a wide range of meta tags to individual pages.
- WordPress Plugin, GA Meta Tags: easily set meta tags on a site-wide basis with this simple plugin.
- Joomla Plugin, Easy Frontend SEO: easily control robot meta tags.
- Joomla Plugin, Tag Meta: a simple meta tag extension.
- Joomla Plugin, Meta Robots: create tags with a radio-button interface.
- Drupal Module, Custom Meta: create any type of meta tag with a simple form.
- Drupal Module, Meta Tags Node Type: add meta tags on a per-node basis.
Please, Control Your Robots
If you want to control how web crawlers index your site, the robots meta tag is a good mechanism. It's easy to understand, easy to implement, and can have a powerful impact on how search engines index your website.
Use it well and the robots meta tag will ensure search engines treat your site just the way you want them too. Use it poorly and search engines may forget all about you.
Just make sure you use it with care.
Further Reading and Resources
We have more guides, tutorials, and infographics related to coding and website development:
- Composing Good HTML: this is a solid introduction to writing well-formed HTML and using HTML validator software.
- CSS3 — Intro, Guides & Resources: this is a great place to start learning webpage layout.
- ASP.NET Resources: this guide will get you going with Microsoft's .NET framework for creating webpages.
HTML for Beginners — Ultimate Guide
If you really want to learn HTML, we've created a book-length article, HTML for Beginners — Ultimate Guide And it really is the ultimate guide; it will take you from the very beginning to mastery.