Robots.txt Generator

Generate robots.txt files to control how search engines crawl your site. Select a user agent, add allow or disallow rules for specific paths, include your sitemap URL, and set crawl delays. Copy the output directly to your site's root.

FAQ

robots.txt must be at the root of your domain: https://example.com/robots.txt. Search engines look for it at exactly that location. It doesn't work in subdirectories - /blog/robots.txt will be ignored.

Major search engines (Google, Bing, etc.) respect robots.txt. However, malicious bots, scrapers, and spambots routinely ignore it. robots.txt is a polite request, not a security mechanism. Use proper authentication for sensitive content.

Disallow in robots.txt prevents crawling but the page can still appear in search results (e.g., if linked from elsewhere). Noindex (meta tag or HTTP header) prevents the page from appearing in search results entirely. Use Disallow for crawl budget management; use Noindex to hide content from search.

Google Search Console has a robots.txt Tester tool under Settings > Crawling. It lets you test specific URLs against your robots.txt rules to see if they're blocked or allowed. You can also validate syntax before deploying to avoid accidentally blocking your entire site.

Yes! Each User-agent: block defines rules for a specific crawler. For example, you could allow Googlebot full access while restricting Bingbot. Use * as the user agent for rules that apply to all crawlers not specifically named.