Skip to main content
SlapMyWeb
Skip to tool
Free tool

Robots.txt tester

Paste your robots.txt and instantly check if a specific URL is crawlable by Googlebot, Bingbot, or any user agent. Catches wildcards and Allow/Disallow precedence.

Allowed
Googlebot · /admin/
parsed rules
User-agent: *
Disallow: /admin/
Disallow: /private/
Allow: /private/public/
User-agent: Googlebot
Disallow: /no-google/

What is Robots.txt Tester?

A robots.txt tester is a critical SEO debugging tool that lets you verify whether search engine crawlers like Googlebot, Bingbot, and others are allowed or blocked from accessing specific URLs on your website. The robots.txt file sits at your domain root and controls which pages search engines can crawl and index. A single misconfigured rule can accidentally block your entire site from Google, or expose private admin pages to crawlers. This tool parses your robots.txt content, evaluates Allow and Disallow directives with proper precedence rules, handles wildcard patterns (* and $ end-of-string anchors), and tests against multiple user agents. It follows the same matching logic that real search engine crawlers use, so you can catch crawl blocking issues before they tank your rankings. Whether you are launching a new site, migrating URLs, or debugging indexing problems, testing your robots.txt is one of the first steps in any technical SEO audit.

How to Use Robots.txt Tester

  1. 1

    Paste your robots.txt content

    Copy your robots.txt file content and paste it into the editor on the left. You can paste the entire file including multiple User-agent blocks, Allow, Disallow, and Sitemap directives.

  2. 2

    Enter a URL path and select a user agent

    Type the URL path you want to test (e.g., /admin/settings) and choose a crawler from the dropdown -- Googlebot, Bingbot, DuckDuckBot, Yandex, Baiduspider, or the wildcard * agent.

  3. 3

    Check the result instantly

    The tool immediately shows whether the path is Allowed or Blocked for that user agent, which specific rule matched, and a full breakdown of all parsed rules color-coded by type.

Features

  • Tests Allow vs Disallow precedence using longest-match-wins logic, matching real crawler behavior
  • Supports wildcard patterns including * (match anything) and $ (end-of-URL anchor)
  • Pre-configured user agents for Googlebot, Bingbot, DuckDuckBot, Yandex, and Baiduspider
  • Parses and displays all rules organized by User-agent block with color-coded Allow/Disallow labels
  • Shows the exact matched rule so you can trace why a URL is blocked or allowed
  • Runs entirely in your browser -- your robots.txt content never leaves your device

Related Tools

Frequently Asked Questions

How does Allow/Disallow precedence work in robots.txt?+
When both an Allow and Disallow rule match a URL, the most specific (longest) rule wins. If they are equal length, Allow takes precedence. For example, "Disallow: /private/" blocks /private/page but "Allow: /private/public/" specifically permits that subfolder.
Does robots.txt actually prevent pages from being indexed?+
No. Robots.txt only controls crawling, not indexing. Google can still index a URL it has not crawled if other pages link to it. To truly prevent indexing, use a "noindex" meta robots tag or X-Robots-Tag HTTP header instead of or in addition to robots.txt.
What does the * wildcard mean in robots.txt?+
The asterisk (*) matches any sequence of characters in a URL path. For example, "Disallow: /*.pdf$" blocks all URLs ending in .pdf. The dollar sign ($) anchors the pattern to the end of the URL so /pdf-guide/ would not be blocked.
Should I block Googlebot from crawling CSS and JavaScript files?+
No. Google needs to render your pages to evaluate content quality and mobile-friendliness. Blocking CSS/JS files via robots.txt prevents Googlebot from rendering your pages properly, which can significantly hurt your rankings.
What happens if my robots.txt has no User-agent: * block?+
Crawlers that do not find their specific User-agent listed and there is no wildcard (*) block will assume all URLs are allowed. It is best practice to always include a User-agent: * section as a catch-all for unlisted crawlers.
How often does Google re-fetch my robots.txt file?+
Google typically caches and re-fetches robots.txt roughly every 24 hours, though it can vary. If you make urgent changes, you can use the robots.txt Tester in Google Search Console to request a refresh, or wait for the cache to expire naturally.