Shopify robots.txt guide for 2026

This Shopify robots.txt guide covers what the file actually controls, the defaults Shopify ships, how to add custom rules through the new robots.txt.liquid template, and the AI bot blocks every store should consider in 2026. The short version: you can now edit it, you should block the bots that scrape without sending traffic, and you should never use it to deindex pages.

For years Shopify locked the robots.txt file. Then in mid-2021 they shipped robots.txt.liquid, which means you can finally add, remove, or change directives without leaving the platform. Most stores still run the defaults and miss the chance to control crawling.

The 2026 reality is that AI crawlers now make up a meaningful chunk of bot traffic, and most of them do not send referral visits. Knowing which ones to allow and which to block is the new robots.txt skill.

What robots.txt actually does

Robots.txt is a plain text file at the root of your domain that tells crawlers which paths they may or may not request. It is a politeness protocol, not a security wall. Well-behaved bots respect it. Bad actors ignore it.

Two things robots.txt does well: stop crawlers from wasting your bandwidth on low-value paths, and stop search engines from spending crawl budget on URLs that should not be indexed. Two things it does badly: hide pages from search results (use noindex), and protect sensitive data (use auth).

You can pull and inspect any store’s live file with our free Sitemap Checker, which also surfaces the sitemap entries declared inside it.

Shopify’s default robots.txt

Out of the box Shopify generates a sensible robots.txt for every store. It blocks crawlers from cart, checkout, account, and search result pages, and it declares the sitemap. Here is the gist of what ships:

User-agent: *
Disallow: /admin
Disallow: /cart
Disallow: /orders
Disallow: /checkout
Disallow: /account
Disallow: /collections/*sort_by*
Disallow: /*/collections/*sort_by*
Disallow: /collections/*+*
Disallow: /collections/*%2B*
Disallow: /search
Sitemap: https://example.myshopify.com/sitemap.xml

The defaults handle 80% of stores. The remaining 20% care about parameter handling, sub-collection bloat, AI bot blocking, or letting specific tools through. That is when you reach for the template override.

Custom robots.txt on Shopify

Editing robots.txt on Shopify is a theme task, not a settings task. Open your live theme, go to Edit code, and create a new template called robots.txt.liquid under Templates. Shopify will use that file to generate /robots.txt from then on.

The cleanest pattern is to start from Shopify’s default output and add to it instead of replacing it wholesale. Use the robots object Shopify exposes inside the template:

{% for group in robots.default_groups %}
{{- group.user_agent }}
{% for rule in group.rules -%}
{{ rule }}
{% endfor -%}
{%- if group.sitemap != blank %}
{{ group.sitemap }}
{% endif %}
{% endfor %}

User-agent: GPTBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

Save the template, then visit yourdomain.com/robots.txt to confirm the changes are live. If you need to generate the rules without writing Liquid, the Robots.txt Generator outputs a ready-to-paste block.

Blocking AI bots

The big question for 2026 is whether to let AI crawlers index your catalog. There is no single right answer. Block everything and you give up referral traffic from ChatGPT, Perplexity, and Gemini. Allow everything and you let model trainers scrape your copy without compensation.

Here is the practical split most stores land on: allow the bots that send users to your site, block the ones that only train models.

Allow: OAI-SearchBot, PerplexityBot, ChatGPT-User, Claude-User, Google-Extended (these run live retrieval and can drive referral clicks).
Block: GPTBot, anthropic-ai, CCBot, Google-Extended for training only, FacebookBot, Bytespider (these scrape for training and rarely send traffic).

Audit which bots are hitting your store first. Our free AI Bot Checker shows the top AI crawlers requesting any URL, so you know which user agents are worth a directive before you write the rules.

Stores that depend on AI search visibility should pair the allow list with proper structured data. The Shopify AEO guide walks through what to feed those bots once you let them in.

Allowing Google and Bing

The biggest robots.txt mistake is accidentally blocking Google. It usually happens when a store copies a “block everything” rule from a staging site, forgets to remove it, and watches traffic die over the next two weeks.

The safe pattern is an explicit allow for the search engines you care about, even if you have a permissive default. Belt and braces:

User-agent: Googlebot
Allow: /

User-agent: Googlebot-Image
Allow: /

User-agent: Bingbot
Allow: /

Googlebot-Image deserves its own line because image search drives a meaningful share of clicks for stores with strong product photography. Block it and you lose that channel entirely.

Common directives

Four directives cover almost everything you will write:

User-agent: which bot the rules apply to. * means all.
Disallow: path the bot may not request. Empty value means nothing is disallowed.
Allow: exception inside a Disallow. Use sparingly.
Sitemap: absolute URL to your sitemap. You can list multiple.

Wildcards are supported by Google but not by every bot. Disallow: /*?*sort= will catch every URL with a sort parameter. Use them to clean up faceted navigation paths that bloat your crawl budget.

Testing your robots.txt

Never push a robots.txt change without testing it. The cost of a typo is two weeks of lost traffic.

Open Google Search Console and use the URL Inspection tool on a sample URL. It tells you whether the page is allowed or blocked by robots.txt.
Run a few important URLs through our Sitemap Checker to confirm they are still reachable.
Check the live /robots.txt in a private browser window after every save.
Watch GSC coverage for the next seven days. Sudden spikes in “Blocked by robots.txt” mean a rule is too aggressive.

Robots vs noindex

Robots.txt and noindex solve different problems and are often confused. Disallowing a URL in robots.txt tells crawlers not to fetch it. Noindex tells crawlers not to show it in search results.

Here is the trap: if you disallow a URL in robots.txt, Google cannot crawl it, which means Google cannot see the noindex tag, which means the URL can still appear in results as a bare link with no snippet. To deindex a page properly, allow the crawl and add a noindex meta tag.

Use robots.txt to save crawl budget. Use noindex to control what shows up in search. They are not interchangeable.

Stores migrating from another platform often inherit a confused mix of both. The WooCommerce to Shopify migration guide walks through the cleanup steps, and the 2026 SEO checklist covers the audit you should run after.

If you are also rethinking how variants are split into separate products for indexing, Rubik Combined Listings handles the structural piece on the Shopify side.

FAQ

Can I edit robots.txt on Shopify?

Yes. Create a robots.txt.liquid template in your theme code editor. Shopify will use it to generate the live file at /robots.txt.

Should I block GPTBot?

If you do not want OpenAI training models on your content, yes. GPTBot is the training crawler. ChatGPT-User and OAI-SearchBot are the live retrieval bots that can send referral traffic, so consider keeping those allowed.

Will blocking robots.txt remove pages from Google?

No. Pages can still appear as bare URLs in results because Google knows they exist from links elsewhere. Use a noindex meta tag to deindex.

Where do I find Shopify’s default robots.txt?

Visit yourdomain.com/robots.txt in any browser. It is generated automatically until you create a robots.txt.liquid template.

Does Shopify support wildcards in robots.txt?

Yes. The default file already uses wildcards for sort and filter parameters, and you can add your own in robots.txt.liquid.

How long until Google notices a robots.txt change?

Google refetches robots.txt roughly every 24 hours. You can force a re-read in Google Search Console under the robots.txt report.

Next step: open your theme code, create robots.txt.liquid, paste in the default loop plus your AI bot blocks, and verify the live file before lunch.