Shopify robots txt guide: what to allow, what to block, and why AI bots matter

Shopify robots.txt file is the first thing that every web crawler will read before crawling over your store. Google, Bing, Ahrefs, GPTBot, ClaudeBot, PerplexityBot and many more will all “knock” on your site for indexing and your robots.txt file will tell them whether they are let in or not. Make a small mistake with this file and you could be wasting your store’s crawl budget on filter pages, or unknowingly be excluded from the new AI search results that are recently generating significant traffic.

This guide covers what Shopify includes in the default Header.Тhis guide covers a couple of areas, what Shopify ships by default, what you actually need to change, how to customize the file through your robots.txt.liquid file, and a rant in 2026 about why blocking AI crawlers is the equivalent of dropping your phone and wondering why nobody rings you.

What robots.txt actually does

The robots.txt file is a plain text file at the root of your site (http://yourstore.com/ robots.txt). Its purpose is to instruct web crawlers which directories and files on a website that the crawlers can or cannot download. It does NOT hide content from search engines. It does NOT control what gets indexed. It only tells the crawlers what to web scrape in the first place. A very different function from common misunderstanding and one that is critical to search engine crawling.

If you want to remove a page from Google’s index, you can use the noindex meta tag or the URL removal tool in Search Console. robots.txt on the other hand will only tell the Google crawler not to visit a page. However, a page that is blocked in this way can still show up in search results if other sites include a link to it.

What Shopify ships by default

By default, every Shopify store has a pre-filled robots.txt that prevents crawlers from indexing checkout, cart, admin, policies, search results, gift card pages and URL parameters that cause faceted duplication. However, product and collection pages are still allowed. Additionally, the default Shopify robots.txt points to your sitemap. For 90% of Shopify stores, this default setting will be optimal.

Check yours now. Open yourstore.com/robots.txt in a browser. See the defaults. Then decide whether you need to change anything. Most stores don’t need to change any of the defaults. Some stores absolutely do.

How to customize the file on Shopify

In Shopify you can now edit the robots.txt as of 2021 directly through your store theme. Shopify introduced the ability to edit the robots.txt within a theme template file called: robots.txt.liquid. Here is how to do it.

Online Store, Themes, Edit code.
Under Templates, click Add a new template.
Pick robots.txt from the dropdown.
Edit using Liquid rules to add, remove, or replace directives.
Save. Changes propagate within a few minutes.

*_DO_NOT_SET_THIS_WHOLE_TEMPLATE_MANUALLY_*I found out that you should use Liquid to add functionality to the default template that shopify provides. This default is updated by shopify and if you manually set the template you can leak pages that you should block while at the same time not getting updates that shopify does to that default.

Not comfortable editing Liquid? There are free robots.txt generators available on the web that can help you generate the necessary directives first, and then you can copy and paste the output into the Liquid template. These generators understand Shopify-specific filepaths like cart, checkout, policies and search, and will not produce the same kind of broken patterns that can break crawling that using a general-purpose robots.txt generator might cause.

What to allow, what to block

Path	Rule	Why
/products/*	Allow	Revenue pages, always crawlable
/collections/*	Allow	Category pages rank
/pages/*	Allow	About, contact, content pages
/blogs/*	Allow	Content marketing
/cart	Block	No SEO value
/checkout	Block	Private
/search	Block	Infinite URL variants
/?sort_by=	Block	Duplicate content
/?filter=	Block	Faceted duplication

AI bots and why they matter now

Instead of telling people to block AI bots in 2023 so that your content isn’t scraped for future model training, that advice is now passé and actually costing stores money in 2026.

ChatGPT,Claude,Perplexity,Sidekick,Gemini now answer shopping-related questions. When users like you ask questions like “what are the best combined listings apps for Shopify?” you get a straightforward answer with a citation to drive some clicks. Some of those clicks are going to shops like this retail electronics store, and those answers were written by some variation of a bot like GPTBot, ClaudeBot, or PerplexityBot, or Google-Extended.

If you block them in robots.txt, your store is invisible to them. You are not in the training data. You are not in the live retrieval index. You will not be cited. Meanwhile your competitor who left them allowed is getting quoted in every ChatGPT answer in your category. That is the GEO gap. It is widening fast.

Use this free AI bot checker to see what AI crawlers your robots.txt does and does not allow. GPTBot, ClaudeBot, PerplexityBot should not be blocked. More on AEO and AI readiness here.

Strong opinion: Stupid, stupid, stupid. By 2026 we’ll all have AI bots banned from commenting on sites like this because nobody is smart enough to see that you’re not loosing (losing) any sales by allowing and LLM to read your product page to generate an answer. Because we’re geniuses, we recognize that this LLM activity will only serve to add value to your site by providing additional answers that give prospects free distribution for your sales pitch. Seize the ball and run with it.

Testing and auditing

Three checks to run monthly:

Google Search Console robots.txt tester. Catches syntax errors.
The AI bot checker. Confirms which LLM crawlers are allowed.
Server log grep for user-agent. See which bots actually showed up.

Common mistakes

The most classic mistake of disasters I see is to not remove the Disallow: / directive left in a staging theme, which in a matter of weeks can cause an entire store to deindex and only come to your attention at 11pm the day it happens. Check your production robots.txt the moment you push a new theme.

Other frequent problems: blocking /collections/ because someone confused it with /search. Blocking CSS and JS paths that Google needs to render the page. Forgetting the sitemap line.

Robots.txt generator: draft Shopify-safe directives
AI bot checker: see which crawlers are allowed
All free Shopify tools

See the live demo store, watch the tutorial video, or read the getting started guide.

Try Rubik Variant Images free

Try Rubik Combined Listings free

FAQ

Can I edit robots.txt on Shopify?

A robots.txt template was added in 2021, and you can customize the directives by adding a robots.txt.liquid template under your theme code. Defaults are provided, but should be sensible.

Does Shopify block AI bots by default?

It allows all standard user agents, including AI crawlers. I think you would have to have manually changed this to block Shopify data gatherers. The default settings include this: User-agent: robots # Added by robot – http://www.robotstxt.org/docs/generic/

Should I block GPTBot and ClaudeBot?

No. Allow them. AI search engines now drive real traffic through citations. Blocking them will remove your store from ChatGPT, Claude, and Perplexity answers in your category.

What happens if I accidentally block everything?

Google will start deindexing your store within days. Organic traffic will collapse while you sit idle. Fix the file ASAP and get your store recrawled in Search Console.

Should I block faceted filter URLs?

Yes. There are some url paths that generate infinite duplicates with sort_by, filter, etc parameters. This should be blocked in order to save some crawl budget and not screw up duplicate content signals.

Does robots.txt remove pages from Google?

This will not address the indexing issue. To keep a page from being indexed, you should use a noindex meta tag.

How do I check which AI bots are crawling my store?

Use the AI bot checker tool on your site to see which known LLM crawlers are currently allowed or blocked.

Shopify robots txt guide: what to allow, what to block, and why AI bots matter

In this post

What robots.txt actually does

What Shopify ships by default

How to customize the file on Shopify

What to allow, what to block

AI bots and why they matter now

Testing and auditing

Common mistakes

FAQ

Can I edit robots.txt on Shopify?

Does Shopify block AI bots by default?

Should I block GPTBot and ClaudeBot?

What happens if I accidentally block everything?

Should I block faceted filter URLs?

Does robots.txt remove pages from Google?

How do I check which AI bots are crawling my store?

Umid

How to make Shopify variant swatches match your dark theme store with one click

Shopify swatches not showing on collection page: how to fix it (2026)

How to set different product descriptions per variant on Shopify (combined listings)

Links

Our Apps

Other

In this post

What robots.txt actually does

What Shopify ships by default

How to customize the file on Shopify

What to allow, what to block

AI bots and why they matter now

Testing and auditing

Common mistakes

Related tools on this site

FAQ

Can I edit robots.txt on Shopify?

Does Shopify block AI bots by default?

Should I block GPTBot and ClaudeBot?

What happens if I accidentally block everything?

Should I block faceted filter URLs?

Does robots.txt remove pages from Google?

How do I check which AI bots are crawling my store?

Related reading

Umid

Shopify ROAS calculator: target ROAS, attribution, and the traps

Select a Shopify variant by clicking its image

You may also like

How to make Shopify variant swatches match your dark theme store with one click

Shopify swatches not showing on collection page: how to fix it (2026)

How to set different product descriptions per variant on Shopify (combined listings)

Links

Our Apps

Other