What is robots.txt?
robots.txt
Definition, formula, India benchmarks, and the operator-grade nuance behind it.
robots.txt is a plain-text file at the root of a domain that tells web crawlers which paths they can access. It's the first request crawlers make. robots.txt does not prevent indexing (use noindex meta for that) — it controls crawl behavior.
robots.txt controls crawler access.
Doesn't prevent indexing — use noindex meta tag for that.
Frameleads robots.txt explicitly allows 21 LLM bots (already shipped).
robots.txt is a text file at /robots.txt with User-agent and Allow/Disallow directives controlling crawler access.
robots.txt: User-agent: <bot> + Allow/Disallow: <path>The operator's read on robots.txt
robots.txt is the gatekeeper for crawler access. Common pattern: Disallow /api/ and /_next/ to prevent bot waste; Allow / for everything else. Per-bot rules let you allow LLM crawlers (GPTBot, ClaudeBot, PerplexityBot) while controlling lower-value bots. Important: robots.txt is publicly visible — anyone can read it. Don't put sensitive paths there (use auth + noindex instead).
India 2026 benchmarks — robots.txt
- Frameleads robots.txt allows: 21 LLM/AI crawlers explicitly
- Disallow patterns: /api/, /_next/ (build artifacts)
- Sitemap reference: required (helps crawlers find sitemap)
- Crawl-delay: rarely used in 2026 (modern crawlers self-throttle)
- Per-bot directives: Most effective for LLM-bot routing
Common mistakes to avoid
- Putting sensitive paths in robots.txt (publicly visible).
- Confusing robots.txt with noindex (different mechanisms).
- Disallow / accidentally (kills entire site indexing).
- Not updating after adding new bot user-agents.
Frequently asked questions
What's a typical robots.txt value in India?
India 2026 benchmarks vary by category: Frameleads robots.txt allows: 21 LLM/AI crawlers explicitly; Disallow patterns: /api/, /_next/ (build artifacts); Sitemap reference: required (helps crawlers find sitemap). Bands compress in saturated CPM regimes and widen as products move from impulse to considered. The right benchmark for your business depends on stage, gross margin, and channel mix.
What are the most common mistakes when tracking robots.txt?
Three mistakes recur most often: Putting sensitive paths in robots.txt (publicly visible).; Confusing robots.txt with noindex (different mechanisms).; Disallow / accidentally (kills entire site indexing).. The simplest defense is to define each metric explicitly in your reporting playbook and avoid mixing definitions across teams.
How does robots.txt relate to other unit-economics metrics?
robots.txt is most useful in context. Pair it with SITEMAP and NOINDEX to build a complete picture. robots.txt alone can mislead — the relationship between metrics matters more than any single number.
Should I optimize robots.txt or accept industry-standard values?
Optimization depends on your stage. Early-stage businesses often have robots.txt values outside healthy bands and need to fix structural issues (audience, creative, retention) before chasing the metric. Established businesses can compound through marginal improvements. Frameleads' Growth System maps which lever moves which metric in your specific category.
How robots.txt behaves per industry
robots.txt is a universal metric, but its band, drivers, and optimisation levers vary by category. Drill into the industry-specific version below for the deep view.
- robots.txt for Real Estate DevelopersCAC 3,500–35,000 ₹ · CPC 40–280 ₹Open
- robots.txt for D2C BrandsCAC 250–2,200 ₹ · CPC 8–60 ₹Open
- robots.txt for B2B SaaS StartupsCAC 15,000–3,00,000 ₹ · CPC 50–1,200 ₹Open
- robots.txt for Healthcare Clinics & HospitalsCAC 500–15,000 ₹ · CPC 15–250 ₹Open
- robots.txt for Education & EdTechCAC 400–4,500 ₹ · CPC 12–160 ₹Open
- robots.txt for Financial ServicesCAC 1,500–20,000 ₹ · CPC 30–950 ₹Open
- robots.txt for Professional ServicesCAC 800–12,000 ₹ · CPC 20–500 ₹Open
- robots.txt for Restaurants, Cafes & Cloud KitchensCAC 150–2,500 ₹ · CPC 8–120 ₹Open
- robots.txt for Fashion & Apparel D2CCAC 200–1,200 ₹ · CPC 10–55 ₹Open
- robots.txt for Gyms, Studios & Fitness AppsCAC 250–1,800 ₹ · CPC 12–80 ₹Open
- robots.txt for Automotive Dealers & OEMsCAC 600–4,500 ₹ · CPC 18–120 ₹Open
- robots.txt for Manufacturing & MSMEsCAC 3,000–35,000 ₹ · CPC 25–220 ₹Open
Questions about robots.txt
Long-form guides on related topics
Pair this with
Sources & references
Cited primary and analyst sources. Independent of Frameleads' own data.
- IBEF — India Brand Equity Foundation: Indian Industry Reports — IBEF (Ministry of Commerce & Industry)
Sector-level market size, growth, and policy context for Indian industries.
- IAMAI — Internet & Mobile Association of India — IAMAI
Digital advertising industry body; reports on India internet user base, ad spend, and platform shares.
- MoSPI — Ministry of Statistics and Programme Implementation — Government of India
Primary source for India macro-economic indicators (CPI, GDP, household consumption).
- ASCI Code for Self-Regulation of Advertising in India — Advertising Standards Council of India
Mandatory baseline for all advertising claims in India — including digital, influencer, and comparative ads.
Want this audited for your business?
Get a free 30-minute marketing audit from a senior Frameleads consultant. No slides — just an honest read of where your funnel leaks, with the three highest-leverage moves we'd make first.