Key Takeaways
  • **GPTBot** (OpenAI/ChatGPT)
  • **PerplexityBot** (Perplexity)
  • **ClaudeBot** or **anthropic-ai** (Anthropic/Claude)
  • **Google-Extended** (Gemini/Google AI)
  • **CCBot** (Common Crawl, used by many AI training pipelines)

An AI visibility audit is a structured process that reveals whether AI search engines like ChatGPT, Perplexity, Claude, and Gemini are citing your website when users ask questions about your industry. Running one involves checking your technical setup, generating the queries your customers actually ask, sending those queries to every major AI engine, recording which URLs get cited, and then analyzing the gaps between you and your competitors. The full manual process takes 4 to 6 hours. This guide walks through every step so you can do it yourself, or understand exactly what automated tools like GetCited handle on your behalf.

If you have been focused on traditional SEO and haven't checked whether AI engines even know your site exists, this audit will likely surface some uncomfortable truths. Our data from over 200 audits shows that most websites are either invisible to AI search or cited so inconsistently that their visibility is essentially random. The good news is that an AI visibility audit gives you a concrete, prioritized action plan instead of guesswork.

Let's walk through the entire process.

Why You Need an AI Search Audit Right Now

Before we get into the steps, a quick word on timing. The window for treating AI visibility as optional has closed.

ChatGPT now has over 800 million weekly active users. Perplexity processes 780 million monthly queries. Google AI Overviews appear in up to 60% of search results. Your customers are already searching through AI engines, whether you have optimized for them or not.

The problem is that AI search works nothing like traditional search. There is no "page one" to rank on. There is no list of ten blue links. AI engines synthesize a single answer from the sources they trust most, cite a handful of URLs, and that is the entire result. If your site is not among those cited sources, you are not just ranking low. You do not exist in that answer.

An AI visibility audit tells you exactly where you stand. It identifies the queries where your competitors are getting cited and you are not, the technical issues that might be blocking AI crawlers from reading your content, and the specific content gaps you need to fill. Without this information, any optimization effort is a shot in the dark.

The GetCited Framework: Five Pillars of AI Visibility

Before diving into the tactical steps, it helps to understand the strategic framework that an AI visibility audit is built around. At GetCited, we break AI visibility into five pillars. Every step of the audit maps back to one or more of these.

1. Open the Door

AI engines can only cite content they can access. This pillar is about removing technical barriers: making sure your robots.txt allows AI crawlers, your pages load correctly for bots, and your server sends the right signals. If the door is closed, nothing else matters.

2. Introduce Yourself

Once AI crawlers can access your site, they need to understand who you are and what you cover. This is where files like llms.txt come in. Think of llms.txt as a cover letter for your website, written specifically for language models. It tells AI engines what your site is about, what your most important pages are, and how your content is organized.

3. Speak the Language

AI engines parse content differently than humans browse it. This pillar focuses on structuring your content so it is machine-readable: using clear headings, schema markup, direct answers in opening paragraphs, and a high fact-to-word ratio. Content that "speaks the language" of AI engines is dramatically more likely to be cited.

4. Answer the Questions

This is the content strategy pillar. AI engines are answering questions from real people. If your content directly answers those questions better than any competitor's content does, you earn the citation. This pillar is about identifying the exact queries your audience is asking and creating content that addresses them with authority and specificity.

5. Measure and Improve

AI visibility is not something you set and forget. It is volatile. Our research shows that only 30% of brands maintain consistent visibility from one AI-generated answer to the next. This pillar is about ongoing monitoring, regular audits, and iterative improvement based on real data.

Every step of the audit walkthrough below connects back to these five pillars. Steps 1 and 2 address "Open the Door" and "Introduce Yourself." Steps 3 through 7 address "Answer the Questions" and "Measure and Improve." Steps 8 and 9 address "Speak the Language" and tie everything together into an action plan.

Now let's get into the actual process.

Step 1: Check Your robots.txt for AI Crawler Access

This is the first thing to check because it is the most fundamental. If your robots.txt file is blocking AI crawlers, your site is invisible to AI search, period. No amount of content optimization will fix that.

Go to yourdomain.com/robots.txt and read through the file. You are looking for User-agent directives that target known AI crawlers. The main ones to check for are:

If you see Disallow: / next to any of these user agents, that crawler is being blocked from your entire site. Some sites use a blanket User-agent: * with Disallow: / and then selectively allow only specific crawlers like Googlebot, which blocks every AI crawler by default.

How common is this problem? Our audit data shows that 18.9% of websites are actively blocking AI crawlers. Nearly one in five. And many of those blocks are unintentional, the result of overly broad crawler restrictions or security configurations that were set up before AI search existed.

What to record: For each AI crawler, note whether it is allowed, blocked, or not mentioned (which means it falls under whatever the default User-agent: * rule says). Also check whether your site is blocking crawl access to specific directories that contain your most important content.

Time estimate: 10 to 15 minutes.

Step 2: Check for an llms.txt File

Navigate to yourdomain.com/llms.txt. If you get a 404, you do not have one. And you are in the majority. Our data shows that 92% of websites do not have an llms.txt file.

The llms.txt file is a relatively new standard, proposed as a way for websites to communicate directly with large language models. It serves a similar function to robots.txt but instead of telling crawlers what not to access, it tells language models what your site is about and which pages are most important.

A well-structured llms.txt file typically includes:

Not having an llms.txt file will not get you penalized. But having one gives you an advantage over the 92% of sites that do not. It is a direct signal to AI engines about your site's identity and priorities.

What to record: Whether llms.txt exists, and if it does, whether it is well-structured and up to date. If it does not exist, flag this as an action item for later.

Time estimate: 5 minutes.

Step 3: Generate 25 Customer Queries for Your Industry

This is where the audit shifts from technical checks to competitive analysis, and it is the step that requires the most thought.

You need to come up with 25 queries that represent the actual questions your potential customers are asking AI engines about your industry, products, or services. These are not keyword phrases. They are full, conversational questions, because that is how people interact with AI search.

Here is how to approach it. Break your queries into categories:

Awareness queries (8-10 queries): These are broad, top-of-funnel questions where someone is learning about your category. Examples: "What is the best project management software for remote teams?" or "How does AI-powered accounting work?"

Consideration queries (8-10 queries): These are comparison and evaluation questions. Examples: "Is Notion better than Asana for small businesses?" or "Top-rated CRM platforms for real estate agents in 2026."

Decision queries (5-7 queries): These are bottom-of-funnel queries where someone is close to buying. Examples: "Is [your product] worth it?" or "Reviews of [your product] vs [competitor]."

A few principles for writing good audit queries:

  1. Use natural language. Write them the way a real person would type them into ChatGPT, not the way someone would type a keyword into Google.
  2. Include branded and unbranded queries. Some should mention your brand. Most should not.
  3. Vary the specificity. Mix broad category queries with narrow, specific ones. Our research shows that query specificity dramatically affects which domains get cited.
  4. Include queries where you expect to appear. If you have strong content on a topic, include queries about that topic. The audit should test both your strengths and your blind spots.

If you are struggling to come up with 25 queries, look at your customer support tickets, your sales call notes, your site's search logs, and the "People Also Ask" boxes in Google search results for your target terms. These are all windows into what your audience actually wants to know.

What to record: Your full list of 25 queries, organized by category.

Time estimate: 30 to 45 minutes.

Step 4: Send Each Query to All Four Major AI Engines

Now the labor-intensive part begins. Take each of your 25 queries and submit it to:

  1. ChatGPT (chat.openai.com)
  2. Perplexity (perplexity.ai)
  3. Claude (claude.ai)
  4. Google Gemini (gemini.google.com)

That is 100 individual queries. Yes, this takes a while. This is the step that eats most of the 4 to 6 hours the manual audit requires.

A few important notes on methodology:

Use the same exact query text across all four engines. Do not paraphrase or adjust. You want an apples-to-apples comparison of how each engine responds to the identical prompt.

Use a clean session for each query. AI engines use conversation context, so previous questions in a session can influence the next answer. Start a new chat for each query, or at minimum for each batch.

Use the free or standard tier for each engine. Premium tiers may have different model versions and retrieval behaviors. For a consistent audit, stick with the versions your customers are most likely using.

Do this within a single day if possible. AI engine responses change over time. Running all 100 queries within the same window reduces the noise from temporal variation.

For each response, you need to identify every URL or domain that the engine cites, references, or links to. Different engines display citations differently:

What to record: For each query on each engine, log every URL cited. Include the full URL, not just the domain. You will need both the domain-level data and the page-level data later.

Time estimate: 2 to 3 hours.

Step 5: Record Every URL Cited

This step is really about organization. You should now have up to 100 responses (25 queries multiplied by 4 engines), each containing anywhere from 0 to 10+ cited URLs.

Create a spreadsheet with the following columns:

Some responses will have no citations at all, particularly from Claude, which sometimes provides information without linking to specific sources. Record these as zero-citation responses. They are data points too, because they tell you which queries produce answers that AI engines generate from training data alone rather than from live retrieval.

You will likely end up with somewhere between 200 and 500 individual citation records, depending on how citation-heavy the responses are.

What to record: A complete citation log with every URL from every response.

Time estimate: Included in Step 4 if you are recording as you go. If you are doing it after the fact from saved responses, add 30 to 45 minutes.

Step 6: Rank Domains by Citation Frequency

Now you analyze the data. Aggregate your citation records by domain and count how many times each domain was cited across all 100 queries.

Sort the list from most cited to least. You will likely see a steep power curve: a small number of domains will account for a disproportionate share of all citations. This is consistent with what we have found in our research at GetCited. The top-performing domains in any category tend to dominate AI citations, while most domains barely register.

For each domain in your top 20, also calculate:

Where does your domain rank in this list? If you are in the top five, you have strong AI visibility in your category. If you are in the top ten, you have a presence but room to improve. If you are below the top twenty or not on the list at all, you have significant work to do.

What to record: A ranked list of domains by total citations, plus the citation rate, engine coverage, and category distribution for each.

Time estimate: 30 to 45 minutes.

Step 7: Identify Gaps Where You Are Missing

This is the most strategically valuable step. Go back through your 25 queries and identify which ones produced citations to your domain and which did not.

The queries where you are absent are your visibility gaps. These are the conversations your potential customers are having with AI where your brand does not exist.

Organize your gaps into categories:

Total gaps: Queries where you were not cited by any of the four engines. These are your biggest blind spots.

Partial gaps: Queries where you were cited by one or two engines but not the others. These represent opportunities to strengthen existing visibility.

Competitive gaps: Queries where specific competitors are being cited but you are not. Note which competitors appear and which of their pages are being cited.

For each gap, ask yourself a critical question: do you have existing content that should be answering this query? If yes, the problem is likely content quality, structure, or technical visibility. If no, the problem is that you simply have not created the content yet.

This distinction matters because the fix is different. Existing content that is not getting cited needs to be restructured and improved. Missing content needs to be created from scratch.

What to record: A gap analysis showing every query where you are absent, which competitors fill those gaps, and whether you have existing content that should be performing.

Time estimate: 20 to 30 minutes.

Step 8: Audit the Top-Cited Competitor Pages

Now you know which competitors are winning the citations you are not getting. The next step is to understand why.

For each of your top 5 to 10 competitors (by citation frequency), pull up the specific pages that are being cited and analyze them. Here is what to look for:

Word count. How long is the content? AI engines tend to favor comprehensive pages. If the top-cited pages in your category are 3,000+ words and your competing page is 800 words, that gap alone could explain your invisibility.

Heading structure. Map out the H1, H2, and H3 tags on each page. Are they using question-based headings that match the queries people ask? Well-structured headings give AI engines clear signals about what each section covers, making it easier to extract and cite specific information.

Schema markup. Check whether the page uses structured data. You can do this with Google's Rich Results Test tool or by viewing the page source and searching for application/ld+json. Common schema types that support AI visibility include Article, FAQ, HowTo, and Organization. Pages with rich schema markup give AI engines more context about the content, which can influence citation decisions.

Opening paragraph structure. Does the page answer its primary question within the first 200 words? Our research shows that content with direct, upfront answers is significantly more likely to earn AI citations. If competitors are leading with clear answers and your page buries the key information under three paragraphs of introduction, that is a fixable problem.

Data density. Count the number of specific facts, statistics, and concrete claims on the page. Pages with a high fact-to-word ratio (at least one fact per 80 words) are 4.2x more likely to be cited by AI engines. If your competitor's page is packed with specific data and yours is mostly general advice, the AI engine will pick the data-rich source every time.

Internal and external links. Note how the page links to other content on the same domain (internal links) and to external authoritative sources. Strong internal linking helps AI crawlers discover and understand the full scope of a site's content. External links to credible sources signal that the content is well-researched.

Freshness signals. Check the publication date and last-modified date. AI engines heavily favor recent content. If a competitor's page was updated last week and yours was published 18 months ago and never touched, the recency advantage alone could be decisive.

Create a comparison matrix that puts your page side by side with the top-cited competitor pages across all of these dimensions. The areas where you fall short become specific items on your action plan.

What to record: A competitive comparison matrix covering word count, heading structure, schema, opening paragraph quality, data density, linking, and freshness for your pages versus the top competitors.

Time estimate: 45 minutes to 1 hour.

Step 9: Generate Your Action Plan

Everything you have gathered in Steps 1 through 8 now gets synthesized into a prioritized action plan. This is the deliverable that makes the audit worth the effort.

Organize your action items into three tiers:

Tier 1: Technical Fixes (Do This Week)

These are the items from Steps 1 and 2 that are blocking or limiting your AI visibility at a foundational level.

Tier 2: Content Improvements (Do This Month)

These are improvements to existing content, identified in Steps 7 and 8.

Tier 3: New Content Creation (Do Over the Next 60 Days)

These are the total gaps from Step 7, queries where you have no existing content at all.

Prioritization

Not all action items are equally urgent. Here is a simple prioritization framework:

  1. Technical blocks first. If AI crawlers cannot access your content, fix that before anything else.
  2. High-value existing content second. Pages that should be earning citations but are not, especially for decision-stage queries, are the next priority.
  3. New content for gap queries third. Start with the queries that have the most direct revenue impact.

What to record: A prioritized action plan with owners, deadlines, and success metrics for each item.

Time estimate: 30 to 45 minutes.

The Time Problem (And How to Solve It)

If you have been keeping a running tally of the time estimates, you already know the total: the manual version of this audit takes roughly 4 to 6 hours. And that is for someone who knows exactly what they are doing.

The most time-consuming part by far is Step 4: sending 25 queries to four different AI engines and recording every citation. That step alone accounts for 2 to 3 hours of manual work. It is also the step that is most prone to human error, because copying URLs accurately from 100 different AI responses is tedious and mistakes compound.

Then there is the repeatability problem. AI visibility is volatile. Running this audit once gives you a snapshot, but that snapshot starts decaying almost immediately. Our data shows that only 30% of brands maintain consistent AI visibility from one answer to the next. To get a reliable picture, you need to run audits regularly, ideally monthly.

Nobody has 4 to 6 hours every month to spend on manual AI audits. That is exactly why GetCited exists.

GetCited automates all nine steps of this process. You enter your domain and your industry. The platform generates relevant customer queries, sends them to Perplexity, ChatGPT, Claude, and Google Gemini simultaneously, records every citation, ranks competing domains, identifies your gaps, analyzes top-cited competitor pages, and generates a prioritized action plan. The report arrives in your inbox ready to act on.

What takes you 4 to 6 hours of manual work, GetCited handles in minutes. And because the platform runs audits on a recurring schedule, you get ongoing visibility tracking instead of a one-time snapshot that goes stale.

Common Mistakes When Running an AI Visibility Audit

Having seen hundreds of these audits, both manual and automated, there are patterns in what goes wrong.

Using keyword phrases instead of natural queries. If your 25 queries read like SEO keywords ("best CRM software 2026") instead of how people actually talk to AI ("I run a small real estate agency with 12 agents. What CRM should I be using?"), your audit will not reflect real-world AI search behavior. Make your queries conversational and specific.

Only checking one AI engine. Checking just ChatGPT and calling it an audit is like checking just Google and ignoring Bing, Yahoo, and DuckDuckGo in the SEO era. Except the gap between AI engines is even larger. Each engine has its own citation personality. Perplexity cites more sources. Claude favors authoritative domains. Gemini leans on Google's existing index. An audit that only covers one engine gives you, at best, 25% of the picture.

Ignoring the technical layer. Jumping straight to content analysis without checking robots.txt and llms.txt first is like optimizing a storefront that is locked. If 18.9% of sites are blocking AI crawlers, there is a real chance yours is one of them. Always start with the technical checks.

Not recording citation position. Being the first source cited in an AI response is not the same as being the fifth. The first citation tends to get the most click-through traffic and carries the most implied authority. If your competitor is consistently cited first and you are cited fourth, the gap is larger than the raw count suggests.

Running the audit once and considering it done. AI visibility is a moving target. A single audit is useful, but it is a snapshot of one moment in time. The brands that win at AI visibility are the ones that monitor it continuously and adapt their strategy as the landscape shifts.

What Good AI Visibility Looks Like

To give you a benchmark, here is what a strong AI visibility profile looks like based on our research:

Most sites we audit fall short on several of these metrics. That is not a failure. That is information. The audit tells you exactly where you stand and exactly what to improve.

How AI Visibility Audits Differ from Traditional SEO Audits

If you have been through a traditional SEO audit, some of this process will feel familiar. But there are critical differences worth understanding.

Different success metrics. SEO audits measure rankings, click-through rates, and organic traffic. AI visibility audits measure citation frequency, engine coverage, and citation position. You can rank on page one of Google and still be completely invisible to ChatGPT.

Different competitive landscape. In traditional search, you are competing against sites that target the same keywords. In AI search, you are competing against every source the AI engine considers authoritative on a topic. That often includes sites you have never thought of as competitors: Wikipedia, Reddit, review sites, industry publications, and government resources.

Different content requirements. SEO rewards length, keyword density, and backlinks. AI visibility rewards directness, data density, and structural clarity. A 5,000-word SEO-optimized guide that takes four paragraphs to get to the point will often lose to a 2,000-word page that answers the question in the first sentence.

Different update frequency. SEO rankings shift over weeks and months. AI citations can change from one query to the next. The monitoring cadence for AI visibility needs to be significantly faster than what most teams are used to from SEO.

Different technical requirements. SEO audits focus on page speed, mobile-friendliness, and crawlability for traditional bots. AI audits add layers like AI crawler access, llms.txt, and the freshness signals that AI engines specifically look for.

This does not mean SEO does not matter anymore. Traditional search and AI search overlap in many ways, and strong SEO foundations support AI visibility. But if your only audit process is a traditional SEO audit, you have a blind spot that is growing larger every month.

Building AI Visibility Into Your Ongoing Workflow

An audit is a starting point. The real value comes from building AI visibility into your regular content and marketing workflow.

Here is what that looks like in practice:

Monthly audits. Run a full AI visibility audit at least once a month. Track your citation rate, engine coverage, and gap queries over time. Look for trends, not just snapshots.

Content update cycles. Based on your audit findings, update your most important content every 7 to 14 days. This does not mean rewriting from scratch. It means refreshing data, adding new examples, and making sure the information is current.

New content aligned to gaps. Every audit will surface new gap queries. Feed those gaps into your content calendar and prioritize the ones with the highest business impact.

Technical monitoring. Make robots.txt checks part of your regular technical maintenance. A CMS update or security configuration change could accidentally block AI crawlers without anyone noticing.

Competitive tracking. Watch what your top competitors are doing. When new domains start appearing in your audit results, investigate their content to understand why AI engines are favoring them.

The brands that treat AI visibility as an ongoing discipline rather than a one-time project are the ones that build and maintain strong citation profiles over time. It is the same principle that made SEO successful for the companies that committed to it early: consistent effort, informed by data, compounding over time.

Frequently Asked Questions

How long does a full AI visibility audit take?

The manual process takes approximately 4 to 6 hours if you follow all nine steps. The most time-consuming part is Step 4, which involves sending 25 queries to four AI engines and recording every citation. That step alone accounts for 2 to 3 hours. Automated platforms like GetCited compress this entire process to minutes by handling the query generation, multi-engine submission, citation tracking, competitive analysis, and action plan generation automatically.

How often should I run an AI visibility audit?

Monthly at minimum. AI visibility is significantly more volatile than traditional search rankings. Our research shows that only 30% of brands maintain consistent visibility from one AI answer to the next, which means a single audit gives you a snapshot that starts degrading almost immediately. Monthly audits let you track trends, catch new competitors entering your space, and measure the impact of your optimization efforts.

Do I need to audit all four AI engines, or is one enough?

You need all four. Each AI engine has a distinct citation personality. Perplexity cites the most sources and is the most accessible to smaller domains. Claude favors established, authoritative sources. Gemini leans heavily on Google's existing search index. ChatGPT falls somewhere in the middle. A domain might be cited frequently by Perplexity and completely absent from Claude. If you only audit one engine, you are working with an incomplete and potentially misleading picture of your actual AI visibility.

What is the most common issue AI visibility audits uncover?

The most common finding is that sites simply do not have content that directly answers the questions their customers are asking AI engines. The second most common is technical: 18.9% of sites are blocking AI crawlers through their robots.txt configuration, often unintentionally. And 92% of sites do not have an llms.txt file. These three issues account for the vast majority of AI visibility problems we see across audits.

Can I improve my AI visibility without a full audit?

You can make quick wins like unblocking AI crawlers in robots.txt, creating an llms.txt file, and restructuring your opening paragraphs to lead with direct answers. These improvements are broadly applicable and almost always help. But without an audit, you are guessing about which queries matter most, which competitors are winning your citations, and where your biggest gaps are. The audit turns guesswork into a data-driven strategy. That is the difference between making incremental improvements and making the right improvements.