
Bots now make up most of the traffic on the web. On June 3, 2026, Cloudflare CEO Matthew Prince shared Cloudflare Radar data showing automated requests at 57.5% of web traffic, ahead of humans at 42.5%, the first time bots have led in the history of the web and roughly eighteen months earlier than he had forecast. So the dashboard question is no longer whether AI crawlers hit your site. It is which ones, how often, and what you should do about each one.
Per-crawler visit counts are the rawest signal you have. The number of times GPTBot, ClaudeBot, or PerplexityBot requests a page is a fact, written in your server logs, not a guess from a model. But a raw count by itself is close to useless. A count only means something once you know the purpose of the bot behind it. Sort the bots by what they are there to do, and the same numbers turn into decisions.
A count means nothing until you sort the bot by purpose
Different crawlers want different things. A training bot may pull your whole site once to feed a model. A search bot may revisit a handful of priority pages every few days to keep an answer fresh. A live assistant fetches one page the moment a user asks a question that touches it. A scraper just takes. If you dump all of that into a single "bots" line on a chart, you have a big number and no idea what it means.
citAEOtion reads the server-level record and sorts every known AI crawler into four categories:
- AI Training - bots pulling your content into model training pipelines, like GPTBot, ClaudeBot, and Meta-ExternalAgent.
- AI Search - bots indexing you to answer searches inside AI engines.
- AI Assistant - bots fetching you live to answer a user's question right now, like PerplexityBot.
- Data Scraper - everything else taking your content, attribution optional.
That sort is the whole game. A training count tells you whether your content is shaping the model. A search count tells you whether you are in line to be cited. An assistant count tells you a real person just asked something your page answered. A scraper count tells you who is burning your bandwidth for nothing. The classification comes from reading the actual bots in your logs, not from asking a model to describe you and scoring the guess. See how the sorting works if you want the mechanics.
The metrics that map to a decision
A useful dashboard does not hand you a vanity score. It hands you a small set of numbers, each one tied to a thing you can change. Here is what each metric tells you and what to do with it.
Per-crawler visit counts
This is the base layer: how many times each named bot requested your site over a window. Sorted by category, it answers the first real question. Is OpenAI training on you while Perplexity ignores you? Is one scraper responsible for most of your bot load? The decision it drives is where to spend effort. If the training bots are heavy and the assistant bots are absent, your content is being learned but not yet pulled into live answers, and that gap is the thing to work on.
Page-level hits
Counts per URL show which pages each bot actually takes. This is where coverage gaps surface. If your best article never shows up in PerplexityBot's hits, that page is invisible to the engine you care about, and no amount of prompt-checking would have told you. The decision: get that page crawlable and linked so the bots that matter reach it. Page-level data is also how you catch a scraper hammering one page, like your pricing, on a loop.
Response codes
Every bot request gets a status back, and the spread of 2xx, 3xx, 4xx, and 5xx per crawler is a health check most people never run. If GPTBot is collecting 404s, it is hitting dead or expired URLs and wasting its visit on nothing. If a crawler is catching 429 rate-limit blocks, you may be throttling the exact bot you wanted reading you. The decision is concrete: fix the broken links, or loosen the rule on the bot you meant to welcome.
Trend over time
A single day's count is noise. The slope is the signal. A climb in training hits means a model is working through your fresh content. A sudden drop can mean a robots.txt change or a server rule quietly shut a bot out. Watching the line, per category, is how you tell a real shift from a blip, and how you prove a change you made actually moved something.
Training-versus-citation mix
This is the metric that predicts the future, and it is the one a prompt tool can never produce. The training bots move first. When a model trains on your content, the search and assistant citations come later. The scale is not small: on Vercel's network, GPTBot alone made roughly 569 million requests in a single month, and ClaudeBot another 370 million, and those are training-weighted crawlers working the open web for fresh content right now. So in your own mix, the training hits are the leading indicator. Feed the training bots clean, open, well-structured pages today and you are buying citations down the line. A prompt tool will only tell you that you are not showing up yet. It will never tell you that GPTBot just crawled forty of your pages last week, which is the thing that predicts whether you show up next.
Why server logs, not browser analytics
You cannot get any of this from a standard analytics tag. Tools like Google Analytics depend on JavaScript running in a browser, and AI crawlers do not render JavaScript. Vercel's own measurement found these crawlers skip JS entirely, which is also why a large share of ChatGPT and Claude fetches land on pages those bots cannot fully read. A browser-side tag simply never sees the visit. The bot requested the page, the server answered, and your analytics dashboard shows nothing.
That is the gap citAEOtion closes. It reads the server-level crawler activity, classifies every bot by purpose, shows the page-level hits and the response codes, and tracks the trend so you are working from the record instead of an estimate. It installs as a WordPress plugin in about five minutes. The thesis is plain: the GA of AI, full data, no BS.
Read the dashboard for the question you are asking
The same data answers different questions for different people. If you run infrastructure, you are reading response codes and per-crawler load to decide what to allow, throttle, or block, and you want that decision grounded in data about that exact bot rather than a blunt rule that might take out real users along with the scrapers. If you own the content, you are reading coverage and the training-versus-citation mix to decide what to publish and how to frame it. If you are reporting up, you are reading the trend lines to show whether AI visibility is moving in the right direction.
What ties those views together is one loop. Your bot mix tells you whether you are being cited or merely consumed. That tells you whether your framing is landing. You change the page, then you watch the search and assistant hits climb to prove the change worked. Measure, learn, reframe, win, on evidence instead of vibes.
From counts to becoming the answer
Per-crawler visit counts are not the destination. They are the floor you build on. Sorted by purpose, paired with page-level hits and response codes and read over time, they tell you exactly how AI systems are treating your content right now and where the next move is. The goal is not to rank in a guess. It is to become the answer, and to know you got there because the real crawler record says so.
Ready to see your own numbers sorted by purpose? Start tracking your AI crawler dashboard metrics, or see how it works first.
Frequently asked questions
What are AI crawler dashboard metrics?
AI crawler dashboard metrics are the numbers your server records about AI bot activity: per-crawler visit counts, page-level hits, response codes, and the trend over time. Read on their own they are just totals. Sorted by the purpose of each bot, training, search, assistant, or scraper, they tell you whether your content is being learned, cited, fetched live, or simply taken.
What do per-crawler visit counts actually tell you?
A per-crawler count tells you how often a specific bot, like GPTBot or PerplexityBot, requested your site over a window. The count only becomes useful once the bot is classified by purpose. A high training count means a model is learning from you. A high assistant count means real users are pulling your pages into live answers. Page-level counts show which pages each bot reaches and which it ignores.
Can I track AI crawler metrics in Google Analytics?
Not reliably. Google Analytics runs on JavaScript in the browser, and AI crawlers do not execute JavaScript, so the visits never register in a standard analytics tag. Accurate per-crawler counts come from server-level data. citAEOtion reads that record directly as a WordPress plugin, which is why it sees crawler activity a browser-side tool cannot.
Which metric predicts whether AI engines will cite me?
The training-versus-citation mix. Training crawlers move first, and search and assistant citations follow once a model has learned your content. Watching training hits rise on your priority pages is the leading indicator that citations are coming. A prompt-based tool only reports whether you show up today and cannot see the crawl that decides tomorrow.
What should a high error rate in crawler data tell me to do?
A spike in 4xx or 5xx responses for a given bot means it is hitting broken links, expired URLs, or blocked endpoints, so it is spending visits on nothing and may drop your content. Check which URLs return errors for that crawler and fix them. If the errors are 429 rate-limit blocks, you may be throttling a bot you actually want reading you, and the fix is to loosen the rule on that specific agent.