# What is an AI Presence Score?

> The AI Presence Score is a 0-100 metric that measures how well AI models represent your company to buyers. It's built from seven structured signals, not a single model judgment. Here's how it works, what the numbers mean, and why the decomposition matters more than the headline.

- Author: Max Wiesner
- Published: 2026-04-10
- Canonical: https://knitknot.ai/learn/what-is-ai-presence-score/
- Publisher: KnitKnot, the AI Presence Management platform (https://knitknot.ai)

---

## What is an AI Presence Score?

The AI Presence Score is a 0-100 composite metric that quantifies how well AI models represent your company when buyers ask evaluation questions. Higher is better. A score of 80 means AI is generally accurate, favorable, and grounded in your content. A score of 35 means AI is getting things wrong, recommending competitors, or citing sources that work against you.

The score is useful as a headline. But the headline isn't the product. The product is the decomposition: the seven structured signals that add up to the number and tell you exactly what's working, what's broken, and what to fix first.

We built it this way because [the alternative didn't work](/blog/we-stopped-asking-ai-who-wins). When we started, we asked GPT to rate each AI response on a 0-100 scale. It gave us numbers. We put them in charts. But when a company asked why they scored 52, we couldn't answer. The model had compressed recommendation quality, factual accuracy, sentiment, source influence, and feature coverage into a single judgment with no trace. Two runs on the same response came back with different numbers. The score was a vibe check, not a measurement.

So we stopped asking for one number and started extracting seven.

## What are the seven signals behind the score?

Coverage, recommendation outcome, feature comparisons, claim accuracy, sentiment, source influence, and confidence markers. Each signal captures a different dimension of how AI treated your company in a specific response, and each is extracted by an LLM judge doing semantic evaluation of the response, not keyword matching. The weighted composite across signals is the AI Presence Score for that evaluation.

**Coverage** measures how prominently you appear in the response, on a five-level scale: primary, substantial, peripheral, incidental, or absent. A glowing assessment buried in a list entry is not the same as being the subject of the response. Absent coverage zeroes out everything else, because a recommendation nobody reads has no impact.

**Recommendation outcome** is a win, loss, tie, or not-compared verdict for each head-to-head matchup in the response. The verdict is composed deterministically from the judge's structured signals (who was recommended, on what basis, with what framing), not from a single holistic "who won" judgment. The canonical win rate is wins divided by decided outcomes, with ties counted in the denominator of the overall tally, so the definition is the same everywhere it appears.

**Feature comparisons** produce a per-feature verdict. For every feature the AI compared, did you win, lose, or tie against the named competitor? Feature win rates aggregate from these verdicts, so a feature-level loss is always traceable to the specific responses where the AI handed that feature to the competitor.

**Claim accuracy** tracks misrepresentations: factual claims the AI made about you that contradict your actual product, pricing, or positioning. Each one carries a proof receipt, the exact knowledge-base source that contradicts what the AI said.

**Sentiment** captures the overall framing on a 0-100 scale. This catches something the other signals don't. The AI can recommend you, get every claim right, and still frame you dismissively. The difference between "Acme is a solid choice" and "You could try Acme, I guess" is a sentiment signal, not a factual one.

**Source influence** measures whose content the AI's cited sources belong to: your domains, competitor domains, or third parties. If the AI built its answer from five competitor blog posts and zero of your pages, the response was synthesized from competitor content. Ownership classification is explicit and per-workspace, so a subdomain you own counts for you.

**Confidence markers** record the AI's conviction level: certain, tentative, or uncertain. A buyer who reads "Acme definitely does not support SOC 2" walks away with a different impression than one who reads "I'm not entirely sure about Acme's SOC 2 status." [Confident misinformation is more damaging](/blog/confident-lies-are-worse-than-hedged-ones) than hedged truth is reassuring, and the score weights it that way.

Two more things happen at scoring time. Every response runs through a universal visibility floor (mention extraction and competitor/feature classification) plus specialty extractors selected per response, such as the competitive-comparison extractor. And the derived results are written down as facts at scoring time, so the same inputs produce the same score every time. No temperature, no prompt sensitivity, no "run it again and hope for the best."

## How does the score differ from coverage and visibility rate?

They are three distinct metrics at three grains, and they are not synonyms. **Coverage** is a per-response category: how prominently you appeared in one answer (primary through absent). **Visibility rate** is an aggregate: the share of scored responses where coverage isn't absent, computed on organic prompts only. Prompts that name your company are excluded from the calculation, because showing up in a question about yourself proves nothing. **AI Presence Score** is the 0-100 composite across all seven signals. A company can have a high visibility rate and a mediocre score: AI mentions them everywhere but gets the facts wrong when it does.

## What do AI Presence Score ranges mean?

Roughly: 80+ means AI is working for you, 60-79 means present with specific issues, and below 50 means AI responses are hurting your competitive position. We've computed scores across our full benchmark dataset, which includes 11,600 head-to-head evaluations across 136 competitors. The distribution tells you where a given score falls relative to that dataset.

<div style="margin: 2em 0; border: 1px solid hsl(var(--border)); border-radius: 8px; overflow: hidden;">
<table style="width: 100%; border-collapse: collapse; font-size: 14px;">
<thead>
<tr style="background: hsl(var(--muted) / 0.5);">
<th style="text-align: left; padding: 10px 16px; font-weight: 600; font-size: 11px; letter-spacing: 0.06em; color: hsl(var(--muted-foreground)); border-bottom: 1px solid hsl(var(--border));">Score range</th>
<th style="text-align: center; padding: 10px 16px; font-weight: 600; font-size: 11px; letter-spacing: 0.06em; color: hsl(var(--muted-foreground)); border-bottom: 1px solid hsl(var(--border));">% of evaluations</th>
<th style="text-align: left; padding: 10px 16px; font-weight: 600; font-size: 11px; letter-spacing: 0.06em; color: hsl(var(--muted-foreground)); border-bottom: 1px solid hsl(var(--border));">What it typically means</th>
</tr>
</thead>
<tbody>
<tr>
<td style="padding: 10px 16px; border-bottom: 1px solid hsl(var(--border) / 0.5); font-weight: 500;">80-100</td>
<td style="padding: 10px 16px; border-bottom: 1px solid hsl(var(--border) / 0.5); text-align: center; font-family: var(--font-mono, monospace); color: #3DB0C4;">38.9%</td>
<td style="padding: 10px 16px; border-bottom: 1px solid hsl(var(--border) / 0.5); color: hsl(var(--muted-foreground));">AI is recommending you, claims are accurate, sources are balanced. The response is working for you.</td>
</tr>
<tr>
<td style="padding: 10px 16px; border-bottom: 1px solid hsl(var(--border) / 0.5); font-weight: 500;">60-79</td>
<td style="padding: 10px 16px; border-bottom: 1px solid hsl(var(--border) / 0.5); text-align: center; font-family: var(--font-mono, monospace);">22.1%</td>
<td style="padding: 10px 16px; border-bottom: 1px solid hsl(var(--border) / 0.5); color: hsl(var(--muted-foreground));">Mentioned with mostly accurate information. May have a weak recommendation, a missing feature, or a source imbalance dragging the score down.</td>
</tr>
<tr>
<td style="padding: 10px 16px; border-bottom: 1px solid hsl(var(--border) / 0.5); font-weight: 500;">40-59</td>
<td style="padding: 10px 16px; border-bottom: 1px solid hsl(var(--border) / 0.5); text-align: center; font-family: var(--font-mono, monospace); color: #F67202;">17.3%</td>
<td style="padding: 10px 16px; border-bottom: 1px solid hsl(var(--border) / 0.5); color: hsl(var(--muted-foreground));">Problematic. Usually a combination of factors: competitor recommended, one or two factual errors, source mix skewed toward competitor content. Actionable fixes exist.</td>
</tr>
<tr>
<td style="padding: 10px 16px; border-bottom: 1px solid hsl(var(--border) / 0.5); font-weight: 500;">20-39</td>
<td style="padding: 10px 16px; border-bottom: 1px solid hsl(var(--border) / 0.5); text-align: center; font-family: var(--font-mono, monospace); color: #FC5043;">11.1%</td>
<td style="padding: 10px 16px; border-bottom: 1px solid hsl(var(--border) / 0.5); color: hsl(var(--muted-foreground));">AI is actively working against you. Multiple errors, competitor framing, or peripheral/absent coverage. Every buyer who gets this response is being steered away.</td>
</tr>
<tr>
<td style="padding: 10px 16px; font-weight: 500;">0-19</td>
<td style="padding: 10px 16px; text-align: center; font-family: var(--font-mono, monospace); color: #FC5043;">10.6%</td>
<td style="padding: 10px 16px; color: hsl(var(--muted-foreground));">The AI either doesn't know you exist or has fundamentally wrong information. Absent from the response, or present with critical misrepresentations.</td>
</tr>
</tbody>
</table>
</div>

The distribution is skewed positive: 61% of evaluations score 60 or above. But the tail is heavy. 28.6% of evaluations score below 50. Nearly 1 in 3 AI interactions about your company is producing a response that hurts you. That's not an edge case. It's a structural fraction of buyer research.

## Why do scores vary by AI engine?

Each engine has different training data, a different search index, and different citation preferences, so the same company gets different scores from different models. [This is consistent across our dataset](/blog/why-ai-recommends-your-competitor): Gemini produces the highest average AI Presence Score (67.5), Perplexity the lowest (62.7). The spread is 4.8 points, which sounds small until you realize it compounds across hundreds of buyer evaluations.

The per-engine score matters because it tells you where your worst problem is. A company with a blended score of 65 might have a Gemini score of 72 and a Perplexity score of 55. The blended number suggests "room for improvement." The per-engine number says "Perplexity is an emergency." Per-engine trend snapshots are written with every run, using the same formulas as the dashboards, so the trend chart and the headline number can never disagree.

## Why does the decomposition matter more than the headline number?

Because the decomposition is the diagnosis. A score of 42 means nothing by itself. A score of 42 where feature comparisons are strong but claim accuracy is weak tells you: "AI thinks your features are strong but it's recommending the competitor because it has outdated facts about your product. Fix the facts, and the recommendation probably flips."

The decomposition also makes it possible to diff across runs. Score dropped 8 points this month? The signal breakdown tells you it was a recommendation swing, not a sentiment change. Score went up but you didn't do anything? A misrepresentation got corrected in the model's training data. Every movement is traceable to a specific signal.

And every number is auditable. Each headline metric drills down to the exact underlying AI responses, and the drill-down list is the same row set the number was computed from. If someone asks why they scored 42, the answer should never be "the model felt that way." The answer should be: here are the three claims that were wrong (with the contradicting sources attached), here are the features you lost on, here is the competitor page that shaped the recommendation. Three different problems, three different fixes, none of them "make your product better."

## How does the score relate to win/loss outcomes?

The AI Presence Score and the competitive outcome (win/loss/tie) measure different things. The score captures the full quality of the AI's representation: accuracy, sentiment, sources, coverage. The competitive outcome captures whether you got the recommendation.

They correlate, but they're not the same. A company can win the recommendation and still have a mediocre score if the AI got several facts wrong along the way. That's a fragile win: the next query might flip the recommendation if the facts get slightly worse.

A company can lose the recommendation and have a decent score on features and accuracy, but low source balance. That means the AI knows your product is strong but is citing competitor content as its primary source. The fix is publishing the comparison page that shifts the source balance, not changing the product.

The decomposition tells you which scenario you're in. The overall score doesn't.

## Frequently asked questions

### How is the AI Presence Score different from an AI visibility score?

Visibility measures whether you appear; the AI Presence Score measures the quality of what AI says when you do. The precise visibility metric is a visibility rate: the share of responses where your coverage isn't absent, computed on organic prompts only (prompts that name your company are excluded). The AI Presence Score adds factual accuracy, recommendation direction, source influence, sentiment, and confidence. A high visibility rate with a low AI Presence Score means AI mentions you often but gets things wrong when it does.

### Why seven signals instead of one holistic judgment?

Because one judgment hides the diagnosis. When a model says you scored 52, you can't tell whether the problem is pricing, features, sources, or sentiment. The seven signals tell you exactly what's wrong and in what order to fix it. Different error types have different fixes, and the decomposition matches errors to remediation.

### Does the score change over time?

Yes. AI models update their training data and search indices at different cadences. A pricing correction on your website might improve your ChatGPT score within weeks but take months to affect Claude. We track scores over time through periodic benchmarks so companies can see whether fixes are landing and which models are updating fastest.

### What's a good AI Presence Score?

Based on the distribution across our benchmark dataset: 80+ puts you in the top 39% where AI is actively working for you. 60-79 is the middle ground where you're present but have specific issues to address. Below 50 (which 28.6% of evaluations fall into) means AI is producing responses that hurt your competitive position.

### Can I improve my score without changing my product?

Almost always. The fixes are content-level, not product-level. Publishing a comparison page, updating pricing, adding structured data, creating documentation that directly answers buyer evaluation questions. The score measures how AI represents you, and representation is a function of the source material available to the AI, not the quality of the underlying product.
