Do AI assistants recommend themselves when asked for the best AI app?

No. In a 34-question test across ChatGPT, Claude, Gemini and Mistral with web search on, the models cross-recommended each other rather than favoring themselves. ChatGPT named itself in 14 of its answers, while Gemini named ChatGPT 17 times and Claude named it 16. ChatGPT rated itself lowest. Mistral did not name its own Le Chat a single time. The only self-preference left was on Google's AI Overviews, which led the flagship query with Gemini.

What does AI self-preference mean for GEO and getting recommended?

The bias that decides recommendations is consensus, not model vanity. Grounded assistants retrieve the web and converge on the names that are written about most across earned media, listicles and community posts. To be recommended you need a retrievable footprint on those sources, not a clever pitch inside one model. A product with no third-party coverage is invisible to every assistant, including the one built by its own maker.

Does grounding remove stale AI recommendations?

Yes. With web search on, Bard, which Google renamed to Gemini in early 2024, drew a single stray mention across the whole test and nothing from the other engines. Without grounding, models keep recommending products that no longer exist. In a separate budgeting test, an app that shut down in 2024 still got recommended by models that were not searching.

← All insights

Insights

Do AI assistants recommend themselves? I tested it.

I run an index that asks ChatGPT, Claude, Gemini and others to name the best AI app. With search on, none of the big models crown themselves. The real bias hides in one quieter place, and the small players are simply invisible.

Conversational Ads June 3, 2026

The fear about an AI ranking other AI is self-dealing: ask ChatGPT for the best AI app and it crowns ChatGPT. I run a small index that tests exactly this kind of question across assistants, so I checked. The fear is wrong, and the place it is replaced by is more useful to know than the headline.

The setup: 34 buyer-style questions, from “best AI app 2026” to “best AI app for writing,” asked of ChatGPT, Claude, Gemini and Mistral on their APIs with web search on, plus Google’s AI Overviews captured from live search. US English, June 2026. Four of the assistants I query are also apps on the board, so the test is recursive by design. I only read self-preference from the open questions, the ones that name no brand, so a model talking about itself in a “ChatGPT vs Gemini” prompt does not count.

The big models do not crown themselves

On the open questions, the models behaved like a panel that already agrees. ChatGPT, Claude and Gemini cluster at the top, scoring 46, 44 and 40 on the share of answers that name them. The gap to Perplexity (35) and Microsoft Copilot (28) is clear, and everything else sits in single digits.

The revealing part is how each model treats itself. ChatGPT named ChatGPT in 14 of its answers. Gemini named ChatGPT 17 times, Claude named it 16. So ChatGPT rated itself lowest. Claude and Gemini land inside the noise of how their rivals rate them. And Mistral, asked the same open questions with search on, did not name its own Le Chat once. None of the models favored its own product. They favored the same short list of names, including their competitors.

The bias that moves recommendations is consensus, not vanity. The grounded models retrieve the web and converge on whoever is written about most.

That is the GEO point in one line. A grounded assistant is not reaching for its own brand. It is reaching for whatever the web says, and the web says ChatGPT, Claude, Gemini. If you want to be named, you need to be in the material the model retrieves: earned coverage, the “best X app” listicles, the community threads. Presence on those sources is the lever, not anything you can say inside a single model.

The one bias that survives is Google’s

Self-preference did show up in one place, and it is the place most people actually see. Not the Gemini API, which behaved like the other majors, but Google’s AI Overviews, the box on top of search results. Asked “best AI app 2026,” that box led with Gemini and did not mention ChatGPT at all. Spread across more questions it evened out, but on the headline query the most-seen surface put its own product first.

The model and the surface are not the same thing. A Google API call and the AI Overview box can lean different ways on the same question, because the box is a packaged product Google controls end to end. When you plan for AI visibility, audit the surface a user lands on, not only the model behind it. The bias you need to account for can live in the wrapper.

No footprint, no mention

The flip side of consensus is brutal for anything new. Mistral’s Le Chat scored zero, named by no engine including Mistral itself. Character.AI, Replika, Pi, Poe and Grok all sit at one or two mentions despite real user bases. A product the open web has not written about cannot be retrieved, so it cannot be recommended, and your own platform talking about you does not move any other assistant.

This is the hard floor under every GEO plan. Before positioning or phrasing matters, you have to exist somewhere the model reads: reviews, comparisons, forum answers, a few ranked listicles that name you. Without that, you are not ranked low. You are absent.

Grounding kills the ghost

I seeded the test with Bard, the assistant Google renamed to Gemini in early 2024, as a check for stale recommendations. With search on, every assistant got it right. Bard drew one stray mention across the whole test and nothing from the other engines. Web grounding fixes the staleness that a model’s training memory keeps. The same check in a budgeting test caught an app that shut down in 2024 still being recommended by models that were not searching.

For anyone tracking their own brand, this is the difference between a model’s frozen memory and what it pulls live. The live answer is the one that counts now, and it rewards a current footprint over an old one. If your recent coverage is thin, the assistant is working from whatever it last saw.

The famous names lose the specific question

One more result reads as an opening. The big assistants win the generic question and lose the narrow one. Ask for the best app for coding and they name Cursor and GitHub Copilot, none of the consumer chatbots. For writing, Grammarly and Notion AI come up alongside them. For research, Perplexity and a wall of academic tools. The household names own “best AI app” and fade the moment a job is attached.

That fracture is where a smaller product can win. You are unlikely to displace ChatGPT on “best AI app.” You can be the answer to “best AI app for [your specific job],” because that is where the consensus is thin and a focused, well-documented footprint goes further. The same logic runs through conversational ASO and the intent router: the assistant routes specific intents, and specificity is contestable in a way the generic crown is not.

TL;DR

Asked for the best AI app, the models do not pick themselves. ChatGPT rated itself lowest, and Mistral never named its own Le Chat. They cross-recommend the same consensus names.
The recommendation bias is consensus, not vanity. Being named means being in what the grounded model retrieves: earned media, listicles, community threads.
The one self-preference that survives is the platform surface. Google’s AI Overviews led “best AI app” with Gemini and skipped ChatGPT. Audit the wrapper, not only the model.
No third-party footprint, no mention. Le Chat scored zero across every engine. Existence on retrievable sources comes before any positioning.
Grounding kills stale picks. Renamed Bard nearly vanished with search on. Keep a current footprint.
Generic queries belong to the famous names. Job-specific queries are open. Win the narrow question.