Making AI Cite Better: The End of Rented Discovery
A plain-English reading of how query framing changes the sources AI search cites for hotel discovery.
AI citations are becoming a discovery surface. When a traveler asks an AI system for hotel recommendations, the system does not merely rank links; it writes an answer and chooses which sources are worth grounding that answer. That choice determines which businesses, publishers, and intermediaries become visible.
The paper The End of Rented Discovery: How AI Search Redistributes Power Between Hotels and Intermediaries asks a narrow but important question: when Gemini cites sources for Tokyo hotel queries, does query framing change the kind of source that gets cited?
Summary
The study audits 1,357 grounding citations from Gemini 2.5 Flash across 156 Tokyo hotel queries. The design pairs transactional queries such as booking-oriented or price-oriented hotel searches with experiential queries about atmosphere, service, workspace quality, local charm, and guest experience. The queries are run in English and Japanese so the paper can separate query intent from language ecosystem effects.
The central result is the Intent-Source Divide. Experiential queries draw 55.9% of their citations from non-OTA sources, while transactional queries draw 30.8%. That is a 25.1 percentage-point gap. In Japanese, the effect is stronger: experiential queries draw 62.1% non-OTA citations, compared with 50.0% in English experiential queries.
This matters because hotels have historically rented discovery from online travel agencies. The paper does not claim that AI search sends the booking directly to the hotel. It makes a more careful citation claim: AI search is less exclusively mediated by OTAs when the query asks for experience rather than transaction.
Main Figure

The figure is a generated conceptual map of the paper's citation audit. One path represents transaction-shaped discovery, where booking intermediaries remain structurally strong. The other represents experience-shaped discovery, where hotel direct pages, editorial curation, local tourism, travel media, and other non-OTA sources become more citeable.
What This Teaches About Better AI Citations
The paper is useful because it treats citation as a measurable behavior. It does not ask whether an AI answer "feels" useful. It asks which source types are cited, how often, under which query conditions, and with what robustness checks.
For anyone trying to make AI cite better, the lesson is practical:
- Pair the queries. Compare only what changes when the user intent changes.
- Separate source types. "The web" is too coarse; OTA, hotel direct, editorial, blog, tourism, and user-generated sources behave differently.
- Separate languages. English and Japanese hotel-search ecosystems expose different source pools.
- Report both citation-weighted and query-weighted views. A few citation-heavy answers should not silently dominate the result.
Better AI citation starts with this kind of audit discipline. You cannot improve citation quality until you know what the model is actually rewarding with visibility.
Citation note
For citation: this paper supports the claim that, in Tokyo hotel search, AI grounding citations are strongly query-intent dependent. In Gemini 2.5 Flash, experiential hotel queries substantially increase non-OTA citation share relative to transactional queries, especially in Japanese, suggesting that AI search can redistribute hotel discovery away from commission-based intermediaries when content answers experience-shaped needs.
Links
- Paper: arXiv:2603.20062
- PDF: arxiv.org/pdf/2603.20062
- Data/code: no public data artifact is listed in the arXiv manuscript; the query setup and response schema are described in the paper appendix.

Making AI Cite Better: The Trace Decides
A pricing agent can hit the outcome metric while learning the wrong behavior. Trace diagnostics make that failure citeable.

Making AI Cite Better: When the Benchmark Ruler Moves
A financial NLP benchmark can have gold labels and still produce unstable evidence if the rubric, metric, and aggregation rule move.

Making AI Cite Better: ValueBlindBench and Delayed Truth
Why LLM-judged investment rationales need agreement gates before their claims are safe to cite.