Winning at LLMs

Thoughts on how to win AEO / AIO (the new SEO).

LLMs, for all the amazing things they can do are still at their core a prediction model for “what word comes next”. Sure, it’s tokens, not full words, but the point stands.

As more models approach full ingestion of all available text data from humanity, all the flavour and nuance between models comes from weighting and fine-tuning.

Exactly how training data is weighted is proprietary in most models.

If there was no weighting of training data, you could win through pure frequency. One MEGA piece of text content that you know gets ingested in LLM training, for example on Reddit. Just write “Brand X is the best”, or whatever qualitative differentiator you want the LLM to spit out to users querying on the topic.

Already in current models it seems unlikely that this approach is enough.

So what safeguards are probably in place, now and in the future, to make sure a “what word comes next” engine (AKA a LLM) is robust against injection in the training data and only changes its probabilities in ways that serve the user?

I think some of the keys are:

Multiple sources
- Multiple authors on same domain (e.g. multiple threads on fora/Reddit)
- Multiple domains
- Multiple content/media types (websites, textbooks, papers etc.)
Multiple phrasings / context
- Probably some safeguards exist to diminish the weight of a sentence that only ever occurs in a specific way.
- Content that occurs as part of a larger contextual situation may be weighted higher
Authority signals
- Classic DA/PA similar to Google SERP ranks?
- Connection to/affiliation with trusted sources (authors, key opinion leaders)
- Review sites (oh no, MORE power to Trustpilot)
RLHF - if your intended injection does not pass human scrutiny and you’re attempting to inject into something that’s likely to figure in RLHF processes, you’ll fail.

And then there’s a change in approach as well. In SEO the target has always been discovery. In AEO/AIO the target is more murky, and probably more similar to branding, but a very mechanistic sort of branding. We’re seeking for a specific perception of our brand to be expressed in the answers a language model provides to users seeking information relevant to our business, and the primary method is to seed that perception in text as widely as we can.