The context of the LLM chat

29 May 2025 • 1 min read

Context window refers to the amount of text or code a Large Language Model (LLM) can consider when generating responses. In simple terms, the context is the history of the conversation — the messages you’ve sent and the model’s previous replies — which helps the LLM understand what you're talking about. Without context, every message would be treated in isolation, like talking to someone with no memory.

Claude 3.5 Sonnet has a large context window of 200,000 tokens. This big context window allows the model to process and consider a lot of information when generating responses. It's a big advantage for tasks that need to analyze large codebases and documents, or keep conversations coherent over long periods.

GPT-4o via API offers a context window of 128,000 tokens. While smaller than Claude 3.5 Sonnet's, it's still a big improvement over earlier models. It allows for processing large amounts of text or code.

If something is not in the context of the specific LLM chat, it is not remembered by the model and will not be used. Limitations are

No Long-Term Memory (Unless Explicitly Designed)
By default, models don’t remember you between chats. Some implementations (like ChatGPT with memory enabled) allow limited long-term memory
Context Loss Over Time
As the conversation grows and the context window fills up, older messages may be truncated, summarized or "forgotten." This can lead to repetition, inconsistency, or loss of key background details.
Misinterpretation of Ambiguity
If context is vague, the model might make assumptions or misunderstand your intent. Precision matters — especially in longer interactions.
No True Understanding or State Awareness
LLMs do not have awareness, emotion, or a persistent sense of "self." All responses are generated based on pattern recognition, not conscious memory or understanding.

In next blogs we discuss tactics for filling the context.