ConversationSummaryMemory

1. What is ConversationSummaryMemory?

ConversationSummaryMemory stores a running summary of the conversation instead of storing raw messages.

After each turn, the LLM updates the summary to include new important information.


2. Why does it exist?

Buffer-based memories:

  • Grow forever

  • Cost tokens

  • Include noise

Summary memory solves this by:

  • Compressing old conversation

  • Keeping only what matters

  • Maintaining long-term context cheaply

In short:

Remember the conversation as a summary, not a transcript.


3. Real-world analogy

Think of:

  • ❌ Chat log → word-by-word recording

  • ✅ Summary → meeting minutes

You don’t remember every sentence, only the important points.


4. Minimal working example (Gemini)


5. What does it store internally?

Example summary:

Notice:

  • No raw messages

  • Just important facts


6. How does it work internally?

After each turn:

  1. Previous summary is taken

  2. New message is added

  3. LLM rewrites the summary

So the summary evolves over time.


7. Key characteristics

Feature
Summary Memory

Stores

Summary text

Token usage

Low

Long conversations

Exact wording

Fact accuracy

Medium


8. Comparison with other memories

Memory Type
Best at

Buffer

Short chats

Window

Recent context

Token buffer

Cost control

Entity

Facts

KG

Relationships

Summary

Long conversations


9. Common mistakes

❌ Expecting exact quotes ❌ Using it for precise instructions ❌ Assuming summaries never drift

Summaries can lose detail over time.


10. When should you use it?

Use ConversationSummaryMemory when:

  • Conversations are long

  • You want long-term context

  • Exact wording is not important

Avoid when:

  • You need step-by-step instructions

  • You need recent verbatim context


11. One-line mental model

ConversationSummaryMemory = rolling conversation summary

Last updated