VectorStoreRetrieverMemory stores conversation information as embeddings in a vector database and retrieves only the most relevant past information when needed.
It is semantic memory, not chronological memory.
2. Why does it exist?
Other memories fail when:
Conversations are very long
Old but important facts must be recalled
Exact wording doesn’t matter, meaning does
Vector retriever memory solves:
Long-term memory
Semantic recall
Scalable memory storage
In short:
Remember by meaning, not by order.
3. Real-world analogy
Think of:
Buffer memory → chat log
Summary memory → meeting notes
Vector memory → search engine for your past conversations
You “search” your memory by meaning.
4. Minimal working example (Gemini + FAISS)
5. What gets stored?
Each interaction is stored as:
Text
Converted to embeddings
Saved in vector DB
Example stored chunks:
6. How does retrieval work?
When a new question comes in:
Question is embedded
Vector DB finds similar past messages
Top-k results are injected into prompt
Only relevant memory is used.
7. Key parameters
Parameter
Meaning
k
How many past memories to retrieve
Chunk size
Granularity of memory
Vector DB
FAISS, Chroma, Pinecone
8. Comparison with other memories
Memory Type
Recall Style
Buffer
Chronological
Summary
Compressed
Entity
Fact-based
KG
Relationship-based
Vector
Semantic
9. Common mistakes
❌ Assuming it remembers everything
❌ Not chunking memory properly
❌ Using it without relevance filtering
Vector memory retrieves similar, not exact matches.
10. When should you use it?
Use VectorStoreRetrieverMemory when:
You need long-term memory
Conversations are large
Semantic recall matters
Avoid when:
Short chats
Precise step ordering matters
11. One-line mental model
VectorStoreRetrieverMemory = semantic search over past conversations
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.memory import VectorStoreRetrieverMemory
from langchain.chains import ConversationChain
from langchain.vectorstores import FAISS
from langchain.embeddings import GoogleGenerativeAIEmbeddings
import os
llm = ChatGoogleGenerativeAI(
model="gemini-2.5-flash",
api_key=os.getenv("GEMINI_API_KEY")
)
embeddings = GoogleGenerativeAIEmbeddings(
model="models/embedding-001",
google_api_key=os.getenv("GEMINI_API_KEY")
)
vectorstore = FAISS.from_texts([], embeddings)
retriever = vectorstore.as_retriever(
search_kwargs={"k": 2}
)
memory = VectorStoreRetrieverMemory(
retriever=retriever
)
conversation = ConversationChain(
llm=llm,
memory=memory,
verbose=True
)
conversation.invoke("My name is John and I live in Toronto")
conversation.invoke("I work at Google")
conversation.invoke("Where do I live?")
"My name is John and I live in Toronto"
"I work at Google"