Generative AI Course
Memory: Short-Term and Long-Term Context in GenAI Systems
Without memory, a GenAI system lives only in the present moment.
It forgets past interactions, repeats questions, and fails to build continuity.
Memory exists to make AI systems persistent, contextual, and intelligent over time.
Why Memory Is Required in Real Applications
Most real-world workflows span multiple steps and sessions.
Examples include:
- Multi-turn conversations
- User preferences
- Ongoing tasks
- Long-running agents
Without memory, each interaction resets the system.
Types of Memory in GenAI Systems
Memory is usually split into two categories:
- Short-term memory – temporary context
- Long-term memory – persistent knowledge
Each serves a different purpose.
Short-Term Memory (Context Window)
Short-term memory lives inside the model’s context window.
It includes:
- Recent messages
- Retrieved documents
- Agent reasoning steps
Once the context limit is exceeded, information is lost.
How Engineers Think About Short-Term Memory
Before coding, engineers decide:
- What information must stay?
- What can be summarized?
- What can be discarded?
Short-term memory is expensive because it consumes tokens.
Simple Short-Term Memory Example
This example stores recent messages for context.
conversation_memory = []
def add_message(role, content):
conversation_memory.append({
"role": role,
"content": content
})
def get_context(limit=10):
return conversation_memory[-limit:]
Only the most recent interactions are passed to the model.
What Happens When Memory Grows Too Large
Problems include:
- Token overflow
- Higher cost
- Slower responses
This is why summarization and pruning are critical.
Summarizing Short-Term Memory
Instead of storing raw conversations, systems compress them.
def summarize_memory(messages):
prompt = "Summarize the following conversation:\n" + str(messages)
return llm(prompt)
The summary replaces detailed history while preserving intent.
Long-Term Memory: Persistent Knowledge
Long-term memory stores information beyond a single session.
This includes:
- User preferences
- Past decisions
- Important facts
- Learned behavior
Long-term memory is usually external to the model.
How Long-Term Memory Is Stored
Common storage options:
- Databases
- Vector stores
- Knowledge graphs
The model retrieves memory when needed, not continuously.
Vector-Based Long-Term Memory Example
Important interactions are embedded and stored.
def store_memory(text, embedding):
vector_db.add({
"content": text,
"embedding": embedding
})
Later, relevant memories are retrieved by similarity.
How Agents Use Memory
Agents rely heavily on memory to:
- Track progress
- Avoid repeating actions
- Adapt behavior over time
Memory transforms agents from reactive to adaptive.
Memory Safety Considerations
Memory introduces risk.
- Storing incorrect facts
- Leaking sensitive data
- Reinforcing bad behavior
Production systems apply validation and expiration rules.
How Learners Should Practice Memory Systems
To truly understand memory:
- Build a short-term memory buffer
- Add summarization
- Store selected facts long-term
- Manually inspect retrieval
Memory design is more important than storage technology.
Practice
What does short-term memory mainly store?
What type of memory survives across sessions?
What technique reduces token usage in memory?
Quick Quiz
Which memory type lives inside the context window?
Long-term memory is commonly stored using?
Memory allows agents to:
Recap: Memory enables GenAI systems to retain context, learn from the past, and operate coherently across time.
Next up: Evaluation — measuring quality, safety, and reliability of GenAI systems.