GenAI Lesson 50 – Memory | Dataplexa

Memory: Short-Term and Long-Term Context in GenAI Systems

Without memory, a GenAI system lives only in the present moment.

It forgets past interactions, repeats questions, and fails to build continuity.

Memory exists to make AI systems persistent, contextual, and intelligent over time.

Why Memory Is Required in Real Applications

Most real-world workflows span multiple steps and sessions.

Examples include:

Multi-turn conversations
User preferences
Ongoing tasks
Long-running agents

Without memory, each interaction resets the system.

Types of Memory in GenAI Systems

Memory is usually split into two categories:

Short-term memory – temporary context
Long-term memory – persistent knowledge

Each serves a different purpose.

Short-Term Memory (Context Window)

Short-term memory lives inside the model’s context window.

It includes:

Recent messages
Retrieved documents
Agent reasoning steps

Once the context limit is exceeded, information is lost.

How Engineers Think About Short-Term Memory

Before coding, engineers decide:

What information must stay?
What can be summarized?
What can be discarded?

Short-term memory is expensive because it consumes tokens.

Simple Short-Term Memory Example

This example stores recent messages for context.


conversation_memory = []

def add_message(role, content):
    conversation_memory.append({
        "role": role,
        "content": content
    })

def get_context(limit=10):
    return conversation_memory[-limit:]

Only the most recent interactions are passed to the model.

What Happens When Memory Grows Too Large

Problems include:

Token overflow
Higher cost
Slower responses

This is why summarization and pruning are critical.

Summarizing Short-Term Memory

Instead of storing raw conversations, systems compress them.


def summarize_memory(messages):
    prompt = "Summarize the following conversation:\n" + str(messages)
    return llm(prompt)

The summary replaces detailed history while preserving intent.

Long-Term Memory: Persistent Knowledge

Long-term memory stores information beyond a single session.

This includes:

User preferences
Past decisions
Important facts
Learned behavior

Long-term memory is usually external to the model.

How Long-Term Memory Is Stored

Common storage options:

Databases
Vector stores
Knowledge graphs

The model retrieves memory when needed, not continuously.

Vector-Based Long-Term Memory Example

Important interactions are embedded and stored.


def store_memory(text, embedding):
    vector_db.add({
        "content": text,
        "embedding": embedding
    })

Later, relevant memories are retrieved by similarity.

How Agents Use Memory

Agents rely heavily on memory to:

Track progress
Avoid repeating actions
Adapt behavior over time

Memory transforms agents from reactive to adaptive.

Memory Safety Considerations

Memory introduces risk.

Storing incorrect facts
Leaking sensitive data
Reinforcing bad behavior

Production systems apply validation and expiration rules.

How Learners Should Practice Memory Systems

To truly understand memory:

Build a short-term memory buffer
Add summarization
Store selected facts long-term
Manually inspect retrieval

Memory design is more important than storage technology.

Practice

What does short-term memory mainly store?

What type of memory survives across sessions?

What technique reduces token usage in memory?

Quick Quiz

Which memory type lives inside the context window?

Short-term memory
Long-term memory
External memory

Long-term memory is commonly stored using?

Vector databases
Tokens
Cache only

Memory allows agents to:

Adapt over time
Compress models
Tokenize faster

Recap: Memory enables GenAI systems to retain context, learn from the past, and operate coherently across time.

Next up: Evaluation — measuring quality, safety, and reliability of GenAI systems.

← Previous Course Index Next →

Generative AI Course