Memory System

The conversation engine uses a 3-tier memory system managed by conversation-memory.service.ts. This allows the AI to maintain context across conversations, remember customer preferences, and personalize responses.

Architecture

Three Memory Tiers

Short-Term Memory

Property	Value
Scope	Current conversation thread
Contents	Recent messages, current intent, active context, topic
Lifetime	Duration of the conversation thread
Storage	`ConversationMemory` with `tier = SHORT_TERM`
Expires	When thread is resolved or after 24h of inactivity

Purpose: Keeps track of what's happening right now in the conversation. Prevents the AI from repeating questions or losing context mid-thread.

Medium-Term Memory

Property	Value
Scope	Per audience profile (across threads)
Contents	Recent interaction history, topic preferences, resolved issues, sentiment trends
Lifetime	30 days, refreshed on each new interaction
Storage	`ConversationMemory` with `tier = MEDIUM_TERM`
Expires	30 days after last interaction

Purpose: Remembers recent context about a customer across multiple conversations. Allows the AI to reference "last time you contacted us about X" or know about recent issues.

Long-Term Memory

Property	Value
Scope	Per audience profile (persistent)
Contents	Customer preferences, purchase history, key facts, communication style, VIP status
Lifetime	Permanent (until manually deleted)
Storage	`ConversationMemory` with `tier = LONG_TERM`
Expires	Never (unless workspace is deleted)

Purpose: Stores persistent knowledge about a customer. Enables deep personalization -- the AI knows their preferred language, past purchases, communication preferences, etc.

Memory Retrieval

When generating a reply, the context builder pulls from all three tiers:

// conversation-memory.service.ts
async function getContext(audienceNodeId: string, threadId: string) {
  const [shortTerm, mediumTerm, longTerm] = await Promise.all([
    prisma.conversationMemory.findMany({
      where: { audienceNodeId, tier: 'SHORT_TERM', threadId },
      orderBy: { createdAt: 'desc' },
      take: 10,
    }),
    prisma.conversationMemory.findMany({
      where: { audienceNodeId, tier: 'MEDIUM_TERM' },
      orderBy: { createdAt: 'desc' },
      take: 5,
    }),
    prisma.conversationMemory.findMany({
      where: { audienceNodeId, tier: 'LONG_TERM' },
    }),
  ]);

  return { shortTerm, mediumTerm, longTerm };
}

Memory Updates

Memory is updated at key points in the conversation lifecycle:

Event	Memory Updates
New message received	Short-term: add message to thread context
Reply sent (auto or manual)	Short-term: update conversation state
Thread resolved	Medium-term: store resolution summary; short-term: clear
Purchase detected	Long-term: update purchase history
Preference expressed	Long-term: store preference ("prefers English", "likes discounts")
Intent classified	Short-term: store current intent; medium-term: update interaction pattern

// After generating a reply
await conversationMemory.store(audienceNodeId, 'SHORT_TERM', {
  threadId,
  lastIntent: classification.intent,
  lastReply: generatedReply.text,
  timestamp: new Date(),
});

// After resolving a thread
await conversationMemory.store(audienceNodeId, 'MEDIUM_TERM', {
  topic: thread.topic,
  resolution: thread.resolution,
  sentiment: thread.overallSentiment,
  resolvedAt: new Date(),
});

How Memory Improves Responses

Scenario	Without Memory	With Memory
Customer asks about order status	"Can you provide your order number?"	"Your order #12345 from last Tuesday is being shipped today."
Returning customer with complaint	Generic apology	"I'm sorry you're experiencing another issue. Last time we resolved your shipping concern -- is this related?"
Customer mentions preference	Ignored	"Since you prefer text updates, I'll make sure we text you when your order ships."
Repeat question in same thread	Answers from scratch	"As I mentioned earlier, the return window is 30 days."

Database Model

model ConversationMemory {
  id             String       @id @default(cuid())
  audienceNodeId String
  audienceNode   AudienceNode @relation(fields: [audienceNodeId], references: [id], onDelete: Cascade)
  tier           MemoryTier   // SHORT_TERM, MEDIUM_TERM, LONG_TERM
  threadId       String?      // Only for SHORT_TERM
  data           Json         // Flexible JSON storage
  expiresAt      DateTime?    // Null for LONG_TERM
  createdAt      DateTime     @default(now())
  updatedAt      DateTime     @updatedAt

  @@index([audienceNodeId, tier])
  @@index([expiresAt])
}

enum MemoryTier {
  SHORT_TERM
  MEDIUM_TERM
  LONG_TERM
}

Memory Cleanup

Expired memories are cleaned up periodically:

Tier	Cleanup Strategy
Short-term	Deleted when thread is resolved or after 24h inactivity
Medium-term	Deleted when `expiresAt` passes (30 days after last interaction)
Long-term	Never automatically deleted

Cross-Reference

Conversation Engine -- full ICE pipeline
Gemini API -- AI model used for memory-informed responses
Embeddings -- vector embeddings for audience clustering
Schema Overview -- ConversationMemory model details

Architecture​

Three Memory Tiers​

Short-Term Memory​

Medium-Term Memory​

Long-Term Memory​

Memory Retrieval​

Memory Updates​

How Memory Improves Responses​

Database Model​

Memory Cleanup​