Blogs
AI

How we built a contextual RAG for meeting notes and Slack

How we built a contextual RAG to understand meeting notes and Slack. Here's our step-by-step guide.

Paul Debahy
Jul 17, 2025 . 7 min read

Tl;dr: 

  • Standard RAG systems fail for product teams because they can't parse the multi-topic, unstructured nature of meeting notes and Slack.
  • A contextual RAG pipeline solves this by first enriching text with semantic labels (e.g., 'risk', 'decision') and domain anchors (e.g., 'Launch Name', 'Jira ID').
  • This structured approach enables precise, filtered retrieval that dramatically reduces AI hallucinations and unlocks specific use cases like automated risk summaries and progress reports.

At Luna AI, we’ve been building AI features like automated sprint summaries, mostly powered by structured Jira data and some unstructured content like comments and ticket descriptions.

But here’s the truth:

→ Most product decisions don’t live in Jira. They live in chaotic meeting notes and Slack threads.

These sources capture the real-time “why” behind product work: the tradeoffs, blockers, risks, and decisions that never make it into Jira fields. But they are messy: long, inconsistent, and unstructured. 

To make sense of this chaos, we built a simple contextual RAG pipeline designed specifically for product teams. Unlike generic RAG approaches that treat all text equally, ours understands:

  • What type of information we’re dealing with (e.g., a risk, a decision, an update).
  • Where it came from (e.g., which launch, sprint, or Jira issue).
  • Who it involves.

This post walks through our implementation, from the initial chunking challenges to the contextual metadata that now powers our contextual RAG and AI use cases.

Building contextual RAG

Why context is everything in RAG

Meeting notes contain the DNA of product decisions, but they're structured like stream-of-consciousness rather than databases. A typical 30-minute standup note might cover: 

  • Five different features, 
  • three blockers, two decisions, and a dozen action items, all woven together in natural language.

Here’s a real excerpt from our Sports Launch meeting:

“Launch timeline confirmed for March 15th. Authentication team reports 2-week delay due to iOS regression affecting login flow. Alex flagged in #ios-eng that crash rate jumped to 3.2% on 16.2 devices. Decision: we're pushing auth to v2, launching with guest mode only. Sarah will update roadmap. Discussed A/B test results showing 12% conversion lift with new onboarding flow.”

Now imagine asking your AI system

“What are the key risks for Sports Launch?”

A standard RAG system might:

  • Return this whole paragraph.
  • Miss that the risk is the iOS regression.
  • Include unrelated info like A/B test results or team assignments.
  • Or worse, return a chunk starting mid-sentence: “…auth to v2, launching with guest mode only…”

You’re left guessing:

  • What’s the actual risk?
  • Who flagged it?
  • Which launch is this even about?

Popular guides about simple RAGs on HuggingFace or Mistral focus on mechanics like:

  • Chunking text by 500 tokens or characters.
  • Embedding everything blindly.
    Retrieving top 5 vectors based on similarity.

But that falls apart when your data looks like:

  • Free-form meeting notes with multiple projects and decisions.
  • Slack threads full of mentions, noise, and context switching.

The core problems with naive RAG:

  • Chunks contain multiple topics (e.g. launch updates + hiring plans + design blockers).
  • No awareness of semantic roles (is this sentence a decision or just chatter?).
  • No domain anchors (like launch names, Jira IDs, or Sprint labels).
    Final LLMs often hallucinate or misinterpret due to lack of structure.

→ Anthropic’s recent post on Contextual Retrieval makes this point clearly: useful retrieval depends on structured, contextual metadata.

What we did at Luna AI

To fix this, we built our own contextual RAG pipeline, tailored to how product and engineering teams actually communicate.

Every sentence in a meeting note or Slack message is:

  • Labeled with anchors:
    • By 'anchor,' we mean the core entity a piece of information relates to, such as a specific: launch_titles, jira_ids, sprint_ids, users, etc.This connects unstructured text to your structured company data.
  • Enriched with semantic types:
    • risk, decision, status, dependency, etc.
  • Indexed for targeted retrieval:
    • Ask about risks → only return risk-labeled sentences
    • Ask about Launch B → filter for that anchor
    • Need a sprint update? → Return only relevant sentences

This approach ensures that we retrieve precise, structured, and relevant context, so downstream LLMs don’t get confused or hallucinate. It also keeps prompts small and sharp.

We chose this structured approach to retrieval for three key reasons: 

  1. To achieve clear precision by retrieving only relevant sentences, not entire documents. 
  2. To drastically reduce LLM hallucinations by providing clear, unambiguous context. 
  3. To unlock powerful, real-world use cases like automated risk detection that naive RAG simply cannot support.

Step by step on building contextual RAG at Luna AI 

The following are high level steps to implement a contextual RAG. Of course, these are the “foundations”, which you should make your own and build on. The goal is to optimise for output: 

  1. Precision
  2. Recall
  3. Quality 

Step 1: Pre-processing with LLM-driven segmentation & labeling

Instead of naive chunking by character count, we use an LLM to break notes and Slack messages into semantically meaningful segments.

Each segment is enriched with:

  • Anchor: what it relates to? a Launch, Jira Epic, Sprint, or Team.
  • Label: what type of info it is? risk, decision, status, dependency.
  • Metadata: meeting title, date, author, thread ID, etc.

Example segment:

{

  "text": "iOS regression from auth team is delaying rollout",

  "segment_index": 3,

  "anchor": "Launch: Sports Expansion",

  "label": "risk",

  "source": "weekly_meeting_notes_2025_07_01",

  "timestamp": "2025-07-01"

}

This turns messy paragraphs into structured building blocks that are searchable, filterable, and semantically tagged.

Step 2: contextual chunking (group by anchor)

Next, we group semantically related segments by anchor. This lets us build tight, topic-specific chunks for retrieval:

  • All risks related to a launch.
  • All updates related to a sprint.
  • All decisions tied to a Jira Epic.

We keep chunks under ~300–500 tokens to ensure compact, relevant prompts for downstream LLMs. You can implement a dynamic token limit and use a re-ranker that will only prioritise high precision chunks. 

Why this matters:
Instead of dumping entire documents into the context window, we pass only what’s relevant to the query’s intent and topic.

Step 3: embed + store with metadata

We store these structured chunks in a vector database (e.g., Weaviate, Qdrant, or Postgres + pgvector), along with rich metadata for filtering.

Each chunk includes:

  • The vector embedding.
  • Anchors: launch_id, jira_id, etc.
  • Labels: risk, decision, etc.
  • Source: Slack thread, meeting doc, timestamp

This lets us do both semantic search (via embeddings) and precise filtering (via metadata).

Step 4: contextual retrieval at query time

When a user asks a question (e.g. “What are the top risks for Launch B?”), we:

  1. Classify the query
    • Is it about a launch, a sprint, a user, or a Key Result?
  2. Retrieve relevant chunks
    • Filter by metadata (e.g., anchor = Launch B)
    • Use vector similarity to pull the most relevant segments
  3. Aggregate & summarize
    • Combine all relevant segments of a given type (e.g., label = risk)
    • Feed them into the final summarization or answer generation prompt

This produces focused, high-quality summaries, instead of generic or hallucinated responses.

Note: short-term vs. long-term memory

Another design principle: we distinguish between short-term memory and long-term memory, a critical step for turning noisy inputs into a reliable source of truth.

  • Short-term: transient context from recent meeting notes and Slack conversations.
  • Long-term: structured data in Luna—Launches, OKRs, Jira issues, teams, etc.

To bridge the gap between unstructured insight and structured reality, we run equivalence checks during preprocessing:

  • Does the “Sports Launch” mentioned in a note match an actual Launch object in Luna?
  • Is “Alex” from the Slack thread the same user assigned to a Jira Epic?
  • Are dates, tags, or decisions already reflected in a Key Result?

This grounding ensures we don’t treat vague, ambiguous references as truth and lets us build a consistent product graph that ties everything together.

Pre-processing sections

How this compares to Anthropic’s contextual retrieval

Anthropic’s approach (see their blog) retrieves entire documents and injects them into Claude’s 200K-token context window.

Where it shines:

  • You have clean, structured docs (e.g., contracts, manuals)
  • You can fit the whole source into the prompt

Where we differ:

  • Our notes and Slack threads are messy and multi-topic
  • We need to extract structure first
  • Our goal isn’t Q&A but summary + signal detection

Why this matters for Product & Engineering leaders

Product and engineering teams don’t need another AI tool that just summarizes random text. They need a system that understands the structure of their work, and reflects how teams actually operate.

By grounding unstructured inputs (like meeting notes and Slack) in structured context (like launches, epics, and risks), our contextual RAG unlocks powerful use cases. 

Real use cases contextual RAG powers

  • Auto-generated risk summaries: 
    • Instantly surface blockers and dependencies from across dozens of Slack threads and docs.
  • Weekly progress updates:
    • Get accurate summaries for each Launch, Epic, or Key Result, even when no one manually wrote one.
  • Slack detection of decisions and blockers
    • Automatically capture key decisions from conversations and associate them with the right project.
  • Quarterly OKR traceability reports
    • Understand which launches contributed to a Key Result, and what signals show progress (or lack of it).

A smarter way to answer hard questions

You can now ask:

“What are the top 3 risks flagged across launches in the last 2 weeks?”

And instead of guessing, the system will:

  • Retrieve only segments labeled as risks
  • Filter to those tagged in the last 14 days
  • Group them by Launch anchor
  • Return a clean, structured answer—just like a teammate would

The bigger vision

Contextual RAG isn’t just a technical pattern, it’s a foundation for something much bigger:

  • A dynamic memory layer that connects all your product context
  • A source of truth that evolves with your team
  • AI teammates that actually understand how your org works

This is how we move beyond chatbots and into real operational intelligence: AI that helps teams work faster, align better, and ship smarter.

Stop drowning in busywork.
Start delivering value!

Set up in minutes
Instant with
Jira and
Slack
Set up in minutes
Instant with
Jira and
Slack