LLM Internals

Lesson 6 of 6

The Augmented Signal

Concept:

Production AI systems don't send your raw question to the LLM. They enhance it: a system prompt sets persona and rules, retrieved documents provide context (RAG = Retrieval Augmented Generation), and your question comes last. The user types 'What's our refund policy?' but the real prompt includes the entire policy document. RAG uses embeddings (lesson 02) to find relevant docs, then injects them into the prompt. This is how AI chatbots answer about YOUR data without retraining the model.
Science Officer Chen: Commander, I have a confession. When you've been sending transmissions to ARIA... they weren't going directly.
Commander Vega: What? I thought I was talking to ARIA directly!
Science Officer Chen: You were — but the ship's computer was augmenting your signal first. Before your message reached ARIA, the system added a mission briefing — the system prompt — setting ARIA's behavior and rules.
Commander Vega: So ARIA was getting instructions I didn't see?
Science Officer Chen: Standard protocol. But it gets more powerful. In production systems, before your question is sent, the system searches our intelligence database — using those embeddings from lesson 02 — and finds the most relevant documents. Then it injects them into the signal.
Commander Vega: RAG — Retrieval Augmented Generation. I've seen this in the technical briefs.
Science Officer Chen: Exactly. The user asks 'What's our refund policy?' The system retrieves the actual policy document, adds it to the system prompt, THEN sends everything to ARIA. ARIA answers from the document, not from its general training. No hallucination. No guessing.
Commander Vega: This is how every AI assistant works behind the scenes.
Science Officer Chen: Every single one. ChatGPT, Claude, enterprise chatbots — they all augment the signal. Now you know the full picture: tokens, vectors, API calls, enhanced prompts. Let me give you a context document. Build the full augmented transmission yourself.
Example Code:
# What the user types:
"What is the refund policy for premium accounts?"

# What actually gets sent to the LLM:
{
  "model": "kimi-k2.5",
  "messages": [
    {
      "role": "system",
      "content": "You are a customer support assistant for TechCorp. Answer based ONLY on the provided context. If the answer is not in the context, say 'I don't have that information.'\n\nContext:\nTechCorp Refund Policy (2024):\n- Standard accounts: 30-day refund, no questions asked.\n- Premium accounts: 90-day refund with full feature credit.\n- Enterprise accounts: Custom terms, contact sales.\n- All refunds processed within 5 business days."
    },
    {
      "role": "user",
      "content": "What is the refund policy for premium accounts?"
    }
  ],
  "temperature": 0.3,
  "max_tokens": 200
}

Your Assignment

Build an enhanced prompt that a real company chatbot would use. Create a JSON request with: (1) a 'system' message that includes a persona AND the following context: 'TechCorp Refund Policy: Standard accounts get 30-day refund. Premium accounts get 90-day refund with full feature credit. Enterprise accounts have custom terms.' (2) a 'user' message asking about refunds. Include 'model', 'temperature', and 'max_tokens'.

Llm Console