Code as a Source: We Tried for 2 Years. Now It's Finally Here.

Code as a Source: We Tried for 2 Years. Now It's Finally Here.

50-80% of user questions can be answered from source code alone. We've known this for over two years - we just couldn't do anything about it. Until now.

by

by

by

Emil Sorensen

Emil Sorensen

Emil Sorensen

Overview

We've been trying to crack code ingestion at Kapa for over two years. 75% of our customers maintain public code repositories. Over 30 explicitly asked us to ingest it. The signal was impossible to ignore.

But combining docs and code in a single RAG system is brutally hard. Code is massive - a single repo can contain 10x more content than all your documentation combined. Naive chunking destroys context. And reranking code against prose requires models that genuinely understand both.

For two years, models weren't good enough. We tried. And tried. And tried again.

In the last few months, something clicked. Our research team saw the opening and absolutely cooked. Today we're shipping Code as a Source.

What we built

Code-aware chunking that parses your codebase down to every function, every class, every method definition - directly integrated into our agentic RAG pipeline that understands the structure of your entire codebase. And still returns a cited answer with time to first token in under 2.5 seconds.

This isn't "send your repo to an LLM and wait." This is production-grade retrieval across docs and code simultaneously, at speed. When Kapa references code, it cites the specific file and line numbers, linking directly to the source on GitHub.

What code unlocks

We tested this across 10+ customer projects. 50-80% of production questions could be answered from code alone. Not because docs aren't valuable - but because code contains information documentation was never designed to capture.

Code is always the most up-to-date source of truth. It captures nuance that docs intentionally simplify. And it unlocks entirely new question categories - "how does the retry logic work?", "what parameters does this function accept?"

What this unlocks next

This is what really excites us. Agents grounded in your actual codebase. And using code and docs together to automatically detect where your documentation has gaps.

For technical writing teams, this is huge. If your AI assistant can pull implementation details from code, your writers can shift to the work humans are great at - tutorials, conceptual guides, architecture decisions. Documentation tells you the "why." Code tells you the "what." Now your AI assistant speaks both languages.

Get started

Start with your SDKs, client libraries, and example repos. Exclude tests and generated files. Expand from there. Private repos work too with a GitHub personal access token. Kapa tracks your repos automatically, pulling updates every hour.

Head to Sources > Add new source > GitHub Code. Full setup guide and best practices here.

AI that actually understands your product

See how kapa.ai can transform your docs, support, and product experience