Secure by design. Built for enterprise.
Kapa.ai was built with security, privacy, and compliance as top priorities from the start, using industry-leading practices to safeguard your data.
Kapa.ai is reviewed and trusted by +200 COMPANIES
Meets the highest industry standards.
We have achieved SOC 2 Type II certification, demonstrating our commitment to maintaining high standards of security, availability, processing integrity, confidentiality, and privacy of customer data.
Reporting security concerns
If you believe you've identified a security vulnerability, please report it to security@kapa.ai. For general security inquiries, you may also reach out to that email.
Frequently asked questions
What types of data does Kapa.ai collect?
We collect two main types of data:
Knowledge source data: This includes documentation, support tickets, tutorials, etc. that you connect to Kapa to create the bot's knowledge base. It can be part of an "External" project (public sources) or an "Internal" project (private/internal sources).
Question/answer data: This refers to the questions users ask the bot and the answers it provides. This is used to provide analytics to the team managing the kapa project (e.g., to see common questions and identify knowledge gaps).
Can I use private/internal data as a knowledge source?
Yes, in fact using private/internal knowledge sources like Zendesk, Confluence, internal docs, etc. is a common use kapa.ai to help internal support teams. Specifically there are two types of projects that exist in kapa:
External Project (default): Uses publicly available data sources.
Internal Project: Uses private/internal data sources.
You choose the right mix of data to meet your operational and privacy goals. We invest in security and privacy, including by maintaining SOC 2 Type II certification.
How is knowledge data stored and processed?
Knowledge data, whether from "External" or "Internal" projects, is processed as follows:
Ingested using an integration (e.g., web crawling, GitHub files loader)
Stored in a US-based Google Cloud-hosted PostgreSQL database
Transformed into LLM-friendly formats
Uploaded to a vector database (Weaviate) hosted on Google Cloud
Given to LLMs to synthesize answers
All data is encrypted in transit and at rest.
What LLM providers do you use, and how do you ensure they don't train on our data?
We utilize a stack of multiple LLM calls from various providers to optimize the answer for each individual question. Our current external providers include Cohere Inc., OpenAI LLC., Voyage Inc., and Anthropic PBC. We have vendor agreements in place with all model providers to prevent them from training on any data we share with them.
How long do you retain data?
Both knowledge data and question/answer data are retained for the duration of the service to provide ongoing question-answering capabilities and support our analytics platform. Enterprise customers can customize retention periods and request data deletion. If you need your data deleted, please contact us at security@kapa.ai.
Can we sign a DPA?
Yes, we offer Data Processing Agreements for enterprise customers with specific requirements. If you need a DPA or have any questions about our data processing practices, please contact us at security@kapa.ai We're committed to working with our customers to meet their compliance and data protection needs.



