Glossary — Octopus Docs

Glossary

Definitions for technical terms used throughout Octopus.

BYO Keys (Bring Your Own Keys): A configuration option that lets you use your own Anthropic or OpenAI API keys. AI token usage is billed directly to your provider account instead of consuming Octopus credits.
Codebase Indexing: The process of cloning a repository, splitting its files into overlapping text chunks, generating vector embeddings for each chunk, and storing them in Qdrant. This enables fast, semantic search over your code during reviews and chat.
Context Window: The maximum amount of text an LLM can process in a single request. Octopus uses vector search to select the most relevant code chunks so that reviews fit within the model's context window.
Credits: The billing unit in Octopus. Each review, indexing operation, and chat message consumes credits based on token usage. Organizations receive free credits to get started and can purchase more or use BYO keys.
Diff: The set of changes between two versions of code — typically the difference between a pull request branch and its base branch. Octopus fetches the diff to understand exactly what changed in a PR.
Embeddings: Numerical vector representations of text generated by an AI model (default: OpenAI text-embedding-3-large). Embeddings capture semantic meaning, allowing Octopus to find code that is conceptually related to a query even if the exact words differ.
Knowledge Base: A collection of documents you upload to Octopus — coding standards, architecture docs, style guides — that the AI references during reviews. This helps Octopus give feedback consistent with your team's specific conventions.
LLM (Large Language Model): An AI model trained on large amounts of text, such as Anthropic Claude or OpenAI GPT. Octopus sends code context and diffs to an LLM, which performs the actual analysis and generates review findings.
.octopusignore: A configuration file placed at the root of your repository (same syntax as .gitignore) that tells Octopus which files and directories to skip during indexing and review — e.g., build outputs, vendor code, or generated files.
Qdrant: An open-source vector database that Octopus uses to store and search code embeddings. It enables fast similarity search so the AI can retrieve the most relevant code chunks when reviewing a pull request or answering a chat question.
Reranking: A second-pass relevance scoring step (using Cohere Rerank) applied after the initial vector search. It re-orders retrieved code chunks by how well they match the query, improving the quality of context sent to the LLM.
Severity Levels: A rating system Octopus assigns to each review finding: 🔴 Critical (likely bug or security issue), 🟠 Major (significant concern), 🟡 Minor (style or minor issue), 🔵 Suggestion (improvement idea), 💡 Tip (informational note). Helps teams prioritize what to fix first.
Spend Limit: A configurable monthly budget cap per organization. Octopus checks the limit before every expensive operation (review, indexing, chat) and stops processing if the organization is over budget — preventing surprise bills.
Vector Search: A search technique that finds items by semantic similarity rather than exact keyword matching. Octopus converts a query into an embedding and finds the code chunks with the closest vectors in Qdrant, surfacing conceptually related code.
Webhook: An HTTP callback sent by GitHub or Bitbucket to Octopus when an event occurs — typically a pull request being opened or updated. This is how Octopus knows when to start a review automatically.

Have a question not covered here? Check the FAQ or open an issue on GitHub.