rag-code-mcp

Tool Output & Indexing Schema (v2)

This document describes the canonical v2 data model used by the code MCP server. The goal is that ANY other developer can quickly understand:


1. Indexing / semantic search – CodeChunk

Indexing into the vector DB is done exclusively with the CodeChunk structure (see internal/codetypes/types.go).

// CodeChunk is the canonical v2 format for indexing/search. It represents a
// semantically meaningful piece of code (usually a function, method, type or
// interface declaration) that is stored in vector search.
type CodeChunk struct {
    // Symbol metadata
    Type     string // function | method | type | interface | file
    Name     string // Symbol name (or file base name for Type=file)
    Package  string // Package/module name
    Language string // go | php | python | typescript etc

    // Source location
    FilePath  string
    URI       string
    StartLine int
    EndLine   int

    // Selection range (for precise navigation to symbol name)
    SelectionStartLine int
    SelectionEndLine   int

    // Content
    Signature string
    Docstring string
    Code      string

    // Extra metadata
    Metadata map[string]any
}

Principles:


2. Output for the AI – descriptor schema

All tools that support output_format: "json" must serialize one of the structures defined in internal/codetypes/symbol_schema.go.

2.1. SymbolLocation

type SymbolLocation struct {
    FilePath  string `json:"file_path,omitempty"`
    URI       string `json:"uri,omitempty"`
    StartLine int    `json:"start_line,omitempty"`
    EndLine   int    `json:"end_line,omitempty"`
}

Used everywhere as location for precise navigation.

2.2. ClassDescriptor – type/class/model

type ClassDescriptor struct {
    Language string `json:"language"`
    Kind     string `json:"kind"` // class | interface | trait | struct | type | model
    Name     string `json:"name"`

    Namespace string `json:"namespace,omitempty"`
    Package   string `json:"package,omitempty"`
    FullName  string `json:"full_name,omitempty"`

    Signature   string         `json:"signature,omitempty"`
    Description string         `json:"description,omitempty"`
    Location    SymbolLocation `json:"location,omitempty"`

    Fields    []FieldDescriptor    `json:"fields,omitempty"`
    Methods   []FunctionDescriptor `json:"methods,omitempty"`
    Relations []RelationDescriptor `json:"relations,omitempty"`

    // Data-model specific (ORM / Eloquent)
    Table      string            `json:"table,omitempty"`
    Fillable   []string          `json:"fillable,omitempty"`
    Hidden     []string          `json:"hidden,omitempty"`
    Visible    []string          `json:"visible,omitempty"`
    Appends    []string          `json:"appends,omitempty"`
    Casts      map[string]string `json:"casts,omitempty"`
    Scopes     []string          `json:"scopes,omitempty"`
    Attributes []string          `json:"attributes,omitempty"`

    Tags     []string       `json:"tags,omitempty"`
    Metadata map[string]any `json:"metadata,omitempty"`
}

Used by:

2.3. FunctionDescriptor – function/method

type FunctionDescriptor struct {
    Language    string         `json:"language"`
    Kind        string         `json:"kind"` // function | method | scope | accessor | mutator | constructor
    Name        string         `json:"name"`
    Namespace   string         `json:"namespace,omitempty"`
    Receiver    string         `json:"receiver,omitempty"`
    Signature   string         `json:"signature,omitempty"`
    Description string         `json:"description,omitempty"`
    Location    SymbolLocation `json:"location,omitempty"`

    Parameters []ParamDescriptor  `json:"parameters,omitempty"`
    Returns    []ReturnDescriptor `json:"returns,omitempty"`

    Visibility string `json:"visibility,omitempty"`
    IsStatic   bool   `json:"is_static,omitempty"`
    IsAbstract bool   `json:"is_abstract,omitempty"`
    IsFinal    bool   `json:"is_final,omitempty"`

    Code     string         `json:"code,omitempty"`
    Tags     []string       `json:"tags,omitempty"`
    Metadata map[string]any `json:"metadata,omitempty"`
}

Used by:

2.4. SymbolDescriptor – summary for listings/search

type SymbolDescriptor struct {
    Language  string `json:"language"`
    Kind      string `json:"kind"` // class | interface | trait | function | method | constant | enum | file
    Name      string `json:"name"`
    Namespace string `json:"namespace,omitempty"`
    Package   string `json:"package,omitempty"`

    Signature   string         `json:"signature,omitempty"`
    Description string         `json:"description,omitempty"`
    Location    SymbolLocation `json:"location,omitempty"`

    Tags     []string       `json:"tags,omitempty"`
    Metadata map[string]any `json:"metadata,omitempty"`
}

Used by:


3. Mapping: tool → input → output

3.1. find_type_definition

3.2. get_function_details

3.3. list_package_exports


4. Semantic vs structural – how they work together

This way, the AI uses very few tokens on raw code text and instead has a clear, standardized map of symbols via the schema above.