This document describes the canonical v2 data model used by the code MCP server. The goal is that ANY other developer can quickly understand:
CodeChunkIndexing into the vector DB is done exclusively with the CodeChunk structure
(see internal/codetypes/types.go).
// CodeChunk is the canonical v2 format for indexing/search. It represents a
// semantically meaningful piece of code (usually a function, method, type or
// interface declaration) that is stored in vector search.
type CodeChunk struct {
// Symbol metadata
Type string // function | method | type | interface | file
Name string // Symbol name (or file base name for Type=file)
Package string // Package/module name
Language string // go | php | python | typescript etc
// Source location
FilePath string
URI string
StartLine int
EndLine int
// Selection range (for precise navigation to symbol name)
SelectionStartLine int
SelectionEndLine int
// Content
Signature string
Docstring string
Code string
// Extra metadata
Metadata map[string]any
}
Principles:
CodeChunk is the only format written to the vector DB for code.Signature + Docstring + (sometimes) Code is the text that is embedded.Metadata can contain language-specific information, for example:
type_info (serialized as JSON from golang.TypeInfo),eloquent_model), populated by the Laravel analyzer
for Eloquent models (table, fillable, relationships, scopes, attributes, etc.).Metadata directly. They only access it through a layer
that builds the descriptors defined below.All tools that support output_format: "json" must serialize one of the
structures defined in internal/codetypes/symbol_schema.go.
SymbolLocationtype SymbolLocation struct {
FilePath string `json:"file_path,omitempty"`
URI string `json:"uri,omitempty"`
StartLine int `json:"start_line,omitempty"`
EndLine int `json:"end_line,omitempty"`
}
Used everywhere as location for precise navigation.
ClassDescriptor – type/class/modeltype ClassDescriptor struct {
Language string `json:"language"`
Kind string `json:"kind"` // class | interface | trait | struct | type | model
Name string `json:"name"`
Namespace string `json:"namespace,omitempty"`
Package string `json:"package,omitempty"`
FullName string `json:"full_name,omitempty"`
Signature string `json:"signature,omitempty"`
Description string `json:"description,omitempty"`
Location SymbolLocation `json:"location,omitempty"`
Fields []FieldDescriptor `json:"fields,omitempty"`
Methods []FunctionDescriptor `json:"methods,omitempty"`
Relations []RelationDescriptor `json:"relations,omitempty"`
// Data-model specific (ORM / Eloquent)
Table string `json:"table,omitempty"`
Fillable []string `json:"fillable,omitempty"`
Hidden []string `json:"hidden,omitempty"`
Visible []string `json:"visible,omitempty"`
Appends []string `json:"appends,omitempty"`
Casts map[string]string `json:"casts,omitempty"`
Scopes []string `json:"scopes,omitempty"`
Attributes []string `json:"attributes,omitempty"`
Tags []string `json:"tags,omitempty"`
Metadata map[string]any `json:"metadata,omitempty"`
}
Used by:
find_type_definition with output_format: "json".
Fields and Methods from
TypeInfo when available.FunctionDescriptor – function/methodtype FunctionDescriptor struct {
Language string `json:"language"`
Kind string `json:"kind"` // function | method | scope | accessor | mutator | constructor
Name string `json:"name"`
Namespace string `json:"namespace,omitempty"`
Receiver string `json:"receiver,omitempty"`
Signature string `json:"signature,omitempty"`
Description string `json:"description,omitempty"`
Location SymbolLocation `json:"location,omitempty"`
Parameters []ParamDescriptor `json:"parameters,omitempty"`
Returns []ReturnDescriptor `json:"returns,omitempty"`
Visibility string `json:"visibility,omitempty"`
IsStatic bool `json:"is_static,omitempty"`
IsAbstract bool `json:"is_abstract,omitempty"`
IsFinal bool `json:"is_final,omitempty"`
Code string `json:"code,omitempty"`
Tags []string `json:"tags,omitempty"`
Metadata map[string]any `json:"metadata,omitempty"`
}
Used by:
get_function_details with output_format: "json".
visibility, is_static, is_abstract, is_final,parameters (with types from PHPDoc / type-hints),returns (including Eloquent types, e.g. BelongsToMany<App\\Role>),kind: "scope", "accessor",
"mutator" for Eloquent special methods).SymbolDescriptor – summary for listings/searchtype SymbolDescriptor struct {
Language string `json:"language"`
Kind string `json:"kind"` // class | interface | trait | function | method | constant | enum | file
Name string `json:"name"`
Namespace string `json:"namespace,omitempty"`
Package string `json:"package,omitempty"`
Signature string `json:"signature,omitempty"`
Description string `json:"description,omitempty"`
Location SymbolLocation `json:"location,omitempty"`
Tags []string `json:"tags,omitempty"`
Metadata map[string]any `json:"metadata,omitempty"`
}
Used by:
list_package_exports with output_format: "json" (Go + PHP).find_type_definitiontype_name (required),package / namespace (optional but recommended),output_format: "markdown" (default) or "json".markdown – human-friendly view, optimized for reading in a terminal.json – a ClassDescriptor instance.get_function_detailsfunction_name (required),package / namespace,class_name for PHP methods (implicitly derived from the chunk when
possible),output_format.markdown – human-friendly view.json – a FunctionDescriptor instance.list_package_exportspackage / namespace (required),symbol_type (filter; optional),output_format.markdown – structured list grouped by kind (function/type/class/etc.).json – []SymbolDescriptor.CodeChunk + embeddings,hybrid_search / search_code should return:
[]SymbolDescriptor + small snippets of code.find_type_definition(json) → ClassDescriptor,get_function_details(json) → FunctionDescriptor,list_package_exports(json) → []SymbolDescriptor.This way, the AI uses very few tokens on raw code text and instead has a clear, standardized map of symbols via the schema above.