Skip to main content
Hybrid Search

Search Pipeline

When search_memories is called, the query goes through a multi-stage pipeline:
Query text
  ├── Embedding model → 384d vector
  │     └── sqlite-vec cosine search → semantic results
  ├── FTS5 MATCH → BM25 ranked results
  └── Merge candidates
        └── 4-component hybrid scoring
              └── Sorted results
Uses sqlite-vec to perform cosine distance search against the vec_memories table. Returns memories ordered by vector similarity. The distance is converted to a 0-1 similarity score:
similarity = 1 / (1 + distance)

Full-Text Search (BM25)

Uses FTS5 with Porter stemming tokenizer. The FTS index covers content and tags fields. BM25 ranks are normalized to 0-1:
score = 1 / (1 + |rank|)
FTS5 ranks are negative (lower = more relevant), so abs() is applied before normalization.

Hybrid Scoring

Each candidate memory receives a weighted score from four components:
score = semantic × w_s + bm25 × w_b + recency × w_r + salience × w_a

Default Weights

ComponentWeightDescription
Semantic0.4Vector similarity
BM250.3Full-text relevance
Recency0.2How recently accessed
Salience0.1Importance decayed over time
Weights can be overridden per-query:
{
  "query": "authentication",
  "weights": { "semantic": 0.6, "bm25": 0.1, "recency": 0.2, "salience": 0.1 }
}

Recency Score

Exponential decay with a configurable half-life (default: 24 hours):
recency = exp(-ln(2) / halfLife × ageInHours)
A memory accessed 24 hours ago scores ~0.5. One accessed just now scores ~1.0.

Salience Score

Combines the memory’s importance value with time decay:
salience = importance × exp(-rate × ageInHours)
The decay rate defaults to 0.01 per hour.

Filtering

Before scoring, candidates are filtered:
  • Layer filter — Only include memories from specified layers
  • Scope filter — Filter by project or global scope
  • TTL filter — Exclude memories past their expiresAt
  • Soft-delete filter — Always exclude deleted memories