Skip to content

Commit b87e302

Browse files
author
Your Name
committed
feat: rebase onto upstream main — port FTS5 BM25 search, Interface registration, embeddings, cross-repo infrastructure
Rebased our PR #162 features onto upstream's latest main (commit 1d30971) which includes MinHash SIMILAR_TO edges, CBM_CACHE_DIR, and major refactoring. Ported features (building clean on upstream's refactored codebase): 1. FTS5 BM25 search infrastructure: - Contentless FTS5 virtual table (nodes_fts) with camelCase token splitting - cbm_camel_split() SQLite function: updateCloudClient → 'update Cloud Client' - FTS5 backfill in both full pipeline and incremental pipeline - Incremental reindex now preserves FTS5 (was wiping to 0 rows) 2. Interface registration in symbol registry: - Added 'Interface' to label filter in process_def() (pass_definitions.c) - Added 'Interface' to label filter in register_and_link_def() (pass_parallel.c) - Fixes: C# class Foo : IBar now creates INHERITS → Interface edges 3. C# base_list extraction: - Added 'base_list' to fallback base_types[] in extract_base_classes() 4. Embeddings infrastructure (opt-in via CBM_EMBEDDING_URL): - embeddings table in SQLite schema - cbm_cosine_sim() SQLite function for vector search - embedding.c/h: HTTP client, text generation, RRF merge, pipeline integration - Auto-generates embeddings during indexing when configured 5. Cross-repo infrastructure: - cross_repo.c/h: unified _cross_repo.db builder, cross-repo search, channel matching, trace helper Not yet ported (follow-up commits): - MCP tool changes (search_graph query param, generate_embeddings tool, cross-repo tools, get_impact tool) - Process detection (cbm_store_detect_processes) - Channel detection (cbm_store_detect_channels) - C# delegate event subscription (extract_calls.c) - WRITES expansion (extract_semantic.c) All upstream features preserved: MinHash SIMILAR_TO, pass_similarity, CBM_CACHE_DIR, TS_FIELD() macro, extracted helpers.
1 parent 1d30971 commit b87e302

File tree

12 files changed

+1923
-7
lines changed

12 files changed

+1923
-7
lines changed

Makefile.cbm

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -145,7 +145,7 @@ PREPROCESSOR_SRC = $(CBM_DIR)/preprocessor.cpp
145145
SQLITE_WRITER_SRC = $(CBM_DIR)/sqlite_writer.c
146146

147147
# Store module (new)
148-
STORE_SRCS = src/store/store.c
148+
STORE_SRCS = src/store/store.c src/store/cross_repo.c
149149

150150
# Cypher module (new)
151151
CYPHER_SRCS = src/cypher/cypher.c
@@ -186,7 +186,8 @@ PIPELINE_SRCS = \
186186
src/pipeline/pass_compile_commands.c \
187187
src/pipeline/pass_infrascan.c \
188188
src/pipeline/pass_k8s.c \
189-
src/pipeline/pass_similarity.c
189+
src/pipeline/pass_similarity.c \
190+
src/pipeline/embedding.c
190191

191192
# SimHash / MinHash module
192193
SIMHASH_SRCS = src/simhash/minhash.c

internal/cbm/extract_defs.c

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -955,6 +955,7 @@ static const char **extract_base_classes(CBMArena *a, TSNode node, const char *s
955955
"implements_clause",
956956
"argument_list",
957957
"inheritance_specifier",
958+
"base_list", /* C# class Foo : IBar */
958959
NULL};
959960
return find_base_from_children(a, node, source, base_types);
960961
}

0 commit comments

Comments
 (0)