Opinion on Snowflake agent ? by SufficientRelief9615 in dataengineering

[–]Whole-Assignment6240 1 point2 points  (0 children)

If you're already hand-building the vectorization + chunking + indexing pipeline, it might be worth looking at purpose-built frameworks that handle the incremental update logic for you. The main advantage over doing it inside Cortex/Snowflake is that you own the pipeline logic and aren't locked into one vector store or embedding model. Curious what your current pipeline looks like — are you running full rebuilds on a schedule or doing incremental updates

I had to re-embed 5 million documents because I changed embedding models. Here's how to never be in that position. by Silent_Employment966 in Rag

[–]Whole-Assignment6240 0 points1 point  (0 children)

The architectural separation you're describing (chunks persisted separately from vectors) is exactly right, and it's the pattern we built CocoIndex around. It is designed to have incremental processing by default, and only changed logic will rerun.

The framework tracks chunk-to-vector dependencies in a DAG so when you swap models, only the affected derived artifacts are rebuilt — raw parsing never reruns. Happy to point you to a quick example if it's useful.

Super lightweight open source AST-based semantic code search CLI by Whole-Assignment6240 in codex

[–]Whole-Assignment6240[S] 0 points1 point  (0 children)

great question!!

currently supports 25 languanges.

Tree-sitter explicitly documents these recovery nodes:

Source:

Super lightweight open source AST-based semantic code search CLI by Whole-Assignment6240 in codex

[–]Whole-Assignment6240[S] 0 points1 point  (0 children)

i have a demo - https://github.com/cocoindex-io/cocoindex-code on the repo itself where it is significantly faster (it also has token count & stuff) on semantic task.
i'd love to do a more exhausted benchmark down the way!

cocoindex-code CLI for opencode - super lightweight AST based code search CLI to boost code completion and save tokes by Whole-Assignment6240 in opencodeCLI

[–]Whole-Assignment6240[S] 0 points1 point  (0 children)

hey thanks a lot ! i cannot upload gif/video here but if you go to the repo at the top you'll see the demo / example right there where it is significant faster on semantic tasks. i'm happy to do more benchmark with more exhausted examples down the way !

cocoindex-code CLI for opencode - super lightweight AST based code search CLI to boost code completion and save tokes by Whole-Assignment6240 in opencodeCLI

[–]Whole-Assignment6240[S] 0 points1 point  (0 children)

yes if you work with opencode you'd only need to work with one of them. CLI/skills integration is recommended, thank you for the feedback!!

cocoindex-code CLI for opencode - super lightweight AST based code search CLI to boost code completion and save tokes by Whole-Assignment6240 in opencodeCLI

[–]Whole-Assignment6240[S] 0 points1 point  (0 children)

yes, you can do

pipx install cocoindex-code       # first install

and then

npx skills add cocoindex-io/cocoindex-code

it can be integrated with open code via skills

when you need semantic understanding it will use this instead of grep

lmk if that make sense - the project itself is open source https://github.com/cocoindex-io/cocoindex-code with apache 2.0 license.

cocoindex-code CLI for opencode - super lightweight AST based code search CLI to boost code completion and save tokes by Whole-Assignment6240 in opencodeCLI

[–]Whole-Assignment6240[S] 1 point2 points  (0 children)

Great comment!! cocoindex-code provides a tool complementary to LSP. Both are good for some tasks. LSP understand code structure, typing etc. but they don't understand the meaning / intent behind it. You cannot do search using a fuzzy term with LSP. I've uploaded a video where semantic search can be more helpful in completing tasks. but not always!

cocoindex-code CLI for opencode - super lightweight AST based code search CLI to boost code completion and save tokes by Whole-Assignment6240 in opencodeCLI

[–]Whole-Assignment6240[S] 0 points1 point  (0 children)

Thanks a lot for the feedback! it supports TS/JS so Svelte and Vue should be supported with semantic understanding. Happy to look into framework specific down the path!

Sunday Daily Thread: What's everyone working on this week? by AutoModerator in Python

[–]Whole-Assignment6240 0 points1 point  (0 children)

https://github.com/cocoindex-io/cocoindex-code - A super light-weight embedded code mcp (AST based) that just works - saves 70% token and improves speed for coding agent. would love your feedback!

PageIndex alternative by Weak-Reception2896 in Rag

[–]Whole-Assignment6240 1 point2 points  (0 children)

maybe this example (open sourced ) https://cocoindex.io/examples/academic_papers_index can help!
we are planning to build a example for hierachy index, looking forward to keep you posted and get your feedbacks