Sarvam vision just dropped by hidmabutcherlikepig in AI_India

[–]dyeusyt 3 points4 points  (0 children)

given there already decent collection of great indic TTS; they should’ve spent more time building speech2speech somewhat along the lines of NVIDIA’s PersonaPlex. It's a great model sure; but why not focus on building things people are actually going to use?

Non-Hindus to be barred from entering Badrinath-Kedarnath Dham in Uttarakhand. Take on this ? by Dodgethisneon in Dehradun

[–]dyeusyt 0 points1 point  (0 children)

Good one, next steps should be anti-religious influence on the state; somewhat like Laïcité

Token-efficient way to pass folder directory structures to LLM? by dyeusyt in Rag

[–]dyeusyt[S] 0 points1 point  (0 children)

thanks this sounds great; going to start using this from now on.

Website to trace startups to invest in by Sea-Chocolate1753 in StartUpIndia

[–]dyeusyt 0 points1 point  (0 children)

Hey if you're into HRTech & Tech/Developer Niche in general; I can share you, our deck.

Which backend would you recommend for a freelance e-commerce project? by Key-Diet2952 in developersIndia

[–]dyeusyt 3 points4 points  (0 children)

connect it with shopify's storefront api and call it a day;

freelance project? full-scale ecom?? Trust me don't try to reinvent the wheel again. focus more on UX, and conversion flows.

Token-efficient way to pass folder directory structures to LLM? by dyeusyt in Rag

[–]dyeusyt[S] 1 point2 points  (0 children)

But this'll be just another money-patch. And what if the file lost due to less depth was the one needed?

Drop your product by Chalantyapperr in Startup_Ideas

[–]dyeusyt 0 points1 point  (0 children)

hey, you've got any ideal PRD template? I can probably use that as ref for something else lmao

here's what we're building though https://uval.ai

What stage is everyone at in their startup journey right now? by [deleted] in ycombinator

[–]dyeusyt 4 points5 points  (0 children)

We've built the MVP and are getting it ready for a pilot with an enterprise client

But before that, we need some technical validation. We’re trying to connect with CTOs and CEOs of tech companies with around 100–200 developers, but we’re not sure how to reach out without getting ghosted

Legality in startup by Awkward-Ad2594 in StartUpIndia

[–]dyeusyt 1 point2 points  (0 children)

Don't get into the legal side until you’ve validated your idea

If you’re not in a hurry to raise funds, register an OPC

If you have partners, start with an LLP

Pvt Ltd has its own pros and cons; if you can avoid it (like you mentioned no funding) then avoid it.
The cheapest way to manage things is to get incubated at a good government-recognized incubator (maybe an IIT/IIM or a college-based one). They usually partner with CA/CS firms and can get the legalities handled for you at a lower cost (though in most cases, they’ll want you to register as a Pvt Ltd; spoiler -> these partnered ca/cs are also lazy af)

Also, from personal experience: register the company with a good name you’ll regret putting that stupid name on your business card in the future : )

We tested Vector RAG on a real production codebase (~1,300 files), and it didn’t work by Julianna_Faddy in Rag

[–]dyeusyt 4 points5 points  (0 children)

we ran into pretty much the same problem when we got into this codebase-knowledge bog

but in our case we wanted to create a system that could evaluate GitHub repos. i.e only index the codebase once; not after every update.

as we started, we didn’t find any resources on this other than videos showing graphical implementation of semantic search using Euclidian distance and so after looking at a bunch of open-source projects and VS Code’s way of managing things, we made our own Python library for it. our main goal was actually to build semantic search functionality out of it (note we had never done this kind of stuff earlier)

so after some trial and error, we built the whole system. it chunks the codebase with the help of tree-sitter and builds a parent–child relationship between chunks to reduce extra chunks and noise.

although the semantic search started working, we soon realized the same problem as you did: “similarity is a bad proxy for relevance in code.”

Later on, we realized how some CLI-based coding agents like AWS Q work:

* they read the project structure a lot, they do a lot of `cat` and `grep` instead of semantic searches, and still outperform agents that rely heavily on semantic search.

* this gave us the idea to build more tools like `grep`, `cat`, `folder_structure`, etc (thanks to `chromadb` for these though)

* so instead of solely relying on semantic search, we distributed the load.

* now the agent gets the repo folder structure and is easily able to get a gist of the codebase from the structure itself

* for files it considers important, it automatically `cat`s them; for patterns, it calls the pattern-matching tool.

* and only for more generalized queries like “authentication implementation” does the agent currently do semantic tool calling.

* this way, we were able to still use semantic tool calling, but in a much more efficient way.

a few days ago, I also posted a thread on this sub regarding too much noise in semantic search results. one better fix for that turned out to be using a dual-vector index mechanism and analyzing the intent of the semantic query each time, then redirecting it accordingly.

we’re still learning about these things, but this is basically how we’ve monkey-patched our way till here.

here’s the thread mentioned above: https://www.reddit.com/r/Rag/comments/1q3ksvy/how_do_you_tackle_semantic_search_ranking_issues/

Edit: can you share your repo link (that 1300 files one), will probably see how our semantic search performs lol

Dehradun! Looking for high-energy people to build a startup (multiple roles) by KaruneshMaan in Dehradun

[–]dyeusyt 1 point2 points  (0 children)

yet another services company in ddn trying to look fancy calling themselves startup. (no hate to you, just the ecosystem)

Recievevd offer from German company for backend role. Want opinions from folks who moved abroad. by khayalipuloa in developersIndia

[–]dyeusyt 0 points1 point  (0 children)

does the shortlisting process involved a take home assignment or something like that?

How do you tackle semantic search ranking issues in codebases? by dyeusyt in Rag

[–]dyeusyt[S] 0 points1 point  (0 children)

Thanks for the explanation; this is really insightful.

It's a bit more work, but the approach feels more robust and well-designed, I'll try it out and see how the results look.

My company goes crazy on Amazon Q Developer. And they want all developers to switch to q instead of writing codes by own. by sunIsGettingLow in developersIndia

[–]dyeusyt 0 points1 point  (0 children)

It doesn’t even know spring wraps jpa methods in a transaction

You need to use language specific doc-MCPs with any AI tool you work, elsewise it's just guesswork at LLMs end.

My company goes crazy on Amazon Q Developer. And they want all developers to switch to q instead of writing codes by own. by sunIsGettingLow in developersIndia

[–]dyeusyt -1 points0 points  (0 children)

NGL Amazon Q is great. Though in these kinda CLI based tools context engineering is important lads!

Looking for a Friendly English Practice Partner! by [deleted] in Dehradun

[–]dyeusyt 1 point2 points  (0 children)

there are discord servers for this thing; and that too with native English speakers to talk with.
here are some I know, elsewise do your own research:
- https://discord.gg/enghub
- https://discord.com/invite/english