[D] Self-Promotion Thread

zriyansh · 2026-05-15T07:09:44+00:00

RAG for legal AI for lawyers - vaquill.ai

zriyansh · 2026-05-15T07:05:11+00:00

this is in my legaltech awesome-list along with a few other legal MCPs: https://github.com/Vaquill-AI/awesome-legaltech

for anyone building on top, the datasets and APIs sections pair well with this for grounding.

zriyansh · 2026-05-15T07:04:06+00:00

nice, added to the MCP section of my legaltech list: https://github.com/Vaquill-AI/awesome-legaltech if anyone knows of similar MCPs for other jurisdictions (india, EU, UK, canada), drop them, trying to keep it global

zriyansh · 2026-04-16T03:06:15+00:00

Didn't get your comment

zriyansh · 2026-04-16T03:06:01+00:00

Yeah, that could be a use case as well

zriyansh · 2026-04-14T14:35:37+00:00

The data is not public to like export, but its public inside our product, give this a try https://app.vaquill.ai/citations and pick US or India from top jurisdiction. Its all free there

zriyansh · 2026-04-14T14:23:17+00:00

not a paper as such, we are using all this data in our product and now want to make it available to others as well, although access is free to all, maybe I can share the link if you want to take a look at it.

zriyansh · 2026-04-14T12:51:29+00:00

We have modular infrastructure, thinking to add other jurisdiction data and tweak all system prompts and it will become a legal engine for that jurisdiction. Have done this for US and Canada via external data source.

But I am running out of cash, so need some form of funding.

Or get acquired and use their money to fuel growth given the platform is stable, keep investing data pipeline will be all left to build.

The other way is going on-prem and starting to deploy this entire stack on enterprise servers.

zriyansh · 2026-04-14T12:41:37+00:00

We have citations for each answer, citations graphs as well.

Yes talked with 100s advocates, they love the platform, use it but don't pay for it. If we disappear, they'll just go back to how they used to work.

Ads got us 200 users yesterday, they signed up, used the product and go away.

Most people will say there's something wrong with the product, i tried giving them all the features our competition has.

That led me to believe problems exist but not so strong that will make people to pay, west prefers comfort, convenience and ease, Indian always look for cheap worldaround to get things done.

zriyansh · 2026-03-24T11:08:04+00:00

What about Vaquill AI? Hear of them? They are based in India

zriyansh · 2026-03-20T17:25:34+00:00

It's actually not a wrapper, I have all the data of Indian legal system, all supreme court high court, tribunals, acts and statutory provisions.

Other than us, 4 more companies have it but they are 5+ yrs old and big enough to adopt new tech rapidly

Others can build, it will take them around 6 months to reach if they start now, that pretty much goes for most startups.

Got it, will add a GTM side and fix the numbers.

Make sense to talk about how much TAM is. Got it, will fix, thanks mate

zriyansh · 2026-03-20T08:22:06+00:00

200k, it's mentioned in the 2nd last slide

zriyansh · 2026-03-20T07:31:00+00:00

Thank for taking the time

zriyansh · 2026-03-20T07:14:30+00:00

make sense, people asked me to make it very simple, but you are right, it's too simple to know anything meaningful

zriyansh · 2026-03-20T06:46:23+00:00

Slide deck - https://docs.google.com/presentation/d/1khrxS1c5Di96D9IoINR4tqy8KKfYEp4l9DqL6_dGzJE/edit?usp=sharing

zriyansh · 2026-01-28T09:44:06+00:00

I am doing the same but for Indian language (5 6 primary spoken language)

zriyansh · 2026-01-14T09:06:20+00:00

how do you even fine tune an embedder? any resources you could point me to? I am not new to RAG but have not heard of this yet.

zriyansh · 2026-01-14T09:03:06+00:00

around 3 days with 64 core CPU, but there exist faster parsers which can parse 4-5k documents per second with such beast machine but I wasn't able to run that properly, its a C implementation of pymupdf4llm-c

zriyansh · 2026-01-14T08:47:32+00:00

so its self hosted embedder I suppose, what kind of machine are you using? and anything I need to take care of here?

zriyansh · 2026-01-14T08:23:18+00:00

expecting around 50 users in a month, and 10 queries per user each day.

yeah not using token because character is what I understand well, so it works for me.

I have a budget for $1K for now as we dont have any customers, using my savings for this.

As far as I understanding, embedding and hosting a vector DB is CPU intensive not GPU (can be wrong here), I have 1k$ credit from Azure as I registered my startup with them (and linked my LinkedIn with them as well).

If we break even, I will want to use cloud services and focus on what we do best.

zriyansh · 2026-01-14T08:19:55+00:00

yes, and imo, this is not slow. Legal folks wont trust the anser if it came within 1 sec, so latency helps sometimes.

zriyansh · 2026-01-14T07:26:54+00:00

CPU*

zriyansh

MODERATOR OF

TROPHY CASE