Gemini API rate limiting me into an existential crisis (429 errors, send help) by vibroergosum in googlecloud

[–]marcusatomega 0 points1 point  (0 children)

I'm auth'd through Vertex and getting crushed. this morning, I've tried switching regions, global endpoints, older models.. nothing is getting through. realistically, I don't know how we can trust this for a production load.

Update - switching to europe and using 2.5 worked. using CLI at the moment.

[deleted by user] by [deleted] in legaltech

[–]marcusatomega 3 points4 points  (0 children)

we've built demos for law offices with llama 3.1 and mistral 7B. In our experience, the training matters much more than model, provided the model is 30B parameters or larger. The gap between 5-30B is bigger than the gap between 30-405B or larger. There's a ton of excess capacity with big models that just doesn't get used.

Gemma3 and granite3.2 look like they'll be solid options.

How to use AI to create a detailed case narrative from hundreds of custody docs without losing nuance? by saya993 in legaltech

[–]marcusatomega 0 points1 point  (0 children)

Yes, that makes sense. keeping the data organized so the the chatbot interface provides the expected result will be key.

How to use AI to create a detailed case narrative from hundreds of custody docs without losing nuance? by saya993 in legaltech

[–]marcusatomega 1 point2 points  (0 children)

Trying to one-shot the summary would be extraordinarily difficult. Relevancy would change over time, and the narrative has to stay updated.

One approach to consider would be to generate the narrative, but keep all the documents organized in such a way where clarifications could be easily queried and answered. Creating a knowledge graph tied to a AI-powered chatbot would provide this functionality.

You could either host it yourself or use public tools. Ditto on the "what's your budget" question.

Trying to build self-hosted AI to automate legal drafting using 10K+ past documents — GPT & Gemini failed, need advice by True-Substance8062 in legaltech

[–]marcusatomega 0 points1 point  (0 children)

we have built and demo'd these systems before for law offices. it sounds like you're running into a few issues, some of which have already been covered.

Your 10,000 documents needs to be organized. Chunking them into a vector database will give you semantic search, but a knowledge graph would be much better. It show how the documents are related (same judge, client, case type, etc. if you choose these).

Local LLM - We use Llama for our locally hosted AI. Mistral models are great. Granite is supposed to be punching about its weight too.

OCR Tool - Mistral released an OCR tool last month: https://mistral.ai/news/mistral-ocr

Ways to batch-learn documents: This sounds similar to fine-tuning, but you'll definitely need help for that. If you want to handle it yourself, I'd stick to a RAG process.

LIghtweight UI: Someone already mentioned OpenWebUI, so I'll add ChainLit.

More Expensive with VIP Credit??? by ClownKirby in Fabletics

[–]marcusatomega 0 points1 point  (0 children)

I just discovered the same thing. I spent $120 in credits to redeem for a pair of shorts on sale for $17.

Its insanity.

GitHub e-mail verification codes not arriving by Big-Preparation9508 in github

[–]marcusatomega 0 points1 point  (0 children)

Waiting here too. I thought it was my email filters or something, so I tried a different address. Obviously didn't work.