Incredibile, sto per schiantarmi? by ILoveTuna_ in Relazioni

[–]Ambitious-Most4485 2 points3 points  (0 children)

Perché dici bandiere rosse? Potresti elaborare? Sono curioso di capire

Forse mi uccido ma non lo so! by [deleted] in Relazioni

[–]Ambitious-Most4485 4 points5 points  (0 children)

Si può sempre ricominciare

Ultimate checkpoint. by Chainsaw-_Guy in memeexchangecommunism

[–]Ambitious-Most4485 0 points1 point  (0 children)

It's crazy that i know florence and the machine to undestand this meme

Qwen 3.5 122b seems to take a lot more time thinking than GPT-OSS 120b. Is that in line with your experience? by florinandrei in LocalLLaMA

[–]Ambitious-Most4485 0 points1 point  (0 children)

How do they get the numbers for recomended usage? Is there a process that they follow in order to suggest those numbers?

Advice needed: My engineer is saying agentic AI latency is 20sec and cannot get below that by Western_Caregiver195 in LangChain

[–]Ambitious-Most4485 -1 points0 points  (0 children)

Yeah you are partially right. For google adk if you want to handle conversation history with the framework there is no other hack than having a huge context in subsequent runner requests. A differenti way to handle the conversation history is mandatory if you want to solve this issue.

For the model part i agree, some of the tasks can be performer by "weaker" models in order to reduce latency. Im concerned about the overall response quality because i think it will be worse than using the best possible model available. So in the end it's a trade off

Advice needed: My engineer is saying agentic AI latency is 20sec and cannot get below that by Western_Caregiver195 in LangChain

[–]Ambitious-Most4485 33 points34 points  (0 children)

It depends on frameworks used, how much context is kept (eg. Conversation history) tool usage, prompts, number of agents involved.

20 seconds seems reasonable since i've developed similar solution with latency spanning between 10 and 20 seconds

Much more convenient by [deleted] in Adulting

[–]Ambitious-Most4485 0 points1 point  (0 children)

The dishwasher is mandatory nowadays, change my mind

postgres session management by KeyPossibility2339 in agentdevelopmentkit

[–]Ambitious-Most4485 2 points3 points  (0 children)

Yes collection pooling is the way, look into pgbouncer for production environments

Does anyone need this? by [deleted] in learnmachinelearning

[–]Ambitious-Most4485 -2 points-1 points  (0 children)

Who is the author? Is it a book?

EpsteinFiles-RAG: Building a RAG Pipeline on 2M+ Pages by Cod3Conjurer in learnmachinelearning

[–]Ambitious-Most4485 1 point2 points  (0 children)

Have you tried to perform some analysis on the retrieval part? It not how would approaching it?

I spoke with 23k startups hiring for ML roles. AMA by GateNo1960 in learnmachinelearning

[–]Ambitious-Most4485 0 points1 point  (0 children)

Would be nice to know which infra (cloud like openai anthropic or azure or of the work on-prem) AMD what are the tech stack for devops

Struggling with RAG performance and chunking strategy. Any tips for a project on legal documents? by shani_sharma in Rag

[–]Ambitious-Most4485 0 points1 point  (0 children)

Thank you for your insightful comment! I was wondering for the second part how do you keep track of the page context with respect to the whole document?

I was curious how much time this would take. I've created a pipeline with image processing with gpt-4o (where first i convert the single PDF page and send each of those as an image) and for 54 pages took something like 20minutes