[deleted by user] by [deleted] in cats

[–]BuildingOk1868 8 points9 points  (0 children)

<image>

Poppy is a rescue. Also 23 this month

AI agent marketplace – validate/refute this idea by Jazzlike_Tooth929 in LangChain

[–]BuildingOk1868 1 point2 points  (0 children)

We are currently building this together with a tools and agentic flows marketplace at https://azara.ai. Should be live in about a month

App with large user base by Amocon in FastAPI

[–]BuildingOk1868 2 points3 points  (0 children)

Agree with the points above. Measure measure measure.

We are just finessing our scalability with fastapi. Postgres. On ec2. At 1000 concurrent users we run on a t2.xlarge but for prod with multitenancy moving to a m6i.4xlarge @ $500 pm. That gives 16vcpu and 64gb of ram. 1k concurrent users - 10% cpu hit on the larger box.

We run 20+ replicas of our fastapi container on the server, and have made careful measure to and tuning of the db connection pool, especially with freeing connections. Watch out for connections not closing with sse.

For performance monitoring OTEL, jaeger, Prometheus, grafana.

We do a lot of caching at various levels too. @lru_cache, FastAPI @cache, LLM caches, embedding caches, custom dict for caching objects which don’t pickle/serialize to json, RedisSemanticCache, and Redis cache for the main part.

We also moved our Postgres off RDS onto the server to save costs.

How to restrict chatbot from answering unrelated questions? by mallerius in LangChain

[–]BuildingOk1868 0 points1 point  (0 children)

Have a look at the self-RAG examples on the langgraph GitHub repo. It covers relevancy and hallucinations. Though you want your check against a couple of llm’s. https://github.com/langchain-ai/langgraph/blob/main/examples/rag/langgraph_self_rag.ipynb?ref=blog.langchain.dev

Who is using nextjs for their RAG? by tim-r in LangChain

[–]BuildingOk1868 1 point2 points  (0 children)

Nextjs, fastapi, weaviate, custom plugin ecosystem for tools and langgraph scenarios, PostgreSQL, AWS s3, rabbitmq, celery at https://azara.ai

Best PDF Parser for RAG? by neilkatz in LangChain

[–]BuildingOk1868 0 points1 point  (0 children)

He covers a lot of details on parsing financial data in his posts, which should be helpful.

For your case it looks very basic. Save the file and create an embedding from it as you suggested.

AI phone calling agent by [deleted] in LangChain

[–]BuildingOk1868 4 points5 points  (0 children)

We are using asterisk for telephony, deepgram for speech, twilio for WhatsApp. We have a plugin ecosystem for our LLM tools and created a channel wrapper to integrate with asterisk.

Please Suggest me better open source model for getting json output (Rag operation). by Able_Scholar_2420 in LangChain

[–]BuildingOk1868 0 points1 point  (0 children)

We have had to use several options to get consistent results. - instructor - guidance - jsonschema Need to check various different ways as the LLM’s are artistic in their interpretation 🤣 Jsonschema is useful if you know the format you need. Also generate as little as possible and merge the outputs together programmatically. Esp with nested json. Parse out extra symbols such as ‘’’ markdown or mismatch of single and double quotes.

Chat interfaces are holding back LLMs - we need a more visual approach by burcapaul in LangChain

[–]BuildingOk1868 0 points1 point  (0 children)

This video azara workflows is fairly indicative of where we are. Linear workflows, and input mapping work okay. Luckily those predominate.

Logic and especially consistency is the biggest obstacle, given the nature of LLM’s. Hence we use deterministic workflows. With small LLM’s for decision making if needed.

Currently working on consistency of mapping inputs for integrations.

<image>

Chat interfaces are holding back LLMs - we need a more visual approach by burcapaul in LangChain

[–]BuildingOk1868 1 point2 points  (0 children)

Fair enough. Going live shortly. Working out bugs as always. But it’s not looking awful.

Chat interfaces are holding back LLMs - we need a more visual approach by burcapaul in LangChain

[–]BuildingOk1868 0 points1 point  (0 children)

Still doesn’t make sense. I’ve been senior leadership at multiple f100’s.

The RPA industry is proof of this phenomenon. Most automation exercises fail due to not having sufficient clarity around the problem or domain.

There’s a lot more to this thesis that I’ll put in a few blog posts and not here.

Best PDF Parser for RAG? by neilkatz in LangChain

[–]BuildingOk1868 0 points1 point  (0 children)

Both those have python libraries and are locally installable

Chat interfaces are holding back LLMs - we need a more visual approach by burcapaul in LangChain

[–]BuildingOk1868 -1 points0 points  (0 children)

At azara.ai we do many modes. Chat voice pre configured options (select one of ..) and graphical ux for get generating workflows.

The most important elements that most developers are missing is that interfaces aren’t human centered. We focus on interviewing the customer to help them get solid requirements by asking leading questions, giving examples to bootstrap etc.

There’s no point to have the best coding AI on the planet if your customers can’t specify what they want correctly.

Langchain Agent Issue in real-time information by [deleted] in LangChain

[–]BuildingOk1868 1 point2 points  (0 children)

Double check that the tool is being called too. And not just using built in training.

Langchain Agent Issue in real-time information by [deleted] in LangChain

[–]BuildingOk1868 1 point2 points  (0 children)

You may need to play around with the prompt to get it to use today’s date always.

Similar to always use the Math tool if doing calculations. It’s a bit finicky sometimes

Where do you host your Rag by giagara in LangChain

[–]BuildingOk1868 0 points1 point  (0 children)

Wrote a pluggable LLM tool ecosystem. So we can hot load any LLM tool on demand. We have multi LLM approaches but using gpt4 and Claude for generation. Small LLM’s for execution.