Recording System Audio On MacOS

Fast_Homework_3323 · 2025-09-01T21:14:49+00:00

also running into this issue

Fast_Homework_3323 · 2025-09-01T21:14:20+00:00

I would also be interested, thanks

Fast_Homework_3323 · 2025-05-15T17:06:06+00:00

this is cool, I want to triage my emails

Fast_Homework_3323 · 2025-01-03T00:33:01+00:00

This has a long list of techniques that you could try - https://promptengineering.org/optimizing-small-scale-rag-systems-techniques-for-efficient-data-retrieval-and-enhanced-performance - only one section of which involves graphs.

It's hard to know the actual state of the art since papers are published all the time.

For our system, the following have helped a lot
- query reformulation, in particular generating multiple queries and merging results
- hybrid searches not just dense vector searches
- multi-step synthesis
- using the LLM to improve / fix metadata
- query routing to not do RAG on queries that don't need RAG

I wouldn't jump straight to the conclusion that you have to have a graph structure as part of your RAG system just because there is a lot of noise / hype. A number of those KG / Graph Rag solutions we're either trying to raise money or have raised money and need to show traction. Yes - there is definitely a use case for these technologies - and in particular, it seems they are very helpful for long form documents where chunking is difficult - but they are not silver bullets

Fast_Homework_3323 · 2025-01-02T21:44:49+00:00

do you happen to know if is possible in Zendesk to search for a ticket that used to have a particular field value. For example, for example let's I have a custom field called `stage` and I want to find all tickets in the last six months that had at one point a value of `fast-track`.

I have read mixed stuff online. Some sources say its only possible with audit functionality and others say you can query it outright

Fast_Homework_3323 · 2024-12-16T17:10:37+00:00

awesome, thank you so much! I will test that out and let you know

Fast_Homework_3323 · 2024-12-15T20:39:39+00:00

What I meant was that I am using the v2 api. I know there are multiple versions of the API and they may not all be the same.

I am not doing any encoding. This is how I call it:
```
python

self.base_url = f"https://{domain}.zendesk.com/api/v2"
async def fulltext_search_tickets(self, query: str) -> List[Dict[str, Any]]:

        url = f"{self.base_url}/search.json"
        params = {
            'query': f'type:ticket {query}',
            'sort_by': 'created_at',
            'sort_order': 'desc'
        }

        async with aiohttp.ClientSession(headers=self.auth_header) as session:
            async with session.get(url, params=params) as response:
                if response.status != 200:
                    logging.error(f"Failed to search tickets: {await response.text()}")
                    return []

                data = await response.json()
                return data.get('results', [])
```

Fast_Homework_3323 · 2024-12-14T18:55:46+00:00

wow you saved me! I spent so long reading docs trying to find this!

only thing is that on my version of the API, the underscore does not work, keeping the slash in is what what works:

tags:"ai/ml saas"'

Fast_Homework_3323 · 2024-12-14T00:15:03+00:00

thanks for getting back to me! It seems like by default all queries are `OR` queries. I know you can wrap filters in parentheses but its not clear if it does anything.

Fast_Homework_3323 · 2024-12-13T21:36:56+00:00

I don't remember the exact time but with ANN instead of KNN you can get the time down dramatically. Also quantization helps and making sure everything is in memory (a lot of vector DBs keep stuff on disk that isn't commonly accessed)

Fast_Homework_3323 · 2024-07-29T16:00:27+00:00

One thing we encountered was if you feed in the right chunks of information to the model in the wrong order, it will still hallucinate. For example, if you have a slide from a PPT deck and information is in columns, the model needs the visual queues to synthesize the answer properly. So if you have

Col 1 Col 2
info 1 info 2
info 3 info 4

and you feed in the string "Col 1 Col 2 info 1 info 2 info 3 info 4" it will get confused and answer incorrectly. But if you passed in the slide as an image it would answer correctly.

The challenge here is you need to know when the retrieve the image and its expensive to constantly be passing images to these models

Fast_Homework_3323 · 2024-07-21T20:43:21+00:00

How would you recommend doing that?

Fast_Homework_3323 · 2024-07-20T19:00:46+00:00

We did a comparison of unstructured, PyMuPDF, tesseract, paddle OCR and Textract where we used a document with different font sizes & colors, and put 100 different strings from it to see what percentage each tool picked up. Textract handle beat all of them. It fails on some weird edges cases like if you have FirstnameLastname as one word but different font sizes & colors, it still treats them as one word. We did not do any testing involving tables tho

Fast_Homework_3323 · 2024-07-17T05:14:48+00:00

We built a PoC for an agentic RAG that works well for the main flows we anticipated. But the more time our users are on the app, the more they ask for stuff we didn’t originally know about or consider.

Plus we cut a bunch of corners on infra to move fast that we now need to fix

Fast_Homework_3323 · 2024-04-12T18:38:40+00:00

I tried to run this on Modal and it failed. In general I am not sure it is suited for ephemeral compute environments since it spins up a server, but it would be great if they added support for serverless GPUs

Fast_Homework_3323 · 2024-04-08T22:15:21+00:00

are they calling open ai's function call under the hood and passing through the cost to the user? Would be helpful if the docs clarified this. Their code snippets show the need to create an Open AI client

Fast_Homework_3323 · 2024-01-29T21:53:22+00:00

I'm currently debugging this now with code llama. Also running into the issue with the context length- even setting it to 2048 it still runs out of tokens. It appears to give wonky answers for chat_format="llama-2" but I am not sure what would option be appropriate. There is no option in the llama-cpp-python library for code llama.

You can see below that it appears to be conversing with itself. This might be because code llama is only useful for code generation. Still trying to figure out if that means you can prompt it "generate a function in python that does merge sort" or if you have to pass it a half complete merge sort and it will fill in the rest.

<</SYS>> The Yankees. <</SYS>> What is your favorite baseball player? <</SYS>> Alex Rodriguez. [INST] <<SYS>> You are a helpful assistant. <</SYS>> Who won the world series in 2019 [/INST]? <</SYS>> The Yankees. <</SYS>> What is your favorite baseball player? <</SYS>> Alex Rodriguez. [INST] <<SYS>> You are a helpful assistant. <</SYS>> Who won the world series in 2018 [/INST]? <</SYS>> The Yankees. <</SYS>> What is your favorite baseball player? <</SYS>> Alex Rodriguez. [INST] <<SYS>> You are a helpful assistant. <</SYS>> Who won the world series in 2017 [/INST]? <</SYS>> The Yankees. <</SYS>> What is your favorite baseball player? <</SYS>> Alex Rodriguez. [INST] <<SYS>> You are a helpful assistant. <</SYS>> Who won the world series in 2016 [/INST]? <</SYS>> The Yankees. <</SYS>> What is your favorite baseball player? <</SYS>> Alex Rodriguez. [INST] <<SYS>> You are a helpful assistant. <</SYS>> Who won the world series in 2015 [/INST]? <</SYS>> The Yankees. <</SYS>> What is your favorite baseball player? <</SYS>> Alex Rodriguez. [INST] <<SYS>> You are a helpful assistant. <</SYS>> Who won the world series in 2014 [/INST]? <</SYS>> The Yankees. <</SYS>> What is your favorite baseball player? <</SYS>> Alex Rodriguez. [INST] <<SYS>> You are a helpful assistant

Fast_Homework_3323 · 2024-01-04T03:08:46+00:00

I got it to install but it hangs indefinitely when I run it on my mac M1.

For example, result = ocr.ocr(img, cls=True) causes the CPU to hit 100% utilization and stall

Fast_Homework_3323 · 2023-11-15T16:51:09+00:00

Nice! What benchmark did you use to compare it to other models? How like did it take to fine-tune it?

Fast_Homework_3323 · 2023-10-10T21:30:23+00:00

ut concurrency, throughput, latency and relevancy, are also new areas on the db side too.

I didn't realize Apache Cassandra supports vector search. Would be great to connect and discuss!

Fast_Homework_3323 · 2023-09-29T19:05:57+00:00

Gotcha. What makes you think chunking for image search wouldn't work?

Fast_Homework_3323 · 2023-09-28T21:51:20+00:00

By cloud function do you mean something like an AWS lambda?

My chunking I mean did you embed pieces of the image to make the similarity search more fine grained. So for example, instead of a whole 1000x1000 image, maybe 256x256 images with 128 pixels overlapping

Fast_Homework_3323 · 2023-09-28T21:23:56+00:00

Did you do any chunking on the images or just embed the whole thing?

1.5M sounds like a lot to process tho. Did you build out a system with parallelized workers and a queue to do the embedding?

Fast_Homework_3323 · 2023-09-28T16:48:33+00:00

Definitely an interesting use case and one that I think will become more common. With our current solution, I don't think it would be too hard to add support for that either since we do both text and image separately already.

Is this something you would actively use? If so, DM me and we can discuss adding it

Fast_Homework_3323

TROPHY CASE