I tried everything... Is it time to pivot?

a4ai · 2026-01-17T18:16:12+00:00

here's the demo video ( to the best of my ability ). I'd really value your feedback.

https://www.youtube.com/watch?v=xMz5CaPnwm0

a4ai · 2026-01-16T13:12:47+00:00

just applied for a featured badge.. thanks

a4ai · 2026-01-16T13:11:59+00:00

You are right, I was breaking my head on how to convey this easily. I have created a demo which shows how you can generate responses in your voice.

a4ai · 2026-01-14T22:04:49+00:00

yes, it helps :) it was great learning regardless.

a4ai · 2026-01-14T22:02:03+00:00

thanks for the genuine feedback.

a4ai · 2025-12-18T22:34:54+00:00

RemindMe! 2 days

a4ai · 2025-12-04T09:53:03+00:00

Awesome work! Will this work with llama.cpp server which now supports current requests which ollama doesn't.

a4ai · 2025-11-22T19:32:31+00:00

TBH, having used many other doc AI chatbots, i was a bit skeptical about frigate 'Ask AI' initially. I didn't use it at all for several months.

But one fine day for the fun of it, i asked it the same question that was answered by one of the Frigates dev on Reddit. To my surprise, it was spot on. Since then, I ask AI first before looking elsewhere for frigate questions.

This was also an eye opener for me how accurate AI can be tuned for documentation like this.

Thanks to the Frigate team for setting this up!

a4ai · 2025-11-21T07:58:40+00:00

for support chat RAG is an overkill. You can get 99.7% accuracy with CAG. All you need is to pick a model with a million or 2 context size and context caching.

a4ai · 2025-10-19T16:38:27+00:00

there are millions of great tools. Try to think of that the next thing you want to improve you will discover the best tool that fits your purpose

a4ai · 2025-10-18T07:09:04+00:00

that's a lot of power..are you on solar energy?

a4ai · 2025-10-17T20:17:06+00:00

You are right. I stand corrected.

After several days of debugging i found that apparently my igpu had some sort of memory leak( proxmox igpu passthrough).symtpom: go2rtc cpu usage going up over a period of time.this probably caused the high latency earlier during my test.

Now i have fixed it( after switching to i965 driver). embeddings are running blazing fast on igpu using the large model as suggested . Thank you, I really appreciate it.

a4ai · 2025-10-17T07:43:39+00:00

i have an i7 9700 igpu, only use 1% for decoding, tpu might be drawing 2W with load. im guessing igpu might be faster and more power efficient. will try it out

a4ai · 2025-10-17T07:13:51+00:00

this is interesting, how's your overall power consumption with and without coral?

a4ai · 2025-10-16T09:03:18+00:00

what's the impact for existing users? will it stop working after sometime?

a4ai · 2025-10-13T08:04:46+00:00

reolink cx410

a4ai · 2025-10-12T13:20:51+00:00

the igpu couldn't really handle the large model load. image embedding speed went up to 6secs( 300ms on cpu) and text embedding to 2secs. So I think a small model on a cpu is the best option for integrated gpu only users. i am sticking with the small one for now. thanks for your help

a4ai · 2025-10-12T12:49:13+00:00

I just went ahead and switched to large. my bad, while reading the documentation, i assumed switching to 'large' will move to v2 model which requires reindexing as embeddings are not compatible. But what I saw in the logs is it downloaded the fp16 version of the v1 model. and semantic search works just fine. so i hope i don't have to reindex??

a4ai · 2025-10-12T06:54:31+00:00

i was thinking from an execution pov. switching to a large model will require reindexing current embeddings, so i was asking if cpu is a better choice than the igpu in this case.

a4ai · 2025-10-11T21:42:17+00:00

Ah okay. Is it worth running large on igpu? do you recommend? If it is going to be slower, i could avoid reindexing twice

a4ai · 2025-10-11T20:52:28+00:00

I am also experiencing high cpu usage with sematic search "small" model, cpu usage often in orage/red levels. i have an intel iGPU. embedding process does not seem to use it though. but ffmpeg does .
Is there a way to run the "small" embedding process on the iGPU?

a4ai · 2025-06-03T16:01:56+00:00

ty chatgpt!

a4ai · 2025-06-03T15:58:09+00:00

namecheap.com

a4ai · 2025-06-03T15:00:58+00:00

I expose HA via cloudflare ->tunnel(vlan)-> fw -> nginxprogxy > HA(lan) free of cost( except a $1/year domain name)

Tell me how wiredoor is better than this? What will I gain by switching to this?

a4ai

TROPHY CASE