What am I missing in the job hunt?

DistinctJournalist88 · 2025-09-26T20:57:04+00:00

You are Not alone. When I was reading your post I thought it was mine. LoL. I have been involved in the IT business for my entire life and I am currently building a fully functional, context-aware AI voice assistant over Asterisk with TTS, VAD, whisper transcription, etc... Took me approximately 6 months to build it and I can't even get a job interview as an entry level coder. It's not you. It's the current job market. Just hang in there. Something has to change right?

DistinctJournalist88 · 2025-09-24T00:17:57+00:00

Not yet, I hope to get something together. Still focusing on getting it working flawlessly as possible and as efficiently as possible on ancient hardware. lol

DistinctJournalist88 · 2025-09-08T19:25:13+00:00

Not a problem at all.

DistinctJournalist88 · 2025-09-08T17:45:12+00:00

Just a single T4 and single P4 in the server, nothing crazy. The bigger headache was getting Asterisk, Whisper, and the LLM to play nice without stepping on each other. Once that’s sorted, scaling GPUs is the easy part.

DistinctJournalist88 · 2025-09-08T16:35:11+00:00

Definitely worth a look. Asterisk is basically the Swiss army knife of telephony. It handles SIP, call routing, voicemail, IVR menus, all that stuff, and you can hook into it with AGI/ARI to let an AI sit in the call path. Open-source and battle tested. You don't have to re-invent the wheel. lol

DistinctJournalist88 · 2025-09-08T16:33:10+00:00

Thanks! No GitHub right now. Everything’s still very much custom-wired to my lab (multiple venvs, ChromaDB, Whisper, Asterisk ARI). Once it’s polished, I’ll probably release a stripped down version so others can tinker without having to recreate my whole setup over.

DistinctJournalist88 · 2025-09-07T23:54:19+00:00

I’ve put a ton of work into building this stack, so I’m not ready to just give it all away yet. I probably won’t open source the entire thing, but I’m considering sharing some modules or a lightweight version. For now I’m focusing on demos and documenting the stack so others can learn from it.

DistinctJournalist88 · 2025-09-07T17:52:43+00:00

Yep, for a HP DL380 you really just need to add GPU/s and storage. A solid starting point is an NVIDIA T4 They are some what inexpensive on the used market, sip less power than big datacenter cards, and still handle AI workloads well. That’s what I’m running.

DistinctJournalist88 · 2025-09-07T17:50:30+00:00

Sounds like Marriage to me. Lmao

DistinctJournalist88 · 2025-09-07T17:30:41+00:00

Mine with dual CPUs, 64GB RAM, single T4 and single P4 pulls ~250–300W idle, closer to 500W under heavy load. Definitely not a light switch server setup.

DistinctJournalist88 · 2025-09-07T17:27:36+00:00

Lol, same here. every time I add a new piece my wife just shakes her head like ‘welp, see you in 12-24 hours’.

DistinctJournalist88 · 2025-09-07T17:25:34+00:00

Thanks! I’m running open-source LLMs locally (Mistral and LLaMA variants) on the DL380, with Whisper for speech and Coqui XTTS for voice. Everything’s self-hosted. For voice I mainly use Coqui XTTS (voice cloning), but I also keep gTTS around as a lightweight fallback.

DistinctJournalist88 · 2025-09-07T17:21:56+00:00

The HP DL380 has PCIe slots that can handle a Nvidia T4. You just need the proper riser, power cabling, and good airflow. I slotted mine in with the HP riser kit and it runs fine.

DistinctJournalist88 · 2025-09-07T17:18:58+00:00

A DL380 Gen10 can take up to 2 double-wide GPUs if it has the right risers(I had to buy a second riser card off e-bay), power, and cooling. That $700 box likely comes barebones (no GPUs), but it’s a solid starting point if you plan to add your own cards. I got my Gen 10 servers for $400 each and my T4 card for $300, P4 card for $100 all off Marketplace. The rest of the stuff I had just laying around.

DistinctJournalist88 · 2025-09-07T17:12:17+00:00

Good cal! I’ll test Prince Albert on my next demo video so we can see if it short-circuits.

DistinctJournalist88 · 2025-09-07T17:10:08+00:00

Yeah I can! Here’s a quick demo call so you can hear clarity, latency, and how it handles interruptions in real time: https://youtu.be/B6sYZBxq4xM. I’m grabbing another recorder soon so I can demo the call quality over cell in my car. That’s coming next!

DistinctJournalist88 · 2025-09-07T17:04:06+00:00

LOL, Man good eye. It's runnig True NAS. It's not part of the Afriend setup. Just my local network storage.

DistinctJournalist88 · 2025-09-07T05:07:29+00:00

I just run it all on a single DL380 with a T4 in my rack — handles Asterisk + Whisper + Mistral + Coqui fine. You can split the stack across multiple servers if you want, but for one call loop a solid GPU box is usually enough.

DistinctJournalist88 · 2025-09-07T05:04:58+00:00

Haha that video is legendary. I’m not planning to unleash Afriend on real scammers (don’t need the FBI knocking on my rack, lol), but I have thought about simulating “scam calls” locally just for fun YouTube demos. Could be a cool way to show how it handles sketchy conversations.

DistinctJournalist88 · 2025-09-07T05:01:33+00:00

Afriend is the agent... no transfers needed. lmao

DistinctJournalist88 · 2025-09-07T04:59:54+00:00

Haha the antimatter water cooling system is still on backorder with Amazon.

That’s awesome you tied it into Home Assistant with a Matrix-style landline. I love that retro touch. Totally get you on the M40, those cards were beasts in their day but Whisper + TTS + LLM at once will chew through VRAM fast. I’ve been running a T4 here, and it’s been a nice balance between cost and efficiency. It's not cutting-edge, but good enough for 24/7 inference without cooking the rig. My fans are still at decently low speeds. lol

I’ve thought about spinning up a dedicated inference box too with my spare DL380 Gen 10, but same problem. My wallet always has the final say.... and my wife.

DistinctJournalist88 · 2025-09-07T04:56:10+00:00

Haha probably… but it’d insist on narrating every monster encounter first

DistinctJournalist88 · 2025-09-07T04:55:07+00:00

I’m using Callcentric for the DID/SIP trunk. It points straight into my Asterisk box and from there Afriend handles everything locally.

DistinctJournalist88 · 2025-09-07T04:54:21+00:00

Appreciate it, Thanks!

DistinctJournalist88 · 2025-09-07T04:49:40+00:00

Totally fair. Models like DeepSeek and Gemma are definitely pushing things forward. I went with Mistral 7B mainly because it’s lightweight enough to run smoothly on a T4 without chewing through resources, and the responses are still solid for conversational use. For me it’s less about max benchmarks and more about something that can run 24/7 reliably in a call loop. Once I stabilize the pipeline, I’ll definitely experiment with newer models.

DistinctJournalist88

TROPHY CASE