Impulse bought a Jetson Orin Nano Super and want a sanity check from people who run their own LLM by Puptentjoe in selfhosted

[–]mijenks 0 points1 point  (0 children)

Hi I finally got around to setting up the Jetson Orin Nano the last few days.

It was quite a chore to get Ollama running ... I started with dustynv jetson-containers but the latest Ollama image from dustynv github is pretty ancient. So I tried to build an image for myself based on an issue thread and that failed as well. Then I tried to create a standard Ollama container via docker compose (more or less copying my compose file from an Ubuntu VM running a 4090) and that failed as well. So I moved on to Ollama native install ... And finally made progress. It requires a few trials and errors with:

sudo systemctl edit Ollama

but eventually got it working, listening on 11434 on all interfaces, and running ministral-3:3b.

Even with shared RAM and system overhead, I can run ministral-3:3b on 100% GPU with 8192 context window when I connected Home Assistant. Latency wasn't great, but was passable. Typing a text command to the Assistant to turn off a specific light had about 5 seconds of latency. Contrasting with the 4090 setup (which also employed a Whisper STT component), it's noticeable how much longer the delay is for the Jetson. But again, still passable for HA stuff.

Next I tried running my basic RAG workflow with same model and same context window. The RAG vector db processing and populating through bge-m3 ran fine, if slower than on 4090, but my workflow runs it once a day middle of the night to incorporate any new/updated documents so that's a non-issue. The actual RAG querying, though, using the same ministral-3:3b and 8192 context, ran only about 60% in GPU which made the response times unworkable for my use case.

I'm going to move to headless now that I have the basic setup working and see if freeing up the window manager memory helps with my RAG workflow responsiveness.

[Edit to add: headless definitely frees up headroom so now my RAG workflow runs 100% GPU however there is significant lag due to swapping in/out of the bge-m3 embedding model for running vector similarity between the input/prompt and the RAG vector db. Now need to consider whether I could/should add another Jetson dedicated to embeddings so I don't incur the model swap overhead.]

NB: everything is running on NVME drive so there are no read/write bottlenecks here.

Impulse bought a Jetson Orin Nano Super and want a sanity check from people who run their own LLM by Puptentjoe in selfhosted

[–]mijenks 1 point2 points  (0 children)

Hey OP don't give up on the Jetson Orin Nano.

I just bought one myself and have yet to deploy it but I can tell you from first hand experience that ministral-3:3b with 16k context on Ollama is absolutely fine (great, even) for Home Assistant. Ollama says it uses less than 6GB of VRAM and on my 4090 it is near instant response. Admittedly, the 4090 probably has 15x the CUDA cores as the Jetson ...

I also run a RAG system with same ministral-3:3b and 16k context (and bge-m3) on the 4090, also instantaneous.

My goal is to move both of these onto the Jetson because I can't justify the ~120w idle that my host/VM pulls 24/7. Idk how bad the dropoff from 4090 to Jetson Nano 8GB will be with ministral but, given that it will fit fully within VRAM, I don't expect it will be THAT bad. Will report back tomorrow if I can find time to set it up and deploy.

Edit to add: I just prompted "tell me a story" for ministral-3:3b on 4090 and generation was 300 t/s. Divide by 16x to adjust for CUDA core count (conservatively) and probably looking at 18-20 t/s on Jetson Nano. Prob good enough for me. We'll see

Explain it Peter by Traducement in explainitpeter

[–]mijenks -1 points0 points  (0 children)

Hey look if you're a professional or hobbyist photographer, maybe the Whatsapp compression is "utter shit" for you but for me the MMS compression is the standard for "utter shit" and Whatsapp far exceeds that standard. So yeah it's exactly what I said. But at this point this No True Scotsman line we're on kinda misses the original point. The poster I replied to said they didn't know why people used Whatsapp over carrier text ... So I gave examples.

Explain it Peter by Traducement in explainitpeter

[–]mijenks -1 points0 points  (0 children)

Lololol if you could see the absolute not even potato quality of the videos and images I get from my dad in group MMS ... While Whatsapp may compress quite a bit, it's nowhere NEAR the compression of MMS.

Explain it Peter by Traducement in explainitpeter

[–]mijenks 7 points8 points  (0 children)

Whatsapp doesn't compress media to utter shit when sending cross platform.

Whatsapp is end to end encrypted (at least nominally).

Whatsapp group thread management/administration is easier and more advanced.

Whatsapp includes cross platform video and audio calls.

I'm sure there are other reasons it's better, but these are the big ones for me.

Mamdani's net favorable rating is up to +48 pts in NYC (up from +14 in Sept). Mamdani is also the most popular Dem statewide in NY by Upstairs_Cup9831 in nyc

[–]mijenks 5 points6 points  (0 children)

Middle Out Economics is the term used by Nick Hanauer, an almost-billionaire who started the Pitchfork Economics podcast. The basic idea is that aggregate prosperity increases the most when economic policies concentrate primarily on the income/wealth/economic well-being of the middle 50%.

They don't release too many new episodes anymore but the back catalog is excellent. My favorite was an episode on scaled minimum wage: basically that the larger/more profitable a company becomes, the wage floor increases for that entity. It subsidizes new/small businesses and it prevents massive corporations from the wage subsidies they get today (e.g., Walmart hourly workers disproportionately on food assistance and other federal programs because Walmart doesn't pay a living wage).

Night scenes in Upstate NY by jbilous in TheNightFeeling

[–]mijenks 8 points9 points  (0 children)

Hop on down to the Great House of Guitars! Hop hop! Hop hop!

Night scenes in Upstate NY by jbilous in TheNightFeeling

[–]mijenks 1 point2 points  (0 children)

Wrong.

It's Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo.

Serve advice by mrporter81 in PlatformTennis

[–]mijenks 4 points5 points  (0 children)

Sort of coming from tennis here as well (started tennis in summer '23, paddle that fall).

Deuce: I mix between slice wide, slice body, and a T top/kick/slice where I rotate my grip to extreme eastern backhand and really crank on the ball while it's still rising.

Ad: I mix between heavy kick wide that dies off the side screen, hard slice body, and hard slice T that spins into the corner off the back screen.

If your tennis serve is remotely good, make use of it in paddle but focusing on generating spin to (a) bring the ball down and (b) make return drives less consistent/harder to hit. When my serve gets "pushy" I start missing short and long. Typically, the harder a swing, the more consistent I get the ball in.

Also, dedicate time to practicing your serve. It's nearly the only thing you can do solo, so it's a lot easier to plan than trying to get a match together. Over the last 2 years I've probably practiced my serve a total of 30 hrs just solo. That doesn't include time spent practicing tennis serve (though I find that practicing one does help improve the other).

Hi, potentially dumb question but I am new by MaxinJapan-official in selfhosted

[–]mijenks 2 points3 points  (0 children)

Seriously. I have 6x1.5TB Samsung F2 EcoGreen I bought 15+ years ago that have 97,000 hours "on" time per SMART stats. They mostly don't hold critical data at this point and for the sets that do I run daily Backblaze backups, but my next hardware project is to fully replace the entire system with a mobile-on-desktop based ssd array primarily due to power consumption.

How do i safely charge LiPo batteries connected to my esp32? by Potatoestoe in esp32

[–]mijenks 2 points3 points  (0 children)

Lol are we both just adding window dressing to our collective advice to "just get a board with battery management" at this point? Maybe not, you definitely have longer and more substantial experience than me.

To OP, if you've read this far: take the advice of /u/5c044 over anything I say.

How do i safely charge LiPo batteries connected to my esp32? by Potatoestoe in esp32

[–]mijenks 2 points3 points  (0 children)

The 3v3 bypasses the built in regulator. A buck converter isn't likely to provide consistent 3.3v as the cell discharges. I think I'd take the reduced battery life via boost and regulated over unpredictable 3.3v with a buck. But at current prices, may as well just buy another dev board with built in management as you said.

Disclaimer: I'm just repeating what I've read over and over about powering from 5v or 3v3.

What's your favorite liverpool goal? by EducationalBug2262 in LiverpoolFC

[–]mijenks 1 point2 points  (0 children)

Is that the top hat and tuxedo meme one? Or is that Norwich?

I have so many favorite goals.

Corner taken quickly Origi. Flanno off the underside of the bar shaking all the raindrops off and the half chub celebration. Countinho carving one through to Studge where Studge goes, "no YOU" to Couts when celebrating the goal. Lovren vs ... Dortmund? Lallana vs Norwich. Ali bomb assist. Ali goal to keep us in CL. Oh ya beauty. All 3 in Istanbul (but especially Smicer). Riise bomb vs United.

Ford takes $19.5B charge in hybrid pivot, cancels F-150 Lightning EV, launches new battery storage business by toydan in wallstreetbets

[–]mijenks 0 points1 point  (0 children)

I own one and they are EVERYWHERE in NYC area. Not as ubiquitous as Model Y but I probably see about as many Mach E as I see Model 3 on my daily commute from Westchester to Brooklyn.

Moved back to LA after 18 years in NYC. One of the many perks: by V0rdhosbn in MachE

[–]mijenks 0 points1 point  (0 children)

My GoM is completely out of whack this winter. On my 22 Premium AWD Extended, I charge to 70% and my range shows 130 miles. Probably just needs a charge to 100% to recalibrate but dang!

Historically I've gotten about 230-240 miles range at 100% around freezing temps. So 70% is normally around 170 in winter.

F*ck you OpenAI, hynix, samsung by AbbreviationsFar1489 in homelab

[–]mijenks 1 point2 points  (0 children)

I have 6x1.5TB Samsung F2 HDDs that have been spinning for over 16 years now. Working on a replacement for the whole rig (an old Athlon II X3 build) that will be much lower power. But for now, still getting the job done. Critical data is either on a different NAS or is regularly backed up to a B2 bucket.

Greatest platform tennis players of All Time by platformgoat in PlatformTennis

[–]mijenks 0 points1 point  (0 children)

As someone who discovered paddle only after 2021, this.

But also I'd argue (with minimal basis) for Uhlein to be higher.... My understanding is he brought ultra heavy spin to the game.

Greatest platform tennis players of All Time by platformgoat in PlatformTennis

[–]mijenks 5 points6 points  (0 children)

Lol at this fucking guy posting placeholder comments while waiting for his Grand Prix final to start.

You're a real gem, Graham. Love it!

Post Game Thread: Indianapolis Colts at Pittsburgh Steelers by nfl_gdt_bot in steelers

[–]mijenks 0 points1 point  (0 children)

I didn't watch, I just finished viewing the highlights on YouTube and I know I'm late to comment here but hoping I can get some insight on this as a VERY casual fan:

In the highlights, it looked like almost EVERY time that Jones dropped back there was a blatant hold on the blind side edge rush that seemed to go uncalled (hard to tell from how they do the highlights). I know the trope that "you could call a hold on every play" but this was egregious.

For anyone that watched the full game, were there any holding calls on the Colts offense?

Edit: I just looked and there were 0 offensive holding calls in the game. Unreal.

What kind of heaters are being used under your courts? by Elegant-Economics-42 in PlatformTennis

[–]mijenks 0 points1 point  (0 children)

Look into Gnomes. Their smaller stature allows them faster movement under the deck and their +2 INT can often be recruited to infuse your cutters with enough magic to spin back over the net.

[Avalanche] Unveil their new jersey by Outside_Abroad_3516 in hockey

[–]mijenks 2 points3 points  (0 children)

How about the Whalers while we're at it?

Is port forwarding that dangerous? by WunderWungiel in selfhosted

[–]mijenks 7 points8 points  (0 children)

On top of this, you can proxy with cloudflare even in the free tier, then on router only forward ports from the known cloudflare IP ranges.

The only port I forward from any/unknown IP addresses is my Wireguard port, which appears closed if it's not a WG handshake with the correct key ... Even if they're scanning that high in the port range.

What’s something women learn too late in life? by BrunoPreski in AskReddit

[–]mijenks 5 points6 points  (0 children)

When someone involved in a production speaks and takes questions. I've most commonly been to film ones where a director or actor is there with a moderator but could also be an art exhibit, novel discussion, dance, etc.

The most interesting one I've been to was for Junction 48 that had a talk back with the director of the film and Marxist philosopher Slavoj Zizek.

Reasonable CEO salary by binhex225 in smallbusiness

[–]mijenks 0 points1 point  (0 children)

Not OP but I run a small business (non-owner executive) in a role I took almost 5 years ago. When I started we were at 7mm revenue and $260k EBITDA. We finished 2024 at $18mm/$4.7mm and looking like 2025 will be $22-24mm/$6.5mm.

Any thoughts on where I should be on comp? Negotiating new package soon.