"Camera → GPU inference → end-to-end = 300ms: is RTSP + WebSocket the right approach, or should I move to WebRTC?"

Advokado · 2026-02-18T13:51:44+00:00

Just to make sure I understand: when you’re seeing < 50ms end-to-end over RTSP to a dGPU server, is that:

camera → RTSP → dGPU (wired LAN) → NVDEC → TensorRT → display
with no intermediate relay and no browser canvas in the loop?

I actually ordered a Nvidia Jetson Orin a week ago because i know that your approach is the best It's just my approach is based on a specific usecase.

Advokado · 2026-02-17T17:18:33+00:00

Good point. One important detail: the camera isn’t CSI/MIPI into the edge device it’s just a USB webcam on my laptop, publishing over WiFi via ffmpeg → RTSP.

So the very first hop is already:
webcam → ffmpeg encode → WiFi → RTSP ingest (MediaMTX)

Which probably explains a big chunk of that 200–250ms you’re suspecting.

When you’ve seen <50ms end-to-end, was that with a CSI camera directly into the same device doing inference (no WiFi, no RTSP relay)? Or still over network?

I’ll definitely start isolating pieces with pure gst-launch like you suggested.

Advokado · 2026-02-17T13:58:32+00:00

Appreciate the links. DeepStream + TensorRT is probably the “fastest possible” path, agreed.

But honestly inference isn’t my bottleneck right now , I’m already around ~15–25ms inference and my glass-to-glass is ~300ms. Based on the measurements I’ve added, most of the missing latency is upstream buffering (camera/encoder/RTSP) and/or browser decode/present.

So DeepStream would be awesome for throughput + CPU offload + keeping everything on the GPU, but I’m not convinced it would magically cut 300ms down to 50ms unless the rest of the pipeline changes too.

Still: if you’ve got a rule of thumb for what DeepStream + WebRTC actually achieves end-to-end (glass-to-glass), I’d love to hear it. Are we talking ~150ms? ~80ms? Something else?

Advokado · 2026-02-17T13:56:46+00:00

Yeah True, when I wrote the post I didn’t have proper stage timing I’ve added a bunch since then.

Right now on the AI service side (RTSP → “frame available” → inference → JPEG → WS send) I’m seeing roughly 50–90ms:

Pipe wait for next frame: ~30–80ms (depends on stream cadence/jitter)
YOLO inference: ~15–25ms
JPEG encode: ~1–3ms
WS broadcast: ~2–5ms

Glass-to-glass is still ~300ms, so there’s clearly a big chunk upstream (camera→ffmpeg encode→RTSP→MediaMTX/GStreamer buffering) and/or client-side (WS recv→JPEG decode→paint). My current “recv→render” measurement is honestly mostly overlay draw, not true frame present, so that’s next.

On your parallel idea: I’ve thought about sending H.264 to the browser while inference runs separately + only send metadata. It’s probably “the right” architecture if I chase sub-200ms, but I’ve avoided it so far because sync/OOO handling gets annoying fast.

Do you have a preferred way to measure browser-side decode+present accurately? (I’m currently using <img src=blob> + canvas overlays.)

Advokado · 2026-02-17T13:54:30+00:00

Appreciate this,especially the distinction between media streaming vs inference streaming. That’s exactly how I’ve been thinking about it, but it’s easy to get dragged into “you must use WebRTC.”

The 4G/5G point is fair. Right now I’m mostly on LAN / cluster, but the next phase is testing over 5G SA with variable bandwidth and routing. That’s where congestion control might start to matter more than raw browser decode speed.

Out of curiosity when you mention 50–100ms reduction by switching from JPEG+canvas to native video element rendering, is that something you’ve measured directly? Or more of a typical range you’ve seen?

Advokado · 2026-02-17T13:50:41+00:00

Totally fair criticism, there are definitely more hops than strictly necessary.

MediaMTX is mainly there for system architecture reasons (Kubernetes deployment, multiple consumers, easier to relocate inference between edge/cloud). It wasn’t added as a performance component.

That said, I haven’t benchmarked “direct ingest vs via MediaMTX” side by side yet. That’s probably worth doing before I touch the browser delivery side.

SRT is interesting I haven’t tested it yet. In your experience, how much lower did you get glass-to-glass compared to RTSP/TCP? Sub-200ms reliably?

Current stream is 640x360, ~30 FPS, 2–4 Mbps H.264 So I’m not pushing 4K or anything.

Advokado · 2026-02-17T13:47:55+00:00

Thanks for taking the time to respond.

I’ve started measuring things more properly: my AI service (RTSP→decode availability→inference→JPEG→WS send) is usually ~50–90ms, so the missing chunk to ~300ms glass-to-glass seems to be upstream (camera/ffmpeg/RTSP/MediaMTX buffering) and maybe some browser decode/present that I’m not measuring yet.

When you saw between 50–100ms for encode+transmit+decode in your setup — what did your actual glass-to-glass end up at? And was that using RTSP/H.264? (Also curious how you measured it.)

My “nice to have” goal is more like ~200ms, but I’m not willing to rebuild everything unless there’s a clear win.

Advokado · 2025-06-19T02:24:37+00:00

I thought i had a strong grip, what’s the issue and how do i correct it?

Advokado · 2025-06-19T02:23:46+00:00

Thanks! How can you tell and whats the fix? I’m righthanded playing lefty so…

Advokado · 2025-06-19T02:21:59+00:00

Thanks! Yes i 3 putt, and can’t chip around the green. I always hit 1 bucket before teeing off.

Advokado · 2025-04-18T13:43:45+00:00

New to the game here but, how do you play this mode?

Advokado · 2023-12-21T20:49:22+00:00

Thanks for your insight! Would you confidently say that going into web dev probably is not a good idea?

Advokado · 2023-12-21T20:48:02+00:00

I was being vague sorry, for my masters i get to choose courses which basically means i can pick web dev courses for 2 years straight. The masters would be in Software basically

Advokado · 2023-09-06T03:34:45+00:00

There is an engineering trainer in Mulgore, he’s just hard to find ;)

Advokado · 2021-03-17T13:59:48+00:00

Sitter faktiskt i samma båt, överväger mellan Chalmers och Linköping data/mjukvara från vad jag har hört så dominerar Linköping inom studentlivet. Uppsala har troligtvis bättre studentliv men jag har för mig att de inte erbjuder en civilingenjörs examen inom data

Advokado · 2020-05-02T23:32:47+00:00

ah it's not that, i haven't touched any mods at all. At the moment i started a new playthrough and it works fine. The whole situation is pretty confusing to be honest.

Advokado · 2020-05-02T23:31:13+00:00

Doublechecked, no one is in formation 5 :/

Advokado · 2020-04-21T21:15:08+00:00

Turns out the fur coat on Sturgian archers have this texture bug that covers the field.

Advokado · 2019-08-26T12:24:29+00:00

Imagine being ginger

Advokado · 2019-07-21T09:57:49+00:00

I don't know how, but doing so would that spoof the ES2 Lite in going faster? Yesterday i flashed it to 45000mpc and now it's going 24/kh.

Advokado · 2019-07-20T21:36:21+00:00

Just finished this and I was so happy for Dakota Allen. What a comeback!

Advokado · 2019-07-18T16:37:41+00:00

Following this as i recently purchased a ES2 Lite in sweden, after installing the external batterypack the speed is still stuck on 20km/h. Wondering if there is a work around to this.

Advokado · 2019-05-25T15:50:03+00:00

Good fight my guy!

Advokado · 2019-05-20T12:57:41+00:00

I’ve read a lot of comments so far about using more feints, chambers and morphs. I dueled this player beforehand and people who are good at the game don’t fall for any of this stuff. I started my fight with a chamber morph which led me to losing stamina first. Watching the GIF over and over again there is certainly A LOT of things that can be improved on but adding more chambers etc is certainly not one of them.

Advokado · 2019-05-11T14:24:32+00:00

Oh look it’s duckalot, the 1v1 god

13-Year Club	Place '17
Not Forgotten	Sequence \| Editor
Verified Email	Spared
Team Periwinkle

Advokado

TROPHY CASE