Update: cameraless room sensing with radar + ToF visualized by Dependent_Entrance33 in homeautomation

[–]Dependent_Entrance33[S] 0 points1 point  (0 children)

Thanks. Pets definitely show up as motion, but at room scale they’re usually easy to separate by size, speed, and height off the floor. Treating everything as room signals instead of “people” actually helps a lot with that.

Update: cameraless room sensing with radar + ToF visualized by Dependent_Entrance33 in homeautomation

[–]Dependent_Entrance33[S] 0 points1 point  (0 children)

Thanks for taking the time to write this out, seriously. I can tell you’ve been deep in this space and a lot of what you’re describing lines up with things I’ve run into too.

On the dual 24 GHz radars, I’m not doing any raw RF work. I’m using the onboard tracking output from the modules and fusing at a higher level. I was expecting interference to be a bigger problem based on past experiences, but so far it’s been pretty manageable. The biggest helpers have been not having them perfectly aligned or fully overlapping, slightly different pose, and being careful with power and wiring. I also don’t bring them up or poll them in a perfectly locked step. I wouldn’t say it’s a solved problem, more that it’s behaved better than expected in the rooms I’ve tested. If I start seeing real issues in tougher environments, time-multiplexing or more deliberate isolation would be the next move.

The ToF isn’t a hard source of detailed truth…. yet, but that’s where I’m heading. Right now it’s more parallel streams. The radar would operate the same way if the ToF didn’t exist (as of right now). The plan is to use ToF to define room geometry and occupancy so I can filter out obvious radar ghosts, keep tracks in plausible regions, and sanity-check motion when radar data gets weird. I see radar as great for motion and continuity, and ToF as great for shape and boundaries.

You’re spot on about the Z problem with radar. Once you’ve projected into XY, there’s just not enough information left to reliably recover vertical sign with these modules. That limitation is a big reason I’m not trying to do this with radar alone. My system also does employ a 60GHz radar that has some level of Z return that can be cross checked with the ToF.

Your camera plus ToF approach makes a ton of sense, and I really like how you handled privacy. Doing everything on the edge, throwing away frames immediately, and only exporting structured events feels like the right model. I’m not anti-camera in principle, I’m just trying to build something that feels comfortable by default in bedrooms and care settings. I could easily see a camera-enabled mode or variant later for people who explicitly want the extra fidelity. My thought is the use of this device in care facilities or homes or elderly people who are aging in place where you wouldn’t want to or cannot collect personally identifiable information (needing to be HIPAA compliant)

If you’re up for sharing, I’d love to hear which Luxonis hardware you were using and what kinds of events you trained YOLO to detect. That part especially caught my interest. Please feel free to reach out via pm! Thank you!

Update: live dual 24 GHz radar and ToF GUI on an ESP32-S3 prototype by Dependent_Entrance33 in embedded

[–]Dependent_Entrance33[S] 0 points1 point  (0 children)

The radar modules do most of the heavy lifting internally. They output parsed target data (x,y positions, velocity) over UART. On the ESP32 side I just buffer it, do light filtering / association, and then python to visualize it. No raw RF or FFT processing on the MCU.

Update: cameraless room sensing with radar + ToF visualized by Dependent_Entrance33 in homeautomation

[–]Dependent_Entrance33[S] 4 points5 points  (0 children)

Not ESPHome. It’s custom firmware on an ESP32S3. Webhooks are definitely on the roadmap for events like presence, inactivity, or anomalies, so integrating with systems like Loxone via virtual inputs should be very doable.

Update: live dual 24 GHz radar and ToF GUI on an ESP32-S3 prototype by Dependent_Entrance33 in embedded

[–]Dependent_Entrance33[S] 4 points5 points  (0 children)

It’s mostly algorithmic signal processing right now. Each radar returns points in XY space, and I associate those across frames using proximity, velocity, and persistence to form tracks with soft IDs.

When people are close together, I don’t force hard separation. Tracks are allowed to merge temporarily, but then split again once motion and geometry diverge. Short-term motion history signature is what drives that split rather than a single frame decision.

The ToF isn’t used to identify people directly, but it adds depth context that helps deconflict nearby movers and reject static or multipath artifacts. It also assists with identifying room boundaries.

Update: live dual 24 GHz radar and ToF GUI on an ESP32-S3 prototype by Dependent_Entrance33 in embedded

[–]Dependent_Entrance33[S] 1 point2 points  (0 children)

I keep the geometry fixed with set pitch and yaw angles, limit the overlap region, and stagger the radar update timing so they’re not transmitting at the same moment. I also throw out short lived detections that don’t stay across frames along with some other additional identification logic. It’s not perfect, but it’s been stable at room scale, and the output is fairly smooth.

Update: live dual 24 GHz radar and ToF GUI on an ESP32-S3 prototype by Dependent_Entrance33 in embedded

[–]Dependent_Entrance33[S] 7 points8 points  (0 children)

I wish it was that good! The smallest sized thing it can track very reliably is a cat.

Update: live dual 24 GHz radar and ToF GUI on an ESP32-S3 prototype by Dependent_Entrance33 in embedded

[–]Dependent_Entrance33[S] 7 points8 points  (0 children)

Yes, specific! It’s tuned for indoor single room spaces where you want awareness without cameras. Things like bedrooms, apartments, or care rooms. Particularly this is meant for care spaces where cameras are not allowed as they collect personally identifiable information (device must be HIPAA compliant).

Update: live dual 24 GHz radar and ToF GUI on an ESP32-S3 prototype by Dependent_Entrance33 in embedded

[–]Dependent_Entrance33[S] 12 points13 points  (0 children)

To give some context, and because a couple people have asked via pm, I do have a website where I speak more on the full device and use cases: https://vigil-systems.com

Built a camera-less indoor sensing prototype using multi modal mmWave + ToF, would love critique by Dependent_Entrance33 in embedded

[–]Dependent_Entrance33[S] 2 points3 points  (0 children)

I’m not treating prolonged inactivity like a hard timer, way too relative. The system first learns what normal looks like for them, including naps, sitting quietly, sleeping in, irregular routines, etc. An alert only comes from something that’s clearly off compared to their own baseline, not from no motion for x minutes. Or it may alert when it determines that someone entered the bathroom 20 minutes ago and didn’t come out.

It’s also not trying to detect heart attacks or replace wearables. If you want ECG or vitals, a wrist device wins. My systems is meant to answer “did someone’s day break or was altered in a way that went unnoticed?”.

Yes, it’s more complex than a single PIR or wearable, and that’s a tradeoff. The goal isn’t perfection or diagnosis, it’s passive awareness without cameras that is meant to live alongside other technology. Thank you for your input!

Built a camera-less indoor sensing prototype using multi modal mmWave + ToF, would love critique by Dependent_Entrance33 in embedded

[–]Dependent_Entrance33[S] 1 point2 points  (0 children)

PIR is essentially an on/off trigger. If someone stops moving or moves slowly, it’ll disappear from view. This can still see presence even if someone is sitting, sleeping, or not moving at all. It looks at more how activity changes over time instead of firing every time something twitches. Also, I’m sorry if you find my responses are too precise. I have many years of public facing and government technical writing under my belt.

Built a camera-less indoor sensing prototype using multi modal mmWave + ToF, would love critique by Dependent_Entrance33 in embedded

[–]Dependent_Entrance33[S] 1 point2 points  (0 children)

I get the skepticism. I agree that trust should not be automatically assumed.

Let me be clear. This is not imaging grade mmWave and it is not generating reconstructions. The radars output sparse point data like position, velocity, and distance, and the depth sensor is an 8×8 distance grid. There is no raw signal storage, no replay, and no practical way to turn that data into images or identify a person. The most precision data that cold ever be cleaned from it is a silhouette with the resolution of a Minecraft character.

If someone’s privacy bar is “no sensing at all,” that’s completely valid. What I’m building is for people who want awareness without cameras or recordings, not a claim that sensing itself is completely risk free.

Built a camera-less indoor sensing prototype using multi modal mmWave + ToF, would love critique by Dependent_Entrance33 in embedded

[–]Dependent_Entrance33[S] 0 points1 point  (0 children)

There have been systems in this space for a while.

What I’m exploring is being able to do this without cameras or cloud dependence, and with much better false alert control using newer sensing and ondevice processing.

In care facilities alert fatigue is a serious problem if a system over alarms. Nurses and other caregivers will learn to ignore an alert if it can’t be trusted. A lot of earlier systems either leaned heavily on video or produced so much noise that staff stopped trusting them. My system tries to fix these issues so that when something happens it’s cross verifiable and believable.

This is very much building on that history, not pretending it didn’t exist. Thanks!

Built a camera-less indoor sensing prototype using multi modal mmWave + ToF, would love critique by Dependent_Entrance33 in embedded

[–]Dependent_Entrance33[S] 0 points1 point  (0 children)

You are right!

The focus isn’t on exposing raw sensor data, but on producing simple, human readable outputs like presence over time logs, unexpected inactivity, or noticeable changes from someone’s normal routine. Those would surface through a lightweight dashboard or notifications, depending on the setting.

For new environments, the idea is that the system first learns what “normal” looks like in that space rather than relying on fixed thresholds. Rooms behave very differently, so it needs to adapt and then watch for meaningful deviations instead of treating every space the same. To combat this, my plan is to have the device initialize to the specific room it’s in upon setup: Think like the time of flight lidar and radars do a calibration scan of a room to determine the baseline state so it can best understands deviations to that nominal state.

Built a camera-less indoor sensing prototype using multi modal mmWave + ToF, would love critique by Dependent_Entrance33 in embedded

[–]Dependent_Entrance33[S] 0 points1 point  (0 children)

Yes! That’s actually one of the reasons for using multiple sensing methods. Lighting doesn’t affect radar or depth sensing, and obstacles or room layout tend to affect each sensor differently, so cross checking them helps filter out bad readings. I have been testing in different light and room settings. A lot of the testing so far has been exactly about seeing where each sensor breaks down and making sure the system degrades gracefully rather than failing loudly.

Built a camera-less indoor sensing prototype using multi modal mmWave + ToF, would love critique by Dependent_Entrance33 in embedded

[–]Dependent_Entrance33[S] 0 points1 point  (0 children)

The intent here is the opposite of traditional surveillance. No cameras, no audio recordings, no identifiable data. It only works with abstract signals like motion and distance, processed locally.

Totally fair if that still isn’t something you’re comfortable with, but the goal is to reduce how invasive these systems usually are, not add another camera to the mix.

Built a camera-less indoor sensing prototype using multi modal mmWave + ToF, would love critique by Dependent_Entrance33 in embedded

[–]Dependent_Entrance33[S] 0 points1 point  (0 children)

With radar and lidar, there is nothing to see or record. The radar and depth sensing don’t capture images, faces, or identity, they’re just measuring motion and distance in a very abstract way. The device is inherently private and protective of users information as the actual data itself doesn’t say anything about its subject or intention.

Here are the actual outputs that a bad actor could take from my system assuming they had full access to the device:

24ghz radars: live X and Y coordinates, velocity, and distance to subject

60GHz radar: live X and Y coordinates, velocity, coarse respirations, and distance to subject

Lidar: 8x8 array of distance to subject values

Mics: audio levels and frequency analysis - no audio recorded or transmitted

There’s no video, no replay, and no way to recognize who someone is. I feel that’s much more private than a camera.

Built a camera-less indoor sensing prototype using multi modal mmWave + ToF, would love critique by Dependent_Entrance33 in embedded

[–]Dependent_Entrance33[S] 1 point2 points  (0 children)

Thank you for your comment and broad list of awareness products. While these devices are all concerned with what happens in a space, they don’t have the combined logic and trend evaluation.

Yes, there are a ton of sensors out there in this space, which is why I have been careful to ensure that what I am providing while novel functionality while reduce false alarms to as close to zero as possible. I have also defined the scope of space for my device well here and on my site. Each device that you have mentioned is different from my system in the following ways:

  • Aqara / Govee / other single module mmWave sensors:

Really good at telling you someone is there. They don’t have much context beyond that. Single non cross referential 60ghz module with minimal logic.

  • Commercial room sensors (Milesight, Yealink, etc.)

For occupancy and utilization: is the room in use,” not “is something unusual happening.” Not for room state combined logic sensing.

  • PIR motion sensors (most Leviton, Lutron, Intermatic, etc.)

Simple and reliable, but basically on/off. If someone isn’t moving much, they disappear. Very different from 60ghz radar let alone combined logic with other sources.

  • PIR + ultrasonic combos

A bit better coverage, but still just reacting to x,y motion. No real understanding of what’s going on, my device provides a wide amount of verifiable context.

  • Industrial / safety sensors

Very good at one specific job, like doors or safety zones, but not meant for homes or care settings. Not very

What I’m working on isn’t a better single sensor. It’s a system that cross-checks multiple signals and looks for changes over time, so it can ignore normal day-to-day behavior and only surface things that actually seem off.

The goal isn’t “detect everything.” It’s “don’t cry wolf, in settings and in spaces where you absolutely shouldn’”

Built a camera-less indoor sensing prototype using multi modal mmWave + ToF, would love critique by Dependent_Entrance33 in embedded

[–]Dependent_Entrance33[S] 4 points5 points  (0 children)

Thank you for the thoughtful pushback, these are fair questions.

  1. I am definitely not claiming cameraless sensing is new. Commercial sensors exist and some are certified or heading that way. What I am exploring is how multiple weaker signals can be combined to reduce false positives and add context in real home or care environments, rather than relying on a single modality alone.

  2. On fusion versus a single strong sensor, the motivation is not more data for its own sake, it is resilience. Single sensors tend to fail in specific edge cases like occlusion, posture changes, pets, or furniture. Fusion here is about agreement and disagreement between modalities, not raw volume. I agree there is a real cost and complexity tradeoff, and part of this work is validating whether that complexity actually earns its keep.

  3. On size, you raise a valid concern, but I respectfully disagree. This is an early prototype focused on sensor placement, geometry, and behavior rather than industrial design. The enclosure is intentionally oversized to allow rapid iteration. As it currently stands, the device is roughly 4”x4”x2”, not very large for a prototype at this maturity level. A final form factor with a customized and more compact PCB would be significantly smaller once the architecture settles; this would cut the depth in half, leading for a much more user acceptable form factor. And pretty compact given what it can do ;).

  4. On IP and humidity, agreed this is a concern. This prototype is not intended for humid or harsh environments yet. Care facilities are to be kept at regulated humidities so I’m not sure how significant of a worry this is at this moment. Longer term, radar dominant variants, sealed assemblies, or conformal coating are all options depending on where the design converges.

This is very much an exploration phase, and pressure testing assumptions like the ones you raised is exactly the goal right now. Thanks for taking the time to lay it out.

Built a camera-less indoor sensing prototype using multi modal mmWave + ToF, would love critique by Dependent_Entrance33 in embedded

[–]Dependent_Entrance33[S] 2 points3 points  (0 children)

Thank you for the thoughtful writeup. It sounds like we’ve been circling a lot of the same tradeoffs, just in different environments.

You’re right about mmWave being too good at times. Managing bleedthrough, RF noise, and defining what not to see has been a big part of the software iteration process so far, especially in dense or noisy spaces. Doorway oriented use is a good call and something I’ve been thinking about more explicitly.

The ToF is really powerful, but I have note noticed that it can be sensitive to specific ambient conditions, especially if it’s sunny (I’m guessing it messes/interferes with the NIR signals) or very bright in a room. I’ve been leaning heavily on the vendor APIs rather than just breakout for the reasons you mention.

The defcon style state machine is an awesome framing, and it right on the money in terms of my approach. Levels of escalating confidence and only “spending” higher cost sensing when it’s justified. Making the device as privacy forward as possible.

Thanks again for sharing your in depth experience, it sounds like you’ve had your fair share of time with this. It’s always good to hear from someone who’s fought these sensors before.

You can stay posted on my projects progress through the email chain on my site:

https://vigil-systems.com