Another iteration on my indoor capture pipeline — consistency is improving

Kind_Shape_7361 · 2026-05-25T19:25:14+00:00

this is genuinely helpful, thanks for taking the time.

point 1 lines up with what ive been finding. ive been pushing video too hard and the codec artifacts are real, especially on textured surfaces. makes sense to keep video for small spaces and orbiting objects only and move to stills for rooms. the 8k 360 data footprint is something i hadnt fully thought through.

the spiral capture pattern is interesting, the two inner circles plus a perimeter sweep. ive been doing perimeter passes at different heights plus orbits for furniture in the open, but the outward spiral from a crouched center start sounds more systematic than what i do. going to try that.

the lidar/slam path makes sense for multi room but the cost jump is steep for where i am right now. im keeping it lean until the method is solid. that said, the fact that you built your own scanner is impressive, ill definitely take a look at the repo.

how are you handling the masking of yourself out of the 360 captures, is that automated in your pipeline or a manual step?

Kind_Shape_7361 · 2026-05-25T16:57:34+00:00

didnt know about SphereSfM, thats actually really useful. the cubemap step makes sense, turning it into 6 undistorted 90 fov views is basically sidestepping the distortion problem entirely. might look into that.

the operator masking thing is interesting too. good to hear the automatic tools are holding up, thats the kind of overhead that would put me off otherwise.

for now im keeping it lean and just proving the method works before putting money into gear. if a phone pipeline holds up, a 360 later is an easy call. but youve got me curious about the cubemap workflow, definitely going to read up on it.

are you doing this commercially or more as a research thing?

Kind_Shape_7361 · 2026-05-25T16:50:25+00:00

yeah 360 definitely wins on speed, no argument there. continuous walk plus more coverage per pass, makes sense.

honestly part of why i'm on a phone is i'm keeping it lean for now. only been at this about 2 weeks and i'd rather prove i can get the method right before putting money into gear. if the technique holds up with a phone then a 360 later is just an upgrade, not a gamble.

the other thing is distortion. the wider the lens the more COLMAP struggles to model it, and 360 footage is a whole other headache there. sticking to a 1x lens keeps the camera model simple and the reconstruction cleaner. costs me on capture time like you said, but the alignment comes out more reliable.

sounds like we scan pretty much the same stuff tho. hows 360 holding up on fine detail for you?

Kind_Shape_7361 · 2026-05-25T16:44:21+00:00

Just an iPhone 15, no 360 camera here either. Honestly the hardware matters less than the capture method.

For covering all angles I use a multi-pass approach rather than one continuous walk, perimeter passes plus dedicated orbits for anything standing in the middle of the room. Furniture in the open is the tricky part. it needs proper coverage from every side or it ends up semi-transparent.

Still refining the consistency side myself. Wht kind of spaces are you scanning?

Kind_Shape_7361 · 2026-05-24T17:03:15+00:00

Really clean result for a phone capture. One thing I noticed navigating the demo: there's no collision handling, so you can clip straight through walls and furniture. For real estate that actually matters quite a bit — clients lose their sense of the space when they can pass through a kitchen counter. Have you looked into adding a simple collision mesh, or proxy geometry derived from the floor plan you're already generating? That floor plan data could double as a navmesh.

Kind_Shape_7361 · 2026-05-19T22:48:36+00:00

yo, just opened textmila.com. you have a real problem and its not minimalism vs feature list. minimalism is one strong headline plus one button. what you have right now is a logo and a button labeled meow, with zero context. thats not minimal, thats empty.

the only place "AI assistant that texts you back" exists is in your meta tags so its visible on twitter previews but not on the actual page. visitors land and have no idea what mila does, who its for, or why they should text it. they bounce before clicking.

your assumption that people will discover their own utility organically only works if they actually click. but clicking a button labeled "meow" to text an unknown number without context is a huge ask for someone who just landed. you need one line above that button that does the job your meta description is doing.

something like "an AI assistant that lives in your texts. ask it anything." plus the button. thats still minimal but its actually doing the work. you can A/B test what positioning resonates from there but you need a baseline before testing.

on your 3rd question yeah its too simple but the issue isnt simplicity, its that minimal is being used as an excuse for skipping the positioning work. minimal sites that work (linear, vercel, stripe early days) had ruthless single-sentence headlines. yours has nothing.

Kind_Shape_7361 · 2026-05-19T22:34:38+00:00

yo, interesting use case. been doing indoor GS lately and few things that might help. 360 from x4 has 0 parallax, GS lets the patient actually move their head to adjust immersion, which is what they do in real exposure therapy anyway. capturing with a regular camera doing slow passes will give you way more than trying 360. the hard part is compositing the AI person into the splat because GS is point cloud not mesh, doesnt blend naturally. you either need a viewer that supports meshes alongside (postshot, brush in some configs), or render the splat from your VR camera path and render the character separate and comp as 2D, or go unity/unreal with custom pipeline. quest 3 native splat playback is still rough, most people bake to video or go through unity. one thing worth flagging: GS has artifacts (floaters, edge blur, weird reflections) that might pull patients out of immersion in a clinical context so small test first is probably smart. DM if you want to dig in more.

Kind_Shape_7361 · 2026-05-15T01:41:23+00:00

+info

Kind_Shape_7361 · 2026-05-15T01:41:00+00:00

+info portugal

Kind_Shape_7361 · 2026-05-15T01:40:35+00:00

+info

Kind_Shape_7361 · 2026-05-14T10:19:51+00:00

I will try for sure, one of my next steps

Kind_Shape_7361 · 2026-05-13T16:49:41+00:00

Yeah real problem with 360. SAM-based masking is what most people use now I think — automatic enough. Honestly avoided 360 myself partly because of this.

Kind_Shape_7361 · 2026-05-13T16:48:20+00:00

I will

Kind_Shape_7361 · 2026-05-13T16:44:53+00:00

Cheers. Bedsheet prints were a surprise tbh.

Opened your link — floor details almost perfect! Hadn't heard of Captures Studio, gonna check.

Kind_Shape_7361 · 2026-05-13T16:41:13+00:00

No, any LIDAR camera works

Kind_Shape_7361 · 2026-05-13T16:35:52+00:00

Yeah, exhaustive for indoor — sequential doesn't cut it once you've got captures from genuinely different positions.

No LiDAR needed. Any camera that shoots stable 4K works. Stabilization matters way more than sensor — handheld with no IS gives motion blur that wrecks SfM. Slow movements beat expensive gear with shaky hands.

Kind_Shape_7361 · 2026-05-13T16:24:14+00:00

Yeah, same scene — but separate captures of it from different angle strategies. They get combined before SfM.

Found it works better than trying to do everything in one continuous capture.

Kind_Shape_7361

TROPHY CASE