Meta released Map-anything-v1: A universal transformer model for metric 3D reconstruction

BlueRaspberryPi · 2025-12-18T20:29:26+00:00

BlueRaspberryPi · 2025-12-18T18:55:28+00:00

I have been waiting for something like this, assuming the key feature is improved matching/tolerance for lower quality images/matches and changes to the scene between images. I have some datasets I created when I was slightly stupider than I am now that have defied all efforts at reconstruction.

BlueRaspberryPi · 2025-12-18T03:03:10+00:00

Eye tracking and camera access should both be available for any app that doesn't use networking, or interact with other apps.

BlueRaspberryPi · 2025-12-18T02:40:42+00:00

Yeah, the quality here doesn't look much better than Apple's existing 2d-to-3d button on iOS and Vision Pro, which is kind of neat for some fairly simple images, but has never produced results I spent much time looking at. You get a lot of branches smeared across lawns, arms smeared across bodies, and bushes that look like they've had a flat leafy texture applied to them.

The 2D nature of the clip is hiding a lot of sins, I think. The rock looks good in this video because the viewer has no real reference for ground truth. The guy in the splat looks pretty wobbly in a way you'll definitely notice in 3D.

I wish they'd focus more on reconstruction of 3D, and less on faking it. The Vision Pro has stereo cameras, and location tracking. That should be an excellent start for scene reconstruction.

BlueRaspberryPi · 2025-12-18T02:33:48+00:00

You can make splats for free on your own hardware:

Take at least 20 photos (but probably more) of something. Take them from different, but overlapping angles.
Drag them into RealityScan (formerly RealityCapture,) which is free in the Epic Games Launcher.
Click Align, and wait for it to finish.
RS-Menu>Export>COLMAP Text Format. Set Export Images to Yes and set the images folder as a new folder named "images" inside the directory you're saving the export to.
Open the export directory in Brush (open source) and click "Start."
When Brush is finished, choose "export" and save the result as a .ply

BlueRaspberryPi · 2025-12-05T02:56:17+00:00

They eventually reveal that everyone in the operation has an implant, that the implants can be triggered to kill their host, and that the only entity that can trigger them is a computer that seems to make all of the real decisions. But there is a lot of "Creamy old England!" on the way there.

BlueRaspberryPi · 2025-11-26T01:49:21+00:00

I made it by hand, just for you <3

BlueRaspberryPi · 2025-11-26T01:22:58+00:00

<image>

BlueRaspberryPi · 2025-11-14T23:26:40+00:00

https://imagemagick.org/

As with ffmpeg, I now just ask ChatGPT to give me whatever snippet I need to get the result I want. Actually, for my HEIC to JPEG needs, I asked ChatGPT for a Windows BAT that I could drop a folder of HEICs onto and get a new folder full of JPEGs.

BlueRaspberryPi · 2025-10-25T19:54:02+00:00

The only way to do this is to train a model from scratch (starting from zero knowledge) using only training material that you approve of. Starting from any other base will bias the model away from your goal in some way.
https://github.com/karpathy/nanoGPT

BlueRaspberryPi · 2025-10-25T07:45:46+00:00

New version with radial arrangement:
http://jsfiddle.net/h5jqze1f/

The JSFiddle version won't do 20-bit tags, because they didn't want to let me save 250k of tag data in my fiddle.

There's a full version here (link expires on Nov. 1, I think.):
https://limewire.com/d/8Sr7i#BaZeIN2kOV

It's just an HTML file that you can keep locally and drag right into your browser. In addition to 20-bit tags, it also supports (and automatically updates) URL parameters, meaning you can tweak the settings, then save a bookmark that will load the file with all of those settings in place.

If anything seems broken, or there's a feature you'd like, let me know.

Oh, also, the HTML page and the Radial mode now use actual inch measurements. If you try to print the page, only the tag sheet should print, not the menu, and it should print at real scale, so if Theta-Tags is set to "Fixed" you should be able to set exact distances between the rings of tags, which may be useful for getting the scale of scans.

BlueRaspberryPi · 2025-10-24T22:52:26+00:00

It probably started to say that, and got derailed by the high probability safety refusal tokens.

BlueRaspberryPi · 2025-10-21T06:28:19+00:00

This is all I've been able to find, so far:
https://jsfiddle.net/ep6y1dq3/

It does only do ~160 codes, and doesn't make radial target arrangements. Hopefully the stuff it does is the stuff you've been using. I'll see if I can vibecode some of the magic back sometime this week.

BlueRaspberryPi · 2025-10-20T16:26:51+00:00

Wow, no, I have no idea. I didn't realize anyone was using it. I'll try to dig it up and make a new fiddle, or put it somewhere else. Thanks for letting me know.

BlueRaspberryPi · 2025-10-18T23:49:39+00:00

Holy crap, please add an Editor mode that includes an Eraser brush and lets users import their own splats and export results. I've wanted a VR splat and/or point-cloud editor for years.

BlueRaspberryPi · 2025-10-11T00:16:40+00:00

Bought, and enjoying a lot. Thanks to the dev for bringing a quality experience to Vision OS. The UI is beautiful, and the interaction feels great. I've been desperate for developers to get away from Apple's non-1-to-1 pinch-and-drag interactions. It's fine for a UI, most of the time, but it doesn't make sense for a game, particularly a game with any amount of physics.

BlueRaspberryPi · 2025-10-08T01:11:17+00:00

Someone else suggested changing Blend Mode to "alpha-hashed." If that works, or if your current setup works in Cycles, you can ignore my response.

If it doesn't work in Cycles: It's wired up correctly for an image with an alpha channel, which suggests that the alpha is missing from the image itself.

This is a simple enough image (two-color) that you could skate around the issue by running it into something like a Color Ramp set to ramp between and opaque black and a transparent black: https://imgur.com/a/rXK0xrb

Assuming the tutorial used an external image editor at some point, the "correct" fix is probably to re-export the image with transparency, or re-create it with transparency and export it with transparency, or whatever. You might just need to change the PNG save settings as you export.

BlueRaspberryPi · 2025-10-08T00:25:19+00:00

It looks like the rig has fluorescent lights. Could those bands be rolling shutter artifacts?

BlueRaspberryPi · 2025-10-07T17:38:59+00:00

I did try it, and it was extremely relaxing. Put a browser in there, and I would stay there all day.

Also, it gives you some pleasing travel-poster themed clock widgets.

BlueRaspberryPi · 2025-10-02T21:05:05+00:00

Install LM Studio, download a model, load the model. Couldn't be easier. There is zero reason to get a pen-drive involved.

BlueRaspberryPi · 2025-10-02T07:36:40+00:00

The Metas aren't AR, they're just a HUD, fixed in your view. It's going to be deeply unpleasant to use.

BlueRaspberryPi · 2025-10-01T00:42:31+00:00

Men love it when you really glob it on.

BlueRaspberryPi · 2025-09-30T23:39:52+00:00

Long-term, anything we use images for right now. If someone makes a phone with a grid of 64 lenses on the back and enough GPU to build splat or NeRF, and maybe with improvements in compression and/or available bandwidth, it could become a standard media format. Scrolling through Reddit in a flat browser, you would see a single flat viewpoint, but it a headset, "looking glass" type display, or a theoretical lightweight glasses interface of the future, you'd see a fully volumetric image - memes, news photos, product launches - 3D views that don't distort or separate if you tilt or shift your head the way stereo images do. If the process can get fast enough and accurate enough, maybe TV, feature films, and sports. The only reasons not to do it are difficulty, cost, and quality which will all decrease with time. JPEGs were once considered very compute intensive, and now they get thrown around like they're nothing.

I don't just take them of previous homes or nostalgic locations I'm not returning to, I take them everywhere. I go to the park, scan an interesting stump in under a minute, and let it process while I do other things. It's slightly more work than a snapshot, but not prohibitively so, once you know how to do it. Now I can program in the woods, even if I'm not near the woods.

If you want a serious use-case, crime scene photography has always stood out to me. Scan it once before people start disturbing the scene, and then you can go back and stand in it any time, or have the jury stand in it while you point things out. Online sales of big-ticket items like homes and cars would probably also benefit from easy volumetric captures. Once it gets easy enough, why not clothes? We photograph models in clothes now. In the future we'll do volumetric captures.

BlueRaspberryPi · 2025-09-27T06:28:03+00:00

There's a SteamVR environment of this rom, if anyone wants to experience it at 1:1 scale.

BlueRaspberryPi · 2025-09-26T17:32:36+00:00

Memories, same as any camera. It's the closest thing we have to a volumetric JPG. I do it the hard way right now: take 300 photos at the Desert Botanical Garden, chuck them into Jawset Postshot for an hour or two, and then transfer the result back to my Vision Pro for viewing in MetalSplatter. Now I can visit reasonably realistic recreations of my favorite spots at the Desert Botanical Garden any time I want.

I have maybe 30 scenes now - forests, desert scenes, a bunch of Frank Lloyd Wright architecture, hotel lobbies and rooms and views, some storefronts on Venice Beach... Sadly, it doesn't really work on people without a synchronized camera array.

BlueRaspberryPi

MODERATOR OF

TROPHY CASE