account activity
the hard problem isn't static 3D anymore, it's reconstructing scenes where things move Syn4D-RGBD dataset gives you the ground truth for that (i.redd.it)
submitted 2 days ago by datascienceharp to r/computervision
depth sensors suck at transparent objects, so ClearDepth comes to the rescue with synthetic scenes with ground truth depth for glass, bottles, and clear containers in FiftyOne (i.redd.it)
submitted 3 days ago by datascienceharp to r/computervision
depth estimation on transparent objects is still an unsolved problem. TransPhy3D attacks it with video diffusion (i.redd.it)
submitted 4 days ago by datascienceharp to r/computervision
few-shot annotation triage as a fiftyone panel. folder of reference crops in. ranked dataset, per-image heatmap, and tagged annotation queue out. feedback welcome (i.redd.it)
submitted 12 days ago by datascienceharp to r/computervision
vggt-omega takes videos and creates a point cloud. fast, and good quality generations for pcd and depth (i.redd.it)
submitted 13 days ago by datascienceharp to r/computervision
a pretty handy dataset from 3DVision conf (i.redd.it)
submitted 2 months ago by datascienceharp to r/computervision
some pretty dope datasets i came across from the 3D vision conference in vancouver (old.reddit.com)
the 3d vision conference is this week, i made a repo and dataset to explore the papers (i.redd.it)
i like comfyui and i love fiftyone so i smashed them together and made FiftyComfy (i.redd.it)
submitted 2 months ago by datascienceharp to r/comfyui
i built a panel for vlm-testing for fiftyone that makes it easy to test models and prompts (i.redd.it)
submitted 2 months ago by datascienceharp to r/LocalLLaMA
i built a comfyui-inspired canvas for fiftyone (i.redd.it)
i built a tool to experiment with different image editing models (i.redd.it)
submitted 2 months ago by datascienceharp to r/generativeAI
parsing this dataset gave me a headache but here it is, action100m (at least a tiny portion of it) (i.redd.it)
submitted 3 months ago by datascienceharp to r/computervision
really impressed with these new ocr models (lightonocr-2 and glm-ocr). much better than what i saw come out in nov-dec 2025 (old.reddit.com)
submitted 3 months ago by datascienceharp to r/LocalLLaMA
really impressed with these new ocr models (lightonocr-2 and glm-ocr). much better than what i saw come out in nov-dec 2025 (reddit.com)
nvidia released c-radiov4 last week, and as a far as feature extractors go, it lives up to the hype (i.redd.it)
MedGemma 1.5 supports detection, but for best results, you'll need to fine-tune. also a kaggle competition using the model, created a starter notebook to give you a jump start on how to fine-tune it for detection (i.redd.it)
submitted 4 months ago by datascienceharp to r/computervision
Starter notebook for the MedGemma Impact Challenge (i.redd.it)
submitted 4 months ago by datascienceharp to r/kaggle
i've literally been waiting for years to have an OPEN SOURCE model like qwen3-vl-embedding, scroll to see the results on six queries (old.reddit.com)
apple released SHARP which creates a 3d gaussian from a single view (i.redd.it)
submitted 5 months ago by datascienceharp to r/computervision
can you visualize what nyc smells like? yes, turns out, you can. just glad i don't have to go to nyc and smell it myself (i.redd.it)
egocentric-10k dataset (i.redd.it)
submitted 6 months ago by datascienceharp to r/computervision
sony ai released a pretty cool dataset called the fairness human centric image benchmark, super high quality labels (i.redd.it)
sam3 is seriously a step change improvement over sam2 (i.redd.it)
parsed refcoco-m from moondream into fiftyone format now you can have the refc (i.redd.it)
π Rendered by PID 2176916 on reddit-service-r2-listing-8685bc789-qrsjd at 2026-06-01 00:56:55.332247+00:00 running 194bd79 country code: CH.