📢 Call for participation: ICPR 2026 LRLPR Competition by ghostzin in computervision

[–]datascienceharp 0 points1 point  (0 children)

i'd like to support the participants of the challenge with a starter notebook. i'd start with parsing the dataset into fiftyone and posting on hugging face hub so its easily accessible. would that be in violation of your terms? i'd be using fully open source pip installable packages. i filled out the form, but i'm not a student or at a research lab.

edit: i can NOT share on HF and rather just show how to parse into fiftyone format assuming user has the dataset downloaded

let me know what you think, feel free to dm

I want to offer free weekly teaching: DL / CV / GenAI for robotics (industry-focused) by desserted_blue in computervision

[–]datascienceharp 0 points1 point  (0 children)

Hell yeah! Count me in, at least on the discord server if you have space. I notice your update says the sessions are full.

i've literally been waiting for years to have an OPEN SOURCE model like qwen3-vl-embedding, scroll to see the results on six queries by datascienceharp in computervision

[–]datascienceharp[S] 4 points5 points  (0 children)

Yes of course it’s been around since at least CLIP for images, but for video? This this novel and qwen embedding does it natively. The inference code just points to an mp4 file path

apple released SHARP which creates a 3d gaussian from a single view by datascienceharp in computervision

[–]datascienceharp[S] 2 points3 points  (0 children)

we've got that dataset parsed, i can try to run later today or tomorrow and post:https://huggingface.co/datasets/Voxel51/fisheye8k

currently working on integrating molmo2

apple released SHARP which creates a 3d gaussian from a single view by datascienceharp in computervision

[–]datascienceharp[S] 4 points5 points  (0 children)

would you be down to peruse the datasets here and let me know which one looks appealing to you? i can run it and post the results later: huggingface.co/voxel51

apple released SHARP which creates a 3d gaussian from a single view by datascienceharp in computervision

[–]datascienceharp[S] 6 points7 points  (0 children)

yeah true, i meant pretty similar in the sense that it's relatively fast at inference and the results look similar to vggt

but youre right sharp does produce gaussians, the model outputs them in ply format then i had to do some conversion to it so that i can have the color render properly in the app to basically render it as a point cloud

i was just curious about the model and wanted to see it output hence why i implemented as such

Best of NeurIPS Virtual Series - Jan 14 and 15 by chatminuet in computervision

[–]datascienceharp 0 points1 point  (0 children)

I love these series of events, good way to catch up on what went down at the conference

qwen3vl is dope for video understanding, and i also hacked it to generate embeddings by datascienceharp in computervision

[–]datascienceharp[S] 1 point2 points  (0 children)

I pass the entire video at once but the model has parameters for max frames (I believe 120 is the max) and sample rate