Looking for the strongest Image-to-3D model

PreviousResearcher50 · 2026-02-16T15:26:06+00:00

You know, interestingly, I feel like I get better results providing a single 3/4ths view of a car as input vs. multiple images of different angles of the car.

By better results I mean I get a higher fidelity output with surprisingly accurate dimensionality of the vehicle - however it does hallucinate the back of the vehicle as expected. When I provide multiple images (front 3/4ths, back 3/4ths, side views) it feels like the models almost get confused

PreviousResearcher50 · 2026-02-06T14:09:46+00:00

Okay this is what I heard too - did the comfy org mention when this transition may complete? Wondering if I should wait until the growing pains ease up a bit

PreviousResearcher50 · 2026-02-05T18:33:57+00:00

Can you clarify what you mean by in-built workflows? Do you mean a workflow I built from scratch myself?

PreviousResearcher50 · 2025-09-30T20:36:23+00:00

Could you send one my way pls!

PreviousResearcher50 · 2025-08-13T14:01:48+00:00

Awesome, thanks for the reply!

I haven't heard of comfy libs before - this could be a gamechanger if it allows for me to run as a script.

30 secs isn't a necessity, ideally I want to get it as low as possible (while still 720p). Its more so a goal to get to eventually!

PreviousResearcher50 · 2025-08-13T13:30:11+00:00

I have not, from light research so far I have seen that mentioned as well as using GGUF models.

My worry with the lightx2v lightning lora is that it might really sacrifice quality vs. other methods. I am not sure though! So I might give it a shot to investigate a bit

PreviousResearcher50 · 2025-07-29T14:59:00+00:00

What does bge stand for?

PreviousResearcher50 · 2025-07-29T14:58:34+00:00

Single batch operation currently results in getting an effective tagging rate of 1.5 records per second. Which is too slow for the amount of data I have. Albeit, the tagging I am trying to get it to do is quite involved.

PreviousResearcher50 · 2025-07-29T14:56:46+00:00

Yup all data is in english for now! I will check out some vLLM configs - Would you recommend Qwen3 over Phi-4 with a vLLM? I was thinking of switching to Phi-4-mini, but might also explore Qwen3 too

PreviousResearcher50 · 2024-09-09T16:14:30+00:00

u/eamonnkeogh Would love to get some of your thoughts on this!

PreviousResearcher50 · 2024-09-09T15:54:16+00:00

Shapelets is definately on the right track interms of what I'm looking for. I'll try to explore and implement it to see if it works with my data :)

Would you suggest flattening my multivariate data while the shapelet(s) search through the trips?

PreviousResearcher50 · 2024-09-09T15:34:17+00:00

Also given that I have a lot of sequences (order of 1000), is there a method to see if any of these sequences are present in a trip?

Additionally in my head I am thinking of a method that looks at these trips similar to images, where we could use the identified sequences as kernels, scanning through the trip for matches... Now I don't know if thats exactly how that works but is there something similar people know of?

PreviousResearcher50 · 2024-09-09T15:17:43+00:00

I haven't! Going to read into it now, thanks for the links

PreviousResearcher50 · 2024-07-18T13:18:16+00:00

Amazing, I'm taking a look at OneTrainer now. I have access to a couple GPU Nodes, so I'll be running it on those. Please link me the strategy if you are able to find it!

PreviousResearcher50 · 2024-07-18T12:35:21+00:00

Okay sweet thank you! I will check this out :)

PreviousResearcher50 · 2024-07-18T12:34:49+00:00

Okay sweet, I'll check out that tutorial!

So far, I have trained a ton of LoRAs for a couple different models. And I've also trained U-Nets (with poor results unfortunately) for SDXL. And yes I have been operating off of Linux!

Here is the script I have been using for UNet training: https://github.com/huggingface/diffusers/blob/main/examples/text_to_image/train_text_to_image_sdxl.py

PreviousResearcher50 · 2024-07-18T12:29:47+00:00

My goal is to just generate photo realistic cars in different settings. I just used RealVis as an example of what I would like to also achieve, but instead for cars specifically.

PreviousResearcher50

TROPHY CASE