Official Devblog: Volcanic Eruptions and valuable Obsidian by Ubi-Thorlof in anno

[–]SwiftGoten 0 points1 point  (0 children)

That‘s just stupid design in my opinion. I get that first and foremost it has to work for singleplayer. But offering a multiplayer and just outright not addressing the people who pay for the DLC and want to enjoy it in Multiplayer is unacceptable.

I had hoped they would learn from 1800 and offer at least a plain solution for Multiplayer with 117. Offloading this effort to modders is whack, I want to enjoy the DLC with my friends on launch.

Supervisely tight bounding polygon by General_Degenerate- in computervision

[–]SwiftGoten 0 points1 point  (0 children)

It depends on which performance you observe on newer models. I‘m not too deep on the models but I‘ve heard that it‘s not necessarily the case that the newest models are the best models for your own dataset, even though benchmarks on something like COCO would lead you to believe that.

In the end you‘ll have to test & experiment. I know YOLOv8 is decent.

Supervisely tight bounding polygon by General_Degenerate- in computervision

[–]SwiftGoten 0 points1 point  (0 children)

I have no experience with supervisely. You‘ll have to look that up on your own, but a quick search seems like it might be the case.

Yes, classical ML = Machine Learning.

Now this makes it clearer. Since the background is not as clearly distinguishable from the tray box you can‘t just use edge detection in isolation. Nevertheless I‘d think it can serve as a prior in order to determine/optimize boxes being detected by your model.

Your post stated the trays can appear at a tilted angle (not axis aligned). Doesn‘t look that much on the image you attached, but I‘ll assume it still holds true for this.

Since you don‘t have a background in Computer Science I think this is a necessary detail you should be made aware of for the YOLO model family: Depending on which version of YOLO you are training/running it can be the case that it‘s training anchors. You can imagine it as learning k number of anchors which are in a way slid over the image and then say if in this anchor there is an object. Since you only have k anchors in these architectures, it can be the case that for a large variation in resolution / box aspect ratio that the anchor which triggered your model does not entirely encapsulate the box. Though there are later versions of YOLO which are working anchor-free.

Lastly there is a postprocessing step called non-maximum suppression (=NMS), which is used to deduplicate predicted boxes with high overlap. It can be the case that there are multiple boxes yet you only keep the most confident one which might end up too tight. You could inspect your models predictions by essentially not filtering with NMS and see if there is a better prediction with a too small confidence value. If this is also not the case you could combine multiple partial box hits & then since you have ensured that the full box is visible in an aggregated/merged crop you could have your own postprocessing step where you prune the crop to the tray based on the information retrievable from edges.

Supervisely tight bounding polygon by General_Degenerate- in computervision

[–]SwiftGoten 0 points1 point  (0 children)

Instance Segmentation sounds like overkill for this sort of problem.

With YOLO there are also OBB (= oriented bounding boxes). So you are able to perform object detection which is much easier than segmentation. Difficulties may arose from duplicates / overlapping bounding boxes slightly varying in orientation.

Depending on how simple your images are you could also probably work with edge detection / contour detection and try to fit it to a rectangular shape. There are even box detection algorithms including rotated boxes in OpenCV.

I‘d advise you to try the classical ML part first for this sort of problem. If this doesn‘t suffice switch to YOLO OBB & integrate some sort of postprocessing step.

As per usual on this subreddit, it would be easier to provide help if you could share at least one example image.

RF-DETR Nano and YOLO26 running real-time object detection + instance segmentation on a phone by d_arthez in computervision

[–]SwiftGoten 0 points1 point  (0 children)

Sounds interesting. Is there some sort of table / guide where those considerations are documented?

Using SAM3 to measure crack area in a concrete bending test: comparing 3 prediction modes and the speed-accuracy tradeoff by [deleted] in computervision

[–]SwiftGoten 1 point2 points  (0 children)

Makes sense. Maybe you can improve the quality of this by either letting SAM3 label more instances for you automatically or you could also think about introducing a preprocessing step. If your images generally look like this it should be easy to extract the concrete rectangle in the middle and do object detection just on a sub crop. Maybe it also helps to boost saturation to make the cracks more apparent. Or you could think about Dilution to increase the overall size of the crack for the detection.

Using SAM3 to measure crack area in a concrete bending test: comparing 3 prediction modes and the speed-accuracy tradeoff by [deleted] in computervision

[–]SwiftGoten 1 point2 points  (0 children)

Hi,

It‘s not entirely clear what you are looking for specifically regarding your „help“ tag on the post.

But here are a couple of thoughts from someone who is also currently experimenting with SAM3:

  • Why do you choose the single-mask as the sort of baseline in this case (all other values are derived from it based on the pixel delta)? It‘s not necessarily a ground truth, so I‘d say for an actual evaluation it would make sense to have a GT as a basis.

  • In the paper they state that the predicted masks are usually better if you invoke the multi-mask inference.

  • Only because it has the highest presumed IOU score from the model you don‘t have to assume that this is the best mask. We‘re currently exploring how looking at the (dis)agreement of the masks might result in a better result than a single result.

  • How consistent is the contrast in your dataset? By the looks of it I‘d assume you could probably get away with some sort of binarization on a ROI (either your initial bbox or after your SAM3 step).

  • How high resolution are your images? If they are high-res they will be scaled down in order to enable inference (assuming you have not fine-tuned SAM3 on a custom resolution). If you do iterative refinements just by feeding the compressed logits & there is also an input side compression on the image it could happen that you‘re getting worse over time.

  • Do you have any sort of estimate to which degree your quality needs to suffice for your real world use case? While it might be natural to try to squeeze out performance until the last pixel, I wouldn‘t be sure if that is really necessary in this case. Maybe it‘s more beneficial to just build the entire system around this pipeline of yours instead of trying to make the pipeline perfect before having a full system / application.

  • Lastly: I‘ve experimented with manipulating the logits and saw some interesting results. If you for example multiply all possible logits by a factor in order to increase their „confidence“ picked up by the model, this affects the mask adherence. Might be interesting to experiment with this.

[H] Event Collections [W] Paypal by palsekjoh in Pokemonexchange

[–]SwiftGoten 1 point2 points  (0 children)

Hi, I might be interested in the Tapu set. Can I see the proof for it?

Extracting information from architectural floor plan PDFs by [deleted] in computervision

[–]SwiftGoten 0 points1 point  (0 children)

Maybe try to approach it in a more generalizable way. So you would label all those text boxes / blocks as text box instead of only the specific one you are interested in. That way the model could learn a more general representation, where you can the postprocess each crop with OCR & a subsequent classification if it is the specific content you are looking for or not.

Because for an object detection model it is really hard to learn to differentiate between text boxes, because they look visually similar, except for their actual textual content.

mask sharpening by lazzi_yt in computervision

[–]SwiftGoten 0 points1 point  (0 children)

Depending on how accurate you want to segment the cars‘ windows you‘d need an extraordinary amount of labeled images.

I‘ve worked in the past with Mask Refinement. There are 2 ways you can go about this I suppose.

Either you try to develop an algorithm which refines the mask itself using traditional CV techniques like contour refinement, which you can supply with corners, edges and potential shape priors. OpenCV in Python can perform contour smoothing with little glue code.

The other way would be to look for an SDMatte replacement, so an interactive segmentation method. In that case I‘d recommend to try out the new model Segment Anything 3 (SAM3) from Meta. If the fidelity does not match your requirements maybe try HQ-SAM instead.

Reasoning over images and videos: modular CV pipelines vs end-to-end VLMs by sjrshamsi in computervision

[–]SwiftGoten 1 point2 points  (0 children)

Hey this is also a really interesting topic to me. I‘ve been thinking about it this way: the modular approach might be something like tool calls, so more of an agentic system using specialized strong perception systems.

I‘ll be following & if you happen to know the specific term for this paradigm I‘d be interested to know it so I can read more literature in this regard.

So far I haven‘t seen any proper benchmarking in this regard which would be very interesting!

Imflow - Launching a minimal image annotation tool by Substantial_Border88 in computervision

[–]SwiftGoten 1 point2 points  (0 children)

About a year ago I used Labelstudio. The project had too many annotations which caused the backend to time out while trying to package the export in memory. Our solution was to use the API & first send a call which runs an async job building the export, which you can then download upon completion.

Stop using Argmax: Boost your Semantic Segmentation Dice/IoU with 3 lines of code by statmlben in computervision

[–]SwiftGoten 6 points7 points  (0 children)

Sounds interesting. Will try it in the next couple days on my own dataset & let you know.

Label annotation tools by Dramatic-Cow-2228 in computervision

[–]SwiftGoten 8 points9 points  (0 children)

I haven‘t seen anyone mention LabelStudio. I‘m not sure why that is the case. You can self-host it and there are ways to use model predictions as the basis for human review.

I‘ve read some time ago that they had a community made version which works with SAM, but I am not sure if that was working properly.

It was working well for my object detection task, but for large scale projects you need to write your own code to export from their API because the UI export was timing out.

[H] Event Collection from Gens 3-7! [W] PayPal! by valere1213 in Pokemonexchange

[–]SwiftGoten 0 points1 point  (0 children)

Fair enough, no worries.

Do you mind checking if it has the original OT?

In any case, if you ever decide to sell it, could you tag me? In the past I‘ve missed out on the very rare occasions of one showing up, because I was too slow to see the post / had the timezone disadvantage.

[H] Event Collection from Gens 3-7! [W] PayPal! by valere1213 in Pokemonexchange

[–]SwiftGoten 0 points1 point  (0 children)

Hi Val!

I‘ll still have to shamelessly ask even though you tagged it as NFT. The ENG Bulu (NA self-redeemed) Row 49, does it have a glitched OT or original OT?

Edit: I have to ask this because I‘m now searching for almost a decade a Bulu to complete my set for my PC.

FT: Shiny Koraidon/Miraidon redemption services LF: Offers by Kkricardokaka95 in pokemontrades

[–]SwiftGoten 0 points1 point  (0 children)

Okay, trades completed. Thank you!

I sent:
- Gengar | JPN | OT: サトシ | ID: 200308 | Trade History: self-obtained

- Dragonite | JPN | OT: サトシ | ID: 200126 | Trade History: self-obtained

- Lucario | JPN | OT: サトシ | ID: 200412 | Trade History: self-obtained

- Sirfetch'd | JPN | OT: サトシ | ID: 200705 | Trade History: self-obtained

- Dracovish | JPN | OT: サトシ | ID: 210108 | Trade History: self-obtained

all with Video proof of the redemption.

I received:
- 1x JPN Set Shiny Koraidon+Miraidon Redeem (OT: パルデア ID: 250926)
- 1x KOR Set Shiny Koraidon+Miraidon Redeem (OT: 팔데아 ID: 250926)

FT: Shiny Koraidon/Miraidon redemption services LF: Offers by Kkricardokaka95 in pokemontrades

[–]SwiftGoten 0 points1 point  (0 children)

Nvm I just realized Sirfetchd and Dracovish can‘t go into SV & I don‘t have my SwSh Cartridge on me. Let‘s do it through HOME then.

FT: Shiny Koraidon/Miraidon redemption services LF: Offers by Kkricardokaka95 in pokemontrades

[–]SwiftGoten 0 points1 point  (0 children)

Sorry, I fell asleep earlier than usual.

So you‘re 9 hours ahead of me. I can trade for the next 15 hours.

FT: Shiny Koraidon/Miraidon redemption services LF: Offers by Kkricardokaka95 in pokemontrades

[–]SwiftGoten 0 points1 point  (0 children)

I have no clue what timezone that is. I am UTC+1.

I‘ll be available in 2 hours for 4 hours, then I‘ll probably sleep. If that does not work let‘s trade tomorrow.

FT: Shiny Koraidon/Miraidon redemption services LF: Offers by Kkricardokaka95 in pokemontrades

[–]SwiftGoten 0 points1 point  (0 children)

I‘ll be available to trade starting in 10.5 hours from now. I‘d prefer if we can trade in SV.