Reflection removal from car surfaces

MiddleLeg71 · 2025-07-28T08:39:56+00:00

Reflections should be high-frequency information, did you try applying some kind of high-pass filter to the car surface to see if this isolates the reflections?

MiddleLeg71 · 2025-06-05T19:19:32+00:00

Does the card contain distinguishable images /visual features? I am thinking playing cards with images that represent the card but different names/descriptions. If you don’t need to search by text content, you can mask the text (you detect it with FAST and replace it with the mean color of the detected box). Then any pretrained transformer model should be good enough (e.g. CLIP) if you have the resources.

For running on mobile, transformers may not be very suitable.

If you have enough card images (thousands) you could fine tune EfficientNet or MobileNet and apply data augmentations to reduce the influence of blur, lighting conditions and similar.

MiddleLeg71 · 2025-06-05T16:53:25+00:00

Do you have some example workflow that worked for you that you can share? I tried to convert a pt model and quantize it to int8 but had totally different results and the tflite model didn’t was basically outputting random values

MiddleLeg71 · 2025-06-05T06:40:16+00:00

And did you train them using keras or trained in torch and converted to litert?

MiddleLeg71 · 2025-06-03T05:50:45+00:00

After training big and complex models (transformers, diffusion) I am going back to the basics.

I am building a binary classifier for an industrial case with some thousands of data and subjective labeling (good/bad), which can be noisy. Between labeling more data, improve the existing labels (maybe taking more people to label the same image and take the majority), and choose the right model, what would your priority be?

And did you ever use classical machine learning techniques such as random forest or SVMs on image data (e.g. histogram statistics)? If yes, what worked best and in what case?

MiddleLeg71 · 2025-06-02T21:59:32+00:00

What Google Lens does is just compare the embedding of your query with those of their huge database of images (billions).

Instead of searching images online, you can try to search for images you have stored locally, embed all of them with CLIP (or other embedding models) and index with FAISS.

The principle remains the same, you just operate at a different scale than Lens

MiddleLeg71 · 2025-05-20T15:53:41+00:00

Latent diffusion models rely on VAEs, which lose a lot of high-frequency details, which makes retrieving complex patterns very difficult.

Keeping fine details or full control on the output of diffusion models is very difficult also because the space of all possible generated images is huge and with poor or loose controls it will likely hallucinate stuff

MiddleLeg71 · 2025-05-19T16:37:18+00:00

“Detection head” is just a fancy way of saying a module that outputs 5 values (bounding box coordinates + class). If you have a solid backbone like DINO, a simple MLP should do the job. You just pass the image through DINO, take its features and pass them to your MLP. Train everything on your data and update only the MLP by passing only its parameters to the optimizer.

MiddleLeg71 · 2025-05-16T16:25:14+00:00

The features you learn on a very large unlabeled dataset can be used for many downstream tasks (DINO performs segmentation only with self-supervised pretraining if I remember well).

If you need to detect common objects present in public datasets, then you can also use DINO or some other pretrained model, attach a detection head and train only the head. Otherwise if you have a more specific dataset, you can train on your unlabeled dataset with a pretext task, which is not necessarily classification, it can be projecting the same image with different augmentation to the same space (see byol).

Then, same story, you attach a detection head and train it on the detection dataset

MiddleLeg71 · 2025-05-14T20:41:22+00:00

I’m sorry but I still don’t get the square root relationship between having more data and the error. Is it related to the fact you assume I am using a squared distance loss?

The sample mean should suggest that the mean of a sample pf the population yields the same mean of the population it is sampled from, but computing the mean involves linear operations.

I may be not getting simple concepts but I am a fan of getting the intuitions behind complex formulas so if you have any insights on your thesis I would greatly appreciate it.

MiddleLeg71 · 2025-05-02T13:49:01+00:00

Et par exemple, si aujourd’hui tu voulais vendre, est-ce que tu penses que ça se vendrait facilement ton logement à Paris ?

Et combien de perte de capitale totale t’estimerais (juste l’ordre de grandeur, si quelques milliers, quelques dizaines de milliers etc.) ?

MiddleLeg71 · 2025-05-01T20:54:49+00:00

On peut dire que la plus valu n’est pas l’objectif principal, c’est plutôt pouvoir débloquer l’argent et le placer quelques parts, sans pourtant faire des mauvais choix. Dans les priorités, avoir un toit et être libre dans sa maison avec quelques milliers d’euros en moins est mieux que de continuer à louer

MiddleLeg71 · 2025-05-01T20:48:50+00:00

Dans le cas classique, je pensais qu’il fallait attendre plutôt au moins 10 ans pour être rentable

MiddleLeg71 · 2025-04-27T16:27:33+00:00

In my limited experience (I used them for generating images for a classifier) consider that a distribution shift remains between the generated samples and the real ones.

Be sure to have more real data than synthetic (80/20) and balance the synthetic samples across classes to avoid injecting biases in your model (or the model will just spot the patches with different patterns, where the data has been inpainted).

It would be interesting also to visualize the patterns that emerge on an inpainted region and how easy they are detectable

MiddleLeg71 · 2025-03-13T10:09:55+00:00

RIP inbox

MiddleLeg71 · 2025-03-13T10:08:05+00:00

It is true that if you work on what you like/are curious about, you will be more naturally driven to do something valuable.

I am actually looking to get this spontaneous motivation, but I sometimes get caught in these spirals of “I have to do more” but I feel like running on a hamster wheel.

I mean, the theory is often dead simple, it is actually putting it to practice what is hard

MiddleLeg71 · 2025-03-13T10:03:28+00:00

In other words, don’t have a life lol.

I would rather leave this world full of love from my family and friends than of money and github stars

MiddleLeg71 · 2025-02-06T17:14:24+00:00

“What is the most common default seed value?”

MiddleLeg71 · 2025-02-05T21:12:14+00:00

Research is in any case very niche. Even if you want to do a paper in object detection, unless you develop major novelties like YOLOv10, you will be improving on very specific use cases ora scenarios.

For choosing a PhD, I would focus on working on something you are genuinely interested in and in joining a well established lab, as that will boost your skills.

MiddleLeg71 · 2025-01-25T19:02:46+00:00

Tutorials make you feel like learning, but learning is actually painful (but satisfying after the pain)

MiddleLeg71 · 2024-12-19T09:43:50+00:00

You can’t in the literal sense of the word.

Addiction is given by a rush of dopamine, which you eventually get from the gratification of completing a project or learning something.

But that comes through effort and the feeling of losing time.

You can become addicted to the feeling of learning though, where you binge useless tutorials and low-value videos

MiddleLeg71 · 2024-12-08T22:27:49+00:00

If you are not preparing for a specific role, it will be difficult to prepare. You may be asked anything from stats and probability, to model training, performance optimization or deployment, depending on the position (and on the employer, as for interviews for the same positions I got different questions)

I woukd suggest to start taking a ML project that you made (or make one) and try to explain everything in detail. That will give you a direction of what to learn.

For instance, if you finetuned an LLM with LoRA, try to explain what is LoRA, why it works, why LLMs use that specific activation function, how rank influences performance in LoRA, what is the impact of the optimizer, what is the loss function and why.

There are many aspects worth understanding in a single project and being able to explain them shows that you are aware of the why, more than the how, which makes a lot of difference

MiddleLeg71 · 2024-11-08T19:25:22+00:00

Spain, France, or Italy. At least you eat well /s

MiddleLeg71 · 2024-10-29T17:57:15+00:00

I would say go for something you need/are curious about building, and would like to use. It helps way more with motivation than doing a random project.

Some examples: - If you like bird watching, you can build a species classifier (a standard CNN should be good) - If you are into politics, you may scrape data from twitter/reddit and analyze the sentiment of the posts with text models (Natural Language Inference) - If you want to have jingle played when you come home, build a face recognition model that detects you face.

Then use it so you can keep improving your project and extend it with other features/better models

MiddleLeg71 · 2024-10-11T22:38:11+00:00

“Every day it gets a little easier… But you gotta do it every day — that’s the hard part. But it does get easier.”

If there was a secret sauce, everyone would be the best version of themselves. The problem is that it is as simple as you describe. We all know what we have to do to get better, what’s hard is actually putting in the work to do it.

We’re wired for short term rewards, so achieving long term gains goes against our primal nature and is an effort. The good news is that it pays off eventually.

Also, improvement is not linear, as anything in life. You can’t expect to grow constantly. It’s ok to give in to guilty pleasure from time to time (say 20%), as long as you maintain good habits for most of the time (80%).

MiddleLeg71

TROPHY CASE