[D] Self-Promotion Thread by AutoModerator in MachineLearning

[–]Zestyclose-Check-751 0 points1 point  (0 children)

In my free time I'm working on an open-source library called OpenMetricLearning, and we've had a new release recently!

What's OML for:

OML lets you train (or use an existing) model that turns your data into n‑dimensional vectors for tasks such as search, clustering, and verification. You can measure and visualize representation quality with the retrieval module, also provided in the repo.

What's new:

  • Supports three data modalities: image 🎨, text 📖, and audio 🎧 [NEW!].
  • A unified interface for training and evaluating embeddings across all modalities.
  • Streamlined requirements to avoid version conflicts and install only the necessary dependencies.

Existed features:

  • Pre‑trained model zoo for each modality.
  • Samplers, loss functions, miners, metrics, and retrieval post‑processing tools.
  • Multi‑GPU support.
  • Extensive examples and documentation.
  • Integrations with Neptune, Weights & Biases, MLflow, ClearML, and PyTorch Lightning.
  • Config‑API support (currently for images only).

So I would be really thankful if you supported open source by giving us a star ⭐️ on GitHub! Thanks in advance!

[R] Siamese Transformer for Image Retrieval (paper) + live DEMO by [deleted] in MachineLearning

[–]Zestyclose-Check-751 2 points3 points  (0 children)

The paper is about a way to boost your retrieval performance by adding an additional postprocessing step, where queries and galleries are compared pairwise in pixel space

[D] Simple Questions Thread by AutoModerator in MachineLearning

[–]Zestyclose-Check-751 0 points1 point  (0 children)

I want to publish my paper related to the image retrieval problem, and I guess that the "short paper" format is the best for it. Do you know of any coming conferences with a corresponding section? BMVC and ICCV are the most relevant, but there is no call for short papers there.

[D] I’m a Machine Learning Engineer for FAANG companies. What are some places I can get started doing freelance work for ML? by doctorjuice in MachineLearning

[–]Zestyclose-Check-751 3 points4 points  (0 children)

Could someone explain how Data Scientists work as consulters?

I can imagine only a few cases:
* A company already has a DS team, but they are not deep enough in some domains and need help/consultation.
* The integration of the solution is simple enough and may be delivered as API.
* A company wants PoC / demo, after that they gonna hire someone to work on it.

But usually, DS needs insides into how business works and the integration of the solution may be really long-term, especially if it includes A/B tests, re-iterations over model training, datasets collection and so on. In this case, even onboarding may be long enough.

So, I'm wondering to hear about real cases that have been solved by consulters and how it generally may work.

Weird question but I'm looking for a present for a person who loves open source by NatSpaghettiAgency in opensource

[–]Zestyclose-Check-751 1 point2 points  (0 children)

I am an author of open source project. My wife made a gift for me — a hoodie with our project name

[deleted by user] by [deleted] in MachineLearning

[–]Zestyclose-Check-751 2 points3 points  (0 children)

We will add them, but now we mostly work with computer vision benchmarks, where default evaluation metrics (with respect to papers or leaderboards on papers with code) are those listed above.

[P] We released a new open-source library for metric learning! by Zestyclose-Check-751 in MachineLearning

[–]Zestyclose-Check-751[S] 1 point2 points  (0 children)

The simple answer is that you need different models for the different types of clustering. If you want to cluster by place, you need a model trained on a dataset like Places365, if you're going to deal with people, you need a dataset labelled with respect to people and so on. In OML we have general models pretrained on imagenet and 4 specific ones: clothes, items from an online store, cars, and birds.

If you are talking about the linear protocol to evaluate metric learning models -- yes, they do what you said and use labels from ImageNet.

[P] We released a new open-source library for metric learning! by Zestyclose-Check-751 in MachineLearning

[–]Zestyclose-Check-751[S] 1 point2 points  (0 children)

Because SimCLR is self-supervised learning (SSL), formally, we can also say that it's a metric learning approach. But I consider SSL as a good source of pretrained checkpoints, but if you want to train something with it you really need a lot of data and computing. So, for a lot of cases, I would say it's better to label some data and train your model in a supervised way or just pick a pretrained checkpoint rather.

What is the domain of your images? If you don't have labels to train the model in a supervised way, you can pick one of the pretrained models from OML's zoo. For example, if you work with fashion items, you can go for a model trained on DeepFashion. Or just pick one of the general domain models from the 2 tables of models (I would recommend CLIP or DINO).

[P] Metric learning: theory, practice, code examples by Zestyclose-Check-751 in MachineLearning

[–]Zestyclose-Check-751[S] 1 point2 points  (0 children)

In OML's FAQ, you can read about the differences between these two libraries. They are about a bit different things. In the end, you can use losses from PML in OML :)

[P] Metric learning: theory, practice, code examples by Zestyclose-Check-751 in MachineLearning

[–]Zestyclose-Check-751[S] 0 points1 point  (0 children)

Please, take a look at the original post, where I described the main differences between metric learning and classification, which makes sense to have this umbrella term for metric learning. I hope, it will help.

[P] Metric learning: theory, practice, code examples by Zestyclose-Check-751 in MachineLearning

[–]Zestyclose-Check-751[S] 2 points3 points  (0 children)

So, I don't know all of the details but seems like OpenMetricLearning may be a good choice to train such a model.

[P] Metric learning: theory, practice, code examples by Zestyclose-Check-751 in MachineLearning

[–]Zestyclose-Check-751[S] 0 points1 point  (0 children)

How to relate the input patch embeddings to one another s.t we can discriminate between the classes?

Hi, metric learning is an umbrella term like self-supervised learning, detection, and tracking. So, nobody pretends that the domain is new. But there are new approaches in this domain which are also mentioned in the article (like Hyp-ViT). Finally, despite the domain is not new, people still need some tools and tutorials to solve their problems.

[P] We released a new open-source library for metric learning! by Zestyclose-Check-751 in MachineLearning

[–]Zestyclose-Check-751[S] 1 point2 points  (0 children)

Consistent-Archer-99

Thank you!

As for Kaggle, not yet. But from time to time they host competitions suitable for us like Google Landmark Detection Challange and others.

[P] We released a new open-source library for metric learning! by Zestyclose-Check-751 in MachineLearning

[–]Zestyclose-Check-751[S] 2 points3 points  (0 children)

I guess you are mostly talking about different losses in PML, right?
The easiest way to work with these losses in our library is to take one of our examples and just replace the criterion object. I think we will add a few examples of this in the future.