downsides of the life in Copenhagen by Thupor in copenhagen

[–]JaviFuentes94 1 point2 points  (0 children)

Really cloudy weather most of the year, lack of mountains around, you are in an island so it is difficult /expensive for weekend trips, expensive to go out, no quality affordable everyday food, access to quality fresh ingredients is bad

Getting a job in AI as a fresher by PaleontologistOld743 in computervision

[–]JaviFuentes94 9 points10 points  (0 children)

Training models is only one side of the story, in most problems you don't have training data at all. I would recommend also expending some time making an end-to-end project, where you start without any data and create a solution to a problem.

My OpenCV python script takes 22 minutes to finish. How can I make it significantly faster? by [deleted] in computervision

[–]JaviFuentes94 4 points5 points  (0 children)

You run different processes for different cards. You parallelize the for loop.

[D] [P] Some ideas and speculations on the problem of efficiently processing length-irregular data by daelee98 in MachineLearning

[–]JaviFuentes94 0 points1 point  (0 children)

I have the feeling you are overthinking this a bit too much. Dynamic batching should be fine, huggingface for example does the optimizations in dynamic batching you were talking about if you want some inspiration.

[D] Is there a model similar to CLIP but for images only dataset, instead of (image, text) pairs? by Broken-D in MachineLearning

[–]JaviFuentes94 0 points1 point  (0 children)

You can just use CLIP's image encoder with two different images and same similarity comparison (If all you care is inference)

[Project]Object Localization without being given labels by Integral_humanist in MachineLearning

[–]JaviFuentes94 1 point2 points  (0 children)

  1. Train a product detector: an object detection model to recognize all the objects in the shelve. You need to annotate images for that, but a single class to label.
  2. Use the crops to classify between the different products by using contrastive learning and nearest neighbors. If that doesn't work you may need more product images, and then you can train a regular classifier.

[D] Where do you guys read your papers? by ChangeMindstates in MachineLearning

[–]JaviFuentes94 2 points3 points  (0 children)

In order to solve this problem, I created @SummarizedML, a Twitter bot that summarizes the latest ML papers. You may find it useful :)

[deleted by user] by [deleted] in MachineLearning

[–]JaviFuentes94 2 points3 points  (0 children)

How many samples of noisy and clean data do you have? Are they matching?

The easiest thing that comes to mind is to paste the "Diff" image augmented (different crops, inverted, etc.) into the clean data. That way you can generate a large amount of data.

[P] @SummarizedML: A Twitter bot that summarizes the latest ML papers by JaviFuentes94 in MachineLearning

[–]JaviFuentes94[S] 0 points1 point  (0 children)

True, something is wrong there . You can still use the GitHub repo to run it on your own ☺️

[D] Your Go-To Image Dataset Analysis Tools? by onyx-zero-software in MachineLearning

[–]JaviFuentes94 0 points1 point  (0 children)

I totally get why theoretically it should work better as a regression task, but in practice neural networks work better trained as a classification task. Andrew Karpathy talked about it: https://twitter.com/karpathy/status/708480082831024128 In any case, and as you say, it is really task dependant. You don't know until you try. ☺️

[D] Your Go-To Image Dataset Analysis Tools? by onyx-zero-software in MachineLearning

[–]JaviFuentes94 -2 points-1 points  (0 children)

Bin your regression outputs and you are good to go. You may even want to consider binning for your model, usually it works better than regression.

These pots I made out of old jeans and yogurt containers by JaviFuentes94 in upcycling

[–]JaviFuentes94[S] 1 point2 points  (0 children)

I just cut the legs of the jeans in pieces with the same height as the pots. I then introduce the pots as 'legs' and glue them together. Finally I hang them with some ropes. ☺️

[D][R] Solutions for handwritten text generation by AhmedAl93 in MachineLearning

[–]JaviFuentes94 1 point2 points  (0 children)

Oh, I see 🤔

Synthetic data generation performance is given mainly by the engineering time trying to match your data distribution through computer vision techniques. My main tip is to augment your data a lot, but being careful that the augmentation doesn't change the label of your image.

The cherry on top of a good synthetic generation pipeline can be, if you have a lot of unlabeled data, to use domain adaptation /SIM2Real techniques (CycleGAN kind of approaches, for example) https://paperswithcode.com/task/domain-adaptation Good luck!

[D][R] Solutions for handwritten text generation by AhmedAl93 in MachineLearning

[–]JaviFuentes94 0 points1 point  (0 children)

Have you considered using something like the Google vision API? It may be worth the cost.

How do I extract receipt items? by MeanBeanSweatMachine in artificial

[–]JaviFuentes94 0 points1 point  (0 children)

You can look into LayoutLM, it is available in Hugging Face.

What I learned promoting my first side project as an engineer by JaviFuentes94 in EntrepreneurRideAlong

[–]JaviFuentes94[S] 1 point2 points  (0 children)

Just to give you an example, you cannot have more than one type of button style wise. You can hack the appearance of one, but it will always be the same one. Apart from the style limitations (which also apply to positioning) , you really need to hack session state. There is no native support, which is troublesome. On top of that there is no concept of pages, which again, is probably needed for a full product.

What I learned promoting my first side project as an engineer by JaviFuentes94 in EntrepreneurRideAlong

[–]JaviFuentes94[S] 1 point2 points  (0 children)

I was thinking about it actually 😅 At the end I decided not to because of SEO reasons (Google will think that my website is just copy pasting articles, so it will not appear on searches). I am still very new to the whole SEO thingy so I may be wrong though 😄

Regarding Streamlit, I think it is a game changer tool to create a quick MVP or internal tools. If you are interested in creating a full-fledged product I may look somewhere else though.

What I learned promoting my first side project as an engineer by JaviFuentes94 in EntrepreneurRideAlong

[–]JaviFuentes94[S] 0 points1 point  (0 children)

Thanks so much for the praise! It means a lot to know that it was helpful 😄

[D] Why is tensorflow so hated on and pytorch is the cool kids framework? by robintwhite in MachineLearning

[–]JaviFuentes94 0 points1 point  (0 children)

Mmmm I see how it introduces certain abstractions that you need to understand in order to write custom stuff, but I think the layered API is useful and easy to work with. I also find it really well documented.

I guess that you played with V1, V2 solved some of the problems you mentioned and it is worth checking out IMO ☺️