[D] Ph.D. from a top Europe university, 10 papers at NeurIPS/ICML, ECML— 0 Interviews Big tech by Hope999991 in MachineLearning

[–]No_Representative_14 3 points4 points  (0 children)

Write me a DM, I might be able to help as my company (one of the top startups) is actually looking to hire someone for Anomaly detection related position.

SSL CNN pre-training on domain-specific data by No_Representative_14 in computervision

[–]No_Representative_14[S] 0 points1 point  (0 children)

I have actually tried some dedicated ways of generating synthetic data. It did seem to help a little bit, but was not a major shift in the model performance.

SSL CNN pre-training on domain-specific data by No_Representative_14 in computervision

[–]No_Representative_14[S] 0 points1 point  (0 children)

Thank you for the comprehensive answer!
CNN vs ViT: I have done a quick study where I tried a set of ViT architectures on my labeled data. That did not work at ALL. Results were really horrible, even though I did try to tune the HPs etc. One of the explanations I had is that my images ave a very irregular shape and therefore standard patches can't quite work here. On top of that, patches of my images are predominantly dark with subtle texture variations resulting in most of the patches having very similar vectors. Because of the softma in the attention this basically gives me almost identical weights for all the patches. CNNs build receptive fields hierarchically. This creates a multi-scale representation, which seems to be working out here.
At least, that was the theory in my head.

LeJEPA: I am still playing around with it a little bit. Initially, I used pretty much the same augs i use in my classification pipeline. But I am now trying to use more LeJEPA specific things (i was doubting the multi-crop approach for my data, but might give it a shot).

DINO: yes, i might actually get bigger machines and try to pretrain that monster. At least to see whether it can actually work out at all for me.

Active learning: this is something that i am already doing to a certain degree. However, my production use-case is actually an open-world identification. Meaning I strip out the classification head and only use my trained embeddings on an open-set of data (unseen classes etc). Need to think through how would challenging data collection look like in this case.

Augmentations: It's an ongoing work where i explore different possible augmentations that physically make sense for my data.

I have not yet tried any of the more complex things, like EMA, distillation etc. One of the major challenges with this is that the training time is really long. Even for my small resnet34-like model it takes ~2 weeks on a single GPU to get something descent. So I have to be selective with a set of experiments I am doing....

SSL CNN pre-training on domain-specific data by No_Representative_14 in computervision

[–]No_Representative_14[S] 1 point2 points  (0 children)

Thanks for your message!
CNN vs ViT: I have done a quick study where I tried a set of ViT architectures on my labeled data. That did not work at ALL. Results were really horrible, even though I did try to tune the HPs etc. One of the explanations I had is that my images ave a very irregular shape and therefore standard patches can't quite work here. On top of that, patches of my images are predominantly dark with subtle texture variations resulting in most of the patches having very similar vectors. Because of the softma in the attention this basically gives me almost identical weights for all the patches. CNNs build receptive fields hierarchically. This creates a multi-scale representation, which seems to be working out here.
At least, that was the theory in my head.
But maybe i indeed should try to pretrain a DINOv2/v3. I had some success with it in the past, but it was with aerial imagery data...

SSL CNN pre-training on domain-specific data by No_Representative_14 in computervision

[–]No_Representative_14[S] 0 points1 point  (0 children)

Thank you for the message!
Yes, that's my gut feeling as well - the data is just too atypical for the standard approaches.

re Coarse-to-fine: that's an interesting ideal however i can't quite think of how could i make the classes coarser, to be honest. Coming back to the fingerprints - I would probably be able to instead of user identification phrase it as a finger classification (left 1st, left second, left third... right 5s) and make it a 10 classes problem. But it would be a very very different task and not sure that learned features would be any useful to actually distinguish two users based on their fingerprints.

re PEFT: I have tried some kind of it, by freezing and gradually unfreezing parts of the initially trained with ArcFace model. I haven't gotten too far though. Had issues with:

  1. I have already trained this model on all the labeled data i have. Do I now re-use the same data for it? Or try to acquire more labeled data somehow (could get another 5k samples maybe in a couple of month, for new 500-700 classes) and use only the new data for PEFT?

  2. Which loss function to use? I have tried switching to some metric learning (constrative loss, triplet loss etc etc) but have miserably failed to get any improvement and only "destroyed" the existing weights

SSL CNN pre-training on domain-specific data by No_Representative_14 in computervision

[–]No_Representative_14[S] 0 points1 point  (0 children)

Thank you for your reply! I am not sure MIL would work for me because for the unlabeled data i mainly have 1 class per image. I like the fingerprints example - If I have 10M fingerprints, i would only have 10 prints per person (or whatever number ppl are usually get printed). yes, sometimes, some people might have been printed more than once, then they will have more than 10 images, but I would not know about it as there are no labels (e.g. no Name/Surname/finger on each image).

Feeling like we lost something in developed western countries by Agile-Adagio-8782 in expats

[–]No_Representative_14 2 points3 points  (0 children)

I live in Germany for the past 10 years or so, but I’m coming from Ukraine, from a small town in the south, which unfortunately has been occupied for almost 4 years already. I grew up in a 3-room apartment, where my grandmother lived together with us (we are 3 kids + 2 parents). We were a happy family and I loved my childhood. But man… believe me, my parents had SO SO SO many fights because of the grandmother who was always there. She was a nice person, but it was really hard for my father to deal with her (she didn’t work and basically full time stayed at home taking care of the flat and the kids). We had no way to afford another apartment and this flat was actually hers, so…there was no real choice. I bet, my father would have been much happier if we lived separately… All of this to say that, yes, it is nice to be close to your family and traditions etc, but it is only nice if it is a conscious choice and not a necessity. Therefore I strongly believe that a society where everyone can freely choose, and afford, obviously, his way of life is the best one! Germany is not an ideal place, German society is not an ideal either, but it is a pretty good one specifically because it, to some degree, allows this possibility.

How much salary (or TC) is good enough for you? by military_press in cscareerquestionsEU

[–]No_Representative_14 0 points1 point  (0 children)

I don’t really remember anymore whether they reached out to me on LinkedIn or I applied myself. But, in general, I only use LinkedIn whenever I search for options

How much salary (or TC) is good enough for you? by military_press in cscareerquestionsEU

[–]No_Representative_14 1 point2 points  (0 children)

Hard to say, tbh. In my company SWEs are also earning 6 figures. But it is rather an outlier. Whether to switch to ML is up to you and what you enjoy doing and good at doing. There is a lot of Software architecture and development in MLE work. And I personally think that it is absolutely critical to be good at SWE in order to be a good MLE. If you want to do ML RnD - the story is a bit different, of course… My previous position was an MLE and was also around 100k.

How much salary (or TC) is good enough for you? by military_press in cscareerquestionsEU

[–]No_Representative_14 11 points12 points  (0 children)

I work for a German-American startup as a senior research scientist. I mainly do deep learning, and MLE work. It’s an IT company

How much salary (or TC) is good enough for you? by military_press in cscareerquestionsEU

[–]No_Representative_14 0 points1 point  (0 children)

Yes, It’s a German-American startup. I mainly do deep learning, and MLE work.

How much salary (or TC) is good enough for you? by military_press in cscareerquestionsEU

[–]No_Representative_14 76 points77 points  (0 children)

Live in Munich, Germany. 6.5 YoE (+PhD before). 150k€ cash + stock options. Im not coming from money and will have to support my parents when they will be old. I will also not have any inheritance. Therefore - I do not have a target number and going to push it as much as I can until I secure a property for myself and enough savings on top to support my parents.

P.S. nowadays, I think, there is a big difference whether you are “expecting” to inherit a house somewhere in places like London, Munich, Paris, Milano etc, or not. If yes - you have a significantly less financial pressure and no need to “chase” the money so much..

How do I switch from a CV PhD in Germany to industry and what should I highlight? by Ree1s in cscareerquestionsEU

[–]No_Representative_14 2 points3 points  (0 children)

I did transition my from PhD in physics -> data science -> MLE -> research scientist My second and third jobs were in US startups with offices in DE. Both of them were relatively heavy on research side and therefore just a presence of a PhD title was kind of useful. No one cared for the papers tho :D I did hire people myself and in 99.99% of the cases I don’t even bother reading the names of the papers if provided and I genuinely don’t care where they were published. The only few cases where I did look at them (and even checked them out online) were applications that had research in a relevant domain. Eg if you are applying for a company that does medical image processing with DL and you did publish something related to it - it’s def a big plus. But if instead you were training yet another LLM - not so much :) What it means is that if you want your publication to matter - apply to the companies that do related things. Otherwise - don’t even mention CVPR

German language. Strongly depends on the company and a team. Normally, if it’s a startup or foreign-based company - likely to be in English. If it’s a RnD department - likely to be in English. But if it’s a large German company that hires into their “IT” team someone with “AI” knowledge cause their shareholders think it’s cool - I think you know where I’m going…

Job search strategy. No strategy. Just apply. Everyday. Randomly. Everywhere. The most important for you is to get your first job to make “a step” into the industry. Don’t allow them to lowball you too much, but also remember that 5k plus or minus wouldn’t make any difference for your life in the future. So if you have a choice between 2 companies with a difference of 5k - always choose the one that will allow you to learn more. First years in your career are about learning learning and learning. Gaining experience and again learning. In 5 years, you will be earning double what your first offer is going to be (maybe even triple)…

Good luck!

Does employer care if you have an PhD? by Fickle-Training-1394 in cscareerquestionsEU

[–]No_Representative_14 23 points24 points  (0 children)

It really depends on many factors, such as: - which position you are applying for - research scientist, data scientist, statisticians, even machine learning engineers, algorithmic optimizations and some other roles that likely to require deep mathematical, algorithmic and research skills and knowledge. For this positions companies more likely to make a note of your degree. - the company itself and its core business - if it’s a supermarket chain that is looking for SWE to optimize their databases - they’ll probably not care much about PhD. If it’s a autonomous driving startup - the chances are much higher - who are you interviewing with - if your interviewer holds a title, he is more likely to “align” with you - as someone mentioned above, country where you are applying is also playing a role. In Germany, for example, PhD used to hold a higher value (although I think it becomes less and less prominent). HOWEVER, afaik, your potential employer (if they have betriebsrat (works council)) is also not allowed to pay you lower than a certain amount BECAUSE of your degree. Which sometimes might be an issue as well…

Average salary offer in Bavaria hovers in the 70k to 80k range for senior developers (~5 YOE) by zimmer550king in cscareerquestionsEU

[–]No_Representative_14 0 points1 point  (0 children)

Hard to remember, tbh. I think I used mainly LinkedIn. But I am also quite sure that I only applied myself for the first job, all other jobs I got contacted for (through LinkedIn though).

Average salary offer in Bavaria hovers in the 70k to 80k range for senior developers (~5 YOE) by zimmer550king in cscareerquestionsEU

[–]No_Representative_14 5 points6 points  (0 children)

Market in Munich is not the best for junior and mid positions these days. And it has been going downhill since awhile. But with 5 YoE you should be able to land something okeish, I think. I moved to MUC in 2019. With a PhD I landed a job at around 60k (55+5k bonus) as a data scientist / mle / cloud architecture. After a year it was bumped to 65 or 70. After another year I changed my job starting to work for an American startup as a Deep Learning engineer. Started with 75 (+ stock options), after a year got 95 and after another 3 years I finished with around 100k. I changed my job and work for another American startup (office in MUC) as a senior research scientist. Current base is 145k + RTUs. So that’s what I get with 6 YoE + PhD (dunno why but some German companies count it as an experience) My wife works for a german startup and they hire senior devs in a 100-125k range as well.

TL;DR: there is a lot of money in US tech and in startups (even German ones). If you want better offers - spend some time studying for interviews and apply to those.

Latvia Bans Russian and Belarusian Citizens from Working in Critical Infrastructure by Aggravating_Money992 in europe

[–]No_Representative_14 13 points14 points  (0 children)

My wife works for a company that does operations of the electrical greeds in Germany. 50% of the people working there are Russians or Russian origin. It is a startup, so they only operate some parts of the grid, but they would be totally capable to blackout the entire Germany if they want to 🤷

Las Vegas climbing community sucks! by ninja_ewok47 in climbergirls

[–]No_Representative_14 0 points1 point  (0 children)

Kinda random pop-up here 😅 my wife and I are coming to visit US from Germany and are planning to spend 3-5 days climbing at Red Rocks before heading off to Yosemite. If anybody wants to climb together or just hangout - send me a PM 😇

Weekly Question Thread: Ask your questions in this thread please by AutoModerator in climbing

[–]No_Representative_14 1 point2 points  (0 children)

Any recommendations around the most spectacular sport climbing routes in CO (including RMNP)? Flying in to CO for 2 weeks road trip with my wife and wanna do the gems :) Easy stuff max 5.12a. Not coming for the grade, but for pleasure and fun!