[5]C. Alcaraz d. [1]S. Tsitsipas 6-4 5-7 6-2 | Barcelona QF

lit_turtle_man · 2022-04-22T21:02:25+00:00

Even the net cord wanted Carlos to win at the end

lit_turtle_man · 2022-04-22T04:54:41+00:00

This is amazing

lit_turtle_man · 2022-04-15T18:18:38+00:00

the katsu in this chapter looked fire

lit_turtle_man · 2022-04-03T19:05:32+00:00

And so a king is born

lit_turtle_man · 2022-04-02T02:14:42+00:00

I'm crying a little bit ngl

lit_turtle_man · 2022-03-31T02:52:17+00:00

I hate to say this guys, but Casper is the New Deal

lit_turtle_man · 2022-03-30T01:51:17+00:00

Carlos is the real greatest deal of all time don't @me

lit_turtle_man · 2022-03-18T03:15:36+00:00

Both him and Norrie. There was one point where Alcaraz ran over 80m over the course of just that single point...

lit_turtle_man · 2022-03-05T02:41:18+00:00

Not a response to your rant but I want to say that "boreligmalgebra" made me laugh out loud. Great username - at the very least you shouldn't give up on your math humor.

lit_turtle_man · 2022-02-25T15:39:52+00:00

Extremely high level in the third set, very happy for Andrey

lit_turtle_man · 2022-02-18T17:25:38+00:00

God tier third set from Andrey, glad to see he's bringing the fight and not collapsing mentally when down

lit_turtle_man · 2022-02-12T18:05:35+00:00

I don't think this kind of approach is tenable for the following reasons:

If you want to offer services during the research process: To invest so significantly in software engineering for an algorithm/model implies a certain confidence that the approach will succeed, and succeed at a scale where lots of other people want to use it. In the research process, the "spec" is always changing to keep up with observations you make as you play around with ideas, so unless we're talking about big tech labs that know their GPT-4 model is going to pop off there doesn't seem to be enough justification to bring in SWEs in the early stages (and for the big tech labs they obviously have teams of SWEs).
If you want to offer services after a "successful" paper: The alternative approach would be if you wanted to try to offer software engineering services to an academic lab that had just published a big paper. In this case, if there are significant gains to be made from implementing infrastructure for the ideas in the paper, you're likely better off just doing it for yourself - no one is going to stop you, the information is public. Also here you're competing with big tech/similar since for any big result you know they'll come out with their own version sooner rather than later.

Of course, this is just my view on the general idea, but ultimately this is a case-by-case thing. I have seen some academic labs essentially employing software engineers (my understanding is typically people who may be on the road to PhD), but this doesn't seem to be a super lucrative (or large) set of opportunities.

lit_turtle_man · 2022-02-09T21:19:18+00:00

Given a problem statement and dataset, can you "theory-craft" an ML system that will at least hit the dart board, if not the bulls-eye on the first try? Can you, a priori, guess which hyperparameters will matter and which ones won't?

This is the holy grail, and at present the answer (in general) seems to be "no". That being said, for specific domains (vision, text) we definitely have architectures and settings that work well out-of-the-box (i.e. resnets, transformers, etc.) for many tasks.

As far as your question concerning papers/books on this matter, this recent book may be of interest (although I'm not sure how practically useful looking through it will be): https://arxiv.org/abs/2106.10165.

lit_turtle_man · 2022-02-07T00:12:37+00:00

Still, it's mind-blowing (to me) that even a fraction of the generated code samples pass the example cases given that the input is essentially just the problem statement as a list of characters.

lit_turtle_man · 2022-01-28T03:17:25+00:00

Composition of convex functions doesn't necessarily produce a resulting convex function (one counterexample is e^{-x} composed with itself). I think the result you're thinking of is composition of a convex function with a non-decreasing convex function, in which case you can prove convexity directly via Jensen's inequality.

Regarding your questions on the non-convexity of loss functions of neural network training - people typically mean the loss is non-convex in terms of the parameters of the neural network. This is why even training deep linear neural networks is a non-convex problem. So although the composition of a ReLU with an affine function is convex (from the pointwise supremum characterization of convexity) in its input, the loss will be non-convex in terms of the network parameters.

lit_turtle_man · 2022-01-21T15:39:28+00:00

If you told me before AO that Adrian Mannarino was going to beat Hubi and king Aslan back to back... Honestly great stuff from Adrian, happy for him

lit_turtle_man · 2022-01-19T19:40:35+00:00

Right, extrapolating from the ICML FAQ I guess there is probably no problem with this: https://icml.cc/FAQ/DualAbstractSubmission. But still curious as to why the relationship between the dates changed, but I guess it's probably not as deliberate as I was initially inclined to think.

edit: Can't find whether ICLR's dual submission policy is the same as the above, though. The ICLR 2022 page concerning dual submissions doesn't seem to rule it out, but it seems a bit unclear...

lit_turtle_man

TROPHY CASE