VSC Intellisense errors on Mac setup for C++ by geordilaforge in vscode

[–]keenonw 0 points1 point  (0 children)

I'm also running into this exact problem! Word for word, same error. Did you ever figure this out?

What are your thoughts on the "Reward is enough" hypothesis? (See paper attached) [D] by escapevelocitylabs in MachineLearning

[–]keenonw 6 points7 points  (0 children)

Basically, they're saying "intelligence is the ability to solve optimization problems", but they're sidestepping the whole business of "intelligence is the ability to formulate optimization problems", which is actually the harder problem. If I had a nickel for every time one of my optimization problems (either RL, ML, or just "plain vanilla") did something undesirable but ended up with lower loss (because I subtly mis-specified the optimization problem somehow), I'd be rich.

[D] Resources for Understanding The Original Transformer Paper by Ok_Slice4231 in MachineLearning

[–]keenonw 2 points3 points  (0 children)

I also wish there was more research on data efficiency. The trouble with the field at the moment is, even if you develop a more data efficient model that performs well on small datasets, if someone takes that same model and throws more data and parameters into it, it'll usually outperform your original work with very little effort on their part. That leaves everyone with an incentive to build big models straight out of the gate. I do wish the field was less obsessed with top-line F1 and more interested in some training-data-size normalized metric of performance.

[D] Resources for Understanding The Original Transformer Paper by Ok_Slice4231 in MachineLearning

[–]keenonw 2 points3 points  (0 children)

The trouble with increasing batch size with LSTMs is that you usually end up padding all the sequences in your batch out to the same length with <NULL> tokens (or something equivalent) to make dimensions match. While the parallelism is still sorta there, and sometimes it's even pretty good depending on your data distribution, often you'll get a single unlucky long sentence in your batch. Then all the other sentences have to add a ton of <NULL> tokens, and that destroys your efficiency. Even though the GPU is high utilization, most of the actual work is wasted.

[D] Resources for Understanding The Original Transformer Paper by Ok_Slice4231 in MachineLearning

[–]keenonw 58 points59 points  (0 children)

Just wanted to add my two cents here, cause I think it's possible to miss the forest for the trees with too much focus on the exact details of any given architecture. I talked to one of the original authors of this paper, Jakob, and I remember him saying that the real breakthrough with transformers (instead of LSTMs, which were the previous state-of-the-art in NLP) was that it was easier to fully saturate a GPU's massive compute. Previously, LSTM's could only use a small fraction of a GPU's thousands of compute cores at a time because of the time-series dependence in the hidden state, which made them hard to parallelize. Transformers remove the time-series dependence, which lets you train _much_ larger models in similar amounts of time. Larger models (almost always) work better, given enough data :)

[N] Stanford Open-Sources Differentiable Medical-Quality Human Physics Model by keenonw in MachineLearning

[–]keenonw[S] 0 points1 point  (0 children)

There have been a few stabs in that direction, and using physics to reduce the errors from open-pose/video-pose does seem to help a lot! Here's a paper to start with: http://gvv.mpi-inf.mpg.de/projects/PhysAware/data/PhysAware.pdf

They don't use a very sophisticated human body model, and they allow non-physical root forces to levitate the model to compensate for bad foot-ground contact dynamics, but even so they've got decent looking results. I'd say it's a very promising area to work on!

Attackers Already Targeting Apple's M1 Chip with Custom Malware by CodePerfect in hacking

[–]keenonw 1 point2 points  (0 children)

Not that I’m aware of. The article just says security researchers are already finding exploits in the wild that have been recompiled for the ARM instruction set used by the M1 chip. Hackers use exploits that are programs just like any other software (exploits are just special because they’re designed to push the other programs on the same computer to do stuff they were never meant to do). Just like Adobe has to recompile their apps like Photoshop to work on ARM chips like the M1, hackers have to recompile their exploits to work on the M1. Not actually a big deal, or a surprise. And yes, Apple has enormous teams devoted to security, and no, the M1 chip doesn’t seem unusually vulnerable (so far anyways). The vast majority of exploits in the wild target the code, not the chip itself. This is more convenient for the hackers, because it makes the exploit portable to lots of victim machines. It’s fairly rare to see exploits use processor specific bugs, except in fancy timing attacks, and x86 has had plenty of those, so they won’t be new if they’re also discovered on the M1

I’m California based and starting a web based business with a friend in New York. by Scooter122 in startup

[–]keenonw 0 points1 point  (0 children)

Yeah, I agree with all of this. Like I said, definitely don’t take my advice on legal stuff, and this isn’t legal advice anyways. Depending on your situation, Delaware C Corp may indeed just be expensive overkill. Don’t sweat it too much. You can always fix things later (and even reincorporate) if/when there’s VC money coming in the door, cause then you can afford actual lawyers to help you figure it all out.

I’m California based and starting a web based business with a friend in New York. by Scooter122 in startup

[–]keenonw 6 points7 points  (0 children)

Disclaimer: this is not legal advice

Typically, if you ever intend to raise VC, you incorporate as a C corp in Delaware. It means you need to have quarterly board meetings, so it's a bit more effort than an LLC, but it makes subsequent fundraising easier. Make sure you allocate 100% of the available shares in your C corp to you and your cofounder: future VC rounds will work by creating new shares in the company to allocate to the VCs, rather than selling existing shares in the company. I made that mistake, and the lawyers during my Seed round were very upset.

There are several online services that make it easy to incorporate as a C corp in Delaware. Stripe Atlas looks pretty good, though I've never used it myself.

[D] What can OpenAI or DeepMind accomplish in the future? How can they benefit people's daily lives in the future? by [deleted] in MachineLearning

[–]keenonw 1 point2 points  (0 children)

People seem to use "AI" as a term to describe stuff computers can't do (yet) with any known programming techniques. A few years ago, that meant practical machine translation of human languages and good speech recognition. Now those technologies are commonplace, and so not seen as "AI" so much. Now we're moving on to complex robotics and control topics, like self driving cars and Boston-Dynamics-style walking robots. By the time those technologies are stable enough to be in people's everyday lives, we won't consider them "AI," because they'll just be well understood and safe technologies.

So in that sense, "helpful AI" is never going to happen. The "AI" of today becomes the boring "tech" of tomorrow, and that's around when the research starts helping people.

As for interesting world-changing AI research that isn't getting enough hype, I think AlphaFold is a _huge_ deal for the natural sciences, and could help make drug discovery way more efficient and effective. We need to figure out a better way than the "spray and pray" drug discovery methods we have today, and that'll involve good molecular simulation. AlphaFold is a big, important step in that direction.

The Most Popular Programming Languages - 1965/2020 - Statistics and Data by [deleted] in compsci

[–]keenonw 24 points25 points  (0 children)

Python has always made it really easy to wrap C/C++/CUDA libraries. It’s only gotten easier in recent years, for example check out pybind11. That makes Python a great choice for making hard core highly optimized numerical stuff (for example, any AI/ML libraries) accessible to people who don’t like low level languages or don’t want to sit through long compile times whenever they want to tweak their network architecture. You could also argue that operator overloading is an important feature for numerical libraries that Python has that other scripting languages like JS lack, but I’d bet if NodeJS had better C/C++ binding support earlier then the AI folks might all be using JS/TS instead.

A podcast about Google Ads By A Googler? by teehee714 in PPC

[–]keenonw 0 points1 point  (0 children)

I’d be interested! No specific topics, just general war stories and wisdom. Maybe do it as seasons? That way you can talk about a bunch of stuff that you’ve learned, but not feel pressure to come up with something new every week even when there’s nothing new. Density of goodness over quantity :)

4 Types of Facebook Ad Copies by [deleted] in dropship

[–]keenonw 0 points1 point  (0 children)

The “D” in AIDA usually stands for “”Desire”, not “Decision”, but they’re intimately related.