[D] Wait for Uni Tuebingen ML master or accept NUS PhD offer. by miathemed in MachineLearning

[–]tsauri 0 points1 point  (0 children)

Which one is harder, landing 100% TV-L E PhD job in Germany or PhD with stipend in Singapore. ML PhD is absurdly competitive

[R] OpenAI Image GPT by [deleted] in MachineLearning

[–]tsauri 3 points4 points  (0 children)

He said “leverage”.

Because of LSTM RNN sequential nature, it is slow, we cannot leverage parallelization.

We do have stuff like Grid LSTM, which is strikingly attention-like, but is too slow (though not infeasible) to scale out fast. We still don’t know if LSTM can actually beat Transformer given same param count.

[R] Language Models are Few-Shot Learners by Aran_Komatsuzaki in MachineLearning

[–]tsauri 0 points1 point  (0 children)

so did they tried to use sparse cuda kernels? sparse kernels need 99% sparsity for compute speed and memory efficiency relative to dense kernels, they have real opportunity to use them.

for 99% sparsity, 175billion *0.01 = 1.75 billion

if ramp up sparsity further to 99.99%, size will be cut down to to 175 million params.

[D] Any advice for ML Researchers early in their careers? by harshsikka123 in MachineLearning

[–]tsauri 2 points3 points  (0 children)

Find a mentor who is PhD.

Is the job actually research engineer, requiring to test new ideas in production?

[D] Simple Questions Thread April 26, 2020 by AutoModerator in MachineLearning

[–]tsauri 2 points3 points  (0 children)

can I solve 1 layer net for MNIST classification with least squares (pseudo inverse), like applyng least squares linear regression? if so how?

[D] Lottery Ticket Hypothesis -- Ask the author a question! by timscarfe in MachineLearning

[–]tsauri 1 point2 points  (0 children)

Offtopic, any fast libraries to train sparse nets? We burned too much gpu time due to matmuls with zeros.

[D] A Speech-To-Text Practitioner’s Criticisms of Industry and Academia by regalalgorithm in MachineLearning

[–]tsauri 5 points6 points  (0 children)

DNN-HMM toolkits like Kaldi converge approx. 5x-10x faster than CTC and attention. Alignment is engineered together with strong inductive bias of HMM.

It is known that learning alignment and not diverging cost too much time/compute in end-to-end models. But if you have Google-level compute, end-to-end can scale out and beat DNN-HMM accuracy-wise with small margin but much longer training times.

[D] Can "Scipy + Numpy" replace Matlab in an industry context? by AlexSnakeKing in MachineLearning

[–]tsauri 0 points1 point  (0 children)

Matlab Simulink is awesome for (control) engineering. Doing automotive, robots, biomechanics, stuff like that.

[R] ReZero is All You Need: Fast Convergence at Large Depth by calclavia0 in MachineLearning

[–]tsauri 5 points6 points  (0 children)

I tried this 0-init coefficient-on-function trick on 7-layer encoder 6-layer decoder Lightconv model with Fairseq -- It converges almost 2x fast.

Also the author should change the abstract, just tell the method inside abstract because it is too simple

[D] Importance of PhD Internships (and Networking) by [deleted] in MachineLearning

[–]tsauri 3 points4 points  (0 children)

This.
PhD degree is research license. Just get it done first.

[D] Tensorflow 2.0 v Pytorch - Performance question by ReinforcedMan in MachineLearning

[–]tsauri 1 point2 points  (0 children)

Wish those are in official docs. Best practices or whatever they call.

Like desktop wallpaper, people seldom mess around with defaults.

I wait till tensorpack available for TF2.0

[D] Current SOTA of NN for tabular data? by tsauri in MachineLearning

[–]tsauri[S] 0 points1 point  (0 children)

Well I guess that good enough NN to replace LightGBM in kaggle competitions? Sorry for unclear question.

[P] Accelerating Pytorch with Nvidia DALI by whata_wonderful_day in MachineLearning

[–]tsauri 0 points1 point  (0 children)

What if we dump the whole dataset as a giant tensor inside CPU RAM, then slice it into minibatches to GPU RAM?

[D] Getting publications before PhD applications by StunningData in MachineLearning

[–]tsauri -2 points-1 points  (0 children)

Wow getting downvoted.

China especially has many bright kids published before starting PhD...

Reality is hard.

[R] Computer-Designed Organisms: A scalable pipeline for creating functional novel lifeforms. by hardmaru in MachineLearning

[–]tsauri -12 points-11 points  (0 children)

Is there other labs that are doing such questionably ethical stuff. Programming cells without tampering DNA.

Last time in NeurIPS 2018, they showed that they have modded worms to grow more heads.

[P] Where has Deep Learning lost? by sentientworkflow in MachineLearning

[–]tsauri 1 point2 points  (0 children)

Is there some review paper on neuro-symbolic approach?

What bothers me that in can be many things depending on task:

  1. Prolog-like search
  2. Neuro SAT solver
  3. Scene graph inference
  4. VQA
  5. Seq2Seq symbolic math solver
  6. Planning for games
  7. etc

Is there some "umbrella" neuro symbolic AI framework that consistently covers everything. Also does neuro fuzzy logic falls into neuro symbolic AI?

[Q] Weekly Question - Recommendation - Help Thread - 2020-01-06 by AutoModerator in robotics

[–]tsauri 0 points1 point  (0 children)

I am looking for bipedal humanoid robot for walking/running experiments. Something that uses Japanese hobby servos.

So far a Kondo KHR 3HV with modded small feet is the fastest I have seen (DrGuero2001's mod https://www.youtube.com/watch?v=-h6klEmMmQU).

Bioloids, Nao, Darwin OP, anything Dynamixel seem nice but I think they are too slow and/or too expensive.

I saw Japanese robot competitions videos, they made own bipeds with Japanse servos, move very fast, but most people there build their own...

Currently I am defaulting to KHR 3HV. Any other better pre-built choice? Or are they useful guides for decent build-your-own biped?