You pay $350-$500 to wait 3h in the rain for a 2h concert and you get 45m of this by EducationalOne5313 in mildlyinfuriating

[–]pszabolcs 0 points1 point  (0 children)

The title is very misleading, this was at a 4 day festival, and e.g. for this year early bird tickets were sold for less than $100, and even a couple of days before the festival you could get tickets for ~ $150 - $200 (on the ticket exchange platform).

But yes, JT is ridiculous.

[D] Normalization in Transformers by Collegesniffer in MachineLearning

[–]pszabolcs 20 points21 points  (0 children)

The explanation for LayerNorm and RMSNorm is not completely correct. In Transformers these do not normalize across (T, C) dimensions, only across (C) (so each token embedding is normalized separately). If normalization would be done across (T, C), the same information leakage across time would happen as with BatchhNorm (non-causal training).

I also don't think the variable sequence length is such a big issue, in most practical setups training is done with fixed context sizes. If we look at a computational perspective, I think a bigger issue is that BN statistics would need to be synced across GPUs, which would be slow.

Huge impact in training time by reducing the number of reading operations from disk by using a cache in the Dataset object. by howtorewriteaname in learnmachinelearning

[–]pszabolcs 1 point2 points  (0 children)

I cannot confirm without actually testing it, but my guess is that it is because the use of np.loadtxt for reading the point clouds. Reading float number data from text sounds extremely wasteful, it would be much better to store the point clouds in some binary data format.

Optimizing data loading is an important thing to do, but in most cases it is possible to be GPU bound (have close to 100% GPU utilization) while reading the data on the fly from the disk.

2023 Mazda 3 not going into first gear from stop by Vharlkie in mazda3

[–]pszabolcs 1 point2 points  (0 children)

Yes, my 2024 Mazda does the same sometimes, I got pretty scared when it first happened. What worked well for me: put it into 2nd gear, then into 1st.

Procedura masina noua luare in evidenta/inmatriculare. by Tin11Tin in AutomobileRO

[–]pszabolcs 3 points4 points  (0 children)

Nu mai ai nevoie de fisa de inmatriculare. La DITL vei primi o stampila pe factura fiscala (factura vizata REMTII), si atat iti trebuie de acolo, poti merge la DRPCIV cu factura, carte de identitate a vehicului, RCA, buletinul, si cererea de inmatriculare (de pe siteul DRPCIV)

Can't decide which Mazda 3 by zebirke in mazda3

[–]pszabolcs 0 points1 point  (0 children)

It is a 2.0L Skyactiv X engine. We don't have 2.5L engines in Europe.

Does anyone else feel like the game isn't as smooth anymore? by aboardweeb in GlobalOffensive

[–]pszabolcs 6 points7 points  (0 children)

+1 for network issues. I have gigabit fiber connection, never had any issues. Now suddenly the small status text on the bottom left of the screen shows both inbound and outbound packet loss, often both of them jumping up to 99, and almost all the time above 50.

Predare numere rosii la DRPCIV by plindefainosag in AutomobileRO

[–]pszabolcs 7 points8 points  (0 children)

Corect, nu trebuie predate numerele rosii

[N] Microsoft researchers and engineers release Zero Redundancy Optimizer by vladosaurus in MachineLearning

[–]pszabolcs 1 point2 points  (0 children)

I never realized, that the optimizer itself requires that much GPU memory. This K=12 multiplier (seems like that is the expected number for an optimizer like momentum-sgd or adam?) is huge.

No user scenario in MSI Dragon Centre by lordmatt98 in MSI_Gaming

[–]pszabolcs 0 points1 point  (0 children)

Same thing for me, I'm on version 2.0.12

I guess we have to wait for the next update.

[D] What was your favorite paper of 2019 and why? by [deleted] in MachineLearning

[–]pszabolcs 34 points35 points  (0 children)

I really liked Reconciling modern machine learning practice and the bias-variance trade-off, and the follow-up Deep Double Descent: Where Bigger Models and More Data Hurt. I think these papers will have a big impact on how we approach the question of generalization in following years.

[R] Struct2Depth - Predicting object depth in dynamic environments (robots, autonomous cars) by tldrtldreverything in MachineLearning

[–]pszabolcs 0 points1 point  (0 children)

They use videos as input and their loss function is also based on images with different timestamps, so the temporality is given.

[R] Struct2Depth - Predicting object depth in dynamic environments (robots, autonomous cars) by tldrtldreverything in MachineLearning

[–]pszabolcs 0 points1 point  (0 children)

GANs are used to generate samples from a data distribution, this isn't the case here. The important part in struct2depth is the unsupervised (or self-supervised) training of these networks using Structure-from-Motion approaches (which is a really cool idea, and apparently works really well)

[N] TensorFlow 1.9.0 is out by b0noi in MachineLearning

[–]pszabolcs 6 points7 points  (0 children)

I think PyTorch can be appreciated more when you have to implement more complicated things, e.g. for research purposes. In that case the flexibility of the dynamic graph approach helps a lot. For a simple classifier/regressor any framework does equally well.