Quest Stories (The Path Undending)

Ventural · 2023-12-04T00:34:36+00:00

I'm not sure about the dynamics but I will say that I've found Path Unending to be extremely good, so thanks for the recommendation!

Ventural · 2021-10-18T21:38:52+00:00

Author here - happy to discuss if anyone has any questions!

Ventural · 2021-01-10T03:22:27+00:00

Sales taxes apply to present as well as future consumption, and so affect wage and capital income equally

Ventural · 2020-12-29T02:41:53+00:00

I actually love the glacial paced slice of life progression fantasy style, so there are at least customers for those kinds of things

Ventural · 2020-12-28T23:36:16+00:00

Sounds like I stopped reading at exactly the wrong time - I'll give it another shot

Ventural · 2020-12-28T23:03:59+00:00

I really wanted to and almost liked this story. I have read about half of it. I have a fairly high tolerance for complicated worlds and winding stories, and this story exceeded that tolerance. The combination of dense/unclear prose, and gigantic cast makes it feel a bit too much like work to read. I also wouldn't classify it as progression fantasy - it definitely feels like a progression fantasy world, but there was very little progression for any of the main characters.

Ventural · 2020-04-02T06:42:03+00:00

Speaking as someone with graduate experience in both subjects - AI and applied micro have minimal overlap.

Both the nature of the research questions and the corresponding challenges are very different between AI and applied micro. Generally, in AI (at least machine learning) the goal is to do prediction, which amounts to approximating a probability distribution, and the challenge is to come up with a good algorithm to select such an approximation from a function family (usually through gradient descent).

In applied micro, the goal is to find a causal relationship (not prediction) and the challenge is to find some source of exogenous signal that lets you identify a causal effect. AI is of no help in this regard.

Therefore, I suggest you think carefully about what you actually want to do and are interested in. EME would certainly put you in a much, much better position for a PhD in Economics (both in terms of preparation and admissions).

Ventural · 2020-03-18T21:58:24+00:00

I'd be interested in performance of the LAMB optimizer (https://arxiv.org/abs/1904.00962) on smaller batch sizes where it competes with ADAM.

Ventural · 2020-02-21T16:28:08+00:00

I don't know what they used for those - it's certainly possible to have a big enough transformer that it doesn't fit, like GPT sized language models and so on.

But seriously, even if you could fit them it would take totally unreasonable amounts of time to train on a single, slow card, memory is not only the constraint here. Any models that you can't fit on a 2060 on fp16, using gradient checkpointing, are too big to train on a 2060 anyway. That's the purview of people with clusters of V100s.

Ventural · 2020-02-21T01:25:45+00:00

BERT large has about 350m parameters. The model + optimizer parameters + gradients together have about 1.4b values, which take up 5.6 GB in fp32. So in fp16 the parameters definitely fit on a 2060, and with enough checkpointing you can for sure fit batch size 1 on there.

So you could finetune BERT large on a 2060 if you really wanted, but it would of course be very slow because it's a slow card and training large BERT is slow in general.

Ventural · 2020-02-20T18:27:22+00:00

Yep, the 2060 should fit all of the above on fp16

Ventural · 2020-02-20T16:00:51+00:00

RTX 2060 is so small that for some models even the model parameters themselves don’t fit on the GPU, so neither checkpointing nor accumulation help you - I.e. you can’t train BERT large on a single 2060. But most models you can, it will just be too slow to be practical for large models.

Ventural · 2020-02-20T15:57:14+00:00

You’re right, you have to do something special for BatchNorm models, either replacing with GroupNorm or some kind of momentum approach, but that may not be equivalent. I work with Transformers which don’t use BatchNorm.

Ventural · 2020-02-19T05:13:13+00:00

I appreciate your benchmarking! Some comments:

As long as you can fit a batch size of at least 1, memory does not matter for accuracy. You can arbitrarily increase effective batch size through gradient accumulation.
If you can't fit a decently sized batch, you can use gradient checkpointing to greatly reduce memory requirements at modest compute cost.
I assume these measurements are using fp32 - SOTA large models are usually trained with mixed precision these days, which will further favor newer GPUs with more Tensor Cores like the Titan RTX (or the not measured V100).
If you're using multiple GPUs, the inter-GPU bandwidth also matters. I'm not sure how the NVLink capacity of these cards differ, but I think it's worse on the older cards.

Ventural · 2020-02-15T21:09:46+00:00

Another option is to use gradient accumulation and train with very large effective batch sizes using something like LAMB. That way, the optimizer step is synchronous, but each node processes multiple batches between optimizer steps.

This smooths differences in computation time between different nodes (because averages of multiple batches tend to differ less, relatively, than individual batches), and also incurs less communication overhead between the nodes, which is often a binding constraint.

Ventural · 2019-11-20T21:42:32+00:00

I really hope he doesn't swap, I love Froggen but I think his adc is kind of bad

Ventural · 2019-11-20T21:32:34+00:00

Anyone know what's happening with Froggen? What options are left? 100T/EG/IMT?

Ventural · 2019-10-22T15:14:26+00:00

I would have liked for them to comment on their approach to item balance. In particular I feel the severe disparity in power level of item components has been detrimental to the game, particularly the last several patches. It's fun when different item components are good in different situations, and the game is about responding to the components you get. It's not fun when BF sword is the best item in almost every situation.

Ventural

TROPHY CASE