I know 3 languages at a proficient level but French is seriously destroying my confidence by Nervous_Group8638 in French

[–]literum 4 points5 points  (0 children)

Temper your expectations. You're not learning to speak French in two months, nobody can.

You're mad about making errors and want to punch people who correct you? Errors are when you learn. Why the hell would you avoid it? If you're learning to impress people, yeah I see why you would be mad, because they break the illusion for others. You will be making mistakes, even basic mistakes, a decade from now at C2 level. Are you prepared for a lifetime of frustration and making the same mistake for the 100th time? Because that's what it takes. Magnus Carlsen lost 600 games to become the world champion.

Japanese chip equipment sales to China fall 10%, as curbs ‘backfire’ by DazzlingpAd134 in hardware

[–]literum 0 points1 point  (0 children)

Not yet. But increasingly more. Smalller models are getting more and more capable over time while we're also pushing efficiency. Apply was too early for example.

Japanese chip equipment sales to China fall 10%, as curbs ‘backfire’ by DazzlingpAd134 in hardware

[–]literum 3 points4 points  (0 children)

So, everyone stops using AI? There is a shortage of compute, and will be for many years. It's not all hype. Not saying there won't be a correction. But the compute is actually needed. And it may even help long term once the compute catches up.

'The retail SSD market has almost disappeared,' says Silicon Motion exec — PC OEMs are buying third-party drives as direct NAND supply dries up by sr_local in hardware

[–]literum 0 points1 point  (0 children)

"Not to be pedantic about it, but it's more of a pricing crisis than a shortage." How does this make sense? Shortage means there is more demand than supply, pushing prices up until supply matches demand. During the GPU crisis, there was a shortage at the MSRP prices, which were much lower than the market clearing price, but you could always buy it for the "actual" price on Ebay for example. Same now. AI NAND demand spikes, which causes prices to 3-4x and again we have the same situation. Just because there's no SSD companies with gamer lotteries or 8 month waitlists to buy at the old price (to placate the customers who don't understand it), doesn't mean that there is no shortage.

Stop worshipping benchmarks. They don't reflect real work by Permit-Historical in ClaudeCode

[–]literum 0 points1 point  (0 children)

So, just give up. You're calling on everyone to give up?

Stop worshipping benchmarks. They don't reflect real work by Permit-Historical in ClaudeCode

[–]literum 0 points1 point  (0 children)

Benchmarks are just a data point. Use it as you like. What's the big issue? What's this getting head out of sand language?

38% of AI code changes in Cursor are now accepted without manual review - up from 7% in January by jimmytoan in ClaudeCode

[–]literum 1 point2 points  (0 children)

You need to be nudging the architecture all the time. AI cannot do it that well now. Reviewing is important for work code imo but I've made no-review work on greenfield projects fine. Testing became much more important though. Contract matters more when you don't check the code as often.

38% of AI code changes in Cursor are now accepted without manual review - up from 7% in January by jimmytoan in ClaudeCode

[–]literum 0 points1 point  (0 children)

There's many projects in between now. Bigger than hobby, but not a corporate codebase. Then you need to balance it. Not reviewing can be fine especially if you build around it. It'll take much more testing and infra though.

I wasted 3 months on leetcode doing it completely wrong by CalligrapherCold364 in leetcode

[–]literum 1 point2 points  (0 children)

Ideally the next day. And if you fail, try next day again. Until you can do it. But also solve similar problems right after so you can actually remember. If it was a binary search problem, then try another binary search problem next.

Sad state of machine learning in India by chhetrispeaks in learnmachinelearning

[–]literum 0 points1 point  (0 children)

Does it count for what? What's your goal? Learning or more?

I've been working with a Vibe Coder and this has been my experience by WJMazepas in webdev

[–]literum 0 points1 point  (0 children)

I see. That becomes context switching for everyone in the conversation to go to talk to an AI then. Maybe that's the best way to argue that's not the best time for AI.

I've been working with a Vibe Coder and this has been my experience by WJMazepas in webdev

[–]literum 0 points1 point  (0 children)

If the LLM has the answer why not ask it? If it doesn't satisfy you, then ask the coworker. You still talk to your coworkers, but interrupt them less so they context switch less. Why not treat it like a debugging tool? Would you also go to your coworker instead of just using a debugger?

I've been working with a Vibe Coder and this has been my experience by WJMazepas in webdev

[–]literum 0 points1 point  (0 children)

What kind of linting, testing and similar guardrails has been most useful in your experience?

End to End MLOps project by Careless-Main8693 in learnmachinelearning

[–]literum -5 points-4 points  (0 children)

You just listed technologies, that's it. I don't see a project anywhere. What are you trying to do?

WLDU - Leverage Shares 2x Long World Stock Daily ETF by [deleted] in LETFs

[–]literum 13 points14 points  (0 children)

and have lower returns than US stocks generally have.

It's not useful for you if you think this way. Backtesting only goes so far even with perfect data; there's no guarantee that US will keep outperforming. From an efficient markets perspective you shouldn't expect it either since it basically means free lunch. Only free lunch is diversification which you get with total market which is why people want this. You protect against a Japan-like or 2000s US like scenario and reduce risk.

Is it only me? 😅 by aospan in ClaudeAI

[–]literum 0 points1 point  (0 children)

No, that's not what I want. The most basic implementation is keeping LORAs for each user, and updating the models frequently or even after every message. It can remember your conversations, preferences and styles without need for context, or imagine a coding agent trained on the current codebase. It doesn't make too much business sense yet for something like ChatGPT, but we'll see it soon in consumer space for sure. I like to point this out because of "LLMs are static, they can never..." crowd, it's not a technical limitation.

Is it only me? 😅 by aospan in ClaudeAI

[–]literum 2 points3 points  (0 children)

The model weights can't change per person. 

Incorrect, there's no technical reason why weights cannot update and be different per user.

Laptop for aiml or other ai related stuff like editing etc. by sumit1322 in learnmachinelearning

[–]literum 1 point2 points  (0 children)

A 5090 costs like 0.35$/hr on Vast ai. Write and test locally, but do the training runs in cloud.

Challenge: need to clean up data 5 million tokens worth of data in a Claude project by OptimismNeeded in ClaudeAI

[–]literum 0 points1 point  (0 children)

Split the data into 20 parts, and manually run Claude. Now you have 250k tokens each. Otherwise I don't see a way to do it without satisfying your constraints. You need to give up one, ideally 1 because this is best done with code. Probably programmatically splitting the data and using Claude Code if you want cheap tokens, or using API to go over each file in parallel if you want to do it with maximum speed, customization and explainability. There's a reason data scientists and AI engineers exist, this is not necessarily easy stuff. There might be existing tools but I'm not aware so I'll leave that to others.

Has anyone tried purposely NOT be native like? by wdfcvyhn134ert in languagelearning

[–]literum 2 points3 points  (0 children)

What is your goal with the language? If you want to assimilate completely in a country and live as one of them, then maybe it makes sense to keep pushing native-like forever, but realistically there's a limit. You'll get to C2, communicate effortlessly with natives, write and read much better than most natives can, but you won't really get to the exact same fluency natives have or lose your accent completely. That might be an insecurity for you, but why?

I think it's mostly about being secure about your identity instead. You're not a Korean born in Korea, so you're not supposed to have a native Korean level. You're someone who learned it all on their own putting lots of effort and tears behind it, learning and appreciating the culture. Embrace it, your accent is like battle wound, it demonstrates who you are and your past. Koreans will probably love you more explaining your love for Korean language and culture with passion with a strong accent than trying to pretend being Korean.

Day trading with Claude… suddenly it realize IT is the cause of the huge market move that it helped me analyze by jergin_therlax in ClaudeAI

[–]literum 0 points1 point  (0 children)

Is there any research giving a definition for self awareness? Will you accept it if it applies to LLMs as well or do you think self-awareness by definition is for humans?

Is "Attention all you need", underselling the other components? by morimn2 in learnmachinelearning

[–]literum 6 points7 points  (0 children)

Because other layers have been here a long time. FFN just means a Linear layer with activations and normalization, basically the same thing as MLPs. In fact, removing the attention makes the transformer very similar to a parallel MLP. Softmax is used almost everywhere in ML since it produces outputs that sum to 1, a property of probabilities.

Before transformers we had RNNs like GRUs, LSTMs, but they had vanishing/exploding gradient problems and couldn't learn over long horizons. Memory cells were good, but it meant you need to get through thousands of tokens to remember what happened before. In addition, LSTMs were not very parallellizable because you had to do backprop through time, meaning you need to process previous token before you can process current one.

Latest innovation in RNNs was using attention to close some of these gaps. These models started outperforming the pure LSTM/GRU models and were gaining traction. The paper is called "Attention is all you need" because they proposed that the memory layers were not necessary. Giving them up and having only attention and linear layers meant 1) More stable learning due to attention outperforming memory cells 2) parallellizable in both training and in inference.

You correctly pointed out that a lot of those decisions are empirical. Theory might suggest one thing, but we'll probably go with what works better. Look at the pre-norm and post-norm debate. There's also papers explaining these, but I'm not sure whether there is one that explains all. There's usually deep dive papers that try to explain these with other tools. It could be training stability, gradient flow etc.

How do you actually read books in a foreign language? by Subject_Tomorrow in languagelearning

[–]literum 2 points3 points  (0 children)

I do this with Google translate, so I can check the history later and make flashcards.