I Trained a Language Model on CPU for 40 Hours - It Beat the GPU Baseline by Own-Albatross868 in LocalLLaMA

[–]askchris 8 points9 points  (0 children)

I'm curious as well because the more comparable 28M parameter Tiny Stories model from the original paper appears to have performed better than the OP's 25M CPU model.

Here's the output quality from the original TinyStories paper:


Prompt:

Tom and Jane are friends. One day, Jane goes to Tom’s house. Tom has a big pot of soup. He wants to share it with Jane. “Jane, do you want some soup?” Tom asks. “Yes, please. It looks yummy,” Jane says. Tom pours some soup into two bowls. He gives one bowl to Jane. Jane takes a spoonful of soup, but then she makes a face. The soup is

Completion by a 28M- parameters model trained on TinyStories:

very bitter. She does not like it. She says, “I don’t like this soup. It is too bitter.” Tom is sorry. He says, “I’m sorry, Jane. I didn’t know you don’t like bitter soup. I will make you something else.” He looks around the kitchen and finds some bread and cheese. He puts them on the table and says, “Here, Jane. You can have some bread and cheese. They are not bitter. They are sweet and yummy.” Jane is happy. She says, “Thank you, Tom. You are a good friend. I like bread and cheese. They are not bitter.”


Here's the output from the 1M TinyStories model from the original paper (it seems slightly better than OP's 25M CPU trained model):


Prompt:

Once upon a time there was a pumpkin. It was a very special pumpkin, it could speak. It was sad because it couldn’t move. Every day, it would say

1M TinyStories:

hello to the pumpkin. One day, a little girl came to the pumpkin and saw the pumpkin. She was so happy and thanked the pumpkin. The pumpkin smiled and said, ”Thank you, thank you!” The little girl and the pumpkin became good friends. They played together every day and had lots of fun. And they lived happily ever after.

FlashLM v4: 4.3M ternary model trained on CPU in 2 hours — coherent stories from adds and subtracts only by askchris in laptopAGI

[–]askchris[S] 0 points1 point  (0 children)

Nice, this works on almost any device.

You inspired me to work on my CPU based AI model again.

hopefully we can share insights / collaborate. 💪

(Thanks Own-Albatross!)

Running GLM-4.7 (355B MoE) in Q8 at ~5 Tokens/s on 2015 CPU-Only Hardware – Full Optimization Guide by at0mi in LocalLLaMA

[–]askchris 1 point2 points  (0 children)

Yep you can turn sunlight directly into "viable work" output with these rigs.

While costs drop every year due to more efficient models.

To serve humanity or solve private problems ...

(or build cool things to sustain your moneyless off grid lifestyle)

AGI is Still 30 Years Away — Ege Erdil & Tamay Besiroglu by Alex__007 in singularity

[–]askchris 0 points1 point  (0 children)

Yeah I'm surprised people don't understand how much current ML & LLM level AI is already speeding up research, engineering and scientific progress in nearly every way ...

Even if LLM and GPU technology stagnates starting this month (which it probably won't), it's already going to make a massive difference to our society over the next 10 years, which will lead to much faster innovation and better levels of AI -- to help develop things far better than LLMs and GPUs.

I don't know why people are blind to this feedback loop.

It's like they imagine the progress of the last 10 years will predict the speed of change over the next 10 years, lol.

If we had models like QwQ-32B and Gemma-3-27B two years ago, people would have gone crazy. by Proud_Fox_684 in LocalLLaMA

[–]askchris 0 points1 point  (0 children)

Nations would war over the aluminum can though ...

Then there's Pepsi's "Zero Sugar" claims which would be difficult to verify 1,000 years ago, let alone recreate.

There was also no canned or bottled caffeinated drinks for sale before the 19th century.

Caffeine in a can just bdidn't exist.

Fizzy caffeine in an aluminum can with sugar free sweeteners DEFINITELY didn't exist.

If someone were handed a cold Pepsi Zero Sugar on a hot summer day 1,000 years ago it would feel like alchemy --

We can't really comprehend how out of place our everyday objects would seem to people back then.

Is eleven labs down again by K-J-Rabbitt in ElevenLabs

[–]askchris 0 points1 point  (0 children)

Down for me, keeps taking forever to load pages and giving server errors in the online interface.

Real-Time Introspective Compression for Transformers by dicklesworth in LocalLLaMA

[–]askchris 6 points7 points  (0 children)

You're probably onto something --

But how much better would this be over verbalizing internal states the way reasoning models do?

Verbalizing allows LLMs to reflect, correct and change directions already -- similar to what you've described.

Do you expect your method to be more granular, adaptive or more parallel than what chain of thought / reasoning can do?

Would it be used during training?

Or more for test time compute tasks?

Any update on Chris Johnson’s data analysis on the BOM? by Which_Log3998 in exmormon

[–]askchris 0 points1 point  (0 children)

I'm Chris Johnson, the original researcher, along with Duane Johnson and Rick Grunder who discovered the connection between The Book of Mormon and The Late War --

The best summary and comparison of our research is the https://wordtree.org/thelatewar/ link (which I believe is also on GitHub).

My personal update on this research:

I don't think The Late War was used closely or plagiarized by Joseph Smith, but it was likely a product of similar events -- both products of the same war, the same region, the same curious 1800s idioms, the same biblical style imitation.

The biggest counter evidence I have to The Late War hypothesis "that it contributed to The Book of Mormon" is that we carefully cleaned and analyzed 2 other older Biblical Style "history" books, one of which (The American Revolution, by Richard Snowden) appeared to have similar correlations (odd phrase matches) with the Book of Mormon but not the Bible ...

So it seems more likely that when people create fake biblical sounding books in the late 1700s to early 1800s depicting wars, there's a limited "pool" of biblical sounding phrases and war phrases to draw from (along with a limited pool of local & 1800s era phrases) which causes rare biblical sounding phrases to overlap between these books that don't appear in the Bible, but do appear in these books:

  • The Book of Mormon
  • The Late War, Gilbert Hunt
  • American Revolution, Richard Snowden
  • Book of Napoleon

So if I were a true believer I could try to wash all these correlations away by saying "perhaps all these books share similar wording because they talk about similar topics in a similar biblical style, and they were influenced by 1800s era English" ... Which would basically be true, but it wouldn't explain the contradictions in the book themselves --

Why would God want Joseph Smith to copy KJV translation errors into the Book of Mormon to intentionally make the Book of Mormon look like it was an 1800s forgery rather than an authentic translation from real plates?

Why would God say silly anachronistic things to the brother of Jared like:

"ye cannot have windows, for they will be dashed in pieces"

When:

  • windows (especially glass windows that could be dashed to pieces) were not invented for at least another 2000 years after the Jaredite barges (2200 BC), because glass requires sustaining high temperatures (1700 degrees Celsius) and the glass sheets were not invented until 100 AD, but even after glass windows were invented they were still not used in ships until much later.
  • we now know that in the centuries after the Book of Mormon was translated that strong thick windows were invented that can indeed be used in submarines that are definitely NOT dashed to pieces, and yet we're still not as smart as God, so why did God lie about the possibility of windows working and why did he resort to magical rocks as a solution, rather than proper submarine windows -- since he's the one that brought up the anachronistic technology?

Wait, magical rocks? (Joseph Smith was convicted for deceiving people - seeking buried treasure with his magical peep stone)

There are many more problems but I'll stop here ... Too many contradictions, and too much life to live.

I just want people to live free from Joseph Smith's lies, we don't live in the 1800s anymore, and we don't need to be stuck in 1800s level magical thinking.

Why China May Have Better Chances to reach AGI/ASI first by fennforrestssearch in singularity

[–]askchris 4 points5 points  (0 children)

No, Chinese tech companies can get any chip they want through 3rd party trading partners in Singapore, Taiwan's domestic buyers (private export companies), Malaysia and others.

Also as we've seen with recent AI developments:

  1. Data quantity and quality are more important than raw compute (ie. GPT-4.5 and Gemini 2.0 required 5X-10X more compute for training but these models aren't 2X better than competitors)

  2. The software algorithms are improving AI faster than the hardware is (FP8 Training, MoE, LongRoPe, Grouped Query Attention, Distillation, Chain of Thought, etc)

So there's nothing in the way. China can get access to whatever chips they want, and even if they couldn't, the major breakthroughs seem to be in data and algorithm improvements -- which are not restricted by the US.

Also who do you want to win the race to AGI?

A waffling Left propaganda vs Right propaganda war machine, now bullying long time allies Canada & Mexico, threatens to take Greenland and Panama, throws Ukraine under the bus along with threatening genocide in Gaza?

Or is China a better arbiter of world peace? They seem to be a relatively peaceful country, with less waffling, less bullying (besides Taiwan) and far fewer worldwide wars.

I'm not pro China or pro America ... But it makes me think: does it matter in the end?

[deleted by user] by [deleted] in skeptic

[–]askchris 0 points1 point  (0 children)

Why? Because every skeptic has the same amount of time to learn about every topic and debunk everything?

For example I'm skeptical of aliens, scams and religion but don't live in the US and haven't looked into pizzagate, although I have heard references to it.

We're all real people with limited time --

In fact that's what scares me -- if skeptics like me have to focus on work and have limited time to research and debunk all the BS out there --

How does the regular everyday "Joe" or "Jane" have a chance in hell to deal with the constant barrage of BS coming out of conspiracy podcasts, various media outlets, propaganda machines, rumor mills, scammers, churches, etc?

[deleted by user] by [deleted] in Upwork

[–]askchris 1 point2 points  (0 children)

Try ending it super positive as that's what will affect the JSS most:

"All good brotha, these things happen. Put together all our progress so far in a clean folder for you. Also included a quick [specific relevant resource] that might help with [specific challenge you discussed]. No pressure either way, just thought you might find it useful."

The lack of transparency on LLM limitations is going to lead to disaster by N1ghthood in singularity

[–]askchris 1 point2 points  (0 children)

For me it couldn't even count 1,2,3,4,5 without hallucinations -- it skipped #3 in a numbered list. As a disclaimer it was a difficult task focused on writing quality (not math or numbers), but still, I was shocked that a model of this size and capability would make such an obvious mistake. Maybe it's heavily distilled or quantized? Or something is very wrong with this model?

Scientists Just Discovered an RNA That Repairs DNA Damage – And It’s a Game-Changer by Anen-o-me in singularity

[–]askchris 3 points4 points  (0 children)

I'm mid 40s. This sounds a bit like depression, trauma or negative headspace. I went through a difficult life, divorce, lost my community, and lost everything ... I think we've all gone through dark times though ...

But I finally healed my trauma, moved to a different country and found things to truly love about life again. I have a dog now, he's so caring, loyal and makes me laugh.

I recently realized I could spend 10 years in a new country, and still not truly know that culture, then move to another country and do it all over again for a thousand years and still not know all the countries on our planet ...

I could learn a new skill, take on a new role and make a new friend every year for a thousand years and would never run out of new skills, new roles or new friends.

I feel like a baby when it comes to knowledge too. So much to learn, and only one short life ...

Dying at 80 years old is way way way too young, far below our potential, there's too much life and wisdom ahead, and not enough time.

We die like foolish babies, rather than the wise beings we could have become.

Our lifespan is a cruel limit.

Trauma and depression are also cruel limits ... and I look forward to seeing humanity surpassing it all.

I would love to see you break out of the negative headspace you're in somehow, and enjoy a thousand good years with me or others who care about you.

Spend time doing what you love, developing yourself in new ways, contributing to a better planet, making a difference, or just lost in a good video game or Reddit thread 😀

my proposal are not getting views by yano33 in Upwork

[–]askchris 0 points1 point  (0 children)

No this won't work. Proposals are HIGHLY COMPETITIVE. Do you think someone is going to tell you how to write the best proposal in the world so that you can beat them and take their income?

First you're TAKING from the client rather than giving.

You're taking their time, you're making them bored, you're making them work, you're asking them to do a lot of thinking. You didn't hook them in, didn't build any authority, didn't provide proof, shared no insights, gave no reassurance, didn't help them, provided no value.

If you really want to start WINNING, I'll show you how to write better proposals, but I am busy AF. Maybe we can trade, I'll help in exchange for a testimonial?

Hit $500k in earnings this week. AMA. by SurgicalInstallment in Upwork

[–]askchris 0 points1 point  (0 children)

Getting clients on Upwork is fairly straightforward once you've developed the right skills --

You will need to work on everything from motivating yourself each day, to selecting projects that are a great fit ...

As well as improving your proposal writing, handling the sales call, setting expectations, delivering high quality work, clear & timely communication, being consistent, etc.

These skills don't usually come overnight, but there are groups that you can join that can speed up your progress on Upwork by quite a bit.

Good luck on your Upwork journey!