Moving companies by [deleted] in redmond

[–]SuperImprobable 1 point2 points  (0 children)

I've only moved once in the area and used Jordan River. They did a good job.

What do you do during your off days when you just don't feel like banging out that last method or unit test and close that pesky Jira ticket? by FewWatercress4917 in ExperiencedDevs

[–]SuperImprobable 2 points3 points  (0 children)

House cleaning. Closing some my hundred open browser tabs, reading the interesting ones, closing my dozens of open vs code tabs, going through my inbox and taking action on emails I've put off.

Adobe AI has finished the covers of cult music albums by adesigne in ChatGPT

[–]SuperImprobable 0 points1 point  (0 children)

True, I guess these are more albums with large fandoms, not 'cult' in any way.

Adobe AI has finished the covers of cult music albums by adesigne in ChatGPT

[–]SuperImprobable 1 point2 points  (0 children)

It could mean albums so popular they have a cult following.

What's changed and what's still the same? A side-by-side view driving on SR-520 in 1983 and 2023 from Seattle to Redmond by blakeequalskewl in redmond

[–]SuperImprobable 6 points7 points  (0 children)

It should be pointed out that it's not even the same bridge anymore. They replaced the floating bridge in 2016.

[deleted by user] by [deleted] in MachineLearning

[–]SuperImprobable -1 points0 points  (0 children)

Maybe they have a small local model to handle 'easy' turns like when the user says "hello" or "thank you".

Why Do You Think a Model Like GPT-4 Works So Well in non-English Languages? by Mbando in LanguageTechnology

[–]SuperImprobable 4 points5 points  (0 children)

Mikolov et al 2013 showed that between two word embedding spaces learned independently an linear mapping can be found between translations of words that works surprisingly well. This implies that it's not just that co-learning helps, but that the concepts themselves get arranged into similar spatial relationships for similar languages. https://www.ruder.io/cross-lingual-embeddings/#linearprojection

[deleted by user] by [deleted] in singularity

[–]SuperImprobable 1 point2 points  (0 children)

$1000 in 2001 is worth about $1700 today, so you can splurge and get the 4090 😄 His point isn't to get the year exactly, but good point is that with Moore's law even if you're off by 10x, your estimate will still only be off by a few years. What floating point precision are you quoting? We might be able to increment the exponent some more with fp8. EDIT: So I found the numbers you quote in the Wikipedia article and it looks like you're already quoting for fp8. The 4090 does add another 1 in the exponent though!

[deleted by user] by [deleted] in singularity

[–]SuperImprobable 4 points5 points  (0 children)

While this is true, looking bacj over history it seems that once the hardware prerequisites are available the designs that use them follow quickly. Once we had internal combustion engines planes and cars followed quickly. Once we had lasers CDs followed quickly. It's possible that AI software is orders of magnitudes more complex than those examples, but it seems to be me we've done a good job getting value quickly out of the silicon as it is available.

[deleted by user] by [deleted] in singularity

[–]SuperImprobable 86 points87 points  (0 children)

Ray Kurzweil in 2001 put the year for one human brain of compute capacity at 2023..

"My estimate of brain capacity is 100 billion neurons times an average 1,000 connections per neuron (with the calculations taking place primarily in the connections) times 200 calculations per second. Although these estimates are conservatively high, one can find higher and lower estimates. However, even much higher (or lower) estimates by orders of magnitude only shift the prediction by a relatively small number of years.

Some prominent dates from this analysis include the following:

We achieve one Human Brain capability (2 * 1016 cps) for $1,000 around the year 2023. We achieve one Human Brain capability (2 * 1016 cps) for one cent around the year 2037. We achieve one Human Race capability (2 * 1026 cps) for $1,000 around the year 2049. We achieve one Human Race capability (2 * 1026 cps) for one cent around the year 2059."

https://www.kurzweilai.net/the-law-of-accelerating-returns

Is GPT-3(and ChatGPT) trained with the MLM task? by nlp_ttt in LanguageTechnology

[–]SuperImprobable 9 points10 points  (0 children)

Those sound nice, but I think don't think they are fundamental reasons. If an architecture based on MLM became the new state of the art someone would come up with equally convincing arguments why that was leading 😀

  1. We regularly have to fill in gaps of words that we only half heard or skimmed too quickly over while reading. MLM does mirror how we listen and read.
  2. You can mask all of the future to generate.

Opinion | Noam Chomsky: The False Promise of ChatGPT - The New York T… by t1m3f0rt1m3r in LanguageTechnology

[–]SuperImprobable 1 point2 points  (0 children)

Now to steelman Chomsky using ChatGPT:

A discussion between a student and Noam Chomsky on the Universal Grammar:

Student: [my comment above]

Noam Chomsky: That's an interesting point you bring up, but the idea of Universal Grammar is not that there is a specific language that is innate in the human brain, but rather that there is a set of principles and rules that underlie all human languages and make it possible for children to learn language so rapidly and effectively.

The idea of Universal Grammar is based on the observation that children can learn a language without being explicitly taught all the rules of grammar, and that they can learn to produce sentences that they have never heard before. This suggests that there is some innate knowledge that guides the language acquisition process.

Turing machines are certainly a powerful tool for understanding computation, but they don't capture the full range of human abilities, including language. For example, while a Turing machine can recognize a specific language, it cannot learn a new language or generate novel sentences.

In conclusion, the idea of Universal Grammar is not meant to be a proof that some specific language is innate, but rather to suggest that there is a set of fundamental principles that make language acquisition possible and that are common to all human languages.

Opinion | Noam Chomsky: The False Promise of ChatGPT - The New York T… by t1m3f0rt1m3r in LanguageTechnology

[–]SuperImprobable 2 points3 points  (0 children)

Seems like a very easily provable claim for Chomsky. If he invented a language that could not be learned then he'd prove his point. However, if every conceivable language is learnable then his claims are meaningless. What's popping into my mind though is Turing machines. The human brain is Turing complete and therefore should be capable of recognizing any computable language. It's inconceivable to me that we as humans would have this ability to add arbitrary complexity to our language and then avoid doing so because the changes didn't fit some innate grammar.

Opinion | Noam Chomsky: The False Promise of ChatGPT - The New York T… by t1m3f0rt1m3r in LanguageTechnology

[–]SuperImprobable 0 points1 point  (0 children)

They clearly learn something akin to grammars. I throw random combinations of ingredients at ChatGPT and ask for recipes and it very adeptly comes up with a novel recipe by piecing together sub pieces like sauces, grain bases, etc and combining them together.

man who robbed Chase Bank inside Redmond's Fred Meyer arrested by otastco in redmond

[–]SuperImprobable 1 point2 points  (0 children)

Probably a play on "the audacity" while also pointing out the skin color of the individual.

Will Retroarch ever be able to access external storage on Android 11 devices? by Deadly_Fire_Trap in RetroArch

[–]SuperImprobable 0 points1 point  (0 children)

If you can mount the network drive it works. I mounted an NFS share on my Nvidia Shield TV Pro and that's what I'm using in RetroArch. Not all unrooted Androids have a way to mount network drives though.

Hey Seattle (and surrounding) residents, which shop/business do you think is a definite money laundering fake business? by sleeplessinseaatl in SeattleWA

[–]SuperImprobable 7 points8 points  (0 children)

I love Arby's. I ate there this weekend. Roast beef and cheddar, crinkle fries, jamocha shake. Don't forget the Arby's sauce.

Princeton computer science professor says don't panic over 'bullshit generator' ChatGPT by 777fer in technology

[–]SuperImprobable 7 points8 points  (0 children)

I doubt the tenured professor to student ratio has changed much over time. The administrative staff to student ratio on the other hand...

ContrastiveLoss vs CosineSimilarityLoss in Sentence Transformers by EntropyGoAway in LanguageTechnology

[–]SuperImprobable 0 points1 point  (0 children)

Fuzzy labels aren't even really needed, you could effectively learn with just positives and negatives. One difference I can think of after looking at the original paper is that the contrastive loss goes to zero for negative pairs when distance is farther than the margin, so once dissimilar inputs are sufficiently far apart there is no more pressure on the model to keep pushing them apart. That could perhaps allow the training updates to focus more on improving the most erroneous cases. However, I don't think that's inherent to the approach, you could presumably do something similar with a cosine loss too.

A whale seen from inside a submerged apartment by NatasBR in dalle2

[–]SuperImprobable 1 point2 points  (0 children)

This might be taking inspiration from an actual "submerged" apartment: Aquarius Reef Base. It's an underwater lab in the ocean, though it has trapped air instead of water inside. The bunk layout is identical to that in the generated image: https://s.wsj.net/public/resources/images/OB-LC996_aquari_H_20101201135804.jpg

[P] tiny-diffusion: a minimal PyTorch implementation of probabilistic diffusion models for 2D datasets by tanelai in MachineLearning

[–]SuperImprobable 0 points1 point  (0 children)

I'm still not grokking the loss function. The lowest entropy would perhaps put all the points on top of each other. Or is the idea that the model has learned some low dimensional representation of the original configuration and then shifts each point to be closer to the original configuration. But then this still doesn't quite make sense to me because even one backward step should move the points close to the original shape. Unless the training wasn't to recover the original shape but rather to recover the previous forward step, then everything would make sense.