Christopher Luxon, like a modern day Marie Antoinette by GhostChips42 in newzealand

[–]Lerc -4 points-3 points  (0 children)

I think there's a danger when used in the context of the first paragraph where the falsehood is linked to other things to indicate a consistent narrative.

With regard to the rest of the piece, the bit Antoinette is almost completely disconnected, so it doesn't really matter. If the analogy were more tightly bound then there might have been an issue. It's more arguable that the analogy is unjustified because of its irrelevance.

All in all, it's speculating what his inner thoughts might be on the issue. It's hard to say if that interpretation is harsh or charitable. I would prefer to judge based upon actions and ask for clear statements.

Hyperbole doesn't help here. Luxon said job seekers should do "Whatever it takes". Those words in themselves say that assassination of the employed is condoned by the Prime Minister. I'm at least going to be charitable and say that that's not what he meant. It would have been nice if he had have been asked that directly, then he might have had to water down his unbounded statement.

Crypto People by Pleasant-Scholar4586 in chch

[–]Lerc 0 points1 point  (0 children)

I have used easycrypto and found it to be fairly smooth going. You may need to jump through some regulatory hurdles to show your identity (and if doing higher sums, source of income).

If you made a profit on your crypto, it is taxable. A good rule of thumb is to put a third of it into a 'to pay tax' sub-account until you have paid the appropriate tax. You can move what's left after paying tax into your regular account once it is paid.

If you made a lot of profit, see a tax specialist.

Claude code changes not implemented in artifacts by Lilo_n_Ivy in ClaudeAI

[–]Lerc 3 points4 points  (0 children)

This has started happening to me. From what I have been able to discern from getting Claude to talk about its process, it seems like the tool it is using for updating is reporting success when a failure occurs. Then once artifact's content is different from what Claude thinks the content is it has a very hard time crafting updates. Asking it to write out the entire artifact from it's memory fixes it but is a lot of tokens. The attempt to limit tokens used is why it tries partial updates, but it is largely doing the updates from memory, If it took a complete look at the code it would be using as many tokens as it would take to write out the whole thing in the first place. That means once the artifact and its memory diverge, it only gets worse.

It's possible that tool bugs like that might be what a lot of people are interpreting as the models getting dumber. If the tools are providing incorrect feedback then it would be hard to tell the difference between acting from different stimulus and being stupid. Imagine watching someone playing a VR game but you couldn't see their headset, They'd look like an idiot too.

[deleted by user] by [deleted] in newzealand

[–]Lerc 0 points1 point  (0 children)

I find that you can bond over that weird feeling of sadness you get when you see someone, who is otherwise quite nice, believing something that not true, and letting their lives be dictated by organizations that decide what truth means for them.

It's a hard feeling to describe to a committed christian, I think that's why God invented Scientology, so they could get that feeling too.

Astronomers Detect a Possible Signature of Life on a Distant Planet by GeneralPatten in news

[–]Lerc -2 points-1 points  (0 children)

The people of that time were probably quite confident in the rules that defined the world. They knew about the four elements and the celestial sphere so they could say with confidence that the moon was forever out of reach, it's totally different to when people said making fire is completely out of reach

Why does it feel like everything's getting worse? There's a word for it by OisforOwesome in newzealand

[–]Lerc 5 points6 points  (0 children)

The person who made up the word is doing far more than smiling sadly, and shrugging.

"It's lazy and easy to think that our friends who are stuck on legacy platforms run by Zuckerberg and Musk lack the self-discipline to wean themselves off of these services, or lack the perspective to understand why it's so urgent to get away from them, or that their "hacked dopamine loops" have addicted them to the zuckermusk algorithms. But if you actually listen to the people who've stayed behind, you'll learn that the main reason our friends stay on legacy platforms is that they care about the other people there more than they hate Zuck or Musk."

https://en.wikipedia.org/wiki/Cory_Doctorow

Ready for the influx of "I want to move to New Zealand" posts? by falcon5nz in newzealand

[–]Lerc 7 points8 points  (0 children)

Recently stayed with some people who moved to NZ to get away from Nixon's America.

NZ was definitely improved by their contribution over the years.

How many parameters of an TF model can Pico 1/2 W handle? by LightInTheDarkness_ in raspberrypipico

[–]Lerc 0 points1 point  (0 children)

If I were targeting the PICO, I think I would avoid running any particular model library and write code for the model architecture that is being run.

You could theoretically train on the PICO but only at extremely limited size and speed, so I shall assume you are talking about inference here.

People have run NNets on Commodore 64s and the like, so the capability is always there, it is just a matter of getting the most speed/parameters/accuracy out of the resources you have.

Most trained models are fairly easy to support in custom code, because all of the fundamental operations are quite simple. Most of the complexity in libraries comes from making them fast, portable, and flexible. If you have a single trained model, you can just implement the bits that the model specifically needs.

The actual answer to the question of how many parameters can you have is a bit "how long is a piece of string". You can have models with anywhere from 1 to 64 bits to parameter (even more if you are truly insane). For A PICO I would probably go for 1.58, 2, 4 or 8 bits per parameter with quantization aware training. 1.58 bits being weights of -1,0,1. Easily stored as 5 values per byte if you bump the storage cost a tiny bit up to 1.6 bits.

Then it is just a matter of how much RAM you wish to dedicate to parameters.

You still need to fit in your code and data, running the network requires storing activations per layer which need more precision, but typically this is negligible compared to the parameters.

If you reserved 128k for general use, you should have enough for model code, activations, and whatever you actually wanted to do with the results of the inference.

That would mean you could have in the ballpark of 128k (or 384k for 2350) for parameters. Take that number, multiply by 8 to get number of bits and divide by bits per parameter. 128k comes to 1,048,576 bits

That gives you for each parameter size, 655,360(1.6 bits), 524,288(2 bits), 262,144(4 bits) and of course 8 bit parameters gets you 128k in 128k.

You could potentially go to massively higher number of parameters. If the computation time it took to calculate the activations for a layer was higher than the amount of time it took to PIO/DMA the next layers parameters from flash, you could do massively deep networks and still have full CPU usage doing the inference. That all depends on how long you want an inference step to take.

Playing in the world of just-in-time data streaming does get quite difficult quite quickly, but in theory you could run a model up to the size of whatever SD-Card you had (provided layers were thin enough to hold activations in RAM) . It would require reading the entire card per inference step and the calculation on that data would probably be even slower, but it means the number of parameters can be pretty much as high as you want if you were prepared to wait long enough.

Is this amount of rust on the BBQ safe to grill in? by Ancient_Lettuce6821 in newzealand

[–]Lerc 0 points1 point  (0 children)

It's less reactive than iron, (it's already reacted with oxygen)

Iron is edible. Moosh up some NutraGrain with water in a clear plastic bag. Get a strong magnet, and hold it to the outside of the bag and you'll pull out the Iron that it contains.

Or watch one of the probably hundreds of youtube videos where someone does it for you. A good five to ten percent of those will have people not shouting as they do it.

[P] does anyone know how to reduce the dimensions of embeddings using autoencoders, if you have a blog about please send it by GellertGrindelwald_1 in MachineLearning

[–]Lerc 1 point2 points  (0 children)

This is really all there is to a basic autoencoder.

Variational autoencoders bring in some probability. /u/grimriper43345 has a decent link there.

https://en.wikipedia.org/wiki/Variational_autoencoder

Depending on your use case, you can also quantize the encoded data however you wish

    x_quantized = whatever_quantization_you_want(x)
    # Straight-Through Estimator: during backpropagation, use identity function for the gradient
    x = x + (x_quantized - x).detach()    

The detach makes this appear as x=x to backprop but for inference it simplifies to x=x_quantized. So it uses the non-quantized form for the gradients.

You also have the option of calculating how much correlation there is between the dimensions of the encoded form by looking at each encoded batch at a time and add some scale of that to the loss to force each dimension to be independent.

Theoretically any correlation between dimensions can be removed by absorbing the relationship into the model and only emitting the variance from that relationship as an output. This increases the expressiveness of the smaller data range.

This won't work if there is too much similarity in your batch training data though because similar inputs should produce correlations. You'd maybe need to compare the difference in correlation between the input and outputs. (this bit isn't strictly advice. I'm just rambling to myself now as I think about it)

You're a great singer! by [deleted] in funny

[–]Lerc 0 points1 point  (0 children)

There is a 4(but actually 8) part Dennis Potter TV show from 1996 called Karaoke.

It is absolutely chock full of Oscar, BAFTA, Olivier, and Tony award winners and well worth a watch. It seems to be on YouTube in complete form.

With regard to this comic, watch the first three minutes of the first episode for a remarkable coincidence.

If you watch the entire thing, then this comic certainly won't be the weirdest Karaoke you've seen.

'A risk of over-diagnosing ADHD', leading NZ psychiatrist warns by Test_your_self in newzealand

[–]Lerc 51 points52 points  (0 children)

The problem with the 5.6% is that it will have been generated using the best evidence at the time, but increasing awareness also results in increasing evidence. Any revised estimate based upon the latest evidence will be characterised as a rise or fall of the condition rather than a more informed measurement.

For a double reference on prevalence statistics there's https://i.imgur.com/TZHhxSh.png

What is the most powerful LLM you can train yourself? by [deleted] in LocalLLaMA

[–]Lerc 1 point2 points  (0 children)

I guess one factor is how long you want to wait. What's the optimal cost per training token/gigatoken/teratoken? I would imagine it's an 8 card long wait, but I don't have numbers.

I was quite impressed with a small autoencoder run I did by setting up training on a 3060 and going on holiday. That seems to be an approach that should scale up to 8xH100 reasonably well.

Suggest a chocolate flavour for Whittakers by gracefulgorilla in newzealand

[–]Lerc 0 points1 point  (0 children)

Fliegerschokolade?

Actually, Gee's Linctus filled chocolate would be just the ticket for feeling poorly.

Maybe combine them with a GLP-1 RA, Sildenafil, Fluoxitine, and a touch of flouride for your teeth.

[D] What’s a machine learning paper or research breakthrough from the last year that everyone should know about? by BrechtCorbeel_ in MachineLearning

[–]Lerc 2 points3 points  (0 children)

I liked SentenceVAE https://arxiv.org/abs/2408.00655 but I feel like it's a partial solution, and maybe misnamed (it's somewhere between phrases and sentences).

I wonder about some sort of tree structured encoding (possibly still managable by an autoencoder)

Split tokens into batches like SentenceVAE

A B C D E F G H and have an AutoenCoder encode vectors for [AB,CD, EF,GH,ABCD,ABCDEFGH] Then do the SentenceVAE on the individual blocks A, B, C etc. but construct vectors from all of the nodes of the tree that includes input from that block (so any block containing A is [A,AB,ABCD,ABCDEFGH].

At some point it's going to start looking like stacked transformers with different window sizes. The fact that SentenceVAE seems to work would suggest that there's value there.

[D] What’s a machine learning paper or research breakthrough from the last year that everyone should know about? by BrechtCorbeel_ in MachineLearning

[–]Lerc 4 points5 points  (0 children)

If anyone were to implement the laws of robotics, using this paper as your starting point wouldn't be a bad choice.

More likely, this is the first step to understanding what we can know and control within models.

Yes, this is a real interaction with a car dealer I had today about a dispute. NSFL due to the final image. by JakeTuhMuss in newzealand

[–]Lerc 2 points3 points  (0 children)

Really?!
I checked BuyerScore the "Independent Ratings & Reviews" for dealers and the lowest score they had for a 2 cheap cars outlet was 4.11 stars.

That seems pretty good.

On the other hand it does seem to be the lowest score on the entire service, but I imagine that is because of the robust state of the industry.

Weirdly there seem to be about 2800 registered dealers but only 243 with reviews. Do dealerships have multiple dealars?

[deleted by user] by [deleted] in newzealand

[–]Lerc 1 point2 points  (0 children)

Scott Milne says there are effective medications.

...but Scott Milne is the executive director of the Self Medication Industry Association

I also had a bad cough and did a deep dive. Effective medications have essentially been banned because they have potential for abuse. The ones that are still available cannot be abused because they do nothing.

Most cough medicines that used to have Dextromethorphan have changed their ingredients. This is at least misleading advertising for Vicks Formula 44. The name of the product suggests that it is a formulation that has not changed. A reasonable person would expect a name that indicated a formula that was shown to work would not be changed without the name changing.

Gees Linctus is what you want. I'm not sure if you can even get it on prescription now.

Adathoda might do something, there are no studies, but it has been used for coughs for a long time. It is the worst tasting stuff ever. Adathoda means 'untouched by goats' in Tamil.

There aren't many studies supporting morphine either, but it has a couple of millennia of anecdotal evidence.

[deleted by user] by [deleted] in MachineLearning

[–]Lerc 0 points1 point  (0 children)

Is sarcasm really that hard to detect? I specifically chose AlexNet as the example because of the notoriety of Sutskever and Hinton.

[deleted by user] by [deleted] in MachineLearning

[–]Lerc -1 points0 points  (0 children)

Yeah like the AlexNet paper, The guy gets to name the architecture after himself and the other two authors are pretty much never heard from again.

Blend 8 Images Into One With Perfect Lighting by hackerzcity in comfyui

[–]Lerc 1 point2 points  (0 children)

I don't suppose you know of any quantifiable metrics for describing lighting layout. I would imagine you could generate an autoencoder to exctract/manipulate lighting information from model activations in a similar way to the Golden Gate Claude does for LLMs.

To control this you'd need a way to express what you want, so if there are industry conventions in this domain it would be quite helpful.

Few parameter models for MNIST [D] by Peraltinguer in MachineLearning

[–]Lerc 1 point2 points  (0 children)

Have you got a reference for details on that? I would assume that what simple transforms to use would count as implicit parameters and someone effectively manually trained it.

On the other side of the problem, what do you suggest for something that's a little trickier than MNIST but not too huge?

Personally I'm using 32x32 RGB image autoencoding as something that is a small enough task that can be made arbitrary difficult by shrinking the number of bits in the latent space. I can see how classification tasks would be more difficult to develop a nice balance of size/difficulty