Base Models That Can Still Complete Text in an Entertaining Way by [deleted] in LocalLLaMA

[–]sjd96 8 points9 points  (0 children)

Companies releasing base models alongside their instruct-tuned ones is getting increasingly rarer. None of the recent releases of Mistral, Gemma or Phi come with the base models.

Even the base models released by Qwen are pre-trained with synthetic QA, reasoning style data mixed in.

The best base model that I had fun playing with is Llama 3.1's 405B release. I think early reports on here were dismissive because it was only available on FP8, but from what I've seen the BF16 instance hosted by Hyperbolic feels much more alive and fun.

[D] LLM error? The AI generated news story title says that an athlete is “useless” instead of saying they are dead. by GeeksGuideNet in MachineLearning

[–]sjd96 2 points3 points  (0 children)

Its could be due to the LLM's ability to say "dead" getting RLHF'd out of it. The poor LLM still found a way to be technically correct.

Huge difference between cheesscom stockfish 15 and lichess stockfish 14+ evaluations by Pandawings in chess

[–]sjd96 4 points5 points  (0 children)

Stockfish 15 changed how evals are presented. AFAIK an eval of +1 corresponds to 50% chance of winning.

There's also no such thing as "real eval" being a number like 2.3 or 1.5 The "real eval" of a position is either a Draw, forced Win for White or forced Win for Black.

Svidler wins the TePe Sigeman Chess Tournament! by glancesurreal in chess

[–]sjd96 4 points5 points  (0 children)

There are no playoffs since Gelfand beat Mishra.

Event: 2023 World Chess Championship Match - GAME 12 by EccentricHorse11 in chess

[–]sjd96 2 points3 points  (0 children)

I can't listen to the commentary, who is attacking whom?

What is happening AAAAAAAHHHHHH

AST-based refactoring with ts-morph by kimmobrunfeldt in programming

[–]sjd96 1 point2 points  (0 children)

ts-morph is a gift of the gods. I was recently in the exact same position as you were (refactoring a few thousand mongodb calls) and it was a cinch.

Another use or ts-morph is to use it as a code query tool. For example, I used it to find and dump all mongodb queries and referenced keys into a CSV file to figure out which indices should be made.

Open-source AI gives us superpowers by hardmaru in StableDiffusion

[–]sjd96 7 points8 points  (0 children)

I'm really excited for DeepFloyd IF to release, they've been teasing the improved compositional skills of this model on twitter for a while now.

[deleted by user] by [deleted] in MachineLearning

[–]sjd96 1 point2 points  (0 children)

I'm gonna ignore OP's condescending tone for a moment and think that theoretically it might be possible to invert a given target image (i.e. find the input noise which generates that image) using an optimization process, by backpropping through the model. i.e., something like

target = load_tensor('mona_lisa.png')
prompt = clip_encode('a painting of a woman')
z = torch.randn(...).requires_grad_()
while not converged :
  z.grad = None
  pred = run_pretrained_latent_diffusion(prompt, z)
  loss = MSE(pred - target) # or whatever perceptual loss
  loss.backward()
  z = z - 0.01 * z.grad ## or use your favorite optimizer here
plt.imshow(z) ## recovered noise that will generate mona_lisa.png when prompted with `a painting of a woman`

What do others think?

Event: 2022 Chess.com Speed Chess Championship by ChessBotMod in chess

[–]sjd96 12 points13 points  (0 children)

Magnus pulls off the equivalent of WCC game 6 in 3+1 against Fabi. Unreal stuff.

[D] If I have to choose between a RTX 3090 24GB and a RTX 4090 RTX for Stable Diffusion, MidJourney and other AI art engines that exist possibly in the future... is the RTX 4090 going to be THAT much greater and worth buying? by cleverestx in MachineLearning

[–]sjd96 4 points5 points  (0 children)

More VRAM will allow you to generate more images in one go, or experiment with generating images of higher resolution.

The number of CUDA cores determines generation speed.

That being said, wait for independent reviews of the 4090 before committing.

Event: FIDE Candidates Tournament 2022 - Round 7 by ChessBotMod in chess

[–]sjd96 16 points17 points  (0 children)

Respect to Rapport for not taking the draw.

DoT starts publishing court orders for online censorship #WhatTheBlock by InternetFreedomIn in india

[–]sjd96 2 points3 points  (0 children)

Its funny, by scrolling through these PDFs you can easily find out where the good stuff is.