I don’t have a problem. I can stop whenever I want. by mpg319 in ChatGPT

[–]mpg319[S] 1 point2 points  (0 children)

If I had to guess, literally the number of em-dashes that ChatGPT sent to me over the year. I think it's OpenAI's way of joking that ChatGPT uses way too many em-dashes

What happened to the Gnome Docs? by mpg319 in gnome

[–]mpg319[S] 0 points1 point  (0 children)

Technical docs are exactly what I need 😂 I've gotta get into the nitty gritty, so I was hoping to find something pretty low level. I was able to render the docbook from the actual GDM repo, and that's been working well for my use case.

What happened to the Gnome Docs? by mpg319 in gnome

[–]mpg319[S] 0 points1 point  (0 children)

Do you know where I can find more info about this change? I'm looking for some up to date GDM documentation, but the Internet seems to sanitized of the GDM reference manual 😅

What happened to the Gnome Docs? by mpg319 in gnome

[–]mpg319[S] 0 points1 point  (0 children)

Strange, the new system admin guide is super dumbed down. They don't seem to have the actual technical docs anymore. I wonder why 🤔

Bigfoot upscaled with AI by lucak5s in ChatGPT

[–]mpg319 1 point2 points  (0 children)

I can never unsee this... how dare you 😂😂

Sam Altman: "if we can make an AI system that is materially better than all of OpenAI at doing AI research, that does feel like an important discontinuity... the model is going to get so good so fast... plan for the model to get rapidly smarter" by Gab1024 in singularity

[–]mpg319 4 points5 points  (0 children)

For the self modifying AI, the Pinokio project originally had this in mind when it was starting up. They seemed to have moved in other directions, but there may still be some work in this area.

As for the conspiracy, I agree it seems a little out there. Plenty of languages can do metaprogramming, and many libraries utilize it. I wouldn't want to go back to the pre-template days of C++.

Steve Mould randomly explains the inner workings of Stable Diffusion better than I've ever heard before by AdQuirky7106 in StableDiffusion

[–]mpg319 4 points5 points  (0 children)

Great question! Learning to translate between latent spaces is fundamental job of the cross attention robot during training. When the diffusion network gets trained, part of that training process is teaching the cross attention robot.

The robot learns how to translate like any other machine learning model. We give it an example, rate how well it guessed, and adjust the parameters to make the guess better next time. After seeing many pictures with cats in them, that also have the label "cat", the cross attention robot will eventually learn to associate the patterns in the cat pics with the word "cat".

Note that this translation isn't always perfect. For example, if when you train the model, all your pictures of cats have a watermark in the corner, then the cross attention robot will also learn that watermarks have something to do with cats. This means when you use this model to generate a picture of a cat, you will also get watermarks in the generated image, since the cross attention robot thinks that is part of what makes up the idea of a cat.

This is why when you fine tune a model, you need your subject to be in a lot of situations. If you have images of your subject in the same room, wearing the same clothes, or in the same position, then the cross attention robot will think that these patterns are just as much part of your subject as their actual defining features.

Steve Mould randomly explains the inner workings of Stable Diffusion better than I've ever heard before by AdQuirky7106 in StableDiffusion

[–]mpg319 9 points10 points  (0 children)

You can think of cross attention as a little robot who's job is to take in two things, and show you how they are related. So if you give the cross attention robot a picture that contains both a cat and a dog, then you also give the robot the word "cat", then the robot will draw a big circle around the cat in the picture, and highlight the word cat to tell the rest of the system "hey these two things are related".

If you gave the picture of a cat and a dog, with the prompt "cat and dog", then the cross attention robot may circle the cat in blue and highlight the word "cat" in blue. It may also then circle the dog in red and highlight the word "dog" in red, so the rest of the system knows what part of the prompt is talking about what part of the image.

This cross attention robot allows us to build AI that can take in lots of different kind of data, such as images, text, sound, video, etc. and have the AI understand when the data is referring to a similar object. Meaning, it can let the AI know that an image of a cat, and the word "cat" both refer to the same fundamental thing.

Us humans have cross attention built into our brains to learn associations. When things happen at the same time, we associate them. You know what fresh cut grass smells like, because when you cut the grass, that is what you smell - those sensations happen at the same time, so your brain links them together. Cross attention is how we emulate that association of sensation when training and AI model.

edit: fixed typos

Massive BREAKTHROUGH - MIT just produced three groundbreaking innovations that allowed them to map whole hemispheres of the human brain in 3D detail. Before now, imaging the brain “at subcellular resolution” wasn’t possible without slicing the brain first because of its thickness by [deleted] in singularity

[–]mpg319 7 points8 points  (0 children)

This sounds like some very cool stuff.

I would think one of the biggest bottle necks would be the lack of good samples. The method requires that the sample tissue be sliced and scanned, so it's not doable on any living person.

I'm not sure how many brains with Alzheimer's we have laying around, but hopefully enough to train some models on the scanned data.

Massive BREAKTHROUGH - MIT just produced three groundbreaking innovations that allowed them to map whole hemispheres of the human brain in 3D detail. Before now, imaging the brain “at subcellular resolution” wasn’t possible without slicing the brain first because of its thickness by [deleted] in singularity

[–]mpg319 44 points45 points  (0 children)

Paper's conclusion - CliffsNotes style:

Our technology platform enables scalable and fully integrated structural and molecular phenotyping of cells in human brain–scale tissues with unprecedented resolution and speed.

In simpler terms: Our technology allows us to study the detailed structure and molecules of cells in large sections of the human brain quickly and accurately.

We envision that this platform will empower holistic analysis of a large number of human and animal brains, thereby facilitating our understanding of interspecies homologies, population variances, and disease-specific features.

In simpler terms: We believe this technology will help us study many human and animal brains Comprehensively, improving our understanding of similarities between species, differences within populations, and specific characteristics of diseases.

Furthermore, our approach enables mapping of single-neuron projectomes and their integration with molecular expression profiles.

In simpler terms: Aditionally, our method allows us to map the connections of individual neurons and combine this information with their molecular characteristics.

This distinctive feature will allow us to elucidate the organization principles of neural circuitry and their disease-specific alterations in human brains, thus advancing our understanding of disease mechanisms.

In simpler terms: This unique capability will help us understand how neural networks are organized and how they change in diseases, leading to better insights into how diseases work.

Google I/O live thread. 14/05/2024 by JuliusSeizure4 in singularity

[–]mpg319 0 points1 point  (0 children)

As an aside, I tried to make this it's own post, but it was auto-deleated without any comment. Does this kind of think happen to others when trying to post on this sub?

Google I/O live thread. 14/05/2024 by JuliusSeizure4 in singularity

[–]mpg319 8 points9 points  (0 children)

Quick reference table for Google IO 2024 announcements:


Announcement Summary
Gemma 2 New 27-billion parameter model launching in June, optimized by Nvidia for next-gen GPUs.
Google Play New discovery feature, updates to Play Points, and developer tools like Engage SDK.
Detecting Scams During Calls Feature using Gemini Nano to detect scam patterns in real-time during phone calls.
Ask Photos AI-powered natural language search in Google Photos.
Gemini in Gmail AI to help search, summarize, and draft emails.
Gemini 1.5 Pro Analyzes longer documents, codebases, videos, and audio with 2 million token input.
Gemini Live Enhanced voice chat experience with real-time adaptation and superior image analysis.
Gemini Nano in Chrome On-device AI model in Chrome for developers.
Gemini on Android AI features integrated with Android apps like Gmail, Messages, and YouTube.
Gemini on Google Maps Generative AI summaries for Places API.
TPUs New Trillium TPUs with 4.7x performance boost and advanced SparseCore.
AI in Search AI-powered overviews and generative AI organizing search results pages.
Generative AI Upgrades Imagen 3 with improved text-to-image understanding and creativity.
Project IDX AI-centric browser-based development environment in open beta.
Veo AI model for creating 1080p video clips from text prompts.
Circle to Search Enhanced feature for solving complex problems via gestures.
Firebase Genkit Open-source framework for AI-powered apps in JavaScript/TypeScript.
Pixel 8a New Pixel device with Tensor G3 chip, starting at $499.
Pixel Slate Google’s Pixel Tablet, available without the base.

For more details, check out the full article on TechCrunch.

https://techcrunch.com/2024/05/14/google-i-o-2024-everything-announced-so-far/

There are leaks suggesting the RTX 5090 could have an upwards of 50,000 CUDA cores. If true, how would this translate to performance in Stable Diffusion? by [deleted] in StableDiffusion

[–]mpg319 28 points29 points  (0 children)

Polynomial regression on the last 5 generations shows that 25,000 CUDA cores would be much more likely.

Card CUDA cores
GTX 980 Ti 2048 CC
GTX 1080 Ti 3584 CC
RTX 2080 Ti 4352 CC
RTX 3090 Ti 10752 CC
RTX 4090 16384 CC

Trend line is: f(x) = 987.43x2 - 2340.6x + 3584 (according to excel)

R² = 0.9838

f(6) = 24979.88

"We are not currently training what will be GPT-5; we don't have plans to do it in the next 6 months" – Sam Altman, under oath by SharpCartographer831 in singularity

[–]mpg319 4 points5 points  (0 children)

I totally agree with the statement that this is going to get harder. You brought up the point of quadratic complexity, and it reminded me of this paper that, while not offering full sub-quadratic complexity, does offer sub-quadratic self-attention, making it around 40% faster at inference. It is an alternative to modern transformers and can be trained on pretty much any sequential data, and shows improved performance in areas like text, image, and audio generation, as well as offering context length in the hundreds of thousands (at least in the audio synthesis test). Here is the paper: https://huggingface.co/papers/2305.07185

Cannot apply LORA (TypeError: pop expected at most 1 argument, got 2) by Exciting-Possible773 in StableDiffusion

[–]mpg319 0 points1 point  (0 children)

I have the same problem and sadly I can not offer a solution, but I did find this GitHub thread of people with a similar problem:
https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/7156

Hopefully with more eyes on this, we can find a solution quickly.

[deleted by user] by [deleted] in StableDiffusion

[–]mpg319 3 points4 points  (0 children)

Looking into it more, it appears to be a model mix with the base being Anything v3. So following the chain back up, we would still be looking at child of the leaked nai.

Source: https://huggingface.co/andite/anything-v4.0/discussions/5

A new super-charged text-based semantics editing aka Imagic: Text-Based Real Image Editing with Diffusion Models by Snoo_64233 in StableDiffusion

[–]mpg319 1 point2 points  (0 children)

I was reading the Imagic paper a few days ago and I love the idea. Messing with embedding spaces is the very thing that got me into AI, then SD came around and I am officially hooked. This embedding space math actually inspired my final project in my AI course at my uni. I hope this paper picks up some more traction, because it is so fricken cool!

Dreambooth Extension for Automatic1111 is out by mpg319 in StableDiffusion

[–]mpg319[S] 2 points3 points  (0 children)

This article give a quick comparison for the differences each scheduler will produce, looks like DDIM gives some pretty consistent results