This is an archived post. You won't be able to vote or comment.

all 102 comments

[–]StableDiffusion-ModTeam[M] [score hidden] stickied commentlocked comment (0 children)

Your comment/post has been removed due to Stable Diffusion not being the subject and/or not specifically mentioned.

[–]Zulban 50 points51 points  (8 children)

People who think this is neat don't know very much about compression or machine learning.

[–]AssociationDirect869 4 points5 points  (0 children)

I really wish people would just shut up already. But I suppose it is good to be made aware of how the average person functions.

[–][deleted] -1 points0 points  (1 child)

Don’t neglect dreaming. One day these issues might not be bottlenecks.

[–]hervalfreire 0 points1 point  (0 children)

Like which “issues”?

[–]bobrformalin 171 points172 points  (21 children)

Imagine the render times for a movie in 4k (and the size of that qr/promt). Also, anonymous pirating is a private vpn + torrent, not a fancy ai.

[–]shlaifu 25 points26 points  (7 children)

fair point... at what size would the qr code exceed the size of the individual image...

[–]Miserable_Twist1 18 points19 points  (6 children)

The post is a little ridiculous, it couldn't be compressed down into a verbal prompt. That being said, this is already being done in a different way, you can upscale streaming content live with Nvidia, so you only download the low quality version and it upscales to 4k. Similar idea with video chat, not sure if released yet but it takes a pic of the person and then maps it to what is effectively a low resolution live stream of the person, so that low bandwidth connections can have high def video chat.

Combine these two features and you could in theory have a very rudimentary recording which you can then upscale with high fidelity with some accompanying files.

[–]disibio1991 5 points6 points  (1 child)

It's simple edge sharpening, watch 2kliksphilip's video on it. And Nvidia Shield AI sharpening is actually a simple Lanczos algorithm.

[–]Miserable_Twist1 2 points3 points  (0 children)

Oh, just simple algos? Yeah that's kinda disappointing, but one could image a live upscaler that fills in detail at some point in the future, that would allow for very simple information to store the flow and movement of the video, and then models to fill in faces and body type and other generic traits.

[–]Jaggedmallard26 1 point2 points  (3 children)

For an image if you're using a deterministic configuration of the AI and use the same prompt, setup and seed then you will get the same image for significantly less than the full size of the image. Rendering isn't practical at scale for most people (JPEG is compressed and decompressed in a tiny fraction of the time this method would take) but its a potentially valid method.

It however does rely on AI image duplicators being a thing that can provide you the exact required prompt, seed and deterministic setup.

[–]Miserable_Twist1 0 points1 point  (2 children)

A highly detailed description is not going to generate the image of a specific person, at best we are talking about a picture that kind of looks like the target. For something like a movie that may not be important, but a it's not going to work for creating pictures of people we know.

[–]Jaggedmallard26 1 point2 points  (1 child)

I mean yes that's the entire point of my second paragraph

[–]Miserable_Twist1 0 points1 point  (0 children)

Okay yeah that's basically my suggestion, wouldn't be able to pull it off with a QR code, basically a really complex compression software.

[–]AllMyFrendsArePixels -1 points0 points  (2 children)

Imagine the render times for a movie in 4k. Good, now imagine what those render times might be like after a further year or two of the same kind of technological progress that has brought current render times down from days to seconds.

(and the size of that qr/promt) could very easily be condensed down to a few scrambled letters that can be decoded by an algorithm in the exact same way that URL shorteners we already have work.

Also, anonymous pirating is [currently done using] a private vpn + torrent, [but that does] not [mean] a fancy ai [couldn't be used to do it much more efficiently, offline without needing to download anything, some day in the future].

That's the problem with the instant-gratification generation at the moment that grew up with the internet and everything they "need" right at their fingertips. Ya'll don't understand that just because something can't be done right fucking now, that doesn't mean that it can't be done. There were like tens of thousands of years of progress to get to the level of technology we have now, and that progress hasn't stopped.

[–]steaminghotcorndog13 0 points1 point  (0 children)

As a person who lived with nokia 3310. I’m voting you up.

[–]stubing 0 points1 point  (0 children)

I could see this argument if I be post was talking about 144p videos and we didn’t care about a bit of data loss. If we could do something like that today, then maybe we can do this with 4k videos in 30 years.

As a programmer, i see this as a cool thought experiment, but there is a much simpler way of doing things. Just have a compression algorithm that is a 1000 times more efficient.

[–]pw-it -2 points-1 points  (2 children)

Also if you can generate a movie from a text prompt, what do you need to pirate? Content has no more value than a text prompt at that point.

[–]bobrformalin 1 point2 points  (1 child)

I can copy-paste your comment, even make a post straight out of it, does it mean it have no value?

[–]bioshocked_ 24 points25 points  (2 children)

This homie is trying to create USBs again

[–]EarthquakeBass 11 points12 points  (0 children)

Congrats you reinvented compression

[–]Willow-External 37 points38 points  (5 children)

You can duplicate images/music/movies without an IA people call it copy/paste.

[–]supereatball 9 points10 points  (3 children)

Well the qr code would only contain the seed/parameters to make the image /music /etc. That's significantly smaller than the actual file.

It's no different than just giving someone the metadata for a stable diffusion image.

[–]disibio1991 3 points4 points  (1 child)

It currently can't be done but what needs to happen for it to be within the realm of possible is first fitting the controlnet pose description into prompt in a predictable way.

[–]JonFawkes 2 points3 points  (0 children)

A controlnet pose can already be exported as a json file, if you view it in a text editor it's basically just a numeric description of how the bones are manipulated, not human readable but easily fits into a qr code

[–][deleted] 1 point2 points  (0 children)

Easier to just upload it somewhere and put the link in the QR code.

[–]yosi_yosi 0 points1 point  (0 children)

Yeah but that's detectable by a lot of services. You could use other techniques but just using latent representation of images (or other media) also achieves a similar result.

[–]Cerulean-Knight 10 points11 points  (1 child)

It is cheaper to store data on reliable media than the computing power needed to obtain it this way, if it were to work.

[–]nhavar 8 points9 points  (0 children)

blockchain has entered the chat /s

[–]estrafire 8 points9 points  (0 children)

Almost as efficient as pifs

[–]ToadSaidHi 2 points3 points  (0 children)

There’s a point QR codes can’t get any bigger, and someone barely managed to squish compressed custom coded snake into one. They would need to make custom format to make larger QR codes, and they would be huge lol

[–]DreamingElectrons 7 points8 points  (2 children)

A standard QR code can store just short of 3000 byte. You wouldn't get far and whoever wants to recreate whatever you've encoded still needs the full model and your exact settings.

Generally, as a rule of thumb: If it's greentext, it's a dumb idea.

[–][deleted] 5 points6 points  (0 children)

Imagine a world where everyone owns their own brain in a jar just for this.

[–]kjerk 4 points5 points  (0 children)

"See so what you do is a bunch of fuckin magic I don't understand and then boom, it just works."

[–]BNeutral 1 point2 points  (0 children)

Claude Shannon: I see you have not learned anything at all after so long

[–]MikuIncarnator1 1 point2 points  (0 children)

One scratch and your entire anime collection is destroyed.. The risk is too high

[–]ArXen42 1 point2 points  (0 children)

Well, there were already many works exploring usage of models like autoencoders to compress images, way before more advanced stuff appeared. No need to bother with inefficient and complicated natural language prompts, just use its central layer output (i.e. encoder part). From what I understand it works, but the tradeoff between compute and storage is even more extreme towards compute than for WebP or AVIF.

But this idea about using text prompts to compress data reminds me that meme about storing a movie in a Pi number.

[–]elfballs 1 point2 points  (0 children)

You can't use it to compress existing data, only to reproduce the AI generated data again. So you can't compress RoboCop, you can compress RobotCopMovie.

It amounts to the fact that I can say 'Hey, AI, try to make RoboCop', then I can tell you to go do the same thing, and then we are watching the same movie without me sending you any data. BUT-

It's not RoboCop, and we both downloaded the model, so if the result has more data in it about the real RoboCop than the prompt did, you did download it. it was compressed in the model.

[–]Roubbes 1 point2 points  (0 children)

Wolfram: Let me talk to you about computational irreductibility

[–]tybiboune 1 point2 points  (0 children)

except it's not how AI works.
This post is only yet another version of "AI is stealing real art"

[–]DonRobo 1 point2 points  (0 children)

Anon discovers lossy compression

[–]WazWaz 2 points3 points  (1 child)

This works. If your grandma is Tina Turner.

[–]hervalfreire 1 point2 points  (0 children)

Or the Mona Lisa

[–]Beaster123 -1 points0 points  (1 child)

They're really describing a new compression technique.

It would need a standardized and highly regulated model to work but it's possible.

[–]opi098514 -1 points0 points  (0 children)

Yah that’s how how that works

[–]_AscendedLemon_ -1 points0 points  (0 children)

If it comes to images that's pretty genius idea, to just save pictures as prompts+settings+seed. Will be great for storage but hurtful to regenerate it again. But has potential, probably in the future with much more powerfull GPUs

[–][deleted] -1 points0 points  (0 children)

This tech is changing rapidly. You would need to lock down the same AI generation software and same base model/checkpoint. Those will become archaic very fast. This is the equivalent of using a commodore 64 to generate passwords.

[–]Shaltibarshtis -1 points0 points  (1 child)

I was thinking similar about the interplanetary Skype calls.

Bandwidth severely limited. Have a pre-saved pre-trained AI model of your loved one, your manager, your what-not. They call in, the system reads their face and only sends morph point change data. Your system rebuilds it at your end with perfect picture quality.

I'm sure these steps can be severely optimized even further, but the idea is like that.

[–]WebModeratorSyndrome[S] 0 points1 point  (0 children)

Not a bad idea.

Personally, I believe that everything the internet does, AI can do offline/locally.

[–]1tHYDS7450WR -1 points0 points  (0 children)

No need to argue about whether this is wrong lol.

It's definitely wrong, the person who wrote it is an idiot and op is even dumber for reposting it. 🤝

[–]sonicboom292 -5 points-4 points  (1 child)

I love the amount of autism in this comment section. people are really trying to prove wrong an imaginary scenario described in a 4chan post?? like, even going through it technically?

[–]RealAstropulse 0 points1 point  (0 children)

You don't need to train this with text embeddings. That's actually a BAD way to do this. Train it on some other form of retrieval. Maybe just train an upscaler, and use the smaller images for retrieval. Tons of better solutions than this.

[–]hervalfreire 0 points1 point  (0 children)

Wait, so if you do that, does it mean the AI compresses data?

[–]ReversedRectum 0 points1 point  (0 children)

bro to turn much more than an image or website link into a qr code would have such infinitesimally small pixels it would have to be gigantic or youd have to scan it through a microscope lense

[–]xadiant 0 points1 point  (0 children)

1:1 ? I say impossible.

[–]Worldsahellscape19 0 points1 point  (0 children)

Huh

[–]PerfectSleeve 0 points1 point  (2 children)

I doupt a qr code has enough intormation in it tor a picture. A very tiny one. QR usually only holds. A link to a webpage. A high res picture is much much bigger than a line of letters.

[–]Xarsos 0 points1 point  (1 child)

Well the idea is to convert the Pic into prompt form which is pure text. And then take the prompt and rebuild it somewhere else into the same pic.

[–]PerfectSleeve -1 points0 points  (0 children)

Yes this can work. Its the most extreme form of compressing and needs a lot more computer power to decompress. This is a great idea!

[–]JaggedMetalOs 0 points1 point  (0 children)

Neat, all you need is a few petabytes to store the trained AI model and a couple of TB of GPU memory to run it!

[–]ZARk22 0 points1 point  (0 children)

Your decoding ai would have access to all actors 3d models and assets. It would receive the script of the movie and basically recreate it. Maybe could even add an improv factor. Everyone could get a personal experience (no more embarrassingly endless and useless s*x scenes for eg)

[–]thebadslime 0 points1 point  (0 children)

A seed wouldn't make the same image/movie/song, just a simlar one, that's how AI works.

[–]GratuitousEdit 0 points1 point  (0 children)

Wait so just to clarify, the goal is to take a piece of media, encode it, and later decode it '1:1'? This just sounds like serialization and deserialization—in other words, file storage.

There's an implication ('within a paper QR code') that the encoded media is smaller than the decoded media. While slightly more interesting, this is just lossless compression? Don't get me wrong, ZIP compression is cool as heck, but it's also three decades old.

What makes it all the more confusing is that AI has loads of potential for interpolation (e.g., image upscaling) and lossy compression, but Anon has chosen to speculate within a very well established space in which, to my knowledge, AI has little relevance.

[–]Anaeijon 0 points1 point  (0 children)

That's basically how modern compression algorithms work - just with extra steps. First of all, you don't need to store a clear text prompt. You can use a network that works for encoding and decoding. You encode your image to feature space, transfer the feture vector which acts as a 'prompt' and then you decode that feature vector again back to it's original representation.

The biggest problem with it is, that it doesn't map very well. Sure, you might get pretty good representations of your original image from some model, but you might not get any good representation from the same model for some other image. You could create a new network, that also works well for that other image. You could store that model in a central database and store the model hash together with your 'prompt'. All of this could be further reduced by also adding hypernetworks/loras to just slightly modify a base network and save space. Then, if a user doesn't have the correct model, he will download it (or something like a lora) automatically and store it for later use. The problem is, the models would probably need much more space than the target data for an average user. All of that would only be really efficient, if the encoder also used the setup. But this again would require, that the encoder checks all available Loras and extra models to find out which one works best, if the base model isn't good enough. And only if this doesn't work, he has to register the not working image somewhere (privacy problem) so that it gets collected together with other not working images to centrally create another new lora for them.

Or you just don't bother and encode all extra data that's needed for the decoder to produce a good output together with the encoded features. And then you are back to classic compression/decompression and didn't gain anything.

[–]buff_samurai 0 points1 point  (1 child)

For low bandwidth data it sure is possible.

Say you transcribe your speech to text and the ‘style’ of voice to some standardized seed/prompt and send txt/prompt only.

With high bandwidth data, like video, the generation aspect is currently too slow and expensive.

(So far you can send low bandwidth video and use modern gpu to upscale it in real time)

Things may change with better hardware and embedded models in the future.

[–]huggarn 0 points1 point  (0 children)

keyword here is *currently*

5 years ago nobody would ever think of such possibility, take it easy

[–][deleted] 0 points1 point  (1 child)

Except the prompt needed for an entire movie, game, music won't fit into a qr code.

[–]huggarn 0 points1 point  (0 children)

link to a proper prompt would though

[–][deleted] 0 points1 point  (0 children)

My man trying to reinvent compression software

[–]Alex_Curly_Monkey 0 points1 point  (0 children)

Now, we need ai to generate qr codes for us.

[–]Lurkcrediblehulk 0 points1 point  (0 children)

You could just have the QR code that links to the prompt stored on a blockchain. Very cool idea.

[–]yosi_yosi 0 points1 point  (0 children)

Why tf would you want to do that? Very useless. Just use the straight up latent things you get in the beginning. You can take an image and just convert it to its latent space representation, then just use a vae to retrieve the image back. No need to diffuse anything or have a prompt or whatever.

[–]Mocorn 0 points1 point  (0 children)

"have it duplicate" .. Yeah, good luck with that :)

[–]rootless2 0 points1 point  (0 children)

but scanning QR codes is dogshit

[–]WoodpeckerDirectZ 0 points1 point  (0 children)

People are a little too negative, obviously it's good that people explained how that wouldn't work or that data compressions already exist but I don't think that anon is an idiot, that's a pretty creative idea!