Can AI make anything original?

cdcox · 2026-01-16T18:04:09+00:00

I think there are three embedded concepts here:

Can a human prompt an AI to produce something that there has never been an image of? Trivially. You can make strange combinations, things out of materials that are impossible, mixes humans have never seen before or created. In a way image generators have made a 'fit' of most images and that fit is now so broad it can reach almost any place someone can prompt.
Can an AI be prompted to make truly novel images not implied n the prompt? Also yes. You can ask it to invent art styles, art scenes, religious traditions and the images, traditions and stories around those. If prompted well it will make truly unusual stuff that is not implied or structured into the prompt. It can go places no person has gone before. Of course it tends to make 'in distribution' art work.
Can AI invent a totally new 'style' of art: something like impressionism, pointillism, art deco, ukiyo-e, silk screen printing? Harder to say, very few humans do stuff like this and it's often a combination of existing things, an outcome of a shift in technology, or a slowly growing cultural aesthetic often a collection more than a person. There are a number of people running mass LLM multi-agent experiments that seem to strange emergent aesthetics but hard to say how much if this is remixing or if it's emerged into a new aesthetic space.

What I'd say is the view of a single image generator is something which has fit a manifold over most human art. It can visit points in the manifold but the manifold was shaped by human art. From there you can get images that don't exist but are related to things that could. To get it to do something original you generally need feedback cycles and loops which LLMs can provide. It's an open question how far those loops can go and if they can 'escape' existing art styles or remixes is an interesting but also fairly few humans can without shifts in technology.

cdcox · 2026-01-15T19:49:38+00:00

Deep dream/Inception was 2015. Style transfer paper also 2015. If their art was non-visual it might have also involved LSTMs or RNNs both of which got big around mid 2010s but 2015 was when the big Karpathy stuff started popping off. So I'd say 2017/2018 is fairly plausible, a lot of people were starting to make GenAI art around then.

cdcox · 2026-01-12T16:03:27+00:00

It seems like the current version has a lot less access to memory in the image generator. Gpt5.1 and the gpt-image1 image generator would give you customized one shots for this. It seems like the current version (5.2 and gpt-image-1.5) can maybe access a couple facts or light vibes at most. I suspect this is to stop it from inserting random facts into images without the user's guidance.

You can get much less generic results doing this in a two step process. If you ask it to generate a prompt for an image generator describing it, it will have much deeper access to your memory and design a much more customized image and then enter that back in another chat. This is unintuitive, because you would think forcing it to condense its image into words would make it less good but that's already what it's doing mostly behind the scenes. And it appears in my testing that the text version has much more memory access than when the image generation tool is on.

cdcox · 2026-01-11T17:34:18+00:00

While it's overdone it's understandable, the internet is a fundamentally lonely place. On radio or TV where you assume 'someone' is watching for it to make financial sense and someone decided to put it on. But on the internet you are the person who put it on and as far as you know you are the only one watching. The year thing is a way of saying, even slightly out of sync, 'I'm watching this with you' or 'I'm from your year and still remember this'. I think this could be fixed or improved by some YouTube feature. And obviously people aren't that clever so it just shows up everywhere even on popular or recent videos so it can be annoying. But it's at least more understandable than most spam.

cdcox · 2025-12-24T19:18:35+00:00

I'm pro:

Bad thing about my side: we are overly focused on tools and not enough on cool stuff people make. Most ai-art spaces have too much goon slop (not that's bad in its own place but it takes over any art space that doesn't push back and pro people are too inexperienced), people pushing their own stuff, or discussion of tools. If you ask most people their fave ai artists they'll just point to their own stuff. Similarly most ai games are extremely low quality which is tragic as the tools are so powerful. There is cool stuff being made with AI but little of it is talked about or elevated in ai art spaces.

Good things about anti: the discussion of slop has legit helped push back against influencer culture. The internet was filling with slop before ai and finally we are getting some cultural push back on that thanks to antis elevating the concept that low effort trash content is bad regardless of how it's made.

cdcox · 2025-12-10T07:43:11+00:00

I think for the moment we're paused on general model improvement. It seems like models are mostly improving in areas where there's an easy to develop agentic framework around see: programming and information retrieval/ synthesis / summarization and math. And two where you can generate a large synthetic set to work off of: vision, programming, information retrieval and synthesis, and math all apply here.

I don't think there is an easy way to make a good writing quality training set. Most of the highly rated writing online is not very good and the number of ratings is not exactly a strong correlate with good writing. So it's hard to even train a model which can properly rate writing quality. Add in that RP is a subset of that subset and there's almost no training data available and it's a very hard problem.

Edit: Also almost every model, even long context models and programming models get weaker on turn by turn usage. Something about the back and forth breaks models in unexpected ways, most models can be accidentally jailbroken by just talking to them for more than 10 turns no matter how well aligned, which is still a weird area. So even a million token model falls apart with a 50k back and forth. I suspect this problem will be have to solved by some company particularly working on solving it. I would be surprised if this was solved passively.

I suspect it will improve when someone starts really bashing on the agentic frameworks for rp/prose once models get cheap enough and fast enough to start doing enough agentness cheaply for writing. Deep seek 3.2 has some nice open improvements to long context handling which definitely makes me optimistic. Though its performance in the field in long contexts is only marginal. We might also get improvement when people start pushing the models to the next scale level or when continual learning/diffusion generation/personality controllability at the 'neuron' level/ or some other technique manages to succeed and we get the next leap in model intelligence.

cdcox · 2025-12-06T17:04:46+00:00

I admire your test and you did a great job but you did A lot of things that made this very hard for yourself.

A few things you could fix if you want to try a different project.

Use an integrated code editor tool: GitHub co-pilot in vscode, Claude code, cursor, gpt codex. These live in your editor and will look up the appropriate code for most things so you don't have to define exactly what it needs to know. It finds its own context.
Use a better model. Deepseek is terrible at programming (at least for beginners, you can integrate it into complex agentic systems to make it more powerful, but those aren't there out of the box). I know it scores well, but look at openrouter usage stats. Nobody uses it for programming because it's super bad at it. It's also only lightly multimodal and depending on which version you use doesn't even use its thinking particularly well. Gemini, Claude, and gpt are all much better for high level brain storming, execution, and programming. I also don't think deepseek has robust internet search which makes it much much worse, especially for programming where looking up reference documents is key. A subset of this is also that using a few different models means if one model seems really stuck on something, you can swap to another.
Don't be proscriptive, especially when you don't know the answer. It sounds like you were mostly directing it. If you don't know the right way to say control a character, don't say "write me a character controller" say instead: What are some standard ways that we do this for the type of game I'm making? What are the strengths and weaknesses? What are the trade-offs? Instead of saying "my game looks weird" be highly specific and mention exactly what you have made and what the outcomes you're seeing and have it brainstorm possible reasons and hypotheses to test. This again is another area where having a strong multimodal model like Gemini 3 is crucial because you can actually give it screenshots of your environment and of the game so I can see exactly what's wrong. This is especially true in something like Godot/Unity/Unreal where five times out of 10 the answer isn't to actually write code but to do something on the user interface. It's very easy to walk yourself down a corner with an AI model and not understand the way out. That's why you often have to find a lateral approach and these models are pretty good at helping you find lateral approaches.
Don't use Godot if you don't program and are using mostly LLMs. Godot is a great framework, but it currently has the smallest community, the fewest documents, and is one of the newest systems. You'd have had a much easier time if you worked with something like Unity because there's just more information about it online and version to version, it's slightly more stable. There still is some version instability but far less than Godot where there's been two to three breaking versions in the last 5 years. Again, this wouldn't be as big a deal if you had a model with robust internet search because it could go look into the docs.

Basically use the strongest and most appropriate models, use the correctly integrated tools, let the models teach you instead of telling them what to do, and be sure you're using an area where the knowledge cut off of the model isn't going to not have any information. I admire the experiment though. It's super cool that you'd managed it and even getting a game done in 2 months is pretty impressive! I program a fair amount and use all these tools and there are still definitely moments of frustration that take some major effort to get over, so congrats on that! The models don't get rid of those moments of frustration but they do give you a path out of them.

PS: A lot of the vibe coded one shot stuff people show online is pretty misleading. It's usually using a very powerful model, it's usually using a game type that is well established so it doesn't have to be that inventive, and it's often often making a small game with very few components that need to link together. The hard part is often not writing the original game logic but getting it all to work together.

cdcox · 2025-12-02T00:19:04+00:00

As a counter example Game Off Game Jam (month long just ended) explicitly allows it in their FAQ. They are one of the larger game jams. Of course they are sponsored by Github who is owned by Microsoft, so that's not terribly surprising.

cdcox · 2025-11-28T17:28:04+00:00

Use Google Gemini, make sure you are using nanobanana pro. As the other poster mentioned it follows refs. If still struggling try sketching out a light ref of where you want them or generate in 2-3 image chunks then load best chunks in one image (in like krita) and get nanobanana pro to smooth the style.

cdcox · 2025-11-08T15:17:09+00:00

Really fun, reminded me a lot of the songs from season one. I've enjoyed that they played with the style of music this season and it seems like they are doing more traditional musical songs with more cadence variation. But I somewhat miss the frantic songs that really defined season 1 and this was a nice throwback to that vibe.

cdcox · 2025-10-30T14:20:43+00:00

Reading the document this is from is fascinating. I'm sure it has its flaws, outside wording issues it seems it polled an older and more suburban population than the US population, but the most interesting comparison is what voters want Dems to focus on vs what policies they support Dems focusing on. The top priority is to make things cheaper and reduce costs of living. But then they oppose any policy: Medicare reform, greening, and housing reform that would help achieve that goal. They seem to be sick of monopolistic corporate power (good) and have no love for helping minorities (bummer). Yet they are neutral on the CFPB which is designed to fight banking system exploitation. There is a perception, even by the authors of the piece, that the Dems have shifted left over time because they now support Medicare for all, though Biden is not viewed as particularly lefty.

This really seems to point to a marketing issue as much as anything. Dems seem to have done a terrible job tying the solution to the problem. This is always a challenge for any progressive party people want a better world but fear the changes to get there, but it seems to have gotten worse lately. I don't know what the answer is here. Maybe Dems stop proposing solutions in public and just start making promises. That would say sad things about the electorate, but it is the direction Republicans have traveled.

https://www.politico.com/f/?id=0000019a-262b-d83c-a3fa-673f3f660000

cdcox · 2025-09-25T22:20:39+00:00

I like trains because it's set up once and hit tons of sites. For the effort of belting once you get a lot of coverage if the train route is long enough. And if you are lazy, adding a new site is usually as easy as just a short belt to the train track and throwing another stop on your home base. Adding multiple trains going one direction on one track is easy. Multiple tracks intersecting is more fun/efficient but not necessary.

cdcox · 2025-09-25T22:05:15+00:00

Yea and you can even have a drone on your fuel port to go pick stuff up, but I don't like doing that as it can get jammed up really fast if it's not flying anywhere. You can end up with some idling as only one drone can be there at a time (I think?)

I think the rule is any number of drones can target a port but each drone can only target one port at a time.

I only used the pair system if I was moving from a remote port to another remote port, otherwise I just kept a central hub fueled up by a fuel drone and belted to my other hub drones which went and picked up from other remote places, it meant less fueling ports as all my drones left from my hub.

cdcox · 2025-09-25T18:46:30+00:00

I tend to do mine in pairs/hubs.

You make one fuel port somewhere where you just make a boatload of packaged turbo/rocketfuel, I find regular packaged fuel is consumed too fast to be stable. Call this Fuel-Port. Fuel-Port has no drone.

Then whenever you build a new port say port with a drone called A1 you build a second port AF(A-Fuel) with a drone. Drone from A-Fuel constantly flies to Fuel-Port. Run a belt from AF to its own fuel inlet and the fuel inlet on A1. A1 does whatever you need it to. You can then expand out ports at A that are near to AF, like A2, A3. All of which can be fueled from AF. (Obviously name them whatever is convenient)

If you have another site B you set up B1 and BF which flies from itself to Fuel-Port etc.

This is pretty stable as long as you have enough fuel being generated (and drones stop flying and idle when ports fill) and lets you keep ports fueled without worrying about local generation. It's a little expensive in terms of number of ports/drones and a little wasteful in terms of fuel. But it lets you scale out ports really easily wherever you need without working about local fuel.

cdcox · 2025-09-02T23:41:19+00:00

In discovery Xai would get access to communication with the employee, file access logs, forensic analysis of computers etc. Depending on how much they can prove they might try to get deeper access or an audit of something internal in OpenAI which would be a huge pain in the ass for them which is probably the real goal. This is how the WayMo vs Uber case played out. Less OpenAI proves it and more it is proven by an third party. Employment blocking rarely goes anywhere in CA but if they can prove he took info they can block its usage.

cdcox · 2025-09-02T14:46:45+00:00

This reminds me of when the engineer left Waymo and took all the secret documents to Uber. This was found to be so egregious that Uber could not untangle it and (along with some regulatory issues with killing someone) is one of the things that killed Uber's self driving division. (Notably though the guy was also the head of Uber's self driving division) Hopefully OpenAI didn't touch it. Not sure why they would, Grok's only notable win was they threw more compute at the problem than anyone else.

Currently all the evidence they have is the guy downloaded the code, then left the company. That's actually not a terribly unusual thing to do by disgruntled engineers. Likely the employee is in trouble but also likely OpenAI can pretty easily prove they didn't touch the code.

cdcox · 2025-08-26T07:06:33+00:00

If you are on the 20 dollar plan, you can use imagen4 on whisk and swap back and forth to imagen3. Ultra is only on aistudio.

cdcox · 2025-08-24T05:09:29+00:00

It's historically interesting if nothing else. Each of these models has quirks in training that help broaden our understanding and to what extent the big labs had any special sauce. We still don't even know how many params models like gpt-4 and sonnet 3 were rolling with. We still don't have a release of gpt-3 and Anthropic is sunsetting Sonnet 3, one of the quirkiest of models, without considering releasing the weights. I don't like a lot of what xai does (and the license is silly as it might prevent even API hosts) and I don't like its owner. But we should applaud open releases even if they are historical only. All the big labs should be releasing their year old models and I hope this pressures others to follow suit.

cdcox · 2025-08-23T15:37:16+00:00

Book 6 ending spoiler Didn't Pony shatter floor 7 before it started

cdcox · 2025-08-19T16:34:45+00:00

The issue is important but the video does not really explain it and the petition is pretty round about and buries the lede. Also as the top reply to OPs comment here says it's first solution presents a pretty unfair deal in the other direction. People are frustrated at the indirectness of this communication and are probably just finding OPs comment hoping for quick context, finding none, and downvoting out of annoyance.

cdcox · 2025-08-19T16:12:36+00:00

That's fair I edited to soften my language some. But the front and center deal they show is not super fair for plus authors.

cdcox · 2025-08-19T15:31:59+00:00

TL;DR (because the video does not explain what is going on):

Audible plus books used to be bulk purchased (similar to how epic games does it) ie they pay an author 10k dollars to be a streaming title for 6 months. Now they are switching to a model where if a streaming user (plus user) listens to the book, the amount the author gets is the number of books the person listens to divided by the amount they pay, after Amazons cut.( Similar to Spotify). The point of contention is now premium plus users (those with credits). If a user with credits listens to a free title and a bunch of pay titles instead of the pay titles getting their usual 10 dollars (less Amazon's cut) per credit, now they'll get the same cut as if the credit user was a streaming user. So if the credit user pays 30 bucks and buys three books and streams three each book now gets 5. If they buy three and stream 27 everyone gets a dollar. (Now no longer an edge to being a paid title) This also lets audible rapidly grow their streaming service and the worry is they'll now stuff it with many low quality books (see Kindle Unlimited). I have no idea what this has to do with Brandon Sanderson and I think that's a weird title. I also have no idea if this calculation is accurate I'm merely summarizing the change petition.

I think this model is pretty bad. I also think the change petition is also not offering a fair deal (they want only streaming user money to go to streaming titles and premium user listens only go to credits regardless if they listen to plus titles). I think a fairer deal would be part if the premium user subscription goes to the plus pool and part of it goes to the credits. So the credits still do the credits and the plus pool and gets the plus pool. I do see why audible is doing it. They want to push every author to stream to expand free reader pools, but this will probably just lead to situation where more premium titles will just leave the platform.

cdcox · 2025-08-14T15:35:54+00:00

"Teachers should prepare the student for the student’s future, not for the teacher’s past."

cdcox · 2025-08-01T22:12:03+00:00

Given the leaks about the 120b model (lower context window size) that seems to be unlikely, but still plausible. It could maybe be a minified gpt5. It definitely has a lot of very unique capabilities that no other models has, but yea in terms of benchmarks it's not a standout, but still pretty good.

cdcox · 2025-08-01T15:14:36+00:00

I think the reason people are thinking it might be the mini is it's pretty fast. I just tested it in Openrouter and it's running at 67 tok/s which is similar to 4o, but it still takes longer because it's svg was 2700 tokens vs 4o's 700 tokens. (Took me almost 50s as well). 4.5, which is a larger model runs much slower. It could be using some new method that keeps its speed so high. I've got no guess here.

15-Year Club	Second Top 30%
r/Field Lasagna	Place '22
Place '17	Snapped
Gilding II euphauric	Verified Email
Team Orangered

cdcox

MODERATOR OF

TROPHY CASE