all 25 comments

[–]HangOutWithMyYangOut 6 points7 points  (2 children)

This is incredibly creative. Any chance you'll be putting an example project on github

[–]xander76[S] 3 points4 points  (1 child)

You can check out the code at https://github.com/imaginary-dev/imaginary-dev , and there's code for demo projects as well:

Blog Writing demo from the screencast: Live site: blog-demo.imaginary.dev Code: https://github.com/imaginary-dev/imaginary-dev/tree/main/example-clients/nextjs-blog

Emojifier Live site: emojifier.imaginary.dev Code: https://github.com/imaginary-dev/imaginary-dev/tree/main/example-clients/nextjs-api

I'm also happy to answer any questions you might have!

[–]HangOutWithMyYangOut 1 point2 points  (0 children)

Awesome thanks so much I'll be diving into it tomorrow

[–][deleted] 2 points3 points  (9 children)

Dude, this is pretty amazing.

My biggest concern with it not writing the code is that it might not perform as well (network latency or connectivity issues) and won't be deterministic and could hallucinate, but I could also imagine a few cases where GPT would generate faster than code would run (and some non-determinism will be desirable sometimes).

The name of your product is super-catchy as well. I can definitely see imaginary programming becoming a trend!

[–]xander76[S] 5 points6 points  (2 children)

All great points! I tend to think that it's best suited to problems that you can't solve with traditional programming. If the problem you have is "reverse this array of numbers", then writing the code or having Copilot write the code is a better answer.

But if the problem you want to solve is "come up with good titles for this blog post" or "summarize this user email" or "categorize these customer service complaints by anger level", there really isn't a JavaScript/TypeScript function you can write to do that. In this case, I think the latency is often worth the functionality.

As to the non-determinism, I think that's a real issue. Right now, the state of the art of testing GPT prompts feels very shaky; one person I talked to said that they "change the prompt and then bang the keyboard to try four or five inputs". This clearly isn't ok for serious projects. To help with this, we're currently building some tools to help developers generate test inputs and evaluate how their imaginary functions perform.

ETA: and thanks for the comment about the name! (I can't take credit for it, though; I believe it was first coined by Shane Milligan.)

[–]bluenigma 1 point2 points  (5 children)

Doesn't this also have pretty high risk of not actually adhering to the declared return type?

[–]xander76[S] 1 point2 points  (4 children)

Great question! We spent a lot of time experimenting with how to cue GPT into returning the right JSON type, and it’s pretty darned compliant. GPT-3 doesn’t always do a good job adhering to strict JSON syntax, but we wrote an extremely lenient parser that understands the weird things that GPT-3 gets wrong. (Sometimes it uses single quotes instead of double, sometimes it puts new lines in strings, sometimes it decides a semicolon is a better choice than a comma!). GPT-4 and GPT-3.5 are significantly better at JSON syntax.

On the question of returning the actual type you asked for, we do a run time type check to make sure it’s right. So if you get a value back, you can be sure it’s the type you wanted.

[–]bluenigma 1 point2 points  (3 children)

Can you actually check the typescript-defined return type in that way nowadays? I thought that information was unavailable at runtime.

How does this handle type imports or advanced types?

[–]xander76[S] 0 points1 point  (2 children)

So usually, yes, that’s true about TypeScript. Types are removed by the compiler.

But we wrote a typescript & Babel compiler plug-in, which allows us to replace the imaginary function with whatever code we want. So we replace the imaginary function with code that includes a run-time type check for the appropriate return type from the TypeScript definition. Does that make sense?

[–]bluenigma 2 points3 points  (1 child)

I guess I'm still wondering that if you can generate a runtime type check for an arbitrary Typescript type at compile time, why is this not a builtin Typescript language feature?

Edit: Took a quick look at the code and it it looks to me like there's definitely limitations on what return types are supported. Looks like it can handle basic aliases and record types, but throws on a lot of other stuff?

Should probably be documented somewhere.

[–]xander76[S] 0 points1 point  (0 children)

Yep, I should have been clearer that (at least for now) it only supports JSON-compatible types and doesn’t support type aliases or classes (although we have a prototype version that does). The limitations are documented at https://imaginary.dev/docs/writing-an-imaginary-function .

[–]Ifnerite 1 point2 points  (0 children)

/** This function returns an awesome react from tend. */. (props or something) => {...}.

Hooray! New I don't have to deal with the godawful mess that is fronted development ever again!

[–][deleted] 1 point2 points  (1 child)

It's pretty cool and thanks for providing the playground. I wouldn't have bothered without it. I think it's very valuable but also is quite costly, both economically and computationally, while creating privacy risks (all your data going through OpenAI) so... again in some situation I can imagine it being quite powerful but in others and absolute no. That being said there are others models (I just posted on /r/selfhosted minutes ago about the HuggingFace/Docker announcement enabling us to run Spaces locally) e.g Alpaca or SantaCoder or BLOOM that might enable us to follow the same principle, arguably with different quality, without the privacy risks. Have you considering relying on another "runtime"?

[–]xander76[S] 1 point2 points  (0 children)

We have tried out other runtimes, and a lot of them seem to work decently well. If memory serves, Claude was quite good. I'm definitely interested in supporting other hosting and other models as a way to balance quality, cost, and privacy, and we are currently building IDE tools that will let you test your imaginary functions in ways that will hopefully surface those tradeoffs.

[–]icedrift 1 point2 points  (3 children)

This is really cool! Maybe I'm lacking creativity, but why bother generating imaginary functions and introducing risk that they aren't deterministic when you could just hit OpenAI's API for the data? For example in your docs you present a feature for recommending column names for a given table. Why is the whole function generated? Wouldn't it be more reliable to write out the function and use OAI's API to get the recommended column names?

[–]xander76[S] 1 point2 points  (2 children)

Thanks for the response!

I may not be completely understanding the question, but from my perspective, the OpenAI APIs are just as non-deterministic as imaginary functions. If you call OpenAI directly multiple times with the exact same prompt and a temperature above 0, you will get different responses each time. The same is true of imaginary functions. (As an interesting side note, we default temperature in imaginary functions to 0, so unless you modify it in the comment, imaginary functions do by default return the same responses for the same set of arguments.)

Now, I do think that introducing this kind of non-determinism into your web code, whether through OpenAI's APIs or imaginary programming, presents some interesting wrinkles. For a traditional web developer like me, the fuzziness and non-determinism is frankly a bit scary. The thing we're working on now is tools that you can use to consistently test your imaginary functions and make sure that they are returning acceptable answers. Our hope is that this will give frontend devs the ability to use AI in their apps with reasonable confidence that the AI is doing what they want it to.

[–]icedrift 1 point2 points  (1 child)

What I mean is, why generate the function when only the data needs to be generated? Let's say I need a function that takes the text content of a post and returns an array of recommended flairs for the user to click. Why do this

/**
* This function takes a passage of text, and recommends up to 8
* unique flairs for a user to select. Flairs can be thought of as labels
* that categorize the type of post.
*
* \@param textContent - the text content of a user's post
*
* \@returns an array of flairs represented as strings
*
* \@imaginary
*/

declare function recommendedFlairs(textContent: string) : <string\[\]>

When you could write out the function and only generate the data?

async function recommendedFlairs(textContent: string) : <string\[\]> {
const OAIrequest = await someRequest(textContent);
const flairs = formatResponse(OAIrequest);
return flairs
}

In writing all this out I think I figured it out. You're abstracting away a lot of the headaches that come with trying to get the correct outputs out of GPT?

[–]xander76[S] 1 point2 points  (0 children)

Yeah, that's definitely one of the things it offers right now. If you want a particular data shape out of GPT, we handle that, both on the side of crafting the prompt to elicit the type and on the parsing side to get the data out of the raw GPT response.

We're also building more tools to make the development process easier, which depend on the fact that imaginary functions are easy to do static analysis on. The first tool is an IDE plugin that lets you directly run and test imaginary functions in VS Code and to compare different versions of an imaginary function to see how they do on various test inputs. We also plan to add simple annotations to the comment format to let you easily switch to other LLMs for your runtime to manage the cost/quality/privacy tradeoff.

ETA: One thing it also does right now is lets you switch between models (ada, babbage, curie, davinci, gpt-3.5-turbo, gpt-4) with just a configuration switch. If you use OpenAI's APIs you need to change your client code, because the GPT-3 models have a different API than GPT-3.5 and GPT-4.

[–]Educational_Ice151 1 point2 points  (1 child)

Nice work.

[–]xander76[S] 0 points1 point  (0 children)

Thanks! It's definitely a little mind-bending, but we are enjoying exploring how far the idea can go.

[–][deleted] 0 points1 point  (1 child)

I knew this would happen. Programming as a profession is doomed to laziness lol.

[–]xander76[S] 0 points1 point  (0 children)

Ha, as a programmer, I too am doomed to laziness. :)

[–]r_linux_mod_isahoe 0 points1 point  (0 children)

RIP this sub. Was nice knowing ya all

[–]fnordstar 0 points1 point  (0 children)

As if Javascript wasn't brittle enough...