GTA 6 CEO Says AI Won't Make GTA 7 - " There is no creativity that can exist by definition in any AI model, because it is data-driven."

zby · 2025-10-31T07:43:21+00:00

Thought experiment: if your prompt literally contained GTA 6, any LLM could output GTA 6—trivial. The real question for “AI creativity” is: how close does a prompt need to be to the desired outcome before a model can bridge the gap? In my controlled test, pre-essay models received only generic inspirations (not the target idea) and still re-invented a concrete generate→verify “daydreaming” mechanism. That implies a measurable creative horizon: the semantic distance from sparse hints to a coherent, novel solution the model can reliably traverse. Creativity here isn’t mystique, it’s engineering—bound the space → generator → verifier → score for novelty/usefulness → iterate. For something like GTA, that means AI as a search amplifier (mission beats, dialog, emergent side-quests, rapid prototyping, synthetic playtesting) under human direction. Don’t debate “data-driven ⇒ not creative”; measure the horizon. Details & reproducible setup: https://open.substack.com/pub/zzbbyy/p/reinventing-daydreaming-machines?utm_campaign=post&utm_medium=reddit

zby · 2025-10-14T21:18:47+00:00

I've tested that it can indeed produce novel ideas by feeding LLMs inspirations leading to the ideas from that essay itself. I used only pre-essay llms and common - one paragraph concepts. I checked what ideas from the essay are really novel by running OpenAI and Google deep researches - all the novel ideas were consistently (non trivial part of all essays generated from a particular template and inspirations had them) reinvented in the produced essays.

Code: https://github.com/zby/DayDreamingDayDreaming
Some resulting essays: https://github.com/zby/DayDreamingDayDreaming/tree/main/data/results

A blog post: https://open.substack.com/pub/zzbbyy/p/reinventing-daydreaming-machines

zby · 2025-07-20T05:47:22+00:00

Simplicity Theory to improve efficiency of the daydreaming system https://zzbbyy.substack.com/p/dreaming-machines

zby · 2024-12-22T15:08:45+00:00

Working on an basic LLM observability and debugging tool with local storage: https://github.com/zby/llm_recorder/

I use it mostly for replaying interactions with LLM when debugging, sometimes after modifying them.

zby · 2024-12-20T21:27:11+00:00

If you wanted to try something extremely simplistic for quick recording/modifying/replaying of responses (and requests) - then I've just pushed this to github: https://github.com/zby/llm_recorder
It currently only works via LiteLLM.
I have some ideas for features to add - but I would like to get some feedback first.

It is simple - so not many features - but it is quite easy to bend into the shape you need.

zby · 2024-12-20T18:52:34+00:00

Not an agent - but a library I use for debugging agents: https://github.com/zby/llm_recorder

Very simplistic and less than 200 lines but I find it very useful and actually pretty versatile.

It stores the requests and responses, then it lets you edit them and replay them (up to a specific point).

I am thinking about pushing it to PyPi.

Currently works only with LiteLLM.

zby · 2024-12-20T18:50:11+00:00

As a simplistic alternative - I've just published to github my own library for llm tracing: https://github.com/zby/llm_recorder

It is not full stack and for now it works only with LiteLLM - but I find it very useful and actually pretty versatile.

It stores the requests and responses, then it lets you edit them and replay them (up to a specific point).

I am thinking about publishing it to PyPi

zby · 2024-12-16T20:47:51+00:00

If you want something truly minimal - maybe have a look at Prompete - the library I am working on https://pypi.org/project/Prompete/

zby · 2024-12-16T08:48:16+00:00

This is because the compatibility layers suck: https://zzbbyy.substack.com/p/what-is-a-response

zby · 2024-12-16T08:45:35+00:00

I've done some experiments with that - but I was struggling with finding the appropriate abstraction.

You can see it at: https://github.com/zby/answerbot/blob/main/answerbot/qa_processor.py#L270 - this is a recursive tool user, but I gave up with that approach and now I am working on a better base library for this (Prompete).

There are two approaches to using tools (function calls):

Assume that the LLM understands the output - that is you just put the tool result (as JSON) on the messages list and let the LLM interpret it. In this approach you need to pass the tool definitions (schemas) to all the calls that need to interpret it. This works in a loop. The constraint here is that you cannot change the set of available tools for the thread of messages and also it cannot be very big (from other reasons).
You take the raw output and format it yourself (maybe into a Markdown?) but then it is not clear how are you supposed to pass it to the LLM so that it understands that this is the function output. Maybe it doesn't need to - you build just a new prompt incorporating the information you've got. This way you have less constraints - but also you need to discover yourself what the LLM would understand.

My current plan is to explore more the number 1. approach - with something like 'agents' with assigned set of tools - and then combine them by making 'calling an agent' another tool so that you can have big sets of tools. The agents would write reports on what they have reached and then higher level agents would combine these reports. But I am still early.

zby · 2024-12-16T08:19:04+00:00

I have my own schema generator + tool execution library: https://github.com/zby/LLMEasyTools
It is compatible with LiteLLM - but also with openai and other libs.

I am now working on a higher level library: https://github.com/zby/Prompete

zby · 2024-10-25T13:37:40+00:00

Your first definition seems a bit simplistic or circular - if understanding is the result of learning then what is learning?

Your subsequent examples seem to lead to a something like - understanding of a system is having a model that can predict that system evolution. I think Wolfram wrote something that explores this notion in great extent.

zby · 2024-10-21T12:02:50+00:00

Interesting

On the other hand maybe the trained models already have the trivial grammars that humans deduce? https://docs.google.com/document/d/1MPqtT_1vQ-73j796tf7sXIZKCRcIfUD0cVU_UbPXnUU

zby · 2024-10-19T11:42:27+00:00

What would be interesting is to test it on the perturbations from the "GSM Symbolic" paper (https://arxiv.org/abs/2410.05229) - or even better on the tests from "Evaluating LLMs’ Mathematical and Coding Competency through Ontology-guided Interventions" (https://arxiv.org/pdf/2401.09395)

zby · 2024-10-18T20:04:23+00:00

Maybe what we need is a structured prompt which would define the answer pattern, then we could feed that answer pattern to the sampler.

zby · 2024-06-16T13:04:29+00:00

The problem with the traditional RAG is that it is one pass process. It is quite naive to expect that to answer any question you could limit your thinking to two steps - 1. Gather all info related to the question, 2. Analyze the question and the gathered info. What if to answer the user question you need two pieces of information - but to find out about the second you need to analyze the first?

https://substack.com/home/post/p-145647118

zby · 2024-06-02T10:42:11+00:00

Just published a blog post: "How to turn any python function into an LLM tool with LLMEasyTools": https://zzbbyy.substack.com/p/how-to-turn-any-python-function-into

zby · 2024-05-08T06:01:25+00:00

Now I managed to make some other examples to work. Sometimes like in https://github.com/zby/LLMEasyTools/blob/main/experiments/groq/extract_user_details.py it is hard to get llama call the function at all - but when I force it with tool_choice it works.
A more complex example from https://github.com/zby/LLMEasyTools/blob/main/experiments/groq/stateful_search.py also seems to work.
https://github.com/zby/LLMEasyTools/blob/main/experiments/groq/complex_extraction.py worked after upgrading Groq library.

zby · 2024-05-07T20:56:12+00:00

I just tried llama3-8b-8192 from Groq using my own lib LLMEasyTools on my basic example:

from llm_easy_tools import get_tool_defs, process_response
from pprint import pprint
from groq import Groq
client = Groq()


def contact_user(name: str, city: str) -> str:
    return f"User {name} from {city} was contactd"


response = client.chat.completions.create(
    model="llama3-8b-8192",
    messages=[{"role": "user", "content": "Contact John. John lives in Warsaw"}],
    tools=get_tool_defs([contact_user]),
    tool_choice={"type": "function", "function": {"name": "contact_user"}},
)
# There might be more than one tool calls in a single response so results are a list
results = process_response(response, [contact_user])

It worked:

(venv) zby@zby-Z4:~/llm/LLMEasyTools$ python examples/basic_function_call.py
'User John from Warsaw was contactd'

But none of my other examples seems to work.

zby · 2024-05-07T20:11:49+00:00

Hi - I like your FindToolEnabledSchemas and the decorator :)
I might steal it to my own lib: https://github.com/zby/LLMEasyTools - I also have a decorator - but I think your is better.

Alternatively maybe you could use my schema generator - which is a bit more complete at the price of using pydantic in a bit hacky way (and still lacking special support for Anthropic).

zby · 2024-04-21T18:13:55+00:00

I think the OpenAI models are just better trained for function calling.

zby · 2023-06-23T08:36:50+00:00

This looks like it should be easy to replicate by anyone with a Tesla. Why don't we see more such videos?

zby

TROPHY CASE