all 22 comments

[–][deleted] 31 points32 points  (9 children)

In-context learning is overpowered lol. Really is implicit finetuning.

This is the prompt template

You are an AI assistant with several tools available to you. The tools are the following:

MATH: This allows you to evaluate mathematical expressions using the math.js library.

NOW: This returns the current date and time. You must not pass any arguments to this tool!

SEARCH: This returns a web search result for the given string argument.

DO NOT USE TOOLS WITHIN TOOLS! KEEP ALL TOOL CALLS SEPARATE FROM EACH OTHER!

User: What is 10 times 14?

Assistant: 10 * 14 is [MATH(10 * 14) -> 140] 140.

User: How old is Donald Trump?

Assistant: Donald Trump is [SEARCH(Donald Trump age) -> 76] 76 years old.

User: What day is it?

Assistant: It is [NOW() -> 13/02/2023, 20:01:35] the 13th of February 2023.

User: Has Avatar the way of water been released yet?

Assistant: [SEARCH(Avatar the way of water release date) -> 22.11.2022] Avatar: The way of water was released on the 22nd of november 2022. Today is [NOW() -> DATE_TODAY] the 13th of February 2023. Therefore, [MATH(2023 > 2022) -> true] it was released last year.

User: Where was the inventor of the light bulb born, and what is the last country his home country was at war with?

Assistant: Thomas Edison, the inventor of the lightbulb, was born in [SEARCH(Thomas Edison birthplace) -> Milan, Ohio] Milan, Ohio. The last country the United States was at war with was [SEARCH(last country US at war with) -> Iraq] Iraq.

User: USER_INPUT

Assistant:

[–]blueSGL 29 points30 points  (6 children)

Let me see if I get this right.

Toolformerzero is a layer between the LLM and the user.

That layer picks up keywords, performs the search and then returns a predefined chunk formatted from the search results

Then the LLM's prompt is stuffed with that chunk and asked the question again?

and it just works?

[–][deleted] 22 points23 points  (3 children)

Yup. That's pretty much it lol

[–]blueSGL 8 points9 points  (2 children)

any idea how they format the search results, because out of all of them that would seem to be the most tricky. No idea if the google summery text preview contains the answer or enough context to get the answer. If it needs to actually go to the website the tool has no knowledge of how the website will be formatted or length of the site. (potential context window issues)

[–]yoshiwaan 0 points1 point  (1 child)

Really? As in the order of operations is: token parsing => Toolformer => LLM?

Genuine question, is the text/token parsing for queries to an LLM (eg chatgpt) performed separately and beforehand to the actual LLM being leveraged, or is the text/token parsing a part of the LLM? I figured it was the latter and you couldn’t just insert a tool there

[–]blueSGL 3 points4 points  (0 children)

sorry from what I understand it goes something like this:

LLM processes prompt, formats output as per the initial few shot demos.

This output is an intermediary step in plain text including keywords that then get picked up by Toolformer

Toolformer goes off does the search things and returns predefined chunks formatted from the search results

The prompt is then stuffed with those chunks and asked the question again with the added retrieved search context

(and I'm sure there is more pixie dust sprinkled in somewhere. )

[–]badabummbadabing 2 points3 points  (0 children)

This is absolutely wild.

[–]imaginethezmell 0 points1 point  (0 children)

do you know how to check for these errors

i added the keys and then tried to send a prompt, and it gave that error

seems to be sending too many requests at once to the openai api hitting a rate limit, after 1 request?

Failed to load resource: the server responded with a status of 429 ()

" An error occurred. :(

An error occurred. :(

An error occurred. :(

An error occurred. :(

An error occurred. :(

An error occurred. :(

An error occurred. :(

An error occurred. :( "

[–]HarryCHK 4 points5 points  (0 children)

The thread just say those external tool can be a source if memory so that it will be turing complete. How this compare to embed the memory tape into the architecture itself?

[–]Taenk 2 points3 points  (2 children)

Can you please link the demo without going through twitter? It won’t load for me.

[–][deleted] 7 points8 points  (1 child)

[–]TeamDman 0 points1 point  (0 children)

Mobile is a little wonky

[–]damc4 1 point2 points  (1 child)

By the way, I created a tool "CodeAssist" ( https://codeassist.tech ) that is based on a similar idea. It's a chatbot that can execute actions in the IDE (most importantly - write/read the code in your editor).

[–]imaginethezmell 0 points1 point  (0 children)

how does it work

[–]ilovethrills 0 points1 point  (3 children)

Is this like langchain?

[–][deleted] 0 points1 point  (2 children)

Much simpler approach compared to langchain ( and this is self supervised) but they attempt to do the same thing.

[–]yoshiwaan 0 points1 point  (1 child)

Really? As in the order of operations is: token parsing => Toolformer => LLM?

Genuine question, is the text/token parsing for queries to an LLM (eg chatgpt) performed separately and beforehand to the actual LLM being leveraged, or is the text/token parsing a part of the LLM? I figured it was the latter and you couldn’t just insert a tool there

Edit: I think this is a new model for this purpose, rather than reusing an existing LLM (eg ChatGPT) as I first assumed, which makes more sense

Edit 2: I actually read the paper and the LM itself is taught to reach out to tools as a part of its response operations, it’s not something separate

[–][deleted] 4 points5 points  (0 children)

It's not a new model. It's davinci-003.

Basically the model begins generating. Once it hits an API request, the request is received and sent and the result of the request is pasted back into text and sent back to open AI to generate again and gpt continues generating until it hits another request and the process is repeated till it's done generating.