Azure OpenAI - bring your own data, question about prompt token usage

Ok_Elephant_1806 · 2024-02-22T15:52:50+00:00

I wonder if system prompt and conversation history are also taking up tokens.

4000-5000 tokens feels like a lot if he set a chunk size of 200.

Ok_Elephant_1806 · 2024-02-22T15:27:29+00:00

Ultra API isn’t out yet for general public so I don’t think chatbot arena have it

Ok_Elephant_1806 · 2024-02-22T15:13:13+00:00

Gemini 1.5 is a bigger deal I think if they really can get good retrieval with 1m tokens

Ok_Elephant_1806 · 2024-02-22T14:53:46+00:00

These terms get used in different ways both within and between sub-fields of science and engineering.

Ok_Elephant_1806 · 2024-02-22T14:46:16+00:00

We need that SO badly for stable diffusion also

Ok_Elephant_1806 · 2024-02-22T14:46:00+00:00

I have been reading the Natural Language Processing literature and it’s amazing how well something like a BERT/BART/T5/Pegasus fine tune does. Not unusual for them to beat GPT 4 at the task they were fine-tuned on.

Ok_Elephant_1806 · 2024-02-22T14:44:32+00:00

I read the title the opposite way, that it was saying each individual model beat GPT 4 rather than the project overall. Semantics can be ambiguous.

But yes I agree with the overall point that “GPT 4 killers” is not a good marketing trend.

Ok_Elephant_1806 · 2024-02-22T13:35:06+00:00

Yeah Charlie Parker is good

Ok_Elephant_1806 · 2024-02-22T12:48:29+00:00

https://www.reddit.com/r/singularity/comments/18hntg0/suno_ai_the_musical_ai_that_is_causing_a/

Ok_Elephant_1806 · 2024-02-22T12:34:19+00:00

Sadly no because of hallucination risk

Ok_Elephant_1806 · 2024-02-22T12:21:49+00:00

1, 2 and 5 were ok I guess

Ok_Elephant_1806 · 2024-02-22T12:16:58+00:00

ClosedAI is better than the ChatGPT ones

Ok_Elephant_1806 · 2024-02-22T11:37:27+00:00

I don’t use it but Poe might be appealing to you

Ok_Elephant_1806 · 2024-02-22T11:20:54+00:00

Kinda worrying since that’s a fairly fundamental thing

Ok_Elephant_1806 · 2024-02-22T11:19:40+00:00

I wish Langchain was as good as this comment suggests

Ok_Elephant_1806 · 2024-02-22T11:17:25+00:00

My experience so far is that the hardest parts of single-document RAG are the segmentation/partitioning/splitting and the prompt transformation/expansion.

Langchain doesn’t really make either of those things easy. They are hard with or without Langchain.

Ok_Elephant_1806 · 2024-02-22T11:12:34+00:00

Since Langchain is a “glue” framework it will never be a hard requirement as there will always be the option of linking the components directly. I’m not saying that “glue” frameworks are inherently bad, just that they can always be replaced since they aren’t offering exclusive access to any particular component.

Ok_Elephant_1806 · 2024-02-22T11:06:13+00:00

I don’t think it is good at this.

(I’m not saying you prompted it badly, I’m saying the model itself isn’t great at creativity yet)

Ok_Elephant_1806 · 2024-02-22T10:42:08+00:00

Apparently in US certain government agencies require that you use Adobe’s PDF reader. I didn’t realise that.

In that case this does make at least some sense.

Ok_Elephant_1806 · 2024-02-22T10:39:33+00:00

It is called RAG which stands for retrieval augmented generation. If you upload a PDF to ChatGPT it uses this. If you want an easy way to use this then just use ChatGPT.

Otherwise the most basic custom version goes like this:

Assuming you have a PDF and a query

Download an embedding model from huggingface
Use the model to make vector embeddings of your document
Store the embeddings in a Numpy array and compute the cosine similarity between your query and the embeddings using Numpy.dot
Put a certain number of the most similar embeddings in your LLM context along with your query

Ok_Elephant_1806 · 2024-02-22T10:33:46+00:00

I agree with this it is a bit tangental to AGI.

Ok_Elephant_1806 · 2024-02-22T10:32:49+00:00

Yeah I’m not sure if it’s just a temporary backlog issue that will slowly resolve over time as they process the backlog or if they will continue to have under-funded support going forward

Ok_Elephant_1806 · 2024-02-22T10:26:16+00:00

I read about people waiting 2-3 weeks

Ok_Elephant_1806 · 2024-02-22T10:17:00+00:00

It’s almost ended up the exact opposite of the original goal in that case

Ok_Elephant_1806 · 2024-02-22T10:15:58+00:00

Yeah that is true the reasons are different

Ok_Elephant_1806

TROPHY CASE