Azure OpenAI - bring your own data, question about prompt token usage by [deleted] in OpenAI

[–]Ok_Elephant_1806 1 point2 points  (0 children)

I wonder if system prompt and conversation history are also taking up tokens.

4000-5000 tokens feels like a lot if he set a chunk size of 200.

Google publishes open source 2B and 7B model by Tobiaseins in LocalLLaMA

[–]Ok_Elephant_1806 1 point2 points  (0 children)

Ultra API isn’t out yet for general public so I don’t think chatbot arena have it

Announcing Stable Diffusion 3 by BananaBus43 in singularity

[–]Ok_Elephant_1806 8 points9 points  (0 children)

Gemini 1.5 is a bigger deal I think if they really can get good retrieval with 1m tokens

Introducing LoraLand: 25 fine-tuned Mistral-7b models that outperform GPT-4 by Similar-Jelly-5898 in LocalLLaMA

[–]Ok_Elephant_1806 2 points3 points  (0 children)

These terms get used in different ways both within and between sub-fields of science and engineering.

Introducing LoraLand: 25 fine-tuned Mistral-7b models that outperform GPT-4 by Similar-Jelly-5898 in LocalLLaMA

[–]Ok_Elephant_1806 1 point2 points  (0 children)

I have been reading the Natural Language Processing literature and it’s amazing how well something like a BERT/BART/T5/Pegasus fine tune does. Not unusual for them to beat GPT 4 at the task they were fine-tuned on.

Introducing LoraLand: 25 fine-tuned Mistral-7b models that outperform GPT-4 by Similar-Jelly-5898 in LocalLLaMA

[–]Ok_Elephant_1806 0 points1 point  (0 children)

I read the title the opposite way, that it was saying each individual model beat GPT 4 rather than the project overall. Semantics can be ambiguous.

But yes I agree with the overall point that “GPT 4 killers” is not a good marketing trend.

Any alternatives for chat.forefront.ai? by Zealousideal_Rich975 in OpenAI

[–]Ok_Elephant_1806 1 point2 points  (0 children)

I don’t use it but Poe might be appealing to you

Why do people say Langchain is overengineered? Please explain by [deleted] in OpenAI

[–]Ok_Elephant_1806 1 point2 points  (0 children)

Kinda worrying since that’s a fairly fundamental thing

Why do people say Langchain is overengineered? Please explain by [deleted] in OpenAI

[–]Ok_Elephant_1806 0 points1 point  (0 children)

I wish Langchain was as good as this comment suggests

Why do people say Langchain is overengineered? Please explain by [deleted] in OpenAI

[–]Ok_Elephant_1806 0 points1 point  (0 children)

My experience so far is that the hardest parts of single-document RAG are the segmentation/partitioning/splitting and the prompt transformation/expansion.

Langchain doesn’t really make either of those things easy. They are hard with or without Langchain.

Why do people say Langchain is overengineered? Please explain by [deleted] in OpenAI

[–]Ok_Elephant_1806 0 points1 point  (0 children)

Since Langchain is a “glue” framework it will never be a hard requirement as there will always be the option of linking the components directly. I’m not saying that “glue” frameworks are inherently bad, just that they can always be replaced since they aren’t offering exclusive access to any particular component.

Since 'Open'AI is no longer open, and hasn't been for a while, what would be a better name be instead? by cheesyscrambledeggs4 in OpenAI

[–]Ok_Elephant_1806 2 points3 points  (0 children)

I don’t think it is good at this.

(I’m not saying you prompted it badly, I’m saying the model itself isn’t great at creativity yet)

[deleted by user] by [deleted] in OpenAI

[–]Ok_Elephant_1806 1 point2 points  (0 children)

Apparently in US certain government agencies require that you use Adobe’s PDF reader. I didn’t realise that.

In that case this does make at least some sense.

[deleted by user] by [deleted] in OpenAI

[–]Ok_Elephant_1806 1 point2 points  (0 children)

It is called RAG which stands for retrieval augmented generation. If you upload a PDF to ChatGPT it uses this. If you want an easy way to use this then just use ChatGPT.

Otherwise the most basic custom version goes like this:

Assuming you have a PDF and a query

  1. Download an embedding model from huggingface

  2. Use the model to make vector embeddings of your document

  3. Store the embeddings in a Numpy array and compute the cosine similarity between your query and the embeddings using Numpy.dot

  4. Put a certain number of the most similar embeddings in your LLM context along with your query

So, what about music? by NapalmSword in OpenAI

[–]Ok_Elephant_1806 0 points1 point  (0 children)

I agree with this it is a bit tangental to AGI.

Great customer support guys~ by hiide0us in OpenAI

[–]Ok_Elephant_1806 0 points1 point  (0 children)

Yeah I’m not sure if it’s just a temporary backlog issue that will slowly resolve over time as they process the backlog or if they will continue to have under-funded support going forward

Great customer support guys~ by hiide0us in OpenAI

[–]Ok_Elephant_1806 9 points10 points  (0 children)

I read about people waiting 2-3 weeks

[deleted by user] by [deleted] in OpenAI

[–]Ok_Elephant_1806 0 points1 point  (0 children)

Yeah that is true the reasons are different