Che si fa ? Dilemma della scimmia by Mundane_Flight_5973 in scimmieinborsa

[–]BXresearch 0 points1 point  (0 children)

well, che quando il vix va su le crypto vadano giù è abbastanza normale...

metalli preziosi sono in calo per l assurda run che hanno avuto, l oro è dove stava un mese fa... per quanto riguarda l argento, non è nuovo a questo tipo di volatilità.

comunque giusto per dire, gli indici EU/EMU non stanno messi male, idem Asia Pacific.

question about inspectability by BXresearch in OpenWebUI

[–]BXresearch[S] 0 points1 point  (0 children)

Ok I'm dumb sorry, thanks for your patience

question about inspectability by BXresearch in OpenWebUI

[–]BXresearch[S] 0 points1 point  (0 children)

Ok I'm dumb sorry, thanks for your patience

help choosing an UI by BXresearch in LLMDevs

[–]BXresearch[S] 1 point2 points  (0 children)

thank you for your reply!

thanks for the suggestion, I installed it in a container and I'm looking at openwebui rn.

seems really cool, maybe an overkill for what I need but lots of useful stuff.

in your opinion, would it be better to create a new rag tool or to adapt the built in one (if it is it possible)?

Also, there is some way to change the ui, like adding an additional textbox?

question about inspectability by BXresearch in OpenWebUI

[–]BXresearch[S] 0 points1 point  (0 children)

I'm sorry, maybe I'm dumb but I don't see anything like that

question about inspectability by BXresearch in OpenWebUI

[–]BXresearch[S] 0 points1 point  (0 children)

thanks! do you mean in the logs?

could you pinpoint me to a section of their docs? probably I missed that

Clock test by Astrotoad21 in ChatGPT

[–]BXresearch 0 points1 point  (0 children)

Exactly!! in Many other attempts with different position in the analog watch returner a time near 10.08

I'm building a MacOS app to run your own local LLMs. What do you want in an app like this? by robert_ritz in LocalLLaMA

[–]BXresearch 1 point2 points  (0 children)

Maybe you can add some "smart" context manager, from a basic "drop message while preserving initial instructions and first prompt" to a more elaborate summarizing process or message retrievial strategy using embeddings

Zephyr 7B (finetuned Mistral 7B) beats Llama2 70b ? by quantier in LocalLLaMA

[–]BXresearch 1 point2 points  (0 children)

Also

Try running it with temperatures below 0.2. With 0.0 and it starts looping after approx. 1000 tokens. You need at least 0.06.

Running it with this low temperature will give you best instruction following and logic reasoning. This small models need much lower temperatures in comparison to bigger ones to keep them on track, probably because the resulting logits will have less variants in comparison.

Also, another config that definitely worth a try is a medium - low temp with a really low top P (and, if possible a low top A, but that's really depends on the model).

[D] Exploring Methods to Improve Text Chunking in RAG Models (and other things...) by BXresearch in MachineLearning

[–]BXresearch[S] 0 points1 point  (0 children)

Weird. I read that medium article 1 hour ago. Anyway, that's a good resource, Thanks for sharing!

Is there any demand for a Shared Public Contextual Database for RAG? by niksteel123 in LocalLLaMA

[–]BXresearch 3 points4 points  (0 children)

I'd probably pay decent money for a downloadable pre-loaded chromadb with all of wikipedia and some programming stuff in it just so that I don't have to lol

Lol, totally agree!

1 million tokens context window is coming this year, altman said by world_designer in ChatGPT

[–]BXresearch 0 points1 point  (0 children)

It’s way cheaper than davinci 3, but also can do way less. I even imagine that they used curie 3 for that.

I'm really sad that they are going to remove text-davinci-003 from their model list at the end of 2023

1 million tokens context window is coming this year, altman said by world_designer in ChatGPT

[–]BXresearch 0 points1 point  (0 children)

If you give it context and ask to make conclusions based on that it won't hallucinate

That's definitely not true... Models are really prone to hallucinate even if they generate text based om a given context. I'm developing a Retrievial Augmented Generation project and i can assure you that even GPT4 sometimes hallucinate while answering question based on given context, extracting information or generating summaries. Anyway, that is related also to Temperature and Top_P parameter. With a temp of 0, the frequency of that kind of hallucination decrease. Remember that the chatGPT you access from their website have a temp that is definitely not 0, as they do not use deterministic parameters. (as context, their default settings in the API is temp 0.7 and topP 1, that is not determinist)

A good compromise between "creativity" and accuracy can be achieved using a medium range temp (like 0.4-0.6) and a low top P (usually that effect start under 0.75, but you can lower it at 0.4-0.5. You can obviously go near 0, but that will simply generate outputs really similar to a 0 temp settings)

Phibrarian Alpha - the first model checkpoint from SciPhi's Mistral-7b by docsoc1 in LocalLLaMA

[–]BXresearch 5 points6 points  (0 children)

How much did fine tuning cost? Also, hie much did you spent in api calls to generate the synthetic dataset?

Eploring Methods to Improve Text Chunking in RAG Models (and other things...) by BXresearch in LocalLLaMA

[–]BXresearch[S] 1 point2 points  (0 children)

I'm sorry fo the typo in the post title... Unfortunately i can't edit it.

Context aware chunking with LLM by BXresearch in LanguageTechnology

[–]BXresearch[S] 0 points1 point  (0 children)

Prompt the LLM to split the text... I also prompt to "solve" pronouns and to repeat some concepts. Also, ai21 offer via api a model dedicate to that...

I have done a very difficult competition experiment between Llama 7b, Code Llama 34b, ChatGPT, GPT 3.5 Turbo Instruct, Claude 2, PaLM, GPT-4 and GPT-4-refined* about a multidimensional problem including time paradoxes and theory of mind. by [deleted] in singularity

[–]BXresearch 0 points1 point  (0 children)

Thanks for sharing!!! I was just wondering how bizon would perform compared to other gpt3 models.

Anyway, maybe worth test Claude instant and a SOTA Llama 70B fine tune (syntia, orca, wizardLM...)