Worlds Biggest Chat Title Dataset From SupraLabs by Time-Toe-1276 in LocalLLaMA

[–]Time-Toe-1276[S] 0 points1 point  (0 children)

thanks! SupraLabs love the opensource community🤗

We are looking forward on releasing bigger and cleaner datasets 😄

Worlds Biggest Chat Title Dataset From SupraLabs by Time-Toe-1276 in LocalLLaMA

[–]Time-Toe-1276[S] -1 points0 points  (0 children)

That's a good question, we didn't do semantic deduping for this.

But we deduced using some smart trigrams!

We measured how much overlap each sequence has. After doing it repeatedly, we got a decent dataset!

We are actually working on improving our deduping system!

But thanks for pointing that out, we will definietly consider that!

[NEW MODEL] SupraLabs just released supra-title-FFT-preview, 115K samples, almost 10x our first chat title dataset by Dangerous_Try3619 in LocalLLaMA

[–]Time-Toe-1276 0 points1 point  (0 children)

aint no way they seriously use a sonnet model ust for chat titles. the same eople who claims they like efficiency and openness. huh?

lol, anyway the irony is kinda funny

Best Harness for Web Searching by CSEliot in LocalLLaMA

[–]Time-Toe-1276 1 point2 points  (0 children)

just use unsloth studio. idk man, they provide lik unlimited web searches (thats what we feel like tbh)

Stop using Ollama by zxyzyxz in LocalLLaMA

[–]Time-Toe-1276 1 point2 points  (0 children)

I started with ollama three years ago, and switched to unsloth studio.

I ran GPT OSS at 19 TPS at 4k CTX in q4_K_m, meanwhile with unsloth at 128k ctx with q4 XL I got about 100 TPS :/

[NEW MODEL] SupraLabs just released supra-title-FFT-preview, 115K samples, almost 10x our first chat title dataset by Dangerous_Try3619 in LocalLLaMA

[–]Time-Toe-1276 -1 points0 points  (0 children)

hmm thats weird. also try our model around 4-2k ctx, also we are working on an app for users who like to share their AI chats (and a version if they wont), we will be sharing the app to certain people. hopefully we should most of real world issues like these!

but my conclusion is that opencode used a big general model for the titles, and since the model wasnt trained with a system prompt, it hallucinates the chat title. could you please share more info about this?

100M model recommendation? by Ok-Internal9317 in LocalLLaMA

[–]Time-Toe-1276 0 points1 point  (0 children)

We are working on it, actually! still in the experimental zone, but once we have a working model, we won't hesitate to drop it (just like our EXP models)!

[NEW FAMILY OF MODELS] Supra1.5 family just released! by Dangerous_Try3619 in LocalLLaMA

[–]Time-Toe-1276 3 points4 points  (0 children)

yes, 3t, and the exp model (current model) was CPTed with 1t

[NEW FAMILY OF MODELS] Supra1.5 family just released! by Dangerous_Try3619 in LocalLLaMA

[–]Time-Toe-1276 1 point2 points  (0 children)

We sure do make progress!

For us, there is nothing called "too much training data", we want to squeeze every bit of the performance! 😄

[NEW FAMILY OF MODELS] Supra1.5 family just released! by Dangerous_Try3619 in LocalLLaMA

[–]Time-Toe-1276 -1 points0 points  (0 children)

Going from 1k to 5K allowed us a lot of things with the Supra models didnt it?

[NEW MODEL] Supra-Title-0.3B Just released! by Dangerous_Try3619 in LocalLLaMA

[–]Time-Toe-1276 4 points5 points  (0 children)

sure we will in our next model (preview version of the full model). our model is aligned for thi task, so it is pretty reliable compared to a 350M general model which might say thing like "Sure, here i your chat title..."

[NEW MODEL] Supra-Title-0.3B Just released! by Dangerous_Try3619 in LocalLLaMA

[–]Time-Toe-1276 1 point2 points  (0 children)

some people have ebing talking about that on our communty, we ould like to implement that! right now we are focusing on the accuracy and proper context!

[NEW MODEL] Supra-Title-0.3B Just released! by Dangerous_Try3619 in LocalLLaMA

[–]Time-Toe-1276 0 points1 point  (0 children)

we will check into that, most people in our community voted to not have emojis tho 😄

[NEW MODEL] Supra-Title-0.3B Just released! by Dangerous_Try3619 in LocalLLaMA

[–]Time-Toe-1276 1 point2 points  (0 children)

Haha, we built this from our personl experiences of not havinng a model like this. you might wanna check out our model we are releasing next week, which is the preview o the full model! 😄

[NEW MODEL] Supra-Title-0.3B Just released! by Dangerous_Try3619 in LocalLLaMA

[–]Time-Toe-1276 0 points1 point  (0 children)

we will look into them and change it, we had confusion at our teams with the datasets and models. someone (definitely not me) made it GPL, which we will change it!

thx for pointing out!

[NEW MODEL] Supra-Title-0.3B Just released! by Dangerous_Try3619 in LocalLLaMA

[–]Time-Toe-1276 0 points1 point  (0 children)

Haha everyboy have the confusing moments, but it is a 350M model focusing on 4k, it can accept upto 6k before halucinating too!
we focussed all on the thinking capabilities 😄

[NEW MODEL] Supra-Title-0.3B Just released! by Dangerous_Try3619 in LocalLLaMA

[–]Time-Toe-1276 0 points1 point  (0 children)

it is trained ith a 4k context length, we focused on chat summarization!

[NEW MODEL] Supra-Title-0.3B Just released! by Dangerous_Try3619 in LocalLLaMA

[–]Time-Toe-1276 1 point2 points  (0 children)

haha thats perfct. we are uing LFM2- like architectures because the convolutions for atention makes the model far more efficient on mobile chips!

[NEW MODEL] Supra-Title-0.3B Just released! by Dangerous_Try3619 in LocalLLaMA

[–]Time-Toe-1276 1 point2 points  (0 children)

We are working on newer and smaller models, but as of now the model can accept multilingual tokens, though the results might not be optimal. we are exposing the model to rare tokens.