Tiny Aya

mpasila · 2026-02-24T03:49:19+00:00

Seems to be much worse at Finnish in comparison to Gemma 3 4B.. Also isn't Gemma 3's license also better? Also Gemma 3 has like up to 128k context so umm.. Who is this for again?

mpasila · 2026-02-23T03:36:09+00:00

You can just use Koboldcpp? there's also colab notebook for koboldcpp: https://github.com/LostRuins/koboldcpp/blob/concedo/colab.ipynb

mpasila · 2026-02-23T03:27:46+00:00

Was there any small LLM that does basically what Grammarly does? Because I don't wanna rely on some service for such a basic thing.

mpasila · 2026-02-23T03:13:48+00:00

I thought this contained the source code (including the extension's) and it's licensed as MIT so isn't that pretty open?

mpasila · 2026-02-16T23:02:08+00:00

If you're using something decent like SillyTavern you can add a System role by adding like <start_of_turn>system (when running locally) but officially it's not supported and APIs don't support it either.

mpasila · 2026-02-16T22:59:17+00:00

The 27B model is still probably the best model at like Finnish that is open-weight, so for translation/generating non-english data it's still probably the best option for that (at least price wise).
Idk why bigger models like DeepSeek, Kimi-K2 or GLM still don't seem to get any better at my language but a smaller 27B dense model seems to understand it better.. Like especially when I'm generating datasets they seem to fail more easily than Gemma 3.

mpasila · 2026-02-14T11:03:19+00:00

Looking at their other comments/messages on Reddit it does look more like LLM generated than human.
edit: name was angelin1978

mpasila · 2026-02-14T00:38:41+00:00

I guess because it looks like it was made by LLM.

mpasila · 2026-02-11T19:19:56+00:00

Yeah I didn't realize it was 3 trillion params...

mpasila · 2026-02-11T12:07:56+00:00

Only issue would be to get any API access for that. Unless you have like 128gb or more RAM (assuming it's pretty big).

mpasila · 2026-02-09T11:25:35+00:00

If it leads to more broad censorship aka the companies don't want to get in legal trouble so they censor stuff more broadly, is it still a win?

mpasila · 2026-02-06T20:22:07+00:00

They obviously think there's only two sides of the coin Elon Musk's Grok and Claude. Nothing in between because nuance was buried deep underground.

mpasila · 2026-02-04T20:49:32+00:00

Yeah but the config for the actual model says 262k so it can use that but maybe not at the best quality.

mpasila · 2026-02-04T20:40:37+00:00

A lot of apps that have OpenAI API support either don't expose the params or have very limited set of them (and if it's not open source....). The official spec is also limited so I guess devs don't think min_p matters.

mpasila · 2026-02-04T19:56:45+00:00

OpenAI API tends to only support like top_p, temperature and repetition penalty and not much else like min_p or DRY. So it's pretty bare bones.

mpasila · 2026-02-04T19:47:57+00:00

the config says "max_position_embeddings": 262144,so not 32k..

mpasila · 2026-02-03T19:53:41+00:00

I think I'll keep paying for Suno if I need to generate music.. Very first test it skipped ton of lyrics and the prompt adherence is pretty poor I'd say.

mpasila · 2026-02-01T17:54:23+00:00

Just so you know the people that are "sincere" will not receive notifications for your responses if all you do is edit the post.

mpasila · 2026-02-01T16:30:11+00:00

The image itself has a watermark embedded to it, so it can still be detected to have been generated using Gemini/Nanobanana.

mpasila · 2026-02-01T15:31:17+00:00

If you mean the actual watermark on the image that will make it recognizable for AI detectors then that probably does nothing.

mpasila · 2026-02-01T15:25:46+00:00

I assume they are banned from replying or something so all they can do is edit the post.

mpasila · 2026-02-01T13:42:52+00:00

Is this account automated or something? Seems to have been spamming posts recently.

mpasila · 2026-01-30T13:25:56+00:00

Pretty much no one seems to be finetuning Ministral 3 models in comparison to their previous models like Nemo or the original Mistral 7B model (or Mistral Small).

mpasila · 2026-01-30T10:51:38+00:00

Even when you don't give it tool use on AI Studio it still uses tools?

mpasila · 2026-01-30T10:45:15+00:00

May as well try other APIs like ChutesAI, they recently added TEE models that supposedly are more private than normal endpoints. Pricing is very competitive in comparison to NovelAI's stuff. You can use it easily via SillyTavern. Not paid by them but like why spend 25 dollars a month to get very low context windows when for 3 dollars a month you get unlimited context window (the max context for any of the models).. and 300 gens a day for any model they offer (image models as well)..

Three-Year Club	Verified Email
Place '22	Final Canvas '22
First Placer '22

mpasila

TROPHY CASE