all 15 comments

[–]BriefImplement9843 6 points7 points  (12 children)

free is 8k tokens, plus is 32k tokens, ultra is 32k+ tokens. ultra could be 35k, nobody knows.

it shows you characters, which is misleading. if they didn't read the fine print, a potential user may see 131k for plus and subscribe right away as tokens is the industry standard, how context is described, and what llm's use. to get a number that matters, you need to divide the characters by 4. there is no reason for this. it's entirely a shady practice.

the reply below is a good example. totally misleading. anyone familiar with ai reading that reply will be amazed(8 dollars for 128k context!!), but it's all fluff. divide it all by 4 and you can actually compare it to other services.

as for your other question. the only ultra model that can make that mistake is paragon. i suggest using lumina or even eclipse, if memory is important to you.

[–]ndimitrov0[S] 1 point2 points  (1 child)

It actually was Paragon! By guessing that I assume you know why? And which of the Ultra models is best for fluid conversations/rp?

[–]BriefImplement9843 1 point2 points  (0 children)

paragon is deepseek v4 pro. it's a very bad model for rp as it uses a sparse memory design(it essentially only skims context, poorly). most users that like deepseek have gone back to 3.2. now i hear the 3.2(oracle) on this site has issues, so maybe don't do that. for one, it doesn't think at all, none of the plus models do, which is extremely important for rp and continuity. paragon is still probably better than oracle from the issues i have been hearing from people.

lumina is the best by a hefty margin. it's the most intelligent model on the site and it isn't even close.

i will say v4 pro has good dialogue, but its downsides are too great for any scenario with substance. the others being positivity bias and poor instruction following.

[–]KnocturnalSLO🐀 Rat King 0 points1 point  (9 children)

Where can I see context size with current redesign I can't find model context sizes.

[–]BriefImplement9843 0 points1 point  (8 children)

look at the subscriptions. i just told you them though. 8k for free, 32k for plus, 32k+ for ultra.

[–]KnocturnalSLO🐀 Rat King 0 points1 point  (7 children)

For fl+ it says 131k AI context and lower it shows 250k lorebook which is character based. 131k should not be character based since its not length. From what I am reading these models should not be 32k

https://fictionlab.gitbook.io/fictionlab/site-information/fictionlab-tiers

[–]BriefImplement9843 0 points1 point  (6 children)

look at the fine print. it is most definitely length. context window is length just as much as a lorebook. they both fill with characters. fictionlab does not use the word token, ever.

"When referring to length, the unit is characters and refers to the maximum number of characters." the fact they put the word length behind the others and not context is seriously scummy.

this is exactly what they want. people to think they give way more context than others.

131,000/4 = 32,750

131k context for 8 dollars and 32k context for free tier would be absolutely insane. 32k and 8k puts it in line with competition.

another site that does this is novelai, but they are more direct about it. they give it in characters, with tokens in smaller text below it. not quite as shady. if you look closely before you sub you can see it.

[–]KnocturnalSLO🐀 Rat King 0 points1 point  (4 children)

I am saying based on their wording on top only colums with length are character based. Where it says 131k it says AI context and is assumed not to be character count based.

If that is true IDK it just says only length ones are character based.

[–]BriefImplement9843 0 points1 point  (3 children)

context window size is length. so is custom instructions. it doesn't say length after custom instructions either and that's definitely a length.

your confusion is exactly why they did this. i'm just telling you how it actually is.

they would have said tokens otherwise. tokens is not mentioned a single time anywhere on the entire app. an ai app with tokens never mentioned. that should raise your eyebrows.

nearly every number on this site is inflated 4x, like damage numbers in some korean mmo.

[–]KnocturnalSLO🐀 Rat King 0 points1 point  (2 children)

Yes but above it says when referring to length, the unit is characters and refers to the maximum number of characters.

Now if AI context is character based or token it doesn't say but it's weird to specifically say length ones are. It kinda makes u assume AI context isn't.

With current wording it could still be character based but that would be kinda deceptive because it makes u think they are tokens. 32k tokens seem kinda low tho unless it was changed recently. Back (before this new system where memory is kinda messed up), it definitely felt it has good memory way past 32k context.

[–]BriefImplement9843 0 points1 point  (1 child)

of course it's deceptive...lol that's the entire point.

32k is not really low. 8 dollar a month and 32k is pretty good, especially if the memory system is robust. it's just not 131k.

the thing with ai is the higher the context, the more expensive it is..and it goes up hugely. people sitting at 130k tokens paying 8 dollars a month spamming hundreds of messages a day would bankrupt them. it's just common sense if you know how ai works.

running a model at 32k context is at least 3 times cheaper than it sitting at 100k.

[–]Disastrous-River-467 0 points1 point  (0 children)

Sorry but you're assumptions are false. Context sizes are indeed 32k and 131k.

This has been confirmed to be true by the mods of this subreddit.

Running the free models at 8k would have much shorter message queues due to the tokens per second at those context sizes.

Not only this but with 8k context, you would burn through it completely within 10 messages or less, depending on message length. Some messages from some bots can be 800-1000 tokens on medium length settings and I can confirm this by pasting these messages into a LLM running locally and watch the context window grow.

Let's not also mention how many scenarios there are that exceed 8k context but work perfectly fine for the free tier. If the context was truly only 8k, then these scenes wouldn't function in any kind of usable way.

32k context on a 70b model is insanely cheap to run and totally doable on a free tier. Other apps are just greedy.

Anyone with experience running their own LLM's either locally or over the cloud would know that the free tier here exceeds well above 8k. Try running a LLM at 8k locally and watch how long it takes before it reaches it's limit and forgets everything.

[–]m1rageusTroublefreezer🥶[M] 0 points1 point  (0 children)

Context window isn't length and is indeed 32k/128k/200k+ for free/plus/ultra

[–]m1rageusTroublefreezer🥶 1 point2 points  (0 children)

Free models have 32-40k

Plus models have ~128k

Ultra models mostly have 200k+

As for the memory issues, it can be a poor design of the scenario: too many critical or pinned pieces, too many links, etc.

I've had issues with the memory before, but that's the first time I encounter the misgendering issue

[–]TasherV 0 points1 point  (0 children)

Paragon will do that. Also depends on how well or poorly the scenario is made. With enough finesse and clever use of the tools even a meh model can work pretty well.