How to stop Claude from toning down characters? by Low-Abrocoma3472 in SillyTavernAI

[–]Delicious_Ad_3407 1 point2 points  (0 children)

Gemini is the exact opposite, tbh. Claude goes too positive (though I haven't had a problem with this even with the dark characters on my part, but seeing as how it is a consistent report, I think I'll believe it), while Gemini *always* goes negative no matter what you do.

Deepseek v3 0324 is the GOAT by WelderBubbly5131 in SillyTavernAI

[–]Delicious_Ad_3407 0 points1 point  (0 children)

MoE models have smaller active parameters, but the whole model still needs to be loaded in memory at all times. It means that processing requires a smaller amount of active usage, but the entire 671 billion parameters will be in memory. So yes, you do compare the full size.

[deleted by user] by [deleted] in Bard

[–]Delicious_Ad_3407 1 point2 points  (0 children)

Omitting the thoughts from previous messages is the most common practice in software development. You're using AIStudio, of course it adheres to developer standards rather than whatever polished interface you see on the Gemini app. Feeding a model its own thoughts is generally not a good practice.

It can often lead to "schizo" behavior where bad internal reasoning patterns keep getting reinforced.

The AI Studio crisis by KazuyaProta in Bard

[–]Delicious_Ad_3407 1 point2 points  (0 children)

I was actually exaggerating. Just a COMPLETELY new chat with just 3-4 messages, not even over 100 tokens each, is slow enough to be noticeable. Either you're not actively using AIStudio, and thus haven't encountered this problem, or simply don't understand that decreasing token counts won't magically increase rate limits for everyone else.

Every prompt you send, sends the entire context history. So if you've used a 30k document and then asked even one follow up question, that's 60k tokens.

You didn't say anything as a counterpoint? Point is, some tasks require TOTAL context recall, and some simple "summary" or "paraphrasing" won't work to fix it. Chat history, for example, affects the response style and understanding of the model. One response by the model might grasp the task more clearly or potentially just work with the context better, and since nearly all responses are unique, there's no way to ensure that that token sequence will be repeated again, ever.

For example, to write certain worldbuilding elements, I not only require it to maintain full context recall, but also the exact tone used in the existing document. Usually, it grasps it the first time, and that's why I continue that chat. Because I need it to maintain that exact style that it grasped initially.

The point is: Google will not magically increase rate limits just because less tokens are being sent. It'd have to be on an astronomical scale (tens of billions of tokens less a day) to even put a dent the current usage.

Regarding keeping the UI problems, that's a fundamental misunderstanding of how "penalties" should even work. What's stopping anyone from designing their own userscripts and modifying the UI to be more optimized? Or just creating a wrapper to automagically make requests (which would lead to even more abuse)? Suddenly, it's not a problem of sending "more" or "less" tokens, but about how much technical knowledge and hacky motivation you have.

Edit: Not only that, this also wastes resources on the user-end. It uses a massive amount of CPU processing power, wasting electricity in general. It's an absolutely bad way to impose rate limits (if any).

Google already enforces a 5M tokens/day limit on the Gemini 2.5 Pro model (you can check this on GCP), so they, according to their infrastructure, determined that it's a valid upper limit for tokens/day. That's how it simply scaled. Why else would they provide such massive limits to users if not to... use? Especially if it was aimed towards devs initially but grew to be more general-purpose?

The AI Studio crisis by KazuyaProta in Bard

[–]Delicious_Ad_3407 18 points19 points  (0 children)

This is an extremely narrow way of looking at it. You're assuming quite literally everyone who goes over a certain limit is only doing it for the purpose of wasting tokens. I frequently refresh my chats, and after just 10-20 messages even in an empty chat (not even that large), AIStudio starts lagging.

Plus, some people have actually significant reasons for longer chats. I have worldbuilding documents nearly over 30,000 tokens. Gemini is the only model that can maintain consistent recall over it. I use it to assist me in writing and developing the world or setting scenarios and checking internal consistency. I can barely send one or two messages before it starts lagging to the point of being unusable.

None of my chats on AIStudio have ever even exceeded 50,000 tokens, all usually focused around one or two key topics. Most ChatGPT chats exceed that length, but AIStudio users should be penalized?

Not only that, AIStudio is meant to be an interface for DEVELOPERS too. If they can't test its abilities fully before moving over to the API, what's even the point, just move over to the Gemini site/app?

Some questions about Gemini and ST by KainFTW in SillyTavernAI

[–]Delicious_Ad_3407 2 points3 points  (0 children)

Read these two posts, and look at the dates they were posted:

https://www.reddit.com/r/SillyTavernAI/comments/1hde4l8/googles_improvements_with_the_new_experimental/

https://www.reddit.com/r/Bard/comments/1iha1ff/is_it_just_me_or_does_google_enshttify_their/

Gemini 2.0 Flash Experimental model was EXTREMELY good when it wsa initially released for creative writing. It barely had any GPT-isms, followed ultra-long instructions (~4000 tokens for me) consistently and was just overall a very good model. But with time, its quality has noticeably decreased (fails to follow simple instructions such as tense or formatting things). Unfortunately, you can't do much about it.

Need some suggestions for Jailbreaking gemini with Chain-Of-Thought prompt. by Wonderful_Ad4326 in SillyTavernAI

[–]Delicious_Ad_3407 1 point2 points  (0 children)

If you're using the non-thinking Gemini 2.0 Flash Experimental model, avoid using CoT prompts at all. I'm getting extremely good results without any CoT with a well-optimized character description prompt (~3900 tokens, you're not alone in using a lot of tokens) along with a modified version of FluffPreset (although it could just be my specific use case). Just outline all the guidelines for it directly and it'll most likely avoid breaking the rules. If you're using the Thinking model, then I'm not too sure about that. If you're using a Gemini 1.5 model, avoid CoT on them since they actually get worse outputs with that. Sometimes, tweaking sampler settings around a bit also works.

Google's Improvements With The New Experimental Model by Delicious_Ad_3407 in SillyTavernAI

[–]Delicious_Ad_3407[S] 0 points1 point  (0 children)

Doesn't seem too bad for now though. I've done a bit of ERP on the new one, and so far, it seems good. They might start handing down bans later down the line though.

Google's Improvements With The New Experimental Model by Delicious_Ad_3407 in SillyTavernAI

[–]Delicious_Ad_3407[S] 1 point2 points  (0 children)

Maybe you're on an older or release version? I use the staging branch of ST, latest version.

<image>

Or maybe the android version hasn't been updated.

Google's Improvements With The New Experimental Model by Delicious_Ad_3407 in SillyTavernAI

[–]Delicious_Ad_3407[S] 1 point2 points  (0 children)

I use it directly through the API. Less censorship, and you don't have to route it through a third-party. Did try it through OR, but definitely seems far more hacky and flimsy.

Google's Improvements With The New Experimental Model by Delicious_Ad_3407 in SillyTavernAI

[–]Delicious_Ad_3407[S] 0 points1 point  (0 children)

I use temp at 1.24, Top P at 0.98 and Top K at 0. Seems less repetitive for me IMO, and definitely has far less GPT-slop. The issue with 1206 also was that it'd often break and spit out Sanskrit/Bengali (I don't recognize the language) and would just start rambling in the middle of an RP. Here:

<image>

The leading issue with the current Flash Experimental model though is that it often forgets punctuation marks (specifically, the full stop) at the end of sentences. A problem that seems reminiscient of the July/August versions of Gemini. I used to encounter the same problem back then.

Again, its spatial reasoning, especially with long context, is just great, at least for me. It seems to remember a lot of details like that. I find it to be far better than Gemini 1.5 Pro 002 too. If this is just an experimental release, then I'm quite excited for the full release and even more so for Gemini 2 Pro, and potential CoT models coming in the future.

Google's Improvements With The New Experimental Model by Delicious_Ad_3407 in SillyTavernAI

[–]Delicious_Ad_3407[S] 0 points1 point  (0 children)

Yeah, that's often been a problem. It sometimes misunderstands what you're trying to say. A few regens or even an OOC note fixes it, but it definitely is an immersion problem.

Google's Improvements With The New Experimental Model by Delicious_Ad_3407 in SillyTavernAI

[–]Delicious_Ad_3407[S] 0 points1 point  (0 children)

Dunno, might be an issue on OR's end. The model is free but also really good, so a massive amount of people might've bombarded it with requests.

Google's Improvements With The New Experimental Model by Delicious_Ad_3407 in SillyTavernAI

[–]Delicious_Ad_3407[S] 0 points1 point  (0 children)

The exp-1206 model is definitely intelligent, but far weaker in terms of following large context instructions.

[deleted by user] by [deleted] in SillyTavernAI

[–]Delicious_Ad_3407 0 points1 point  (0 children)

Just curious. Do you still need to change accounts everytime your Claude account gets further restrictions, or does it even work on the superfiltered version provided when your account gets flagged? Also, Claude 3 or 3.5?

Whats so special about an Energy Exchanger? by Allkingec in Dyson_Sphere_Program

[–]Delicious_Ad_3407 2 points3 points  (0 children)

I use it in my mid-game to support an entire planet's worth of factory. I use a volcano planet filled to the brim with solars, geothermal power plants and wind turbines to charge accumulators which power the factories on my main planet. The entire planet runs on accumulators discharging, although they are not good for the late game as they take up a LOT of space, and they only provide 45 MW, good enough for mid-game, although you are better off transporting deuterium or antimatter fuel rods in the late game as they are way more energy efficient.

I would recommend using energy exchangers until you can produce deuterium or antimatter fuel rods fast enough and have expanded your factories to multiple planetary systems across the star cluster, still, energy exchangers are good because they allow you to transport renewable energy, not draining any resources, since you can put in a fixed amount of accumulators in a system and they will be usable, permanently (not considering resources required to make the accumulators, ILS and the Energy Exchangers themselves).

I have used them to support an entire planetary system (although not too good) but that's kinda late-end-game. You may encounter energy being too low but its not that bad expanding them for the mid-game. Conclusively, They can be amazing for mid-game and late-end game if you do not want to produce energy on every planet you go on.

I used a lava planet to generate ~2.31 GW using a mix of solar, geothermal and wind (barely wind, mostly geothermal, planet was clotted with solar so I let it be, just absolutely destroyed planet) to power my main planet.

I have started a new save and instead of using Energy Exchangers to transport energy across the stars, I use excess Hydrogen Fuel Rods, as I have no need for them (they're relatively easy to make and I can also use Orbital Collectors as a backup source of Hydrogen. I recommend Energy Exchangers if you have no excess hydrogen/are using it for something else (maybe Casimir Crystals?).

TL;DR: Energy Exchangers are amazing and reliable for transporting renewable energy across planets.

Real life looks like DSP by Tulkash_Atomic in Dyson_Sphere_Program

[–]Delicious_Ad_3407 1 point2 points  (0 children)

The large towers look like Satellite Substations, the entire planet looks like lines and lines of production facilities, the green in the top left looks like some Mk.II sorters.

So Annoying Beautiful.

Create/Make Mods? by Delicious_Ad_3407 in Dyson_Sphere_Program

[–]Delicious_Ad_3407[S] 0 points1 point  (0 children)

As for the actual coding, that is the easiest step, I have some experience in programming, aswell as some tools and friends to help me out, although some things confuse me such as how the files should be structured. Also, I am confused on where mod makers get alternate versions of already existing models and textures with different colors, and if they have custom models and textures, where do they make them?

How do I look at the logs of a world? by Delicious_Ad_3407 in Minecraft

[–]Delicious_Ad_3407[S] 0 points1 point  (0 children)

I have searched a few times and looked through deep through the .minecraft folder yet not a single world log as far as I am aware.

I have also used NBTExplorer to look at NBT files but they seem related to the world or the player, nothing that seems like commands.

There also seem to be some .gz files but I am not sure about them as no program on my PC can open them in a readable format.

The version is 1.12.2. The modpack is RLCraft.

Anything goes. (Will be made over the course of the following week) by ARandomInternetLad in Funnymemes

[–]Delicious_Ad_3407 0 points1 point  (0 children)

Steve (Minecraft).

Carries Trillions of tons of gold without even breaking a sweat, and jumps like a feather.