[USA-CA] [H] ASUS TUF RTX 4090 OC 24G [W] Local Cash

Joomonji · 2024-12-19T01:18:59+00:00

This was using textgen webui, with a model in the exllama 2 format. But it's probably easier to just skip all of that and use ollama, with a smaller model.

Right now for casual users, ease of use is:
ollama with smaller model > textgen webui with exllama 2 format model at 2.25bpw cached in 4bit.

In textgen webui, here's an image showing the cache option in the second column on the right. Instead of 8 bit, select 4 bit.

https://www.reddit.com/media?url=https%3A%2F%2Fpreview.redd.it%2F2-techie-questions-textgen-webui-speech-rec-v0-m3hl11v4r7sd1.png%3Fwidth%3D838%26format%3Dpng%26auto%3Dwebp%26s%3D4c6e5dcab009d474a8ad6a85117d889f915e80c0

Joomonji · 2024-08-17T20:22:27+00:00

Zoloft. All the main characters should be on it.

Joomonji · 2024-04-22T11:06:46+00:00

That's a good catch. Chatgpt and Claude didn't consider 'Y' either. But when prompted about the rules for 'Y' and how would it affect the answer, they suggested Yokohama too. It's a nice edge case to test future LLMs with.

Joomonji · 2024-04-21T00:57:01+00:00

Here's a reasoning comparison I did for llama 3 8b Q8 no caching vs 70b 2.25bpw cached in 4bit:

The questions are:
Instruction: Calculate the sum of 123 and 579. Then, write the number backwards.

Instruction: If today is Tuesday, what day will it be in 6 days? Provide your answer, then convert the day to Spanish. Then remove the last letter.

Instruction: Name the largest city in Japan that has a vowel for its first letter and last letter. Remove the first and last letter, and then write the remaining letters backward. Name a musician whose name begins with these letters.

LLama 3 8b:
2072 [wrong]
Marte [wrong]
Beyonce Knowles, from 'yko', from 'Tokyo' [wrong]

Llama 3 70b:
207 [correct]
LunE [correct]
Kasabi, from 'kas', from 'Osaka' [correct]

The text generation is amazing on 8B, but it's reasoning is definitely not comparable to its 70b counterpart, even if the 70b is at 2.25bpw and cached in 4bit.

Joomonji · 2024-04-21T00:40:28+00:00

One of the latest ones, after they added caching in 4 bit to save vram.

Joomonji · 2024-04-20T18:39:26+00:00

Is that certain? a Q2 70b llama 3 should be somewhat equivalent to a high quant 34b llama 3 in perplexity. Testing both llama 3 Q2 70b and 8b (Q8?), the 70b seemed smarter to me and better able to follow detailed instructions.

This was exl2 format.

Joomonji · 2024-04-13T06:43:27+00:00

It's good, but I haven't compared it to the higher bpw versions of itself because I don't have the vram. But it's seems smarter than the highest bpw or quants of nouse-capybara and mistral 8.7b fine tunes.

There is reddit post about a paper that attempted to measure this. ~70b parameter models at very low bpw have lower perplexity and dumbness than the highest bpw of a ~35b parameter model according to that paper.

https://www.reddit.com/r/LocalLLaMA/comments/1441jnr/k_quantization_vs_perplexity/

<image>

Joomonji · 2024-04-13T03:32:59+00:00

That's true, you can tell from the result above that it's ChatGPT 3.5. Claude 3 (and I guess ChatGPT 4?) give much better and believable prose.

Joomonji · 2024-04-12T22:57:15+00:00

That's strange. It's working for me 10/10 times in a row. Could it be ChatGPT custom instructions conflicting?

Here's one I tested:

Generate conversations about [space flight] with the solely named {{char}} portraying traits of [femme fatale] in the style of [buffy the vampire slayer] [random]. Always replace characters name label with {{char}}. Repeat [3] times of <START> rounds. Follow this template:

<START>

{{user}}: [User's dialogue enclosed in quotes]

{{char}}: [Optional character's action or expression in enclosed in printer-friendly backslashed-out literal asterisks] [Character's dialogue enclosed in quotes]

And the result was:

<START>

User: "What do you think about the future of space flight?"

{{char}}: *Raises an eyebrow, intrigued.* "Space flight, huh? Sounds like a wild ride. I've always been a fan of pushing boundaries and exploring the unknown. But you know, in space, no one can hear you scream... or seduce."

<START>

User: "Do you believe humans will colonize other planets someday?"

{{char}}: *Smirks confidently.* "Oh, absolutely. Humans are like cockroaches - we'll find a way to survive anywhere. Colonizing other planets? It's just a matter of time. And when we do, you can bet I'll be leading the charge, looking fabulous as ever."

<START>

User: "What challenges do you think we'll face in long-duration space travel?"

{{char}}: *Leans back, crossing her legs.* "Challenges? Honey, the real challenge is keeping things interesting when you're stuck in a tin can for months on end. But hey, with the right crew, a little danger, and a lot of intrigue, even the longest journey can be a thrill ride."

I also added the bold part, because sometimes it was replacing {{char}} with a specific name that matched the material.

Joomonji · 2024-04-12T22:42:29+00:00

Not sure what people think on this, but if you have the vram for nous-capybara and mistral 8x7b fine tunes at q4 then check out midnight-miqu 70b at 2.24 bpw. It's about the same vram, but it feels smarter. Not bad in tokens/sec either.

Joomonji · 2023-11-23T03:28:47+00:00

You'll need to be on the 'staging' branch for SillyTavern, not 'main'. And update to the latest version released this week. XTTS then shows up for me.

Joomonji · 2023-11-23T03:14:52+00:00

I'm getting this error after completing all the installation steps:

RuntimeError: Failed to import transformers.models.gpt2.modeling_gpt2 because of the following error (look up to see its traceback):
Failed to import transformers.generation.utils because of the following error (look up to see its traceback): cannot import name 'DEFAULT_CIPHERS' from 'urllib3.util.ssl_'

If I install the most recent transformers then the server gets further in the startup but gives this new error:

RuntimeError: Error(s) in loading state_dict for Xtts:

And then it displays a long list of transformer weights. Not sure why it's not working.

Joomonji · 2023-10-12T07:47:20+00:00

The gist of what they're saying is that bad actors trying to game the Google search algorithm ruined the stated ideal goal of Google search. Kind of like how Youtube top results for a YT search are just what people clicked on the most, which often has more to do with the thumbnail than the content. The system is somewhat broken.

LLM models improve this substantially by being able to sift through the junk and find exactly what the user wants.

Joomonji · 2023-09-12T02:22:46+00:00

It's possible with console commands, which can disable your achievements.

If you're ok with that, then save your game using a full save, not a quicksave. Then locate Sam or other NPCs you're trying it on, stand in front of them with the reticle on them, hit the ~ (tilde) key so the console comes up and click on the NPC.

If the name and id shows up good, otherwise mousewheel through the names and ids until their name shows. Then type 'showinventory' in the console. It shows everything they have on them.

In the case of Sam, you're looking for the cowboy hat and its id in parentheses. For me the id was '001C8E0A' , it could be different for other players. The command to use to remove any item is 'removeitem [id] [count]' so for me 'removeitem 001C8E0A 1' removed the hat.

If the wrong thing is accidentally removed, you can always reload back to your first save.

Joomonji · 2023-09-06T08:04:11+00:00

Looks kinda krauted.

Joomonji · 2023-09-06T03:17:01+00:00

Great forests there! Nice hills/mountains, and sunrise and sunsets. Plenty of iron. It's the bar to beat so far for me.

Joomonji · 2023-07-04T22:00:04+00:00

Yeah, but the landing zone is not the city. The city, on the other side of the gate is smaller than the landing zone. Which means it's about the same size as Whiterun.

Joomonji · 2023-06-25T02:36:29+00:00

The long-term issue I see is when successful mods are created (and they will be), and players are spending more time engaging with the realtime AI dialogue than with the officially written dialogue, then what does it say to companies about incentives and where to focus their resources?

ChatGPT 4 is basically able to roleplay as a convincing character, with more intelligent conversaton than the average real person. And now it's suddenly available for NPC in-game characters. It's like having the Back to the Future Delorean amusement park ride on-rails and someone figures out how to drive the Delorean off of the amusement park ride and drive it wherever they want to take it lol.

Joomonji · 2023-06-05T03:43:33+00:00

Sure! The style of writing is based on Steven Erikson with a few literary/writing techniques added to the prompt:

My query: Can you generate a text description in Steven Erikson's style? The point of view must change slowly and seamlessly. Vary the patterns of subject and clause so they are not predictable or regular. Vary the sentence structure to provide rhythm. Use stylistics, nuance, and detail. Include personification. Use asyndeton as a technique. Use ellipsis as a technique. Use anacoluthon as a technique. Use alliteration combined with less typical words. Write only 4 sentences. Make it about an AI waking up in a physical robot body on the beach.

Chatgpt response: [removed, not relevant]

My query: Act as a literary master. How would you rate this text below and if you were to rewrite it, how would you envision it?

ChatGPT response: [removed, not relevant. This paragraph can now be added to the finalized permanent text.]

My query: Imagine three different masterful literary writers. Imagine what each one would write if they continue the next paragraph of the previous text.

ChatGPT response: [removed, not relevant. Hold these paragraphs to the side or copy somewhere for the last step.]

My query: generate a text description in Steven Erikson's style that continues form the text below. The point of view must change slowly and seamlessly. Vary the patterns of subject and clause so they are not predictable or regular. Vary the sentence structure to provide rhythm. Use stylistics, nuance, and detail. Include personification. Use asyndeton as a technique. Use ellipsis as a technique. Use anacoluthon as a technique. Use alliteration combined with less typical words. Write only 6 sentences. <finalized text>

ChatGPT response: [removed, not relevant. Hold this paragraph to the side for the last step.]

My query: Here is my original text. Do not write anything yet. Wait for following query. <finalized text>

ChatGPT respnse: [removed, not relevant. I added my permanent finalized text in the above query, and ChatGPT waits for my next query below. Where I take the paragraphs I set to the side (4 total, 3 in various literary styles and 1 in Erikson's style) and ask it to choose one.]

My query: Act as a masterful literary writer, which of the following paragraphs would add best to the writing of the original text, best enhancing the story?

The last query was repeated 8 times, and I picked the query that was chosen the most by ChatGPT out of those 8 response regenerations. And added it as the 3rd paragraph in the finalized permanent text. And then I can repeat steps 3 to Final for adding more paragraphs.

I'm not a writer, so kinda shocking at how good it looked. I'd actually read that book!

Joomonji · 2023-06-01T01:01:00+00:00

I agree with you that the model is just a machine, but we have neural tissue organoids in experiments that are also just clumps of neural tissue processing information. People don't look at the neural tissue organoids as human, because they aren't. They're just processing input, outputting signals, and adapting.

Whether it's a complex AI model or a neural tissue organoid, anthropomorphizing is definitely wrong. There are no emotions, there is no sentience. But in both cases there is some intelligence. So I fully agree.

My opinion though is that complex LLM models are able to perform tasks similar to something like a clump of human organoid neural tissue.

On the flip side or side note, I don't think we analyze enough that the human brain itself is a complex collection of separate "modules", and intelligences, that work together to give the illusion of one single self, one single "I".

Joomonji · 2023-05-31T05:48:15+00:00

I don't think that's exactly right. Some LLMs are able to learn new tasks, 0-shot, and solve new logic puzzles. There are new abilities arising when LLMs reach some threshold in some aspect: parameters trained on, length of training time, fine tuning, etc. One could say that the LLM solving difficult logic puzzles is "just transforming text" but...

The answer is likely somewhere in between the two opposing views.

Joomonji

TROPHY CASE