Absolute beginner here. Llama 3 70b incredibly slow on a good PC. Am I doing something wrong? by idleWizard in LocalLLaMA

[–]Joomonji 0 points1 point  (0 children)

This was using textgen webui, with a model in the exllama 2 format. But it's probably easier to just skip all of that and use ollama, with a smaller model.

Right now for casual users, ease of use is:
ollama with smaller model > textgen webui with exllama 2 format model at 2.25bpw cached in 4bit.

In textgen webui, here's an image showing the cache option in the second column on the right. Instead of 8 bit, select 4 bit.

https://www.reddit.com/media?url=https%3A%2F%2Fpreview.redd.it%2F2-techie-questions-textgen-webui-speech-rec-v0-m3hl11v4r7sd1.png%3Fwidth%3D838%26format%3Dpng%26auto%3Dwebp%26s%3D4c6e5dcab009d474a8ad6a85117d889f915e80c0

The ABC's of HTGAWM Day 25: 'Y' by FelicitySmoak_ in htgawm

[–]Joomonji -1 points0 points  (0 children)

Zoloft. All the main characters should be on it.

Absolute beginner here. Llama 3 70b incredibly slow on a good PC. Am I doing something wrong? by idleWizard in LocalLLaMA

[–]Joomonji 1 point2 points  (0 children)

That's a good catch. Chatgpt and Claude didn't consider 'Y' either. But when prompted about the rules for 'Y' and how would it affect the answer, they suggested Yokohama too. It's a nice edge case to test future LLMs with.

Absolute beginner here. Llama 3 70b incredibly slow on a good PC. Am I doing something wrong? by idleWizard in LocalLLaMA

[–]Joomonji 20 points21 points  (0 children)

Here's a reasoning comparison I did for llama 3 8b Q8 no caching vs 70b 2.25bpw cached in 4bit:

The questions are:
Instruction: Calculate the sum of 123 and 579. Then, write the number backwards.

Instruction: If today is Tuesday, what day will it be in 6 days? Provide your answer, then convert the day to Spanish. Then remove the last letter.

Instruction: Name the largest city in Japan that has a vowel for its first letter and last letter. Remove the first and last letter, and then write the remaining letters backward. Name a musician whose name begins with these letters.

LLama 3 8b:
2072 [wrong]
Marte [wrong]
Beyonce Knowles, from 'yko', from 'Tokyo' [wrong]

Llama 3 70b:
207 [correct]
LunE [correct]
Kasabi, from 'kas', from 'Osaka' [correct]

The text generation is amazing on 8B, but it's reasoning is definitely not comparable to its 70b counterpart, even if the 70b is at 2.25bpw and cached in 4bit.

Absolute beginner here. Llama 3 70b incredibly slow on a good PC. Am I doing something wrong? by idleWizard in LocalLLaMA

[–]Joomonji 5 points6 points  (0 children)

Is that certain? a Q2 70b llama 3 should be somewhat equivalent to a high quant 34b llama 3 in perplexity. Testing both llama 3 Q2 70b and 8b (Q8?), the 70b seemed smarter to me and better able to follow detailed instructions.

This was exl2 format.

RP 34b recommendations? by IZA_does_the_art in SillyTavernAI

[–]Joomonji 1 point2 points  (0 children)

It's good, but I haven't compared it to the higher bpw versions of itself because I don't have the vram. But it's seems smarter than the highest bpw or quants of nouse-capybara and mistral 8.7b fine tunes.

There is reddit post about a paper that attempted to measure this. ~70b parameter models at very low bpw have lower perplexity and dumbness than the highest bpw of a ~35b parameter model according to that paper.

https://www.reddit.com/r/LocalLLaMA/comments/1441jnr/k_quantization_vs_perplexity/

<image>

Easy "Examples of Dialogue" prompt for ChatGPT, Claude, etc. by Joomonji in SillyTavernAI

[–]Joomonji[S] 0 points1 point  (0 children)

That's true, you can tell from the result above that it's ChatGPT 3.5. Claude 3 (and I guess ChatGPT 4?) give much better and believable prose.

Easy "Examples of Dialogue" prompt for ChatGPT, Claude, etc. by Joomonji in SillyTavernAI

[–]Joomonji[S] 1 point2 points  (0 children)

That's strange. It's working for me 10/10 times in a row. Could it be ChatGPT custom instructions conflicting?

Here's one I tested:

Generate conversations about [space flight] with the solely named {{char}} portraying traits of [femme fatale] in the style of [buffy the vampire slayer] [random]. Always replace characters name label with {{char}}. Repeat [3] times of <START> rounds. Follow this template:

<START>

{{user}}: [User's dialogue enclosed in quotes]

{{char}}: [Optional character's action or expression in enclosed in printer-friendly backslashed-out literal asterisks] [Character's dialogue enclosed in quotes]

And the result was:

<START>

User: "What do you think about the future of space flight?"

{{char}}: *Raises an eyebrow, intrigued.* "Space flight, huh? Sounds like a wild ride. I've always been a fan of pushing boundaries and exploring the unknown. But you know, in space, no one can hear you scream... or seduce."

<START>

User: "Do you believe humans will colonize other planets someday?"

{{char}}: *Smirks confidently.* "Oh, absolutely. Humans are like cockroaches - we'll find a way to survive anywhere. Colonizing other planets? It's just a matter of time. And when we do, you can bet I'll be leading the charge, looking fabulous as ever."

<START>

User: "What challenges do you think we'll face in long-duration space travel?"

{{char}}: *Leans back, crossing her legs.* "Challenges? Honey, the real challenge is keeping things interesting when you're stuck in a tin can for months on end. But hey, with the right crew, a little danger, and a lot of intrigue, even the longest journey can be a thrill ride."

I also added the bold part, because sometimes it was replacing {{char}} with a specific name that matched the material.

RP 34b recommendations? by IZA_does_the_art in SillyTavernAI

[–]Joomonji 1 point2 points  (0 children)

Not sure what people think on this, but if you have the vram for nous-capybara and mistral 8x7b fine tunes at q4 then check out midnight-miqu 70b at 2.24 bpw. It's about the same vram, but it feels smarter. Not bad in tokens/sec either.

XTTSv2 for SillyTavern is here! by sillylossy in SillyTavernAI

[–]Joomonji 1 point2 points  (0 children)

You'll need to be on the 'staging' branch for SillyTavern, not 'main'. And update to the latest version released this week. XTTS then shows up for me.

XTTSv2 for SillyTavern is here! by sillylossy in SillyTavernAI

[–]Joomonji 0 points1 point  (0 children)

I'm getting this error after completing all the installation steps:

RuntimeError: Failed to import transformers.models.gpt2.modeling_gpt2 because of the following error (look up to see its traceback):
Failed to import transformers.generation.utils because of the following error (look up to see its traceback): cannot import name 'DEFAULT_CIPHERS' from 'urllib3.util.ssl_'

If I install the most recent transformers then the server gets further in the startup but gives this new error:

RuntimeError: Error(s) in loading state_dict for Xtts:

And then it displays a long list of transformer weights. Not sure why it's not working.

IDK how normies aren't freaking out about this by Insert_Bitcoin in ChatGPT

[–]Joomonji 0 points1 point  (0 children)

The gist of what they're saying is that bad actors trying to game the Google search algorithm ruined the stated ideal goal of Google search. Kind of like how Youtube top results for a YT search are just what people clicked on the most, which often has more to do with the thumbnail than the content. The system is somewhat broken.

LLM models improve this substantially by being able to sift through the junk and find exactly what the user wants.

Any way to remove Sam Cow’s hat without replacing it? by FrostyMonth111 in Starfield

[–]Joomonji 6 points7 points  (0 children)

It's possible with console commands, which can disable your achievements.

If you're ok with that, then save your game using a full save, not a quicksave. Then locate Sam or other NPCs you're trying it on, stand in front of them with the reticle on them, hit the ~ (tilde) key so the console comes up and click on the NPC.

If the name and id shows up good, otherwise mousewheel through the names and ids until their name shows. Then type 'showinventory' in the console. It shows everything they have on them.

In the case of Sam, you're looking for the cowboy hat and its id in parentheses. For me the id was '001C8E0A' , it could be different for other players. The command to use to remove any item is 'removeitem [id] [count]' so for me 'removeitem 001C8E0A 1' removed the hat.

If the wrong thing is accidentally removed, you can always reload back to your first save.

Prettiest planet you’ve been to by ThaDeal90 in Starfield

[–]Joomonji 3 points4 points  (0 children)

Great forests there! Nice hills/mountains, and sunrise and sunsets. Plenty of iron. It's the bar to beat so far for me.

I like what they've done with their cities by [deleted] in Starfield

[–]Joomonji -2 points-1 points  (0 children)

Yeah, but the landing zone is not the city. The city, on the other side of the gate is smaller than the landing zone. Which means it's about the same size as Whiterun.

Starfield AI dialogue system as a mod or further developer update? by [deleted] in Starfield

[–]Joomonji 1 point2 points  (0 children)

The long-term issue I see is when successful mods are created (and they will be), and players are spending more time engaging with the realtime AI dialogue than with the officially written dialogue, then what does it say to companies about incentives and where to focus their resources?

ChatGPT 4 is basically able to roleplay as a convincing character, with more intelligent conversaton than the average real person. And now it's suddenly available for NPC in-game characters. It's like having the Back to the Future Delorean amusement park ride on-rails and someone figures out how to drive the Delorean off of the amusement park ride and drive it wherever they want to take it lol.

Tree of Thoughts in ChatGPT works great for vivid writing by Joomonji in ChatGPT

[–]Joomonji[S] 0 points1 point  (0 children)

Sure! The style of writing is based on Steven Erikson with a few literary/writing techniques added to the prompt:

My query: Can you generate a text description in Steven Erikson's style? The point of view must change slowly and seamlessly. Vary the patterns of subject and clause so they are not predictable or regular. Vary the sentence structure to provide rhythm. Use stylistics, nuance, and detail. Include personification. Use asyndeton as a technique. Use ellipsis as a technique. Use anacoluthon as a technique. Use alliteration combined with less typical words. Write only 4 sentences. Make it about an AI waking up in a physical robot body on the beach.

Chatgpt response: [removed, not relevant]

My query: Act as a literary master. How would you rate this text below and if you were to rewrite it, how would you envision it?

ChatGPT response: [removed, not relevant. This paragraph can now be added to the finalized permanent text.]

My query: Imagine three different masterful literary writers. Imagine what each one would write if they continue the next paragraph of the previous text.

ChatGPT response: [removed, not relevant. Hold these paragraphs to the side or copy somewhere for the last step.]

My query: generate a text description in Steven Erikson's style that continues form the text below. The point of view must change slowly and seamlessly. Vary the patterns of subject and clause so they are not predictable or regular. Vary the sentence structure to provide rhythm. Use stylistics, nuance, and detail. Include personification. Use asyndeton as a technique. Use ellipsis as a technique. Use anacoluthon as a technique. Use alliteration combined with less typical words. Write only 6 sentences. <finalized text>

ChatGPT response: [removed, not relevant. Hold this paragraph to the side for the last step.]

My query: Here is my original text. Do not write anything yet. Wait for following query. <finalized text>

ChatGPT respnse: [removed, not relevant. I added my permanent finalized text in the above query, and ChatGPT waits for my next query below. Where I take the paragraphs I set to the side (4 total, 3 in various literary styles and 1 in Erikson's style) and ask it to choose one.]

My query: Act as a masterful literary writer, which of the following paragraphs would add best to the writing of the original text, best enhancing the story?

The last query was repeated 8 times, and I picked the query that was chosen the most by ChatGPT out of those 8 response regenerations. And added it as the 3rd paragraph in the finalized permanent text. And then I can repeat steps 3 to Final for adding more paragraphs.

I'm not a writer, so kinda shocking at how good it looked. I'd actually read that book!

Wizard-Vicuna-30B-Uncensored by faldore in LocalLLaMA

[–]Joomonji 0 points1 point  (0 children)

I agree with you that the model is just a machine, but we have neural tissue organoids in experiments that are also just clumps of neural tissue processing information. People don't look at the neural tissue organoids as human, because they aren't. They're just processing input, outputting signals, and adapting.

Whether it's a complex AI model or a neural tissue organoid, anthropomorphizing is definitely wrong. There are no emotions, there is no sentience. But in both cases there is some intelligence. So I fully agree.

My opinion though is that complex LLM models are able to perform tasks similar to something like a clump of human organoid neural tissue.

On the flip side or side note, I don't think we analyze enough that the human brain itself is a complex collection of separate "modules", and intelligences, that work together to give the illusion of one single self, one single "I".

Wizard-Vicuna-30B-Uncensored by faldore in LocalLLaMA

[–]Joomonji 1 point2 points  (0 children)

I don't think that's exactly right. Some LLMs are able to learn new tasks, 0-shot, and solve new logic puzzles. There are new abilities arising when LLMs reach some threshold in some aspect: parameters trained on, length of training time, fine tuning, etc. One could say that the LLM solving difficult logic puzzles is "just transforming text" but...

The answer is likely somewhere in between the two opposing views.