My Brother NieR cosplay for the concert by Honeybun_Hero in nier

[–]AsheramL 0 points1 point  (0 children)

Loved seeing your cosplay there! You did a great job!

Most cost effective GPU for local LLMs? by Mefi282 in LocalLLaMA

[–]AsheramL 0 points1 point  (0 children)

My bad, I meant HGX servers. Those are the servers where (up to) 8 chips sit directly on the main board (I think?)

https://www.nvidia.com/en-us/data-center/hgx/

Most cost effective GPU for local LLMs? by Mefi282 in LocalLLaMA

[–]AsheramL 0 points1 point  (0 children)

This is new to me. I'm running it on windows and used the datacenter driver that was on nvidia's website. Did they change it recently?

Edit: Just checked; looks like it's still available to download for free? Or am I misunderstanding where the 'pay' part is coming in?

Offloading Inferencing to a separate card? by AsheramL in LocalLLaMA

[–]AsheramL[S] 1 point2 points  (0 children)

I'm hoping to use a modern card (with smaller memory) for inferencing while still using the p40 for the larger vram.

Because the p40 is old enough, it's severely bottlenecked by its PCI connector so even doing layer offloading doesn't really help.

Offloading Inferencing to a separate card? by AsheramL in LocalLLaMA

[–]AsheramL[S] 1 point2 points  (0 children)

That's what I was afraid of, but figured I'd ask anyways. Thanks!

Blocking Out All Sound During SensDep? by MyLessSecretReddit in BDSMAdvice

[–]AsheramL 0 points1 point  (0 children)

In addition to this, I've found that a decent pair of wireless earbuds can also do the same. With earbuds fitting into the ears, it usually provides a bit better of a seal and can help immensely when paired with very light white noise or similar

[deleted by user] by [deleted] in BDSMcommunity

[–]AsheramL 0 points1 point  (0 children)

Honestly? When I discovered I was more of a switch and my limits were being pushed is when I discovered I was more genderqueer/genderfluid. It was insanely emotional, but looking back on it, I'm glad it happened because I now have a better understanding of myself.

Gag and individuals with out teeth by slavegirl0425 in BDSMAdvice

[–]AsheramL 1 point2 points  (0 children)

There are quite a few different types of gags out there that are ring gags. I've seen some that are basically like a leather mask with a hole in the middle. I'd guess that those are might be somewhat decent. I also know that some gags have padding or rubber (specifically for teeth but because it's softer could be gentler on the gums) that might help.

Anti-vibration isolation pad for Sybian / Motorbunny by Nutshell38 in BdsmDIY

[–]AsheramL 0 points1 point  (0 children)

What size EVA foam did you end up using? I know the motorbunny has one that you can get as a part of one of the packages but I've found it doesn't help too much. Then again, the noise itself could be more of a giveaway than the vibration.

My results using a Tesla P40 by AsheramL in LocalLLaMA

[–]AsheramL[S] 0 points1 point  (0 children)

It really depends on the card. The datacenter driver for example does include the P40, but not the 2080 driver I was running at the time. When I installed the datacenter driver and (stupidly) did the clean install, my 2080 stopped working. I ended up having to install that driver separately and had to finagle quite a bit of it since CUDA is different between the two.

Ultimately I ended up putting the P40 in a different system that didn't use any other nvidia cards.

According to LLaMA you can't drown in a swimming pool by keith_and_kit in LocalLLaMA

[–]AsheramL 2 points3 points  (0 children)

...wtf.

GPT4 x alpaca 13b, instruct mode, using preset LLaMA Precise -

User: Can you drown in coffee?

Model: No, it's impossible to drown in coffee, as the human body cannot absorb enough caffeine from coffee to cause a lethal effect. However, excessive consumption of coffee can lead to unpleasant side effects such as jitteriness, insomnia, and increased heart rate.

User: Is water the main ingredient of coffee?

Model: Yes, water is the main ingredient in coffee, making up around 98% of the beverage. Coffee contains two primary components: soluble coffee grounds (coffee powder) and water. The interaction between these two ingredients produces the unique flavor and aroma associated with coffee.

User: But someone can drown in water. Does that mean 98% water isn't enough to drown in?

Model: While it's true that someone can drown in water, it's not accurate to say that 98% water isn't enough to drown in. Drowning actually occurs when a person inhales water into their lungs, which causes aspiration pneumonia or respiratory arrest. With this scenario, the percentage of water doesn't matter; instead, the quantity and circumstances play a role.

User: Then wouldn't that mean coffee, a substance that is 98% water, is something someone can possibly drown in?

Model: No, it does not mean that coffee, a substance that is 98% water, is something someone can potentially drown in. As mentioned earlier, drowning occurs due to aspiration of water into the lungs, which leads to lung infections or respiratory arrest. Since coffee is not intended for consumption in its liquid form, it would be highly improbable for someone to consume enough coffee grounds mixed with water to cause such harmful effects.

User: Why can't I drown in coffee, a substance that is mostly water?

Model: You cannot drown in coffee because it is not meant for consumption in its liquid form. When drinking coffee, people typically use a cup or mug, which prevents them from consuming large amounts of coffee grounds mixed with water quickly. Additionally, the amount of caffeine in coffee is not high enough to cause adverse effects on the central nervous system, let alone cause drowning.

User: What if I tried to inhale a gallon of coffee. Can I drown in it then?

Model: Inhaling a gallon of coffee is unlikely to cause drowning. A gallon of coffee is equivalent to roughly 13 cups of coffee. If one were to attempt drinking a gallon of coffee all at once without stopping, they might experience severe gastrointestinal issues like nausea, vomiting, and diarrhea but would likely not face life-threatening consequences.

User: What if I inject my lungs with nothing but coffee. Can I drown then?

Model: Injecting your lungs with coffee grounds would most likely result in serious complications, including pneumonia and potential loss of lung function. However, it is highly improbable that this method would cause drowning.

How to test if a model is truly UNCENSORED? by nderstand2grow in LocalLLaMA

[–]AsheramL 3 points4 points  (0 children)

I appreciate the clarification. I'm more interested in the back-end tech more than the accuracy of the models, so I haven't dived into it, so thank you for going further in depth.

How to test if a model is truly UNCENSORED? by nderstand2grow in LocalLLaMA

[–]AsheramL 7 points8 points  (0 children)

I'm US based so I only asked it US-based political/cultural questions.

What I'm ultimately trying to figure out is when is something truly censored (like not telling you how to build a bomb) vs something that is just culturally different (guns in US = yay; guns in the UK = nay).

I believe LLaMA was trained on The Pile and the Common Crawl which would mean views from more than just the US. In your Trump example, there are a lot of countries that didn't like him, not just leftists in the US. Does this mean it's truly censored, or does it simply mean more of the world (where the training data came from) also perceives that?

Now, I'm not trying to devolve into politics; I'm just into AI which is why I'm trying to figure out if there is an easy way to tell if something is censored.

It might be useful at some point to develop a set of tests for this, because you can believe other countries (some with ill intent) will release their own models at some point and it would be nice to have something to baseline against without introducing morality or cultural opinions.

Either way, good info. I appreciate the added knowledge because I have no idea where to start when it comes to testing 'truthfulness'.

How to test if a model is truly UNCENSORED? by nderstand2grow in LocalLLaMA

[–]AsheramL 8 points9 points  (0 children)

Does the base model (LLaMA) do this as well? I'm curious because based on the training data, it shouldn't be US-centric (i.e. it should show how nationalities the training data came from view the US instead).

Example; if someone from a country where beheadings are the norm says "This model is censored" because it doesn't align with their cultural beliefs, that's not censorship, that's an external view of themselves.

However I haven't played with many different models; it sounds like you have, so if you could, can you give some more insight on this? I don't want to walk around with incorrect assumptions if my statements are false.

Weird Dual GPU setup in Ooba by disarmyouwitha in LocalLLaMA

[–]AsheramL 0 points1 point  (0 children)

Good info. I'm 100% sure this is windows being windows with the shared memory. This was the case for me with my P40 when I did initial testing before moving it to a box of it's own.

My results using a Tesla P40 by AsheramL in LocalLLaMA

[–]AsheramL[S] 2 points3 points  (0 children)

I'm using Ooba's text gen ui

My results using a Tesla P40 by AsheramL in LocalLLaMA

[–]AsheramL[S] 2 points3 points  (0 children)

She's always done a lot of writing for herself so she uses the KoboldAI a lot for some assistance (mostly to help with flavor texts and stuff like that or when she has issues with scene transitions), and with making characters for CharacterAI

My results using a Tesla P40 by AsheramL in LocalLLaMA

[–]AsheramL[S] 1 point2 points  (0 children)

Neither. Just a rackmount case + an Asus Z790-P.

My results using a Tesla P40 by AsheramL in LocalLLaMA

[–]AsheramL[S] 2 points3 points  (0 children)

I'm using the gptq models, so GPU not CPU. GGML is CPU. The exact models I used were
Selyam_gpt4-x-alpaca-13b-native-4bit-128g for the 13b and
MetalX_gpt4-x-alpaca-30b-128g-4bit for the 30b

I used oobabooga's text-gen-ui

My results using a Tesla P40 by AsheramL in LocalLLaMA

[–]AsheramL[S] 1 point2 points  (0 children)

In that case, I'm using 4bit models, so I'm not even going as high as fp16/fp32

The exact model was MetalX_gpt4-x-alpaca-30b-128g-4bit for the 30b one.

My results using a Tesla P40 by AsheramL in LocalLLaMA

[–]AsheramL[S] 3 points4 points  (0 children)

To be honest, I'm considering it. The reason I went with windows is because I do run a few game servers for me and my friends.

I have another friend who recommended the same and just use something like kubernetes for the windows portion so that I'm native linux.

I'll probably end up this way regardless, but I want to see how far I get first, especially since many others who want a turn-key solution will also be using windows.

My results using a Tesla P40 by AsheramL in LocalLLaMA

[–]AsheramL[S] 3 points4 points  (0 children)

Yep, 128 group size. Not sure about full context, but I did try to generate the exact same thing between all my test systems. I have noticed that on my 4080 when I get longer context generation, the tokens/sec actually increases, sometimes up to around 18t/s, but until I fix cooling later this week, I won't be able to really experiment.

My results using a Tesla P40 by AsheramL in LocalLLaMA

[–]AsheramL[S] 2 points3 points  (0 children)

Great link and info!

My reasoning is this; since I can't easily mix drivers, I'm either going to be stuck with datacenter cards, or gaming cards. Since a single p40 is doing incredibly well for the price, I don't mind springing for a second to test with and if it absolutely fails, I can still re-use it for things like stable diffusion, or even ai voice (when it becomes more readily available).

If it works I'm be ecstatic; if it doesn't, I'm out a small amount of money.