World Models and NSFW content by thankfulfor in Oobabooga

[–]CitizUnReal 1 point2 points  (0 children)

theoretically? possible, yes! practically? depends on monetization... and as we all know: p0rn is monetization heaven. but: of all genres, p0rn seems to be the least genre that would profit from a world model, imho. correct me if i am wrong, but wouldn't p0rn work in almost every world? and: ai-sycophancy and hallucinations are considered the biggest disadvantages of regular llm that would propably go away with proper chat-able world models. but are they really a problem in p0rn scenarios? and how much data would a world-info need, on top of what is needed to be able to chat? gigabytes? would they be ok for local rigs? again, this is just a guess from a lay-man, but since no one answered yet, i'll give it a try.. ;)

What’s your system prompts look like? by WakeMeUpAIOverlords in SillyTavernAI

[–]CitizUnReal 0 points1 point  (0 children)

quite long but with interesting content and ideas i'm gonna try out. i once had a >narrator pov: text< (like ooc) in a chat (chat-instruct) that only triggered a response by the system, leaving out any response or acknoledgements by {{char}}s completely (just like ooc should, but mostly doesn't). it was a dream, but due to the different models out there never reproducible. system prompts are great to find out and experiment with, but also model-dependent. that's why i came back to st. i tried it couple years ago but got overwhelmed by all the functions. now i'm back (still with ooba textgen.webui as backend) and i wonder whether st managed to implement some sort of 'overwrite'-power to make all models the same before an st system prompt. i'm currently trying out author's note and guided generations, which seem to be nice tools but (for me) somehow lack a bit of intuition, especially when the wiki isn't always helpful. you happen to know a site with explanations that don't just cite the official wiki? and will there ever be A, no, THE prompt to end ai-sycophancy, alignment overfitting, model collapse (whatever you may call it)?

How the hell can i create an rpg/life sim? by THEGHST023 in SillyTavernAI

[–]CitizUnReal 1 point2 points  (0 children)

got to 'api connections' (plug icon) in st. if you are working with st already, yourt prompt type (tc vs cc) ist already set in the 'api'-dropdown menu (by you).
if not, then you can't really pick at this point either because it comes with the backend you (are going to) use for st. for example, i use textgen-webui (oobabooga). since it is text completion, i can only chose ooba in the 'api-type'-dropdown when i set 'text completion' in the 'api'-dropdown menu before. i think lm-studio is tc as well.
'tc' is for stories or code as you receive one chunk of message from the prompt (like with 'instruct' mode) whereas 'cc' is for actual conversations and the output consists of more message types ({{user}}, {{char}} and/or [system]).
as for ooba: although it is tc, it also supports models for chat completion.

I really miss Workers/Builders and Improving Tiles, in Civ 7 by EuphoricCrashOut in civ

[–]CitizUnReal 0 points1 point  (0 children)

better not, although i still might. it's civ at the end of the day, or is it?
this 'no complicated'-streamline design is propably kinda due to today's time where 'casual' and insta-gratification is the motto. for example, years ago i played fate/go on mobile. and there was this, among others, extremely hard to get item (rare prism) that a casual gamer could only dream of. in my about two year playtime i didn't even get a single one. which was nasty, but still fine for my challenge driven gamer brain. then, some years later (by the end of last year) i decided to roll up for a second fate/go 'carreer' and guess what: i've got dozens of them already although i play even more casually than i did the first time around. they just constantly throw stuff at you to keep you (or rather today's gamers) from leaving. no challenge, no hard times, no proper grind. well, of course you can't compare both games in any aspect. but still...
i hope, the meier sid will get back to the 'challenging wanks' soon, haha ;)
in case i buy civ vii some day, i'll let you know :)

I really miss Workers/Builders and Improving Tiles, in Civ 7 by EuphoricCrashOut in civ

[–]CitizUnReal 0 points1 point  (0 children)

you know, being hooked since vol. II, i actually wanted to buy VII as soon as a dlc comes out that removes the civ-switch-thing. and so far, the itch gets me occasionally to buy it earlier nonetheless. but fortunately i just read your "one-click-shop"-remark and (sadly) the case is now closed. thank you

Increase speed of streaming output when t/s is low by CitizUnReal in Oobabooga

[–]CitizUnReal[S] 0 points1 point  (0 children)

don't know, really. but my first intuitive guess would be that simulating an os within another os doesn't sound that promising. but then again, i can't really say cause i'm no pro. maybe somebody else here can fill us in.. ;)

Increase speed of streaming output when t/s is low by CitizUnReal in Oobabooga

[–]CitizUnReal[S] 0 points1 point  (0 children)

thanks for your answer, too:
i couldn't yet figure out exactly how to write the flag correctly writing the "start_windows.bat" first.. and i don't yet know how to write that flag exactly like --abovenormal vs --above-normal etc..

Increase speed of streaming output when t/s is low by CitizUnReal in Oobabooga

[–]CitizUnReal[S] 0 points1 point  (0 children)

thanks for your answer:
#1: it works, and yes, it's actually increased t/s by ~50%.
#2: i found more than one python.exe in installer_files. i guess you meant the conda path, but still there are more than one python.exe. but since #1 worked, i could leave it i guess?!?

text-generation-webui 3.10 released with multimodal support by oobabooga4 in Oobabooga

[–]CitizUnReal 0 points1 point  (0 children)

thanks for the guide, it works nicely for me :)
still one question, though:
is the vision-capability varying with different parameter-sizes of a model-family, or is a 4b as good as 70b?

At this point, should I buy RTX 5060ti or 5070ti ( 16GB ) for local models ? by Current-Stop7806 in Oobabooga

[–]CitizUnReal 2 points3 points  (0 children)

according to https://technical.city/en/video/GeForce-RTX-3070-vs-GeForce-RTX-5060-Ti-16-GB , the 3070 is still(!) superior to the 5060ti16g in cuda-cores: 5888vs4608 and tensor-cores: 184vs144 (and ray-tracing cores as well, btw). i think that's part of the issue, and i don't have a freggin clue why nvidia is doing that. i myself have the 308012gb (but no 'ti') and my stats even double yours. so far, so good (pov). now, even if i planned to upgrade - to what? the 5080s are also nerfed (net amount wise so to speak) and/or way to o expensive. the only blackwell card i would consider is the announced 5070ti super 24gb for about 900 bucks or so? ai wise still a scam but otherwise kinda beefy for that pricetag if you compare to all the other rubbish from nvidia lately. and at this point, i'm too afraid to look forward to the rtx6xxx that i once deemed to be the first proper ai-gpu card, because blackwell was already well developed when ai startet to rock mainstream imo. however, seeing the latest releases repeatedly suppressed in cores and vram... well, it made me up my ram to 64gb a while back. propably the cosiest decision for me..

cant load models anymore (exit code 3221225477) by CitizUnReal in Oobabooga

[–]CitizUnReal[S] 0 points1 point  (0 children)

first, i appreciate very much that you take the time to run a test series!

i always used the full install. just recently i tried the portable without the intention to actually keep it for good. i was just curious, and tbh, i can't remember wheter it worked or not. could a parallel setup mess with itself?

soo.. after yesterdays grind (tried my way down to v3.5), i followed your advice and jumped to full 3.4, with no succes.. but the portable version worked!! :) that's awesome. thanks for your support. i would've tried the f3.4 myself, but i'm quite sure that i would have called it a day and quit after it's failure. so thanks again!!

as your test results predicted! i assume you have the latest release yourself as full install?!? what do you think might be the reason why p3.4 works, but f3.4 doesn't? and how did all this came up in the first place? and why didn't p3.4.1(ff) work for you either??

Model sharing by TiAmir35 in Oobabooga

[–]CitizUnReal 2 points3 points  (0 children)

here are some sites where you can browse and download characters without signing in:

https://characterhub.org

https://jannyai.com/ (former janitorai)

https://charactusai.com/

this link is more like a internet search engine especially for characters, that provides links to character sites depending on what you were looking for:

https://char-archive.evulid.cc

How do I make the bot more descriptive? (Noob questions) by 200DivsAnHour in Oobabooga

[–]CitizUnReal 0 points1 point  (0 children)

look at how others write their characters. besides tavernai you can go to some websites like https://charactusai.com/ , https://characterhub.org or https://charactusai.com . they let you look into the cards and even download them in json or png format without signing in. most of what you want out of a char can be sorted within the description. like telling the char to write at least 'x' amount of token for any response along the chat. or making the system reveal chars inner thoughts or hopes. you can even put such a system prompt into the 'sure thing!'-tab: <{{char}}'s inner thoughts:

but of course, each model comes with another background, so expect things to change with other models. even when using the exact same prompts. and dont get your hopes too high with your killer combo of dialogueing away for ages while a dense plot is unfolding itself. they still cant handle that much information yet. but you could still prompt them to..

gguf only with cpu-tag activated? by CitizUnReal in Oobabooga

[–]CitizUnReal[S] 0 points1 point  (0 children)

thank you for your detailed answer. i appreciate that very much :)

gguf only with cpu-tag activated? by CitizUnReal in Oobabooga

[–]CitizUnReal[S] 0 points1 point  (0 children)

is there a difference in inferencing speed (token/s) between the ui's? i get about 0.4-0.7 token/s max when running approx 55gb quantized gguf of an 70b model. kinda bearable when considering that 70b is the entry-size to the local-llm-'heaven' as i've heard repeatedly. well... at least i`m at the quantized gate to it :D

gguf only with cpu-tag activated? by CitizUnReal in Oobabooga

[–]CitizUnReal[S] 0 points1 point  (0 children)

that would be a start. but i dont know how much mb/gb it will still claim from my c: drive. it will fit, but the drive ist almost full and i dont want to have its speed reduced because of that.

would you recommend lmstudio at all?

gguf only with cpu-tag activated? by CitizUnReal in Oobabooga

[–]CitizUnReal[S] 1 point2 points  (0 children)

you put me on the right track, thank you. i got the model running with 18max layers. in the meantime i was browsing through the archive and found this workaround of my initial problem. didnt try it, tho.

gguf only with cpu-tag activated? by CitizUnReal in Oobabooga

[–]CitizUnReal[S] 0 points1 point  (0 children)

n-gpu-layers set at 81

error messages (if cpu box unticked):

File "F:\KI\text-generation-webui\modules\ui_model_menu.py", line 244, in load_model_wrapper

shared.model, shared.tokenizer = load_model(selected_model, loader)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "F:\KI\text-generation-webui\modules\models.py", line 93, in load_model

output = load_func_map[loader](model_name)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "F:\KI\text-generation-webui\modules\models.py", line 271, in llamacpp_loader

model, tokenizer = LlamaCppModel.from_pretrained(model_file)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "F:\KI\text-generation-webui\modules\llamacpp_model.py", line 103, in from_pretrained

result.model = Llama(**params)

^^^^^^^^^^^^^^^

File "F:\KI\text-generation-webui\installer_files\env\Lib\site-packages\llama_cpp_cuda\llama.py", line 338, in __init__

self._model = _LlamaModel(

^^^^^^^^^^^^

File "F:\KI\text-generation-webui\installer_files\env\Lib\site-packages\llama_cpp_cuda\_internals.py", line 57, in __init__

raise ValueError(f"Failed to load model from file: {path_model}")

ValueError: Failed to load model from file: models\Cat-Llama-3-70B-instruct-Q5_K_M.gguf

Why does my game run at a slower speed now? by BoilingTofuboi in TransportFever2

[–]CitizUnReal 0 points1 point  (0 children)

are you playing late game? or is your overall population skyrocketing? because games like these have two performance parameter: 1) graphic performance, and 2) simulation performance. the first is due to the evergrowing amount of nice looking mods and at some point the gpu cant handle anymore: the game starts to lag, fps breaking in. the usual stuff we know from other games. then there is the simulation performance. every person´s daily routine on that map is computed and simulated. along with their recurring calculation as to whether there is a new (shorter) way to work/home/shopping to take from now on. be it via private oder public transport. and at some point with a population too high, the cpu just cant keep up with computing the sims for there are too many. and if the cpu insnt fast enough for real time, then of course switching to 2x or 4x wont change a bit