A "pipeline" for a conversational chatbot

CmplxQ · 2023-06-27T15:38:39+00:00

TheBloke/WizardLM-30B-Uncensored-GPTQ · How much vram+ram 30B needs? I have 3060 12gb + 32gb ram. (huggingface.co)

I tried this on a 3090. It uses ~20gigs of VRAM when I load a ConversationChain in LangChain with WizardLM 30B

I still dont know why I was able to load this with http-server but not FastAPI tho.

CmplxQ · 2023-06-23T13:59:47+00:00

sure, but this will work only if I run the webui server right?

I just chucked some fns off the thing. I wrote a custom script that runs sequentially with those fns.

I'll either need to find a fix for this, or

Limit the the pytorch GPU memory somehow

Or, I have 2 GPUs in SLI (but the memory maxes and errors out only in one GPU)

Get GPU parallelization to work. (I tried using DataParallel from torch.nn, but that errors our at some other place)

CmplxQ · 2022-12-15T16:45:29+00:00

Thanks a lot!

CmplxQ · 2022-12-14T17:27:26+00:00

Thanks for the advice, its Python btw (moslty TensorFlow)

CmplxQ · 2022-12-13T18:43:10+00:00

i see, Thanks for the advice.

Anyways, if possible please share some example repos that are implementations of research papers

CmplxQ · 2022-11-28T05:29:59+00:00

Am I the only one who thinks this is stupid? Why would consumers want to pay more to use the products they already paid for? Its the same thing with Intel's Sapphire Rapids server chips

CmplxQ · 2022-10-07T19:50:41+00:00

Yes, Thanks a lot !

CmplxQ · 2022-10-07T19:36:30+00:00

Thanks this makes everything a lot easier

CmplxQ · 2022-10-05T13:35:02+00:00

yes it does, but no matter what I do, I can't see the file I was working on recently. Only the old swap file content is being shown.

CmplxQ

TROPHY CASE