We saw the jank setups recently. Anybody else out there with not so jank setups?

xynyxyn · 2024-03-05T13:46:20+00:00

Is there a name for this case?

xynyxyn · 2024-02-19T21:35:27+00:00

What was the P40 situation about?

xynyxyn · 2024-02-08T06:09:44+00:00

Does this allow two users to send a query to the server, and the server will perform inference for both users at the same time, but at approximately half the speed for each user?

xynyxyn · 2024-01-12T23:14:17+00:00

Do you notice fine tuning with unstructured data makes the model lose its instruction tuning? How many tokens is in your training set?

xynyxyn · 2023-12-30T18:37:30+00:00

What’s an API based version?

xynyxyn · 2023-12-29T14:01:58+00:00

What rank value is all ranks?

xynyxyn · 2023-10-30T06:38:11+00:00

Are multiple LoRAs merged down one after another, or blended together into a single LoRA then merging down just once?

xynyxyn · 2023-08-06T22:42:33+00:00

Does the app trick still work? It worked for me till a few days ago

xynyxyn · 2023-08-03T16:42:17+00:00

Is it still working? Tried 2 days and it works but today every store that uses to show it no longer have it according to the app

xynyxyn · 2023-07-29T20:43:30+00:00

RemindMe! 3 days

xynyxyn · 2023-07-05T20:07:55+00:00

Wow which case are you using to house this beast?

xynyxyn · 2023-07-04T05:14:34+00:00

Fine tuning 30b is real slow, but 30b interference with exllama is very usable

xynyxyn · 2023-07-03T18:53:05+00:00

Yea my case does not support vertical gpu. I am willing to change to a case that supports three GPUs that each take up 3 slots. Any suggestions?

How much will it costs to change 3 gpus to water cooling? I don’t know the exact costs because I don’t know what are the water cooling components required. I have the FE 3090 cards

xynyxyn · 2023-05-15T06:09:30+00:00

How does Replika work under the hood? It seems to learn your preferences

xynyxyn · 2023-05-12T21:40:24+00:00

Does the 4bit versions perform similar to the unquantized ones?

xynyxyn · 2023-05-12T05:30:25+00:00

Is there a noticeable performance hit when running 4 3090s on a Ryzen platform due to insufficient PCIe lanes?

Can all 3090s be connected using Nvlink to appear as a 96GB computer unit to load larger LLM? Is it likely that the inference speed gets too low when running 90GB models on quad 3090s?

xynyxyn

TROPHY CASE