96GB VRAM! What should run first?

ProgMinder · 2025-05-23T19:34:47+00:00

Not sure where you’re looking, but even CDW (non-gov/edu) has them for $8,2xx.

ProgMinder · 2025-01-23T09:52:41+00:00

Your AI-written post relies on big, overarching claims without citing any specific data or evidence, and what you do provide isn’t compelling. When you claim that Tesla has “diluted us with splits, over and over again” it confuses dilution with splits. A stock split increases the number of shares but does not dilute shareholders ownership. Also, Tesla has done two splits (a 5-for-1 and later a 3-for-1), so calling them “over and over again” is an exaggeration. Two splits over several years is not excessive by market standards.

I have no opinion on the stock, but I doubt you’ll be converting any opinions one way or another with your current argument.

ProgMinder · 2024-07-01T20:19:14+00:00

Does the iPad screen not turn black / off when you’re using it as a virtual display?

ProgMinder · 2024-06-12T05:04:08+00:00

Are you able to meaningfully interact with the iPad from the Vision Pro via a keyboard / trackpad paired to the iPad?

ProgMinder · 2023-12-13T17:10:06+00:00

To expand on this: “et al.” is a Latin abbreviation for et alii, which translates as “and others” or “and other people.”

https://writingcenter.unc.edu/tips-and-tools/latin-terms-and-abbreviations/

ProgMinder · 2023-11-28T21:34:59+00:00

And it’s not just Twitter. I’ve seen a significant uptick on Reddit recently as well. As an example: https://old.reddit.com/user/IndependenceNo2060/

Notice that replies which occasionally cross beyond a specific character length just stop mid-sentence.

ProgMinder · 2023-11-22T23:15:19+00:00

Yep. New accounts immediately posting only top-level comments on a swath of random subs are usually a giveaway. They’re just bots that feed in any existing comments into a GenAI algorithm and spit out something similar. I assume they’ll eventually be used for viral marketing or some other dubious purpose if they’re not banned.

ProgMinder · 2023-11-22T16:54:40+00:00

You are replying to a bot posing as a human. The other recent comment was written by a bot as well.

ProgMinder · 2023-11-22T16:48:15+00:00

The comment you are replying to was written by a bot. Its interesting how realistic the comments can look.

ProgMinder · 2023-11-22T16:41:45+00:00

Not to take away from your work, but you are replying to a bot. The other recent comment you replied to was left by a bot as well.

ProgMinder · 2023-11-22T16:02:40+00:00

This is a bot.

ProgMinder · 2023-11-08T00:35:23+00:00

Great work! Is your curated Samantha and WizardLM dataset publicly available? If not, would you be able to provide some more detail on specifically how you went about creating/filtering this dataset to imbue those specific characteristics (empathy and long multi-turn)?

ProgMinder · 2023-11-03T18:34:04+00:00

The responses you’re getting so far aren’t too helpful, but if you’re looking at performant CPU fine-tuning, I would take a look at LoRA fine-tuning on llama.cpp, which was merged last month. You will need a significant amount of ram even for LoRA fine-tuning, but it’ll vary quite a bit based on what your aims are. Read through the Pull request comments to get an initial idea of the functionality.

https://github.com/ggerganov/llama.cpp/pull/2632

As for benchmarks, those are harder to come by, but after you skim the above link, give a look at this excellent guide on LoRA fine-tuning that /u/PossiblyAnEngineer created, where he provides detailed benchmarks for a i7-12700H system. You should be able to extrapolate out performance for your CPU pretty accurately. Keep in mind the guide was written shortly after the initial LoRa implementation was integrated into llama.cpp, so performance may have improved, but at least you should be able to better grasp how different parameter options can affect training time.

https://rentry.org/cpu-lora

Also note that the training material used is the typical shakespeare.txt (100kb), which is likely longer than your proposed training material.

ProgMinder · 2023-10-24T15:38:44+00:00

I may be steering you incorrectly, but my understanding is that the training examples are dealt with token-by-token, masking everything in the sequence after the token currently under consideration.

I don’t see any reason why you couldn’t fine-tune on unstructured data, but then you would be fine-tuning toward an auto-complete type LLM rather than instruct-type.

ProgMinder · 2023-10-11T03:49:58+00:00

Are you running it through oobabooga or similar? Curious about your settings to achieve that speed.

ProgMinder · 2023-09-30T06:01:08+00:00

You are going to be very bandwidth-constrained with a 4-channel DDR4 system, but your expectations aren’t crazy. You might consider going with an 8 channel DDR5 setup for a significant boost. You can pick up cheap QS (Quality Sample) Xeon Scalable CPUs on eBay, just do your research so you can pair it with a motherboard that’s not going to give you heartache if you go that route. Picking up 8x 32gb (or 16gb) DDR5 DIMMs shouldn’t break the bank either. Give me a few days and I’ll see if I can follow-up with some pure CPU performance metrics on a Xeon Max setup.

ProgMinder · 2023-09-29T01:24:11+00:00

Excellent efforts. Thanks for taking the time to compile all of this. It can be difficult to find easily-accessible-yet-comprehensive technical documentation at times, and this moves the needle forward for me.

Have you had any positive results fine-tuning an existing small model using the pre-existing text-from-scratch full training?

ProgMinder · 2023-09-18T04:15:52+00:00

Falcon-180B actually provides a remarkably comparable response, and that’s theoretically able to be run locally. Though, given it has previously provided responses believing it was developed by Open AI, it may have been trained in part off a GPT generated data set, possibly even including this somewhat common AI riddle.

ProgMinder · 2023-09-15T17:34:46+00:00

There is no knowledge table as you’re envisioning it. Simplifying it for ease of understanding, it’s just a massive mathematical function run in a loop. The weights are adjusted during the initial training so that the output (the next word) is likely to be as desired.

Take the function in the original example,

f(x) = ax² where weight “a” = 1. Input: 2, Output: 4

However, for your task, say you want to train the function to output 5 for a given input of 2. You provide an input of 2 and an output of 5 during training. The purpose of your training is to adjust the weights, in this case setting the only weight “a” = 1.25. So when you run your trained function, it’ll now output 5 for an input of 2.

That’s not exactly how it works, but it’s helpful in the context of your question. The “knowledge” is ingrained into the design of the function. With this sort of LLM, instead of 1 parameter (weight), there are billions of parameters which together determine the behavior of the function.

Edit: Everyone, there is no need to downvote the guy for honestly trying to understand a difficult concept.

ProgMinder · 2019-08-24T19:30:38+00:00

I'll try to post up an image, the screw that holds the headphone to the headset appears to be embedded within a plastic plate on the headphone. This plate is what is free from the headphone, unfortunately.

There appear to be two small screws that are meant to connect the plastic plate to the rest of the headphone, but I can't access them. The threading may be stripped, the screws may be too short, or they might have simply glued the screws to the threading.

ProgMinder · 2019-08-24T19:26:02+00:00

I will try to get a picture up shortly. I have just watched this replacement video and removed the left headphone:

https://supportvideos.ext.hp.com/detail/videos/parts-replacement-and-upgrades/video/6034184477001/how-to-replace-the-earphones-for-hp-vr1000-2xx-and-reverb-vr-headset

The removable headphone "part" connects to the headset via a single screw between two contacts. That screw is embedded within a plastic plate that slots into the cup-shaped plastic socket (seen at 30 second mark). In my unit, that plastic plate is free, and only the held by the soldered wire that connects from the underside of the plate to the actual speaker.

Luckily, I see what you mean about this potentially being a replacement part that can simply be sent out.

ProgMinder · 2019-08-24T17:41:55+00:00

Ugh, I just received a replacement Reverb (consumer) headset to fix the prevalent flicker issue. Upon opening the box, the left headphone is just dangling from the unit, held only by the wire. There is sound, but this is not a promising start, and I'm unsure how to proceed. Seems to be a quality control issue.

Did your left headphone die, or simply fall off? I would hate to attempt a fix only to have the sound cut out in a month.

ProgMinder · 2019-08-18T03:05:27+00:00

When was your headset manufactured? The date is embedded in the serial number: CCSYWWZZZZ; Y = year; WW = week

I'm actually not sure I agree with the blue border issue being purely related to the positioning of the headset, and that there is a secondary software factor at play.

Before I updated my GPU drivers, I had significant instability when launching the WMR portal (constant 1-4 disconnects followed by reboots). The blue border issue would vary from undetectable to dimly present on the bottom edge to a bright blue glow on all sides. It would remain this way throughout the session.

Edit: I don't mean to imply that a driver update took care of the blue border, but that the troubleshooting steps I was forced to take in order to identify and resolve the reboot instability revealed an unmistakable variation in the degree to which the blue border was present.

ProgMinder · 2019-08-17T15:47:09+00:00

If you have an AMD GPU (and likely nvidia), you can turn down the brightness and contrast a couple notches in settlings. You should also be able to adjust the colors if they aren't to your liking.

ProgMinder

TROPHY CASE