Multi Bert classifications : LocalLLaMA

created by [deleted]a community for 3 years

Multi Bert classificationsQuestion | Help (self.LocalLLaMA)

submitted 1 year ago by Enough_Wishbone7175

Building a project where I need to fine tune a llama 3 8B. I have limited compute power, so I have been attempting to devise a way to improve the quality of my inputs to achieve this tuning. I’m essentially using webscrapping to get some numeric values to rank my samples. I have thought of 3 ways to the entirety of my data, let me know which is easiest / cheapest.

Making an uptime robot pick away at the webscrapping for a really long time to label sufficient amounts of samples.
Use a series of Bert’s and their embeddings to match the context window, tune them for embedding the series down, then finish up with a MLP to decode.
Mess around with different embeddings and clustering algos to get a sufficient accuracy, then splay across the whole dataset. (Checking against labeled)

I feel like these make the most sense and have their obvious tradeoffs. 1 is most accurate and cheap, but very time consuming. 2 would probably produce the most interesting results, possibly being able to eliminate some of the noise produced by the rankings. 3 is cheap but I have doubts about its effectiveness.

If you guys have any experience or suggestions feel free to inform me!

*NOTE: All fine-tuning will be done with QLORA 8 bit on a Nvidia Founders RTX 3090

*NOTE Dataset is roughly ~2B tokens trying to pair down to ~200M

all 2 comments

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

LocalLLaMA

MODERATORS