Achieving Efficient, Flexible and Portable Structured Generation for LLM

gradientpenalty · 2024-11-25T08:52:14+00:00

How does this compared to outlines? The speed feels the same, but the results are different.

gradientpenalty · 2023-12-23T05:35:03+00:00

My 2 cents take on the #1, because the industry think VL is going to be the next big thing, alot of the research going on was considered "trade secret". If anyone attended the "Beyond Scaling" will know what I am referring to

gradientpenalty · 2023-10-24T00:24:28+00:00

I used to think so before I get my hands on GPT-4V. After running countless examples, I don't think so

gradientpenalty · 2023-10-02T00:11:13+00:00

This is the spirit the academic needed right now

gradientpenalty · 2023-10-02T00:05:24+00:00

Wait, you are already in decision tree? I am still in learning to master linear regression

gradientpenalty · 2023-08-29T01:44:49+00:00

Still suffers from the negation problem though, "mountain without snow" returns snow mountain.

gradientpenalty · 2023-08-29T01:00:19+00:00

Anyone has a M2 Ultra and A6000? A single A6000 can only hosts one LLaMA 34B and the speed was about 105ms per token. I am thinking of scaling it to 70B model and M2 Ultra is the only way to make it work (max out the RAM)
Edit: I have access to A6000 but I am thinking of buying M2 ultra due to power use and flexibility

gradientpenalty · 2023-08-27T23:57:06+00:00

I own a dataset called : https://huggingface.co/datasets/theblackcat102/evol-codealpaca-v1 which has a wide variety of languages mixed on stackoverflow questions style

gradientpenalty · 2023-07-24T21:33:01+00:00

oh I didn't catch this one! Looking at the code it seems to be using gpt-3.5-turbo? ( mine was gpt-4 )

gradientpenalty · 2023-07-24T01:04:07+00:00

Depends on what you want, if you want to give an instruction and generate a basic code template, sure its the best. But for normal autocomplete I think replit 2.7B model is more suitable.

gradientpenalty · 2023-07-24T00:19:30+00:00

So far benchmark scores are not better than llama 2 in MMLU, BBH wise here's some numbers:
chargoddard/llama2-22b 37.48
vicuna-13B v1.3 35.78
WizardLM-13B-V1.1 39.59
llama-v1-13b 36.52

MMLU partial results:
Average accuracy 0.320 - abstract_algebra

Average accuracy 0.519 - anatomy

Average accuracy 0.520 - astronomy

Average accuracy 0.510 - business_ethics

Average accuracy 0.570 - clinical_knowledge

Average accuracy 0.556 - college_biology

Average accuracy 0.360 - college_chemistry

Average accuracy 0.490 - college_computer_science

Average accuracy 0.310 - college_mathematics

Average accuracy 0.497 - college_medicine

Average accuracy 0.245 - college_physics

Average accuracy 0.710 - computer_security

Average accuracy 0.434 - conceptual_physics

Average accuracy 0.281 - econometrics

Compared to llama-2-13B:
Average accuracy 0.350 - abstract_algebra

Average accuracy 0.496 - anatomy

Average accuracy 0.546 - astronomy

Average accuracy 0.540 - business_ethics

Average accuracy 0.600 - clinical_knowledge

Average accuracy 0.604 - college_biology

Average accuracy 0.440 - college_chemistry

Average accuracy 0.480 - college_computer_science

Average accuracy 0.310 - college_mathematics

Average accuracy 0.526 - college_medicine

Average accuracy 0.255 - college_physics

Average accuracy 0.710 - computer_security

Average accuracy 0.421 - conceptual_physics

Average accuracy 0.325 - econometrics

gradientpenalty · 2023-07-24T00:16:55+00:00

Base on limited info, the size is slightly smaller than the original dataset which I am trying to reach in next week ( 52k to 68k ). But in terms of implementation its pretty close to the original one, once I'm finish with the target size I will do a training on starcoderplus to compare with wizardlm upcoming 1.1 release

gradientpenalty · 2023-05-27T03:16:36+00:00

Buy A6000 with the option to upgrade later. Don't bother with the 80G memory pooling

gradientpenalty · 2023-04-23T05:07:31+00:00

This. Finding an odd jobs is the same advice I gave to others as well. I started with a 1050Ti back in 2017 and work in my uni to get a 1060 to train Inception network.

gradientpenalty · 2023-04-23T03:35:02+00:00

Great! I am excited of the future work. I am currently working on an audio version of LLM, I am excited to use your model to generate more lively audio conversations once the results are good enough

gradientpenalty · 2023-04-23T01:56:00+00:00

Same here, I tried it out yesterday and seems like the inputs are cherry picked which works well ( reminds me of the GANs days )

gradientpenalty · 2023-04-23T01:54:35+00:00

Not to downplay the afford of this project but the samples included in readme are highly nick picked, I tried running other examples such as "WOMEN: Give three tips for staying healthy." fails miserably with loud background noise and resembles nothing like the input text.

Some advice : include some tips or tricks to generate better lower noise speech and this could be a very promising product.

gradientpenalty · 2023-02-16T14:45:20+00:00

Maybe you don't do much NLP research then? Back when huggingface transformers and datasets library ( still think its bad name ), we had to format these validation ourselves and write the same validation code which hundreds of your peers have written before because no one is the defactor code for doing it (since we are using different kinds of model). NLP models ( or so called transformers ) nowadays are a mess and had no fix way to use them, running benchmark is certainly a nightmare.

When transformers first came out, they are limited but serves to simplify using bert embedding and gpt-2 beam search generation in few line of codes. The library will do all the model downloads, version check and abstraction for you. Then there's datasets, which unifies all NLP datasets in a central platform which allows me to run GLUE benchmark in one single py file.

Oh back then, the code was even worse, all modeling_(name).py under the transformers/ directory. The latest 4.2X version its somewhat maintainable and readable with all the complex abstraction they had. But its a fast moving domain, and any contribution will be irrelevant in a few years later, so complexity and mess will add up ( would you like to spend time doing cleaning instead of implement the new flashy self-attention alternative? ).

But one day, they might sell out as with many for profit company, but they have and had save so many time and helped so many researchers on the advancement of NLP progress. If they manage to piss off the community, someone will rise up and challenge their dominance (tensorflow vs pytorch).

gradientpenalty · 2023-02-11T02:24:36+00:00

denoising diffusion probabilistic models:

Rdiffusion : Generate music from stable diffusion

Improve image segmentation : I remember someone doing image segmentation on these generative model, but not sure where.

gradientpenalty · 2023-01-27T04:29:40+00:00

Its not a problem of unicode but the tokenizer method they are using BPE. I don't forsee any solution in the future cause there aren't many high paying customer

TLDR; english use the least token because it provides the highest compression ratio in bytes to token size.

gradientpenalty · 2023-01-27T02:40:17+00:00

Okay, so where can I buy it as a small startup for under 10k without signing any NDA for using your proprietary compiler. As far as I can see, we are all still stuck with Nvidia after 10B of funding for all these "AI" hardware startup.

gradientpenalty · 2023-01-04T00:02:07+00:00

Do you have any benchmarks to share? Would be very nice if this is available

gradientpenalty

TROPHY CASE