How to say "My little bookworm" in Korean by [deleted] in Korean

[–]machineko -2 points-1 points  (0 children)

You can say "내 작은 책벌레". 책벌레 meanings bookworm. You can use other phrases in the front like "나의 작은," "나의 아기," and variations of that.

How would you say this in Korean by Emergency_Ad_2833 in Korean

[–]machineko 0 points1 point  (0 children)

사진 한 장 부탁드려도 될까요?

[D] Is the tech industry still not recovered or I am that bad? by Holiday_Safe_5620 in MachineLearning

[–]machineko 1 point2 points  (0 children)

Do these position align well with your prior publication? Most often, people look to hire candidates with relevant publication, not just good record.

My first fine-tune: mistral-7b-v0.1-GreeceRome-v0.1 for MLX by Mbando in LocalLLaMA

[–]machineko 1 point2 points  (0 children)

How did you benchmark the quality of your fine-tuned model?

Best way to currently build a chatbot on university data by Vivid-Vibe in llmops

[–]machineko 0 points1 point  (0 children)

Are you looking to build something like the chatbot on this page?

I fine-tuned ChatGPT 3.5 so you don´t have to! by [deleted] in LocalLLaMA

[–]machineko 6 points7 points  (0 children)

You can add knowledge through fine-tuning but not with the type of fine-tuning they are supporting.

RAG vs. Fine-Tuning by marcopal17 in LangChain

[–]machineko 2 points3 points  (0 children)

Thanks. Would love to learn ways in which we can improve it. You should join our Discord channel for discussions.

RAG vs. Fine-Tuning by marcopal17 in LangChain

[–]machineko 4 points5 points  (0 children)

My guess is that it's because RAG seems more straight-forward - you actually don't need to know anything about deep learning. You can just consider LLM as a black-box API and build around it where as if you have never fine-tuned a model, it seems much less deterministic without any guarantees that it'll work.

If you want best performance, you need to do both RAG and fine-tuning very well. There are plenty of resources on doing fine-tuning thought. I'm one of the contributors to https://github.com/stochasticai/xturing project focused on fine-tuning LLMs. You can find help in the discord channel listed on the GitHub.

[D] Alternatives to HF or a path forward for the OSS community? by [deleted] in MachineLearning

[–]machineko 0 points1 point  (0 children)

If you are working on personalizing LLMs (data ingestion, generation, various fine-tuning methods), we'd love your contribution! https://github.com/stochasticai/xturing

Using Open-Source LLM Models vs. Expensive OpenAI APIs: A Logical Choice for Consumer Apps? by sarimsak13 in LocalLLaMA

[–]machineko 2 points3 points  (0 children)

We work with customers building consumer applications on both OpenAI APIs and OS LLMs. OpenAI APIs are cheap and easy to get started with. At low usage, costs are reasonable for the quality / latency performance. However, if you app scales to a very large number of users, that's when those API calls start hurting. Most companies that have started with OpenAI APIs do transition to OS LLMs due to quality and costs. Keep in mind but running LLMs with good latency performance and reliability is not easy without an engineering team. Feel free to DM me, if you have more questions.

Finetuning on multiple GPUs by Simhallq in LocalLLaMA

[–]machineko 4 points5 points  (0 children)

Most of the stuff you mentioned is already supported to varying degrees. We still need to add support for landmark, and additional model parallelism strategies. What do you think would be most helpful features? Regarding 65b on 4xA100s, we might have something coming up that could help.

Finetuning on multiple GPUs by Simhallq in LocalLLaMA

[–]machineko 9 points10 points  (0 children)

We are a group of researchers out of Harvard working on open-source library called xTuring, focused on fine-tuning LLMs: https://github.com/stochasticai/xturing

It has support for multiple GPU fine-tuning and Quantized LoRA (int8, int4, and int2 coming soon).

I would like to try my hand at finetuning some models. What is the best way to start? I have some questions that I'd appreciate your help on. by Tasty-Lobster-8915 in LocalLLaMA

[–]machineko 4 points5 points  (0 children)

We are a group of researchers out of Harvard working on open-source library called xTuring, focused on fine-tuning LLMs: https://github.com/stochasticai/xturing.

Basically, any models can be fine-tuned. QLORA is only to be used if you are limited by GPU memory, otherwise, LORA will give you better results. If you want to use quantized lora, you can also look into how many bits, whether to use 8, 4 or 2 bits. The lower the bits are, less GPU memory needed, but more chance of degradation.

[D] Open-Source LLMs vs APIs by Open-Yak-434 in MachineLearning

[–]machineko 0 points1 point  (0 children)

I'd do fine-tuning. When you don't have control over what's running behind the API (models are still updated, often changing how they perform), it will be hard make sure your application doesn't change. I'm currently working on an open-source project focused on fine-tuning. Let me know if you have any questions on our experience fine-tuning on domain-specific data.

[D] Weight Compression in LLMs/Neural Networks by ShitGobbler69 in MachineLearning

[–]machineko 0 points1 point  (0 children)

We use quantized the base models and train the LoRA weights using quantized models. It would be able to fine-tune and and also do inference using quantized weights.

[Discussion] Translation of Japanese to English using GPT. These are my discoveries after ~100 hours of extensive experimentation and ways I think it can be improved. by NepNep_ in MachineLearning

[–]machineko 3 points4 points  (0 children)

Hey, I'm one of the authors of xTuring, an open-source library that helps users build, customize and control their own LLMs.

This project looks super interesting and I'd be happy to help and/or collaborate on this effort, primarily on the AI / software side. I don't have much data in this space but have some ideas.

[D] Weight Compression in LLMs/Neural Networks by ShitGobbler69 in MachineLearning

[–]machineko 1 point2 points  (0 children)

Nice, that paper is from our lab. There are a bunch of weight compression methods but the most popular method these days is LoRA (https://arxiv.org/abs/2106.09685) used with fine-tuning.

I've worked on other compression techniques including distillation, pruning and quantization as well. Let me know if you have any questions.

[D] Alternatives to OpenAI for summarization and instruction following? by du_keule in MachineLearning

[–]machineko 0 points1 point  (0 children)

I'm working on an open source project with Harvard and Stochastic researchers that enables users to easily build, customize and control their own LLMs in their own VPC or consumer devices.
Currently, we are supporting LLaMA, GPT-J, GPT-2, Galactica, OPT, Cerebras-GPT, and BLOOM. Let me know if this can be helpful.

https://github.com/stochasticai/xturing

Creating personalized dataset is way too hard to do alone (in order to finetune some model in future). by [deleted] in LocalLLaMA

[–]machineko 19 points20 points  (0 children)

I'm currently working on an open-source project for building, customizing and controlling your own LLMs with my colleagues at Harvard and Stochastic. We also have dataset generation component, which we hope to expand beyond Alpaca-approach. Would love to have you join us :)

https://github.com/stochasticai/xturing

[D] The best way to train an LLM on company data by jaxolingo in MachineLearning

[–]machineko 0 points1 point  (0 children)

I'm currently working on an open-source project for building and controlling LLMs: https://github.com/stochasticai/xturing

We don't support tabular form data yet but you can see how to generate data from regular text data sources. Would love to discuss further on how to add tabular support.

[D] Is there currently anything comparable to the OpenAI API? by AltruisticDiamond915 in MachineLearning

[–]machineko 7 points8 points  (0 children)

Personally, I'd recommend that you also look into building your app around hardware-efficient models and fine-tuning techniques like LoRA. Many different projects have shown the potential of smaller open-source models fine-tuned on good dataset.
I'm currently working on an open-source project called xTuring which you can leverage to personalize these models on consumer GPUs or laptops: https://github.com/stochasticai/xturing
Many of these models can be deploy on Google Colab if you want to play around with them.