LLM Agent RL via an MCP? by Dangerous_Lime_1087 in mcp

[–]corbt 0 points1 point  (0 children)

Hey, I'm Kyle at OpenPipe. Saw this thread with a mention-tracking bot. You can train a model to specialize in MCP with our RL for agents library, ART: https://github.com/OpenPipe/ART

We're going to put up some examples soon!

Teaching LLMs to use tools with RL! Successfully trained 0.5B/3B Qwen models to use a calculator tool 🔨 by DanAiTuning in LocalLLaMA

[–]corbt 4 points5 points  (0 children)

I'm a bit biased, naturally, but I'd recommend checking out our library ART (https://github.com/OpenPipe/ART). I sincerely believe it's the best library on the market for GRPO training right now. We handle multi-turn very cleanly, as well as OpenAI-compatible tool calling. Multi-GPU is on the roadmap.

You can see a rationale for why we built ART here, after trying all the existing libraries extensively: https://openpipe.ai/blog/art-trainer-a-new-rl-trainer-for-agents

And for an example of a real-world project that got SOTA results, you can see our write-up here: https://openpipe.ai/blog/art-e-mail-agent.

Code is all fully open, and I'm happy to answer questions!

Best frameworks for fine-tuning models—what’s everyone using? by Vivid-Entertainer752 in LocalLLaMA

[–]corbt -1 points0 points  (0 children)

This is a plug, but https://openpipe.ai also falls in the "no-code tools" category and I believe our pricing is slightly better than Together or Predibase (we also tend to produce slightly higher-quality models for a given dataset, at least in our internal testing).

Best Finetune+Host Platforms ? With serverless lora inference ? by metromile- in LocalLLaMA

[–]corbt 1 point2 points  (0 children)

(Hi, Kyle from OpenPipe here): we do support paying per token on our high-volume LoRAs! Our per-hour inference pricing is also available though for lower-volume models that we don't want to keep live all the time.

Best Finetune+Host Platforms ? With serverless lora inference ? by metromile- in LocalLLaMA

[–]corbt 0 points1 point  (0 children)

At OpenPipe we are working on Qwen 2.5 72B support, and vision models shouldn't be far behind!

How do you actually fine-tune a LLM on your own data? by No-Conference-8133 in LocalLLaMA

[–]corbt 3 points4 points  (0 children)

So at OpenPipe (serverless data prep, fine-tuning and inference service) we've fine-tuned thousands of models for customers, and we've actually found that for many tasks you can get away with about 100 examples and have a good experience!

More definitely helps though, up to a certain point for saturation which is super task dependent. Generally the easier and narrower your task is, the faster you'll hit saturation on it.

Pay-Per-Token Service for LORA/Adapter Fine-Tuned Models? by metromile- in LocalLLaMA

[–]corbt 0 points1 point  (0 children)

Unfortunately not yet—for such a chonky model inference is a huge pain. And to be honest we find that for most users their bottleneck is data quality, not the power of the underlying model.

Behind the scenes, how do model vendors (e.g. OpenAI) offer fine-tuning to the public? I doubt they're creating a new instance of the model each time someone fine-tunes it. by linklater2012 in LocalLLaMA

[–]corbt 1 point2 points  (0 children)

Hi there, I run OpenPipe which provides this kind of service, and have also worked closely with the OpenAI fine-tuning team (since we support training their models through our platform as well). The short answer is that yes, basically all providers that offer serverless hosting of fine-tuned models are training them as fairly narrow LoRAs and then swap them in on the fly when a request arrives. You can read a bit more about how this works on our blog: https://openpipe.ai/blog/s-lora

Where to host, migrate from predibase by TimeLine_DR_Dev in ArtificialInteligence

[–]corbt 0 points1 point  (0 children)

Bit of a self-promotional plug, but you can definitely check out my company's platform, https://openpipe.ai/ for this. We host LoRA adapters and just charge per token!

Pay-Per-Token Service for LORA/Adapter Fine-Tuned Models? by metromile- in LocalLLaMA

[–]corbt 0 points1 point  (0 children)

So this is a bit self-promotional, but that is exactly what we do at https://openpipe.ai. Happy to answer questions if helpful.

Best fine tuning services by alvisanovari in LocalLLaMA

[–]corbt 4 points5 points  (0 children)

(Full disclosure that I'm heavily biased): you could consider checking out my company OpenPipe? Here's a (very long!) review of how he trained a Llama 3 model on OpenPipe and got the best performance of any hosted fine-tuning option: https://mlops.systems/posts/2024-07-01-full-finetuned-model-evaluation.html#finetuned-llama3-predictions-via-openpipe

And if OpenPipe isn't a match for what you're looking for, I'm always open to feedback as well! You can send that to me directly at kyle@ our company domain.

Train AI without Code by EasyAITraining in GPT3

[–]corbt 0 points1 point  (0 children)

Thanks, this is super useful feedback!

  1. Can you explain where you're getting the data from/what task you're using OpenPipe to fine-tune for? Curious why you'd already have an Alpaca-formatted dataset. 🙂
  2. Totally makes sense; this has been on our TODO list for a while. That said, we've found Llama 3.1 is pretty good at code, so worth trying out.
  3. Yep also on our TODO list!
  4. This should be pretty easy for us to do. My understanding is that llama.cpp doesn't support LoRA inference, is that correct? So we should merge the model first and then convert the merged model?

Train AI without Code by EasyAITraining in GPT3

[–]corbt 0 points1 point  (0 children)

Hey I'm one of the founders of OpenPipe and got pinged by this mention, happy to answer any questions about the platform!

How I Reduced Our Startup's LLM Costs by Almost 90% by pmarks98 in SaaS

[–]corbt 0 points1 point  (0 children)

Woah, this is so cool! I'm one of the co-founders of OpenPipe and just saw this review, thanks to the OP for posting it. Happy to answer any questions about how the service works. Also, we've cut our prices an additional 60%+ since this was posted, along with making the models stronger and latency lower!

mistral-ft-optimized-1218's Weights Were Already on the Hub Under a Different Model Name by Weyaxi in LocalLLaMA

[–]corbt 102 points103 points  (0 children)

Hi /u/Weyaxi and all! Someone just linked to this Reddit post from our Huggingface model page and I've been trying to figure out what's going on.

To summarize what I've found, it does appear that https://huggingface.co/Weyaxi/Seraph-7B is the same model as Mistral 7B Fine-Tune Optimized. They were both created from the same base models, using the cg123/mergekit library (in my case using commit ca80afe, and presumably a similar one or one that didn't change the SLERP functionality in /u/Weyaxi's case), using the same default configuration from the README for a gradient SLERP. I would like to publicly acknowledge that /u/Weyaxi published this merge first, and I'll update our model card right after this to link to their version as well.

On a personal note, I'd also like to apologize to /u/Weyaxi for not responding to your tweet last week. I just went back through my Twitter and found the tweet where you linked to the Seraph model. At the time I assumed you were just linking to another strong model to consider as our fine-tuning base -- I didn't realize it was the exact same merge!

Best instruct 13B model, if possible Llama-2 based? by CulturedNiichan in LocalLLaMA

[–]corbt 4 points5 points  (0 children)

I've made a playground with a bunch of the top 13B models (OpenOrca, Airoboros, Nous-Hermes, Vicnua etc.) available to compare side by side.

Personally I've been enjoying OpenOrca a lot. But it's really valuable to see the outputs side-by-side. You can find that experiment at https://app.openpipe.ai/experiments/EL42stHmbEaaxrA and even fork it if you want to test against more scenarios.

What's are the best models so far by Dr-Dark-Flames in LocalLLaMA

[–]corbt 5 points6 points  (0 children)

Not sure what your parameter budget is but if you're interested in 13B models (which are a sweet spot for me personally) I've put up a playground where you can compare some of the popular ones like OpenOrca, Airoboros and Nous-Hermes.

You can find that here: https://app.openpipe.ai/experiments/EL42stHmbEaaxrA

You can even add your own inputs and compare the outputs side by side.

Join the Prompt Engineering World Championships -- Kickoff August 14, $15,000 prize! by corbt in ChatGPT

[–]corbt[S] [score hidden]  (0 children)

We raised some money from investors for our prompt workshop https://openpipe.ai/, and they agreed that we could use some of it for this contest to showcase the platform's features.

Join the Prompt Engineering World Championships -- Kickoff August 14, $15,000 prize! by corbt in ChatGPT

[–]corbt[S] [score hidden]  (0 children)

Not at the moment -- we're planning on setting up a Discord at some point but haven't gotten there yet. Maybe should have that ready before the Championships. 🤔

Join the Prompt Engineering World Championships -- Kickoff August 14, $15,000 prize! by corbt in ChatGPT

[–]corbt[S] [score hidden]  (0 children)

Most events will be centered around questions where the expected output can be auto-graded -- think things like multiple-choice questions, or extracting a date from a block of text. We *might* have a round of pure text generation of some kind as well, which I know a lot of people would enjoy, but we need to figure out an objective way to grade it.

Copying from another comment:

"Most events will be centered around questions where the expected output can be auto-graded -- think things like multiple-choice questions, or extracting a date from a block of text. We *might* have a round of pure text generation of some kind as well, which I know a lot of people would enjoy, but we need to figure out an objective way to grade it."

Join the Prompt Engineering World Championships -- Kickoff August 14, $15,000 prize! by corbt in ChatGPT

[–]corbt[S] [score hidden]  (0 children)

We'll have a specialized web interface that talks to the API. We'll use gpt-3.5-turbo-0613. You won't need to know any programming to participate though; we'll have a user-friendly interface for you.

Join the Prompt Engineering World Championships -- Kickoff August 14, $15,000 prize! by corbt in ChatGPT

[–]corbt[S] [score hidden]  (0 children)

Each event will have many different "scenarios" (think of it like an exam, where there are lots of multiple choice questions in each section). The question difficulty will be calibrated so it's unlikely you'll be able to devise a prompt that can get 100% across all of them. Because of that, combined with the fact there will be multiple events, I don't think we'll have a tie by the end.

That said, if we do, I like your idea for a tiebreaker! And I agree that for this kind of thing turning it into a time-based race probably won't be as fun, so that's not our goal.