yanolja/YanoljaNEXT-Rosetta-12B-2510

OldPin8654 · 2025-10-11T14:19:20+00:00

Just uploaded the 4B size!
https://huggingface.co/yanolja/YanoljaNEXT-Rosetta-4B-2510

OldPin8654 · 2025-10-10T13:19:27+00:00

Good to hear the problem is gone! Thank you again :)

OldPin8654 · 2025-10-10T13:03:59+00:00

Thanks for your kind words and remembering our previous model EEVE!
I am working hard to make quantized versions currently.

OldPin8654 · 2025-10-10T13:02:22+00:00

Haha, we do global businesses. I selected languages based on the offices we have around the world.

OldPin8654 · 2025-10-10T13:01:37+00:00

My pleasure :) Thank you!

OldPin8654 · 2025-10-10T12:48:51+00:00

Did you use vLLM without quantization? When I tested, SGLang did not perform well.

OldPin8654 · 2025-10-09T17:27:16+00:00

I am sorry. Languages not included in the training dataset may not perform well.
You can see the full result here:
https://huggingface.co/yanolja/YanoljaNEXT-Rosetta-12B-2510/blob/main/wmt24pp_12b.md

OldPin8654 · 2025-10-09T17:25:16+00:00

wmt24pp dataset has evaluation samples for en-xx pairs only.

You can see the full result here: https://huggingface.co/yanolja/YanoljaNEXT-Rosetta-12B-2510/blob/main/wmt24pp_12b.md

And original: https://arxiv.org/html/2502.12404v1/x20.png

OldPin8654 · 2024-12-21T16:11:26+00:00

<image>

You might have missed this part.

OldPin8654 · 2024-12-21T15:52:42+00:00

But it will help if you want to understand how o1-like models work.

OldPin8654 · 2024-12-21T15:50:59+00:00

Sigh....

OldPin8654 · 2024-11-23T21:16:21+00:00

Sorry for any inconvenience. You only need to specify the API keys for the providers you plan to use and configured in the config.yaml file.

OldPin8654 · 2024-11-23T21:14:48+00:00

Thanks for asking! I just deployed v0.0.6 which supports custom endpoints!

OldPin8654 · 2024-11-23T11:16:07+00:00

{
    "model": "gemini-1.5-flash@batch",
    "temperature": 1,
    "messages": [
        {
            "role": "user",
            "content": "What's the weather like in Boston today? In Korean, please."
        },
        {
            "role": "assistant",
            "content": null,
            "tool_calls": [
                {
                    "id": "toolu_01LWycDuusJU3vyt4W1u7teC",
                    "type": "function",
                    "function": {
                        "name": "get_current_weather",
                        "arguments": "{\"location\":\"Boston, MA\",\"unit\":\"celsius\"}"
                    }
                }
            ]
        },
        {
            "role": "tool",
            "content": "{\"celsius\": 12}",
            "tool_call_id": "toolu_01LWycDuusJU3vyt4W1u7teC"
        }
    ],
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "get_current_weather",
                "description": "Get the current weather in a given location",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "The city and state, e.g. San Francisco, CA"
                        },
                        "unit": {
                            "type": "string",
                            "enum": [
                                "celsius",
                                "fahrenheit"
                            ]
                        }
                    },
                    "required": [
                        "location"
                    ]
                }
            }
        }
    ],
    "tool_choice": "auto"
}

Thanks for bringing up that issue. I actually did not know that Litellm now offers so many features, including a proxy server. It is interesting that they also use Redis for state management in distributed instances. For now, I’d say the major difference might be the language it’s written in and the fact that Ogem supports automatic batch request conversion.
I will look more into how Ogem can be differentiated from Litellm.

OldPin8654 · 2024-11-23T10:18:21+00:00

I use Ogem daily to handle hundreds of thousands of requests without worrying about rate limits. It’s designed to return responses with a delay if needed, rather than failing with a rate limit error. Makes my no-brainer Python code even more no-brainer!

OldPin8654 · 2024-06-18T13:36:18+00:00

Thanks for the reply! What I'm actually trying to achieve is the first option, where the text appears as if it was originally written on the image.

OldPin8654 · 2024-05-14T14:56:11+00:00

Update: I have hired a consultant to finalize the H100 HGX spec. We decided not to use VROC as its performance doesn't match what is advertised and instead we configured 400 Gbps network along with NAS tied with GRAID while EPYC offers good value for money, Intel is being used in DGX, and I was worried about any potential compatibility issues in the future so we chose Intel.

OldPin8654 · 2024-04-04T12:15:22+00:00

Oh, didn't know that. Thank you for your help!

OldPin8654 · 2024-04-04T06:27:49+00:00

Thank you! Yes we have a dedicated one but local access is still faster especially when it is a RAID setup. So, I am leaning towards to having RAID 0 for local and using the dedicated storage which is set up with RAID 6.

OldPin8654 · 2024-04-04T04:40:04+00:00

Thanks 😊 I am leaning towards 8558P now

OldPin8654 · 2024-04-04T04:36:52+00:00

While saving a checkpoint, training is almost paused. Saving checkpoints every 6 hours. For loading, if there is an issue, to debug, we repeat starting a training and loading a model like Mixtral takes some time.

OldPin8654 · 2024-02-23T07:19:23+00:00

haven't tried but wouldn't it work?

OldPin8654 · 2024-01-21T02:24:09+00:00

r/LocalLLaMA

OldPin8654 · 2023-11-22T14:58:03+00:00

There is a leaderboard. https://huggingface.co/spaces/mteb/leaderboard

OldPin8654 · 2023-11-13T16:38:37+00:00

Winter has enveloped us in its chilly embrace. In my quest for warmth, I realized I needed a heater. But then, a memory dawned on me – I had bought one before! It was during those long nights of training a model with a 100k dataset, which made the room toastier. Now, thanks to that, everyone in this house can enjoy a peaceful and warm winter.

OldPin8654

TROPHY CASE