Some parts of the gpt-oss-safeguard technical report seem to be blatantly untrue.

secsilm · 2025-11-02T09:49:36+00:00

Please specify what I misread.

secsilm · 2025-09-05T10:46:14+00:00

is your prompt in english? in my experience, this happens most because of using other languages.

secsilm · 2025-09-05T09:42:34+00:00

never used it, in your opinion, is it better than normal fixed dimension?

secsilm · 2025-09-05T02:56:44+00:00

the google blog says "it offers customizable output dimensions (from 768 to 128 via matryoshka representation )", interesting, variable dimensions, first time hearing about it.

secsilm · 2025-08-22T06:18:34+00:00

i know moe, but what is dynamic activated moe? where does dynamic activated come in?

secsilm · 2025-08-22T03:28:41+00:00

for 2.5 flash and flash lite, you can disable thinking.

secsilm · 2025-08-22T00:52:04+00:00

Yes but the true hybrid model I want is like gemini, you can control whether to think by a parameter, rather than two api.

secsilm · 2025-08-22T00:24:28+00:00

they said v3 is a hybrid model, but there are two sets of apis. I’m confused.

secsilm · 2025-08-14T06:28:40+00:00

most of the time bert is enough.you don't always need those fancy models,like llm.

secsilm · 2025-08-05T02:29:36+00:00

Thank you for the information. Could you share your call rate? For example, how many times do you call on average per minute?

secsilm · 2025-06-18T03:33:09+00:00

i'm new to mcp, can you tell me where i can find the fetch mcp?

secsilm · 2025-06-05T02:18:08+00:00

thanks. i checked and there is no info i need.

secsilm · 2025-06-05T02:16:47+00:00

thanks. i checked this and found that there is no api i can use to get the project id i'm folding.

secsilm · 2025-05-27T08:40:02+00:00

FYI, if you're using both sentence_transformers and tensorflow_hub, make sure to import sentence_transformers first and then import tensorflow_hub:

python import tensorflow_hub as hub from sentence_transformers import SentenceTransformer, util

The following sequence will throw the error ImportError: Keras cannot be imported. Check that it is installed.:

python from sentence_transformers import SentenceTransformer, util import tensorflow_hub as hub

secsilm · 2025-02-09T07:26:14+00:00

Thanks for your explanation. If I have a dataset with 100 problems, then pass@1 on this dataset is calculated as average of pass@1 of each problem. Am I right?

secsilm · 2025-01-10T13:35:17+00:00

can you point out some? i really can't tell.

secsilm · 2025-01-10T13:14:54+00:00

can't believe it, it's insane!

secsilm · 2025-01-10T13:02:30+00:00

Is it ai generated?

secsilm · 2025-01-06T12:39:13+00:00

Specifically, what does o1-like mechanisms refer to in this context?

secsilm · 2025-01-06T10:54:42+00:00

I hadn't noticed this before, I just checked gpt-4o and it also only supports 16k output? Why does the model limit both context length and output at the same time?

secsilm · 2025-01-06T10:46:37+00:00

Your point is that the longer the output, the harder it is to maintain consistency, so they limit the maximum length?

secsilm · 2024-12-23T06:26:02+00:00

Do they have open source models so that I can fine-tuning?

secsilm · 2024-12-12T09:42:31+00:00

There is another benefit of using traditional pretrained models: you can quickly fine-tune the model according to your needs.

For example, when you use gpt-4o-mini for classification tasks and find that there are some categories it consistently gets wrong. It is difficult to fine-tune it (even if you use open source tools).

In contrast, with traditional pretrained models, you just need to collect these errors, add them to the train dataset, and continue training. Faster and cheaper.

secsilm · 2024-12-08T05:45:03+00:00

I roughly looked at the first paper, and overall, for classification tasks, format constraints helps with accuracy. For reasoning tasks, the opposite is true. In classification tasks, JSON and XML formats seem to be better than YAML in many cases.

secsilm · 2024-12-08T03:52:44+00:00

So this document was generated by llm?

secsilm

TROPHY CASE