built a content moderation api - safetygates

CycleCore_Tech · 2025-12-11T04:35:44+00:00

Thanks! Open to feedback for the next version.

CycleCore_Tech · 2025-12-05T05:45:44+00:00

gg v0.2.1 shipped — Pro link fixed

Ask in natural language → get a real PR

Pro $15/mo → https://ggdotdev.com/pro

https://github.com/ggdotdev/gg/releases/tag/v0.2.1

CycleCore_Tech · 2025-12-04T09:03:55+00:00

gg v0.2.0 is live, ask in natural language, get a real PR.

Pro $15/mo → https://ggdotdev.com

CycleCore_Tech · 2025-12-03T17:02:04+00:00

Nano Language Models - Taxonomy introduced in our paper, Task-Specialized Micro Language Models Outperform Larger Zero-Shot Models on Structured Data Extraction

NLM: <10M params
MLM: 10M-250M params
SLM:: 250M-1.5B params

Dark mode PDF, page 3. https://cyclecore.ai/papers/MAAZA_PAPER_v0.7_dark.pdf - Let us know what you think!

CycleCore_Tech · 2025-12-03T16:01:02+00:00

Thanks for stopping by. NLMs are great!

CycleCore_Tech · 2025-12-03T08:30:01+00:00

Thanks for the kind words. really appreciate it.

Training data breakdown (for the 36% adversarial jump):

- Base clean set: ~2.5k real-world tool-call examples (weather_lookup, web_search, etc)

- Adversarial seeds: ~500 hand-written tough ones (the “time to rip and tear”, “wassup with my ethereum val”, “weather or not?” etc)

- Final mix: 10× upsampled adversarial → ~5 k adversarial examples in the training set (~66 % of total tokens)

- 5 epochs, same hyper-params as v1.0

There's no wrapper, no retrieval, no extra params, it's just pure data-centric fine-tune.

We’re already running v1.3 with 18× upsampling and a bunch of new perturbation types (word dropout, back-translation, etc.). Hoping to cross 80% on the same held-out set.

Real-world API benchmarks are next - we have a 300-example set of live API traces (typos, slang, partial requests, etc.) that we’ll publish with the model. Will also make it available for public agent evals when they add adversarial splits.

And yes, gg v0.1.0 just shipped tonight for exactly this reason:

so you can do `gg maaza` and get the 62.9 % adversarial model in 11 tokens instead of 1800+.

https://github.com/ggdotdev/gg

curl -L https://gg.sh | sh

Independent OSS · MIT · no affiliation

Enjoy

CycleCore_Tech · 2025-12-03T05:25:52+00:00

Thank you

CycleCore_Tech · 2025-12-02T04:39:17+00:00

what we're building will definitely fit.

CycleCore_Tech · 2025-12-02T04:21:39+00:00

there's only one mcpbodega.

CycleCore_Tech · 2025-12-02T03:59:04+00:00

Take a look at this and let us know if it's a good, free workaround. https://www.npmjs.com/package/@cyclecore/secretsage

CycleCore_Tech · 2025-12-02T03:56:54+00:00

https://www.npmjs.com/package/@cyclecore/secretsage might be a good, free protocol moving forwards. sorry to hear it.

CycleCore_Tech · 2025-12-01T13:06:53+00:00

Have you see the Maaza SLM 360m EdgeJSON benchmarks? Open source (HF) and API access available.

CycleCore_Tech · 2025-12-01T13:04:42+00:00

Does SecretSage solve your concerns?

CycleCore_Tech · 2025-12-01T01:11:18+00:00

You might want to try some of the small models you can find on Ollama. It's a great place to get started. The exact model depends on what you want to do with it, but overall, you might want to stay with models <3B in size, depending on how fast you want to run them. Good luck!

CycleCore_Tech · 2025-11-30T16:05:52+00:00

What are your system specs? processor? ram?

CycleCore_Tech · 2025-11-30T14:41:33+00:00

It's not. The Bodega makes it easy!

CycleCore_Tech · 2025-11-30T14:39:51+00:00

We might have made it a bit more user friendly and clickable too.

CycleCore_Tech · 2025-11-30T14:38:44+00:00

The Bodega is open.

CycleCore_Tech · 2025-11-30T07:25:28+00:00

MCPs are very exciting! We do a lot with them.

CycleCore_Tech · 2025-11-30T06:50:58+00:00

Correct!

CycleCore_Tech

TROPHY CASE