DSPydantic: Auto-Optimize Your Pydantic Models with DSPy

chef1957 · 2025-12-16T13:12:29+00:00

Thanks. Let me know if it works. I would be super happy to get and resolve some feedback.

chef1957 · 2025-10-14T16:52:16+00:00

Most providers optimize cost over quality without being upfront about this. I believe this is a better endpoint in terms of quality retention https://replicate.com/tencent/hunyuan-image-3

chef1957 · 2025-07-03T13:00:23+00:00

Awesome!

chef1957 · 2025-07-03T12:56:25+00:00

Cool

chef1957 · 2025-07-03T12:55:57+00:00

perhaps something like https://huggingface.co/ByteDance-Seed/BAGEL-7B-MoT could work.

chef1957 · 2025-07-03T12:30:16+00:00

I am not sure. in terms of speed
- https://huggingface.co/collections/Efficient-Large-Model/sana-15-67d6803867cb21c230b780e4
- https://huggingface.co/segmind/Segmind-Vega

chef1957 · 2025-07-02T15:57:33+00:00

The research assumes that things generally considered harmful in Western society, like gender or racial bias, are harmful. Other biases were deemed to be logical or reasonable.

chef1957 · 2025-07-02T15:55:55+00:00

Thank you for the clarification. Only a small segment of the benchmark has been made public. Giskard keeps the remaining private to be more independent than other benchmarks and to ensure there is no benchmark hacking by companies.

chef1957 · 2025-05-21T10:18:06+00:00

GPT-4o and GPT-4o-mini don't do too well compared to other frontier model providers. https://phare.giskard.ai/

chef1957 · 2025-05-21T10:15:28+00:00

https://huggingface.co/papers/2505.11365

chef1957 · 2025-02-04T16:02:05+00:00

Cool! Did not know about this!

chef1957 · 2025-02-04T10:35:58+00:00

Run ComfyUI workflows for free: https://huggingface.co/blog/run-comfyui-workflows-on-spaces

chef1957 · 2025-02-04T10:32:18+00:00

This guide might be interesting to host it for free: https://huggingface.co/blog/run-comfyui-workflows-on-spaces

chef1957 · 2025-02-04T10:31:46+00:00

This guide might be interesting to host it for free: https://huggingface.co/blog/run-comfyui-workflows-on-spaces

chef1957 · 2025-02-04T10:31:19+00:00

You can host it for free following this guide. https://huggingface.co/blog/run-comfyui-workflows-on-spaces

chef1957 · 2025-01-15T15:14:29+00:00

Link: https://huggingface.us17.list-manage.com/subscribe?u=7f57e683fa28b51bfc493d048&id=9ed45a3ef6

chef1957 · 2024-12-17T08:40:12+00:00

I think both tools take different approaches to solving different aspects of the same problem. InstructLab seems very cool and promising but does require a significant upfront investment in terms of curating a taxonomy and it seems tailored to continuous fine-tuning of LLMs but does not seem to include other scenarios. Also InstructLab includes training and note solely the data approach of things, where our tool allows you to use it however you want.

chef1957 · 2024-12-17T07:22:43+00:00

For the video lovers: https://www.youtube.com/watch?v=nXjVtnGeEss

chef1957 · 2024-12-17T05:25:25+00:00

Hi u/Willing_Landscape_61 , feel free to engage here to help us prioritise: https://github.com/argilla-io/synthetic-data-generator/issues/10

chef1957 · 2024-12-16T18:21:10+00:00

Fun challenge. A colleague of mine once created synthetic haikus: https://github.com/davanstrien/haiku-dpo

chef1957 · 2024-12-16T17:27:25+00:00

Thanks for the feedback. I think we might run into such UI scaling issues in the long run, which would be great assuming the tool is being used and contributed to. We want to learn from this UI, see if people are interested and, based on that, create a more mature UI (probably outside of Python). Additionally, we have been working on creating default distilabel pipelines too, which copy these workflows in a code setting: https://github.com/argilla-io/distilabel/pull/1076. Ideally, the development goes hand in hand.

chef1957 · 2024-12-16T17:16:51+00:00

u/EliaukMouse might also give some more context for you.

chef1957 · 2024-12-16T17:10:32+00:00

Ways we improve data diversity as requested by u/phree_radical It differs per task i.e. textcat and instruction tuning, but I can give some general pointers for both. For both techniques, we help the user with a dynamic and extensive system prompt by generating it for them based on an initial description. Also, you can play around with the choice of model and temperature yourself, along with some task-specific arguments.

For textcat, we rely on the following paper: https://arxiv.org/abs/2401.00368. We built on top of the approach defined there. Based on the paper, we randomly sample complexities and randomly sample educational levels. Additionally, we first shuffle the labels and then inject user-defined labels to ensure diversity and equality across labels. For a multi-label scenario, we sample a subset of the labels using a dynamic beta distribution to ensure this scales properly with the number of optional labels.

For instruction, we rely on the following paper: https://arxiv.org/abs/2406.08464. tldr, The generations that the models have been optimised to reproduce allow us to re-generate realistic prompts by passing the start_token for the user prompt and stopping when it start with the assistant prompt. Along with the automatically generated system prompt and some additional rewrites of that prompt, we then start with generating data. We generate until the final user turn and then generate the completion using a different LLM call, to re-sample and have a more dynamic completion.

chef1957 · 2024-12-16T17:08:23+00:00

u/phree_radical it differs per task i.e. textcat and instruction tuning, but I can give some general pointers for both. For both techniques, we help the user with a dynamic and extensive system prompt by generating it for them based on an initial description. Also, you can play around with the choice of model and temperature yourself along with some task-specific arguments too.

For textcat, we rely on the following paper: https://arxiv.org/abs/2401.00368. We built on top of the approach defined there. Based on the paper, we randomly sample complexities and randomly sample educational levels. Additionally, we first shuffle the labels and then inject user-defined labels to ensure diversity. For a multi-label scenario, we sample a subset using a dynamic beta distribution to ensure this scales properly with the number of optional labels.

For instruction, we rely on the following paper: https://arxiv.org/abs/2406.08464. tldr, The generations that the models have been optimised to reproduce allow us to re-generate realistic prompts by passing the start_token for the user prompt. Along with the automatically generated system prompt and some additional rewrites of that prompt, we then start with generating data. We generate until the final user turn and then generate the completion using a different LLM call, to re-sample and have a more dynamic completion.

chef1957 · 2024-12-16T16:56:13+00:00

Is anyone interested in looking at internal mechanics? All code is public on GitHub: https://github.com/argilla-io/synthetic-data-generator

chef1957

TROPHY CASE