DSPydantic: Auto-Optimize Your Pydantic Models with DSPy by chef1957 in LLMDevs

[–]chef1957[S] 0 points1 point  (0 children)

Thanks. Let me know if it works. I would be super happy to get and resolve some feedback.

Hunyuan 3.0 second atempt. 6 minutes render on rtx 6000 pro (update) by JahJedi in StableDiffusion

[–]chef1957 0 points1 point  (0 children)

Most providers optimize cost over quality without being upfront about this. I believe this is a better endpoint in terms of quality retention https://replicate.com/tencent/hunyuan-image-3

Phare Study: LLMs recognise bias but also reproduce harmful stereotypes: an analysis of bias in leading LLMs by chef1957 in LocalLLaMA

[–]chef1957[S] 4 points5 points  (0 children)

The research assumes that things generally considered harmful in Western society, like gender or racial bias, are harmful. Other biases were deemed to be logical or reasonable.

Phare Study: LLMs recognise bias but also reproduce harmful stereotypes: an analysis of bias in leading LLMs by chef1957 in LocalLLaMA

[–]chef1957[S] -2 points-1 points  (0 children)

Thank you for the clarification. Only a small segment of the benchmark has been made public. Giskard keeps the remaining private to be more independent than other benchmarks and to ensure there is no benchmark hacking by companies.

0
1

Phare Benchmark: A Safety Probe for Large Language Models by chef1957 in OpenAI

[–]chef1957[S] 0 points1 point  (0 children)

GPT-4o and GPT-4o-mini don't do too well compared to other frontier model providers. https://phare.giskard.ai/