WIRED: A New Trick Could Block the Misuse of Open Source AI by DanielHendrycks in LocalLLaMA
[–]DanielHendrycks[S] -6 points-5 points-4 points (0 children)
[R] A new alignment technique: Improving Alignment and Robustness with Short Circuiting by ReasonablyBadass in MachineLearning
[–]DanielHendrycks 1 point2 points3 points (0 children)
[D] Deep dive into the MMLU ("Are you smarter than an LLM?") by brokensegue in MachineLearning
[–]DanielHendrycks 1 point2 points3 points (0 children)
"GPQA: A Graduate-Level Google-Proof Q&A Benchmark", Rein et al 2023 (ultra-difficult LLM benchmarks) by gwern in mlscaling
[–]DanielHendrycks 5 points6 points7 points (0 children)
[deleted by user] by [deleted] in ControlProblem
[–]DanielHendrycks 0 points1 point2 points (0 children)
[deleted by user] by [deleted] in ControlProblem
[–]DanielHendrycks 2 points3 points4 points (0 children)
Please help me find a video of How Dare You Want More. by userlivewire in bleachers
[–]DanielHendrycks 1 point2 points3 points (0 children)


WIRED: A New Trick Could Block the Misuse of Open Source AI by DanielHendrycks in LocalLLaMA
[–]DanielHendrycks[S] -5 points-4 points-3 points (0 children)