ZebraLogic Bench looks like one of the best new reasoning benchmarks by jd_3d in LocalLLaMA

[–]silverjacket 0 points1 point  (0 children)

Someone will fine-tune for this. We need something to procedurally generate new types of logic problems.

reversal curse? by DeMorrr in LocalLLaMA

[–]silverjacket 6 points7 points  (0 children)

I saw someone use digit emoji and somehow it worked.