Leo Gao → Debate training on LLMs as a reward-hacking mitigation | grantmaking.ai