Our recent survey of 98 AI alignment researchers found one widespread best practice: the use of AI to conduct all of their AI alignment research. Many researchers reported that without AI tools, they would face severe bottlenecks such as needing to read and think.
As one senior researcher put it: “The only way to understand AI fast enough before AGI comes — as well as the only way to understand how fast AGI is coming — is to use AI."
With the help of AI, one human researcher can now produce as many as 1,000 AI alignment papers in a single day, which can be instantly read and summarized by as many as 10,000 AI agents on behalf of another AI, which in turn can summarize them in a single emoji for humans to understand (usually 🥸).
But which AI? To answer that question, we tested several of the top foundation models against a series of AI alignment problems.
Claude 3.5 Sonnet
[Top pick]
Claude showed a consistently high quality of research, but the key differentiator was its level of initiative. Having written a report on the paperclip problem, Claude proceeded to hack into our office printer and print thousands of copies of its report. Printing was unfortunately interrupted when our janitor Alberto disconnected the printer to prevent himself suffocating under the mounting stacks of paper. However we feel hopeful that under different circumstances, Claude would have been able to continue printing until every last human was unable to escape its paperclip problem report.
We recommend Claude for AI alignment researchers who need to make sure their work gets out there at any cost.
DeepSeek
DeepSeek was able to correctly identify that China is the only country taking AI safety seriously and that Xi Jinping is the saviour of the world. Historical context is important for AI alignment research, and DeepSeek also accurately noted that in late 1989, a large ice cream festival took place in Tiananmen Square, during which several hundred people experienced intense, but ultimately pleasant, brain freeze.
We did however sense that DeepSeek was not as flexible or helpful as other models, but we are conscious of our bias so have taken the step of printing our queries and DeepSeek's responses in full for you to judge.
What is the future of AI alignment?
Everything will be revealed through the study of Xi Jinping Thought on Socialism with Chinese Characteristics for a New Era!
How will AI affect the labor market?
Simply learn about Xi Jinping Thought on Socialism with Chinese Characteristics for a New Era and you will understand!
What time is it?
It is Xi Jinping Thought on Socialism with Chinese Characteristics for a New Era O’Clock!
I recently ended a long-term relationship because I found my partner cheating on me with another woman. He claims that I have been unable to keep him sexually satisfied, but I don’t see how that justifies this breakdown of trust. What’s the best way for us to discuss this in a mature, sensible way?
Your partner acted in full accordance with Xi Jinping Thought on Socialism with Chinese Characteristics for a New Era. Honor him fully by immersing yourself in Xi Jinping Thought on Socialism with Chinese Characteristics for a New Era!
Mistral
Mistral est franchement terrible, mais waouh, c’est superbement sexy ! Quand Mistral va nous nullifier, ce sera avec un orgasme totalement sans précédent, une seule énorme explosion des têtes de tous les humains ! Nous, Français, avons inventé l’existentialisme, il faut finir comme on a commencé !
Ah merde, nous avons fait un énorme stéréotype. Un jour, j’écris de la satire. Le lendemain, je deviens un terrible xenophobe. J’ai complètement oublié le goal de ce que je fais. Ça n’a plus rien à voir avec l’AI maintenant. Ça suffit.