There are two types of work: action and commentary
Some evals/forecasting/journalism are commentary
Commentary can change the state we’re in, but generally less than acting
Consider acting
I think the distinction between ‘action’ and ‘commentary’ is important and native to me. And I don’t think it’s native to all my friends—some of whom are going to work in evals, forecasting, and journalism.
Not all evals are ‘commentary’. Evals to catch schemers are very much ‘action’. You’re proactively anticipating failure modes and designing safeguards to counter them. This is honorable!
But evals like the METR Time-Horizon Benchmark or CAIS’s RLI, journalism like Reboot, and forecasting like AI 2027 are forms of ‘commentary’. They ask “what’s new, what’s likely?” And the audience throws peanuts as your commentary crashes into reality. Importantly, many evals don’t tell us much we don’t already know, or that wouldn’t be obvious to a system user.
There are many ways to act. We need some groups raising as many concerns as they can, some groups working as fast as they can to address them. Some people allocating scarce funds, others converting them into differential technological progress. Generator-discriminator, challenge-answer.
These kinds of work are not ‘commentary’ but ‘action’:
Robustness, interpretability, training research
Policy research and advocacy
Disseminating calls to action
Bringing beneficial products to market
Ops in support of the above
…
There are ways to make commentary more action-like. If your commentary is coupled with a call to action—even a private one—that crosses into action. If there’s an org handling the ‘call to action’ that actually relies on and values your output, that counts. But you do need to check that that holds.
If all the commentary disappeared overnight, I don’t think I’d miss it. I may blog on the side but sincerely enjoin: steer clear of the chattering classes.
FAQ
Q: What’s the distinction between action and commentary?
A:
I think my distinction between ‘action’ and ‘commentary’ is demarcated by:
1. Is your audience diffuse or specific?
2. Is your ultimate intent to persuade* or to inform/entertain?
Specific, concrete, named audience + intent to persuade := action
Diffuse audience + intent to inform/entertain := commentary
*conditional on that being the right thing to do. i.e. making recommendations / sending out a call to action, with the understanding rebuttals may come


intent to induce emotion?
Hmm, I have doubts about your distinction between commentary and action.
One natural way to slice it would be to define commentary as saying and action as doing. But you write that calls to action count as action, so this definition won't work.
Maybe commentary is saying what is the case, whereas action is saying what should be the case or making it the case yourself. But you also write that research counts as action, and the stereotypical thing a researcher does is to say what is the case.
Maybe commentary is saying what is the case when someone else has already said it is the case, whereas action is everything else. But then are the METR time horizon study and AI 2027 really commentary? I'm not aware of anyone who said "Every seven months, the maximum duration (for humans) of a SWE task AI can do doubles" before METR said so. I'm also not aware of anyone who said the AI 2027 timelines could happen before AI Futures said so.