A tale of two longform pieces du jour.
Nielsen: ‘Which Future?’
Influences: nuclear calculation calibration, dual-use considerations, misuse, fine-tuning attacks, Petrov / Arkhipov, Vulnerable World Hypothesis, market-supplied safety, optimal development timing, the tech tree, differential technological progress, societal alignment, safety of reality
Highlights:
That is: a deep enough understanding of reality is intrinsically dual use.
tags: outside-view-ai-risk
The underlying issue is that such problems don’t reside in the systems; they reside in any deep understanding of the structure of the world. And so the question is, again, whether reality itself is safe. Trying to make a powerful aligned AI system is like trying to make matches “safe” for fire. You may briefly “succeed” very narrowly with a single guardrailed system, but the situation is intrinsically unstable. The idea of powerful and stably aligned AI systems is an oxymoron.
tags: #unstable-equilibria, #what’s-hiding-out-there, #known-and-unknown-unknowns
One intriguing idea is from my friend Hannu Rajaniemi, CEO of Red Queen Bio, who has suggested immune-computer interfaces. The idea is that people will wear devices which do real-time detection of environmental threats, and then develop and deploy countermeasures, also in real time. It’d be just-in-time immune system modulation, based on surveillance and response.
tags: things-that-might-be-worth-working-on-#1, #robustly-good, #advancing-defence
Another possibility is to secure the built environment. […] We don’t address the challenge of fire by putting guardrails on matches, making them “safe” or “aligned”. Instead, we align the entire external world through materials and surveillance and institutions.
tags: #robustly-good, #advancing-defence
That difference in intuition depends heavily on people's prior expertise – people with certain types of background find the Vulnerable World Hypothesis very plausible, while people without those backgrounds do not.
tags: #sociology-of-epistemics
It's a strange feedback loop: train more such technical alignment people, which accelerates the rate of adoption of the systems, which causes many more such people to be trained, which reinforces this mistaken focus and the collective concentration of power, at the expense of what I believe is the true primary threat1
tags: #you-palliate-a-bad-equilibrium
But market-supplied safety tends to struggle when: costs are illegible because of long timelines (e.g., asbestos, cigarettes, sugar); or are borne by third parties or collectively (e.g., polluting waterways, fire, air pollution, CO2 emissions). […]
Unfortunately, many of the downsides of ASI will be illegible for a long time, hidden in the models; harm will be a side effect of collective progress, not attributable to any single actor; and will arise in dual-use ways, creating mixed incentives.
tags: #undersupplied-by-the-market
any “optimism” which refuses to acknowledge genuine threats is a foolish optimism, especially when those threats are systematic products of the institutions we use to understand and control the world. Wise optimism means truly understanding the situation we’re in, and developing institutions and technologies to respond
tags: #wisdom/judgement, #techno-optimism
I returned often to AI over the years, and between 2011 and 2015 concentrated on it, writing a book about neural networks, which Chris Olah and Greg Brockman credit with helping get them involved in the field.
tags: #gold-standard-field-building
It also illustrates a kind of scientific dysergy, where moderately concerning individual discoveries can be combined into something much worse – a pattern that could also be illustrated with the history of nuclear weapons.
tags: #conceptual-engineering
Governance and policy is only a small part of the external alignment work that is required. And external alignment – that is, making reality outside the system safe – is historically far more expensive, far slower, and far less incentivized by the market.
tags: #societal-alignment
It’s important to seed imaginative approaches now, so they’re ready as windows of opportunity open. When you don’t yet see a path to a full solution, you fall back on imagination and improving the ideas you have.
tags: #proactively-awaiting-an-opening, #pronotre
An ongoing project for me is to understand what (if any) design principles enable surveillance that enables safety, while balancing all parties’ need for flourishing; and what leads it to fail, leading to authoritarianism?
tags: #things-others-consider-worth-working-on, #surveillance
People sometimes take a techno-determinist view, that exploration of the technology tree has a near inevitable quality, sometimes even down to timing. But even if that were true low in the technology tree, exponential explosion of the design space means it’s almost certainly not true higher up. Almost all possible technologies will never be invented, no matter how long and aggressively we explore. So it genuinely matters what ideas and institutions modulate how we explore the technology tree. And how vulnerable the world is depends upon those ideas and institutions.
tags: #naive-non-weighted-combinatorics
Takes
A refreshing realism about the benefits of surveillance.
Preventing transmission of diseases is clearly easier than curing them — I neglected this quadrant — we should focus on technical prevention of transmission as much as technical cures (& both over social prevention).
Nielsen does not demarcate a menu of futures. Ben Norman will soon.
I do think there should be third-party inspectors for ASI labs.
Michael Nielsen is a scientist, a man of action. Sam Kriss is a man of commentary. Yesterday, Nielsen published his longform ‘Which Future?’; today, Kriss published his fly-by ‘Child’s Play’. Kriss thinks he’s tracking the rationalists, but clearly Nielsen’s work is where the real story lies. Kriss’s engagement looks impoverished and gaspingly shallow by comparison.
Kriss: ‘Child’s Play’
Influences: observation, sightseeing, interview, cynicism, The Whispering Earring, east coast mentality, pseudoanthropology, Scott Alexander, X
Notes:
The method they landed on for rebuilding all of human knowledge is Bayes’s theorem, a formula invented by an eighteenth-century English minister that is used in statistics to work out conditional probabilities. In the mid-Aughts, armed with the theorem, the rationalists discovered that humanity is in jeopardy of a rogue superintelligent AI wiping out all life on the planet. This has been their overriding concern ever since.
Is Kriss here alluding to “Anthropic Shadow: Observation Selection Effects and Human Extinction Risks” (I’d be severely impressed), something Yudkowskian, or other sources entirely?
[Scott Alexander’s] best-case scenario for AI is essentially the antithesis of Roy’s: superintelligence that will actively refuse to give us everything we want, for the sake of preserving our humanity.
I hadn’t fully appreciated this as Scott’s stance.
The Whispering Earring is a sharp voice in angel-on-the-shoulder AI tool development.


Sam Kriss, apart from being a great writer, is archetypically cynical, and much of his commentary is negative. I think the name of his substack column acknowledges this. I think the great sadistic pleasure in reading his work is watching him tear down people I want to see torn down, but when he comes for people I know and love, I feel deeply unseen and misunderstood. I sometimes come off of a Sam Kriss article feeling wiser or at least smugger, but rarely do I come off of a binge of his articles feeling better about the world.
>Is Kriss here alluding to “Anthropic Shadow: Observation Selection Effects and Human Extinction Risks” (I’d be severely impressed), something Yudkowskian, or other sources entirely?
I'd guess it's just:
* Rationalists talk a lot about Bayes' Theorem
* Rationalists talk a lot about AI risk
* Therefore they must have used Bayes' Theorem to discover AI risk.