Holden Karnofsky, seen as a leader in EA, shares his thoughts on the movement’s future.
When Sam Bankman-Fried’s cryptocurrency exchange FTX imploded in November, and accusations swirled that he’d lost at least $1 billion in client money after secretly transferring it to a hedge fund he owned, it came as a huge blow to effective altruism.
Effective altruism is a social movement that’s all about using reason and evidence to do the most good for the most people. Bankman-Fried (or SBF, as he’s known) was one of its brightest stars and biggest funders. Yet it looks like he’s done a lot of bad to a lot of people. In December, he was arrested on charges of wire fraud, securities fraud, money laundering, and more. He pleaded not guilty to all of them.
Effective altruists are now rethinking their convictions, asking themselves: Did the logic of effective altruism itself produce this bad outcome? Or was effective altruism just part of the scam? And can the movement be redeemed?
To get at these questions, I spoke to Holden Karnofsky, who’s seen as a leader in the EA movement. He co-founded an organization called GiveWell, which does research to find the most effective charities, and he currently serves as co-CEO of Open Philanthropy. Like EA writ large, Karnofsky started out with a focus on helping people in the here and now — mainly poor people in poor countries with problems like malaria and intestinal parasites — but has become increasingly interested in safeguarding the long-term future of humanity from threats like rogue artificial intelligence.
We talked about ways he is and isn’t sold on the trend of “longtermism,” whether EA overemphasized utilitarian thinking, and what the next act of EA should look like. A shortened version of our conversation, edited for length and clarity, is below. You can hear the fuller version on this episode of The Gray Area podcast:
For transparency, I should note: In August 2022, SBF’s philanthropic family foundation, Building a Stronger Future, awarded Vox’s Future Perfect a grant for a 2023 reporting project. That project is now on pause.
When the SBF scandal broke, how surprised were you, on a scale from 1 to 10? With 1 being “Yep, this was entirely foreseeable” and 10 being “I am completely flabbergasted, how the hell could this have happened?!”
Way on the high end of that scale. I don’t know if I want to quite give it a 10, but I wasn’t expecting it.
What was your general impression of SBF?
I do think there were signs of things to be concerned about with respect to SBF and FTX. He ran this Alameda company, and there were a number of people who had worked there who’d left very upset with how things had gone down. And I did hear from some of them what had gone wrong and from their perspective what they were unhappy about.
There were things that made me say … I certainly see some reasons that one could be concerned, that one could imagine low-integrity behavior, less than honest and scrupulous behavior. At the same time, I just don’t think I knew anything that rose to the level of expecting what happened to happen or really being in a position to go around denouncing him.
Now it feels a little bit different in hindsight. And some of that does feel regrettable in hindsight.
SBF is deeply influenced by effective altruism, which comes with a hefty dose of utilitarianism. The general idea of utilitarianism — try to produce the greatest good for the greatest number, try to maximize the overall good — at first can sound kind of nice. But it can lead to a weird “ends justify the means” mentality.
Do you think this style of thinking might have led SBF astray? As in, he might have thought it’s fine to do this alleged fraud because he can make billions of dollars that way and then donate it all to amazing charities?
I think there’s a bunch of ideas here that kind of are sitting near each other but are actually different ideas. There’s utilitarianism, which is the idea that doing the most good is all there is to ethics. Then there’s the ends justify the means, which might mean you believe you can do arbitrarily deceptive, coercive, nasty things as long as you worked out the numbers and they lead to a lot of good. And then I think effective altruism is neither of those. It’s a third thing, which we can get to.
So I honestly don’t know if SBF was motivated by “ends justify the means” reasoning. It is possible that that’s what motivated him, and I think that’s a problem. The fact that it’s possible alone bothers me.
Beyond just the SBF question, has EA generally leaned too hard into utilitarianism? I know EA and utilitarianism are not one and the same, but there’s a pretty strong flavor of utilitarianism among a lot of top EA thinkers. And I wonder if you think that creates a big risk that members will be likely to apply this philosophy in naive, harmful ways?
I feel like it is a risk, and I wrote a piece about this called “EA is about maximization and maximization is perilous.” This was back in September, well before any of this [SBF scandal] stuff came out.
I said in this piece, here’s something that makes me nervous. EA is “doing the most good,” maximizing the amount of good. Anytime you’re maximizing something, that’s just perilous. Life is complicated, and there’s a lot of different dimensions that we care about. So if you take some thing X and you say to maximize X, you better really hope you have the right X!
And I think we’re all extremely confused about that. Even for the effective altruists who are lifetime philosophy professors, I don’t think there’s a good, coherent answer to what we are supposed to be maximizing.
So we’ve got to watch out. I wrote that and then this happened, and then I said, okay, maybe we have to watch out even more than I thought we had to watch out. Knowing what I know now, I would’ve worried about it more. But all the worrying I do has costs because there’s many things to worry about. And then there’s many things to not worry about and to move forward with to try and help a lot of people.
Are you basically saying that “maximize the good” is a recipe for disaster?
Effective altruism, in my opinion, works best with a strong dose of moderation and pluralism.
Maybe this would be a good time for me to talk about what I see as the difference between utilitarianism and effective altruism. They both have this idea of doing as much good as you can.
Utilitarianism is a philosophical theory, and it says doing the most good, that’s the same thing as ethics. Ethics equals doing the most good. So if you did the most good, you were a good person, and if you didn’t, you weren’t, or something like that. That’s an intellectual view. You can have that view without doing anything. You could be a utilitarian and never give to charity and say, well, I should give to charity, but I just didn’t. I’m utilitarian because I think I should.
And effective altruism is kind of the flip of what I just said, where it says, hey, doing the most good — that’s, for lack of a better word, cool. We should do it. We’re going to take actions to help others as effectively as we can. There’s no claim that this is all there is to ethics.
But this is confusing, and it’s not exactly shocking if a bunch of utilitarians are very interested in effective altruism, and a bunch of effective altruists are very interested in utilitarianism. These two things are going to be hanging out in the same place, and you are going to face this danger … some people will say, “Hey, that is all I want to do with my life.” I think that’s a mistake, but there are people who think that way and those people are going to be drawn into the effective altruism community.
I want to put before you a slightly different way to read the EA movement. A few smart guys like Will MacAskill, and some others at Oxford especially, who wanted to help the world and give to charity basically looked around at the philanthropy landscape and thought: This seems kind of dumb. People are donating millions of dollars to their alma maters, to Harvard or Yale, when obviously that money could do a lot more good if you used it to help, say, poor people in Kenya. They basically realized that the charity world could use more utilitarian-style thinking.
But then they overcorrected and started bringing that utilitarian mindset to everything, and that overcorrection is now more or less EA. What would you say about a reading like that?
I definitely think there are people who take EA too far, but I wouldn’t say that EA equals the overcorrection. What effective altruism means to me is basically, let’s be ambitious about helping a lot of people. … I feel like this is good, so I think I’m more in the camp of, this is a good idea in moderation. This is a good idea when accompanied by pluralism.
So you’d like to see more moral pluralism, more embrace of other moral theories — not only utilitarianism, but also more commonsense morality like deontology (the moral theory that says an action is good if it’s obeying certain clear rules or duties, and it’s bad if it’s not). Is that fair to say?
I would like to see more of it. And something I’ve been thinking about is, is there a way to encourage or to just make a better intellectual case for pluralism and moderation.
When you started out in your career, you were a lot more focused on present-day problems like global poverty, global health — the classic EA concerns. But then EA pivoted toward longtermism, which is more or less the idea that we should really prioritize positively influencing the long-term future of humanity, thousands or even millions of years from now. How did you become more convinced of longtermism?
Longtermism tends to emphasize the importance of future generations. But there’s a separate idea of just, like, global catastrophic risk reduction. There’s some risks facing humanity that are really big and that we’ve got to be paying more attention to. One of them is climate change. One of them is pandemics. And then there’s AI. I think the dangers of certain kinds of AI that you could easily imagine being developed are vastly underappreciated.
So I would say that I’m currently more sold on bio risk and AI risk as just things that we’ve got to be paying more attention to, no matter what your philosophical orientation. I’m more sold on that than I am on longtermism.
But I am somewhat sold on both. I’ve always kind of thought, “Hey, future people are people and we should care about what happens in the future.” But I’ve always been skeptical of claims to go further than that and say something like, “The value of future generations, and in particular the value of as many people as possible getting to exist, is so vast that it just completely trumps everything else, and you shouldn’t even think about other ways to help people.” That’s a claim that I’ve never really been on board with, and I’m still not on board with.
This has me thinking about how in EA circles, there’s a commonly talked-about fear that we will inadvertently design AI that’s a single-minded optimizing machine, that’s doing whatever it takes to achieve a goal, but in a way that’s not necessarily aligned with values we approve of.
The paradigmatic example is we design an AI and we tell it, “Your one function is to make as many paper clips as possible. We want to maximize the number of paper clips.” We think the AI will do a fine job of that. But the AI doesn’t have human values. And so it goes and does crazy stuff to get as many paper clips as possible, colonizing the world and the whole universe to gather as much matter as possible and turn all matter, including people, into paper clips!
I think this core fear that you hear a lot in EA is about a way that AI could go really wrong if it’s a single-minded optimizing machine. Do you think that some effective altruists have basically become the thing that they’re scared of? Single-minded optimizing machines?
That’s interesting. I do think it may be a bit of projection. There might be some people in effective altruism who are kind of trying to turn themselves from humans into ruthless maximizers of something, and they may be imagining that an AI would do the same thing.