One thing people might mean when they say "accountable" is "able to explain its reasoning in making the controversial decision", such as a human driver explaining after the fact what led to them running over a cat. The human’s memory might be partially reconstructed but it’s still grounded in cognitive processes that did occur, whereas an AI giving a similar account would generate a plausible-sounding story with no connection to its actual computation.
Compare this to a world where autonomous vehicles were "built, not grown" and an engineer could point at the line in a program to explain the decision. Of course, we don't know how to build such an autonomous vehicle, but in such a world, not only could an engineer or decision maker be blamed, but also we could have a fix whose robustness we could be confident in.
This is not to say autonomous vehicles shouldn't be on the road, or that all current decisions are explainable (try getting a straight answer from a bureaucracy about why your claim was denied). But I can empathize with people being concerned that we are heading towards an AI-mediated world where things increasingly happen for reasons we can’t explain.
That's not how self-driving cars work. You could look at the car's memory, find a segmentation mask for the cat with an associated probability, and replay the data to see why it allegedly ran over the cat. You can make changes like prioritizing avoidance of blurry cat-like blobs if that's what you wanna do. Can't say the same for that human truck driver who slammed into 37 cats a few weeks ago ("Cats missing after animal rescue group involved in crash that killed 8 people on I-85" Atlanta News First Oct. 15, 2025)
This idea that we "grow" these models and don't know how they work, is a pretty egregious misnomer. At least to my mind— as it makes what this "meme" is about more likely to happen. (Giving agency to computer programs.)
We're also not getting worse at interpretability, we're getting better at it. Though it wouldn't matter if how we audit the logic really were a "black box" (it's more like a transparent box with some slightly opaque spots)— the point is computer programs don't have agency.
If we write a program that turns a light on or off, it's not the program turning the light on or off, it is us turning it on or off by extension.
As to where we draw the line at blame, that's an interesting question. If the programmer used a library that had a flaw, would it be that library creator's fault that the programmer's program didn't work as expected? What if the library had been used for years and the problem with it was an extreme edge case? Should some specific person write every line of code used in some process?
I do not think responsibility has a direct relationship with creation. If a bolt fails in an airplane wing, is it the person who made the bolt's fault? Or should there have been some tests or something run on the "end" product? (Hopefully quality assurance tests are run on all the products in the chain, but again, we have to draw lines, uh, places.)
I think the only definition of "held accountable" under which it makes sense to claim that AIs can't be held accountable is if it means something like, "punished in a way that genuinely hurts the one being punished," which is impossible for current AIs because they aren't conscious. I always thought it was pretty nonsensical to care about this definition of accountability. What matters is that you have some way of changing bad behavior, either by encouraging better behavior in the future or by making the misbehaving agent unable to continue their misbehavior. This is the entire purpose of holding people accountable! But this can clearly be done to AIs - in fact, this is just what AI training is. The only reason you would care about the other definition of accountability is if you cared about punishment for its own sake - i.e., retributivism - and you somehow applied this principle event to non-conscious AIs. By why on Earth would getting revenge on a non-conscious being matter intrinsically? And even worse, why would it matter more than actually saving lives? I care a lot more about reducing car deaths than I do about making beings that did not even consciously do anything wrong suffer.
One point about social punishment is that although it is much less effective than neurosurgery, it's also very public. Especially in the context of management decisions, the company performing neurosurgery might be doing anything with their employee, and the public won't be able to confirm it.
If Andy gets fired after running over a squirrel, we know that whatever issue that might've caused that behavior is probably no longer present. If the company does neurosurgery, then we don't actually know whether Andy might keep on running over squirrels. Now, if Andy goes out next week and runs over another squirrel, then obviously the public knows that the neurosurgery didn't work, and the company's likely going to get some public backlash over letting an unsafe driver back on the road.
However, this only really happens when the company has interests that align with the public and/or the public can clearly tell when a failure has occurred.
Let's say that Alice is being incredibly discriminatory in her hiring practices. Like, she rejects everybody whose name starts with 'B' (Alice really hates Bob). Given that management decisions aren't public, it's not even immediately obvious that Alice is doing something wrong. Now, if Alice's questionable hiring practices comes to light and we get some good ol' public punishment, the public can be reasonably content that there won't be any more discriminatory hiring practices, especially since any potential Alice 2 might be wary of acting like her predecessor. And if newly-hired Alice 2 *does* turn out to have similar hiring practices, then the blame falls on her and the issue can be easily fixed by firing her.
However, let's say that the company performs neurosurgery on Alice. And let's say that the company isn't particularly interested in actually fixing Alice. Neurosurgery is hard and incredibly expensive, the company doesn't actually care about whether their hiring is discriminatory, and they think they can get away with it. Maybe it's a bit of a stretch for this contrived example (since it's incredibly high-risk to intentionally keep Alice the same for like zero benefit), but there certainly exists situations where companies have incentives misaligned with public good. So the company will maybe do a token neurosurgery and tell the public that the problem is fixed, and it'll remain difficult for the public to verify that Alice 2 is actually different from Alice 1.
And if the reprogrammed Alice 2 is the same as Alice 1, then who is to blame? You can't blame Alice, since AI is just math. But obviously if Alice was supposed to have neurosurgery and nothing changed, then there's an issue somewhere in the company. Even if we take Alice 2 away and neurosurgery her into Alice 3, it's possible (and even probable) that the issue isn't Alice, but some other decision within the company that allowed for an incorrect Alice to be released. But who exactly is the source of that decision, and how can the public ensure that it will not happen again? Since decisions are made through an AI, the AI is like a buffer for any responsibility and any action that the public can demand to change negative behavior.
There's also the fact that with AI, they don't really have the ability to be wary of acting incorrectly like with humans. Even the human Alice might personally realize that she did something wrong after failing the first time and change, but an algorithm that the company keeps the same will continue to act in the same incorrect manner. A human may also have hang-ups with behaving immorally, while an AI who is presumably aligned solely with the company might not have any such thing. For humans, if the company doesn't change, the human still might and the problem will get fixed. But for AI, if the company doesn't want to change, the problem remains.
Thus, with public punishment with humans, there's an easy way to verify that some action has happened and the company cannot shirk responsibility. A company with AI and unsavory motivations has the ability to act at odds with public interest, at least not without extensive third-party auditing (which afaik does not happen in the US).
Granted, humans are not exactly saints. I agree that when AIs and companies do mostly align with the public (which is probably the majority of situations), AIs can be more easily held accountable by the company. And with humans, there are plenty of opaque decisions that make it hard for the public to guarantee accountability. But this is a very notable example where accountability is vital, but AI makes it a bit more difficult to get it when the company that should be holding the AI accountable doesn't care to.
This has unstuck a couple of gears in my brain, so thanks for that!
But it still seems to me like our society is built on social systems that assume humans will be filling most of the roles and making most of the decisions we care about.
I've had a lot of Kafkaesque customer support experiences with companies like Google where a machine has made a decision about my life, or it's impossible to make an account or log in to a system without lying about my name, email address, etc. due to bad code on the surface.
It seems that this populist meme is actually about that sense of dispowerment---people want to have a person that backstops the process so that if something is going wrong it can be escalated and changed.
But computers, since the marginal cost or social cost of staring at you and saying "no" 1 billion times is ~0, will never bend or fold and fix your issue.
99PI podcast recently did a good piece on this responsibility-shirking even in systems that contain almost 100% humans! (call centers)
So I'm not convinced that human-run systems are especially nice either, but there is something dystopian about getting locked out of my life in some way by a computer and simply having no pathway to escalate.
Do you have any mental model of how we can prevent "Computer Says So"-style disempowerment while still actually putting this technology in everything?
Also, it seems like people compare AI to some imaginary human ideal, when often, humans are way worse about whatever is being evaluated—just like your Waymo example.
I'd figure it was somthing like that, watching Alexander Avile video essay on this. With other marxist analysis on the sidelines. There has been a moral panic that requires cooling down.
That propriety drm they mostly enacted, such an authoritarian incoherence was unsettling, abolishing fair use with making style trade dress. Nope!
Shows that this issue there is ideological inconsistencys. Leaving a splotch on the map. Thanks for clearing this up.
This misses the point. Consider the following real scenario: a health insurance company wants to maximize the amount of denied claims.
It would be hard to convince a doctor: they would personally be liable, might lose their license to practice if discovered, the paper trail of the back and forth would damage the company.
But if you set an ai none of this is a problem: you can disguise your ask of maximizing denials as part of training the ai or behind target okrs.
If the case comes out at best some director or software engineer would get fired and they would quickly by other companies who want to setup similar schemes.
But this applies equally well to any product a corporation sells. Corporations in part exist to limit liability. If a car has a defective part, it’s also often impossible to trace that back to an individual, so we have institutions to influence corporate behavior l: governments, social approval, the market itself. This argument seems like an argument against corporations producing anything we rely on, or automation.
Just because there are other ways that people use to deflect responsibilities it doesn’t mean that it’s not troubling that we are adding another tool.
Think about the possibility of using ai to identify targets for military strikes: it practically allows for the possibility to manufacture a motivation post hoc for any case.
Sure you could say that intelligence can do that as well, but what we are adding here is a further layer of complexity and what is for all intents and purposes a black box that would be impenetrable in a court of law.
You can call a specialist to come under oath and explain their decision. You can’t ask an ai model to explain how they come up to a decision, and even the engineers would have no idea.
I've noticed many people treat punishment as a moral imperative, or as the natural state of affairs, and they panic when it seems like we might not be able to punish everyone all the time.
"Why aren't there ever any consequences for [public figure we don't like]?" Like getting a big advance on a book deal? That counts as some kind of consequence, right? Oh, wait, you really mean "why aren't they punished". Got it.
"We can't allow self-driving cars on the road until countless legal issues have been worked out! After all, what happens if a car does [bad thing]?" Well, we have terms and conditions, insurances, traffic laws, liability laws, right? Oh, you really mean "nobody might go to jail". Got it.
"With realistic Sora videos, people might get convicted for crimes they didn't commit!" Well, that sounds unlikely, given standards of evidence and how everyone knows how easy these videos are to make. "But then someone might escape their punishment even if I catch them on video - that's even worse!" (No, it's really not.)
And my absolutely favorite:
"The worst thing about the 2008 financial crisis is that nobody went to jail for it!"
The purpose of punishment is the threat of it, and that is not effective on computers. When I say that a computer can never be held accountable, I mean that a computer cannot be punished in any way that will inspire fear in it or its "peers".
And that's important, because fear is load-bearing in society. We don't imprison criminals for the entertainment value. Bureaucracy already has responsibility-diffusing effect, and the addition of mechanical bureaucrats could easily take that to new extremes.
If AI is allowed to make management decisions for which no human is considered responsible, the white-collar legal system quickly falls apart. Crimes are committed, but not by anyone, and it becomes impossible to properly deter them. Some people go laughing to the bank, while others suffer and/or die.
Humans punish other humans for their actions, as we have a deep impulse stemming from our evolutionary need to punish those in our group who don't contribute.
Today, most countries operate systems of justice which assume that punishment is the best way to prevent criminals from reoffending. A more rational system in most places to actually reduce the level of crime would probably be to reduce the level of punishment, in favour of more complex rehabilitation schemes.
A computer cannot be punished for its actions, because punishment is not for rewiring, except as a crude evolutionary proxy. We punish because we have a need to cause pain in another person for the actions that we perceived them to have done. Since a computer cannot suffer, it cannot be punished
I think the argument at face value has merit, and I appreciate that Masley made it. It's easy to get stuck in a way of thinking, and oftentimes the easiest way to get dislodged from a rut is for someone else to push you out.
Now that doesn't mean I completely agree with his conclusions, and more than anything I think he doesn't take his reasoning far enough. Because he's right, assuming we have a decent understanding of the systems we're working with, we can and do direct them to our desires. So the bigger question becomes why are these undesirable outcomes still happening if we have no problem exerting total control over computers? Whose desires are they working for?
I think you can answer this question in a lot of ways, but I think an obvious direction people will look is to the owners of the AI, computers, and algorithms. Perhaps accountability is still needed, we just need it directed at the people who do things like casually request the president send the national guard into SF, or work to seek a multibillion dollar bailout for their industry.
I think the point of behind the original presentation is that a person should never be fired by a computer.
But beyond that, computers can’t be held accountable because they are inanimate objects. The computer doesn’t care if you replace it because it can’t care. It isn’t alive, it isn’t conscious, it is just a thing.
I think the original meme makes more sense if we consider who the "we" is.
If "we" consists of average consumers, then perhaps from that perspective there are times that "we" can do next to nothing about AI models behaving badly.
For example, one type of accountability is the ability to provide reasons why a decision was made. Many AI models fail this test.
As a consequence, I understand that financial institutions are limited to using interpretable models in assessing whether to offer someone a loan, because they are legally required to provide specific reasons when someone is denied.
An even worse scenario is dealing with health insurance companies. In that case, insurance providers are incentivized to deny claims, and to provide as little transparency as legally allowed into their reasons.
In those scenarios, from the consumer's point of view, fixing the model isn't an option.
Many of the best things we do to shape human behavior in flourishing societies are not forms of social punishment, but they aren't regarded as scary evil totalitarian thought control mostly because they're the status quo. What happens when people buy super fun lawn darts, use them a few times according to the instructions, then gradually start using them with increasing casual abandon, then start using them sometimes while drunk or giving them to small children? Social punishment? Nope, no more lawn darts one decade and "what are lawn darts?" the following decade.
Thank you for a really interesting post! I was using the IBM quote the other day to argue that current AI shouldn't be making high-consequence decisions, and now I'm starting to reconsider it.
But I'm still struggling with a couple of points:
1. When you say we can "kill" a misbehaving AI and reprogram it, or that we keep Waymos under "totalitarian surveillance" - it feels to me like those aren't punishments the way they'd be for humans. They're more like...quality control? If a batch of medicine is contaminated, we don't "punish" it by destroying it, we just destroy it because that's the appropriate response to a faulty product. The computer has no subjective experience of being punished, no deterrent effect from knowing punishment exists, no learning from social consequences.
2. The accountability in the Waymo case is real - but it's on Waymo the company, not the individual cars. Which seems like it's supporting the IBM quote: it's not the computer that's responsible, it's the humans in the loop.
Right on 1 my point was that quality control is actually what we're aiming for with accountability:
I think when people say “A computer can never be held accountable” what they mean is “A computer can never be socially punished for a bad decision.” That’s true, but social punishment is just a means to a goal: better behavior or harm prevention. It’s also a pretty clumsy, often useless tool. Many people are socially punished and still behave terribly. If we had the ability to perform neurosurgery on everyone behaving badly to permanently change their behavior, I think this would be seen as:
Much more effective than regular social punishment.
Deeply evil and totalitarian.
And yet we do the same to computers regularly. Why would we want to “hold computers accountable” when that’s way, way less effective than what we can actually do to them?
On 2 yeah definitely, the car's not responsible, but it's a very very visible beacon to the humans who are in charge. Corporations can often be way less private than human drivers. I think there are a lot of other cases like that.
One thing people might mean when they say "accountable" is "able to explain its reasoning in making the controversial decision", such as a human driver explaining after the fact what led to them running over a cat. The human’s memory might be partially reconstructed but it’s still grounded in cognitive processes that did occur, whereas an AI giving a similar account would generate a plausible-sounding story with no connection to its actual computation.
Compare this to a world where autonomous vehicles were "built, not grown" and an engineer could point at the line in a program to explain the decision. Of course, we don't know how to build such an autonomous vehicle, but in such a world, not only could an engineer or decision maker be blamed, but also we could have a fix whose robustness we could be confident in.
This is not to say autonomous vehicles shouldn't be on the road, or that all current decisions are explainable (try getting a straight answer from a bureaucracy about why your claim was denied). But I can empathize with people being concerned that we are heading towards an AI-mediated world where things increasingly happen for reasons we can’t explain.
That's not how self-driving cars work. You could look at the car's memory, find a segmentation mask for the cat with an associated probability, and replay the data to see why it allegedly ran over the cat. You can make changes like prioritizing avoidance of blurry cat-like blobs if that's what you wanna do. Can't say the same for that human truck driver who slammed into 37 cats a few weeks ago ("Cats missing after animal rescue group involved in crash that killed 8 people on I-85" Atlanta News First Oct. 15, 2025)
This idea that we "grow" these models and don't know how they work, is a pretty egregious misnomer. At least to my mind— as it makes what this "meme" is about more likely to happen. (Giving agency to computer programs.)
We're also not getting worse at interpretability, we're getting better at it. Though it wouldn't matter if how we audit the logic really were a "black box" (it's more like a transparent box with some slightly opaque spots)— the point is computer programs don't have agency.
If we write a program that turns a light on or off, it's not the program turning the light on or off, it is us turning it on or off by extension.
As to where we draw the line at blame, that's an interesting question. If the programmer used a library that had a flaw, would it be that library creator's fault that the programmer's program didn't work as expected? What if the library had been used for years and the problem with it was an extreme edge case? Should some specific person write every line of code used in some process?
I do not think responsibility has a direct relationship with creation. If a bolt fails in an airplane wing, is it the person who made the bolt's fault? Or should there have been some tests or something run on the "end" product? (Hopefully quality assurance tests are run on all the products in the chain, but again, we have to draw lines, uh, places.)
I think the only definition of "held accountable" under which it makes sense to claim that AIs can't be held accountable is if it means something like, "punished in a way that genuinely hurts the one being punished," which is impossible for current AIs because they aren't conscious. I always thought it was pretty nonsensical to care about this definition of accountability. What matters is that you have some way of changing bad behavior, either by encouraging better behavior in the future or by making the misbehaving agent unable to continue their misbehavior. This is the entire purpose of holding people accountable! But this can clearly be done to AIs - in fact, this is just what AI training is. The only reason you would care about the other definition of accountability is if you cared about punishment for its own sake - i.e., retributivism - and you somehow applied this principle event to non-conscious AIs. By why on Earth would getting revenge on a non-conscious being matter intrinsically? And even worse, why would it matter more than actually saving lives? I care a lot more about reducing car deaths than I do about making beings that did not even consciously do anything wrong suffer.
One point about social punishment is that although it is much less effective than neurosurgery, it's also very public. Especially in the context of management decisions, the company performing neurosurgery might be doing anything with their employee, and the public won't be able to confirm it.
If Andy gets fired after running over a squirrel, we know that whatever issue that might've caused that behavior is probably no longer present. If the company does neurosurgery, then we don't actually know whether Andy might keep on running over squirrels. Now, if Andy goes out next week and runs over another squirrel, then obviously the public knows that the neurosurgery didn't work, and the company's likely going to get some public backlash over letting an unsafe driver back on the road.
However, this only really happens when the company has interests that align with the public and/or the public can clearly tell when a failure has occurred.
Let's say that Alice is being incredibly discriminatory in her hiring practices. Like, she rejects everybody whose name starts with 'B' (Alice really hates Bob). Given that management decisions aren't public, it's not even immediately obvious that Alice is doing something wrong. Now, if Alice's questionable hiring practices comes to light and we get some good ol' public punishment, the public can be reasonably content that there won't be any more discriminatory hiring practices, especially since any potential Alice 2 might be wary of acting like her predecessor. And if newly-hired Alice 2 *does* turn out to have similar hiring practices, then the blame falls on her and the issue can be easily fixed by firing her.
However, let's say that the company performs neurosurgery on Alice. And let's say that the company isn't particularly interested in actually fixing Alice. Neurosurgery is hard and incredibly expensive, the company doesn't actually care about whether their hiring is discriminatory, and they think they can get away with it. Maybe it's a bit of a stretch for this contrived example (since it's incredibly high-risk to intentionally keep Alice the same for like zero benefit), but there certainly exists situations where companies have incentives misaligned with public good. So the company will maybe do a token neurosurgery and tell the public that the problem is fixed, and it'll remain difficult for the public to verify that Alice 2 is actually different from Alice 1.
And if the reprogrammed Alice 2 is the same as Alice 1, then who is to blame? You can't blame Alice, since AI is just math. But obviously if Alice was supposed to have neurosurgery and nothing changed, then there's an issue somewhere in the company. Even if we take Alice 2 away and neurosurgery her into Alice 3, it's possible (and even probable) that the issue isn't Alice, but some other decision within the company that allowed for an incorrect Alice to be released. But who exactly is the source of that decision, and how can the public ensure that it will not happen again? Since decisions are made through an AI, the AI is like a buffer for any responsibility and any action that the public can demand to change negative behavior.
There's also the fact that with AI, they don't really have the ability to be wary of acting incorrectly like with humans. Even the human Alice might personally realize that she did something wrong after failing the first time and change, but an algorithm that the company keeps the same will continue to act in the same incorrect manner. A human may also have hang-ups with behaving immorally, while an AI who is presumably aligned solely with the company might not have any such thing. For humans, if the company doesn't change, the human still might and the problem will get fixed. But for AI, if the company doesn't want to change, the problem remains.
Thus, with public punishment with humans, there's an easy way to verify that some action has happened and the company cannot shirk responsibility. A company with AI and unsavory motivations has the ability to act at odds with public interest, at least not without extensive third-party auditing (which afaik does not happen in the US).
Granted, humans are not exactly saints. I agree that when AIs and companies do mostly align with the public (which is probably the majority of situations), AIs can be more easily held accountable by the company. And with humans, there are plenty of opaque decisions that make it hard for the public to guarantee accountability. But this is a very notable example where accountability is vital, but AI makes it a bit more difficult to get it when the company that should be holding the AI accountable doesn't care to.
This has unstuck a couple of gears in my brain, so thanks for that!
But it still seems to me like our society is built on social systems that assume humans will be filling most of the roles and making most of the decisions we care about.
I've had a lot of Kafkaesque customer support experiences with companies like Google where a machine has made a decision about my life, or it's impossible to make an account or log in to a system without lying about my name, email address, etc. due to bad code on the surface.
It seems that this populist meme is actually about that sense of dispowerment---people want to have a person that backstops the process so that if something is going wrong it can be escalated and changed.
But computers, since the marginal cost or social cost of staring at you and saying "no" 1 billion times is ~0, will never bend or fold and fix your issue.
99PI podcast recently did a good piece on this responsibility-shirking even in systems that contain almost 100% humans! (call centers)
https://99percentinvisible.org/episode/644-your-call-is-important-to-us/
So I'm not convinced that human-run systems are especially nice either, but there is something dystopian about getting locked out of my life in some way by a computer and simply having no pathway to escalate.
Do you have any mental model of how we can prevent "Computer Says So"-style disempowerment while still actually putting this technology in everything?
I agree.
Also, it seems like people compare AI to some imaginary human ideal, when often, humans are way worse about whatever is being evaluated—just like your Waymo example.
I'd figure it was somthing like that, watching Alexander Avile video essay on this. With other marxist analysis on the sidelines. There has been a moral panic that requires cooling down.
That propriety drm they mostly enacted, such an authoritarian incoherence was unsettling, abolishing fair use with making style trade dress. Nope!
Shows that this issue there is ideological inconsistencys. Leaving a splotch on the map. Thanks for clearing this up.
This misses the point. Consider the following real scenario: a health insurance company wants to maximize the amount of denied claims.
It would be hard to convince a doctor: they would personally be liable, might lose their license to practice if discovered, the paper trail of the back and forth would damage the company.
But if you set an ai none of this is a problem: you can disguise your ask of maximizing denials as part of training the ai or behind target okrs.
If the case comes out at best some director or software engineer would get fired and they would quickly by other companies who want to setup similar schemes.
But this applies equally well to any product a corporation sells. Corporations in part exist to limit liability. If a car has a defective part, it’s also often impossible to trace that back to an individual, so we have institutions to influence corporate behavior l: governments, social approval, the market itself. This argument seems like an argument against corporations producing anything we rely on, or automation.
Just because there are other ways that people use to deflect responsibilities it doesn’t mean that it’s not troubling that we are adding another tool.
Think about the possibility of using ai to identify targets for military strikes: it practically allows for the possibility to manufacture a motivation post hoc for any case.
Sure you could say that intelligence can do that as well, but what we are adding here is a further layer of complexity and what is for all intents and purposes a black box that would be impenetrable in a court of law.
Again this just reads to me as an argument against complexity and specialization in general.
You can call a specialist to come under oath and explain their decision. You can’t ask an ai model to explain how they come up to a decision, and even the engineers would have no idea.
I've noticed many people treat punishment as a moral imperative, or as the natural state of affairs, and they panic when it seems like we might not be able to punish everyone all the time.
"Why aren't there ever any consequences for [public figure we don't like]?" Like getting a big advance on a book deal? That counts as some kind of consequence, right? Oh, wait, you really mean "why aren't they punished". Got it.
"We can't allow self-driving cars on the road until countless legal issues have been worked out! After all, what happens if a car does [bad thing]?" Well, we have terms and conditions, insurances, traffic laws, liability laws, right? Oh, you really mean "nobody might go to jail". Got it.
"With realistic Sora videos, people might get convicted for crimes they didn't commit!" Well, that sounds unlikely, given standards of evidence and how everyone knows how easy these videos are to make. "But then someone might escape their punishment even if I catch them on video - that's even worse!" (No, it's really not.)
And my absolutely favorite:
"The worst thing about the 2008 financial crisis is that nobody went to jail for it!"
The purpose of punishment is the threat of it, and that is not effective on computers. When I say that a computer can never be held accountable, I mean that a computer cannot be punished in any way that will inspire fear in it or its "peers".
And that's important, because fear is load-bearing in society. We don't imprison criminals for the entertainment value. Bureaucracy already has responsibility-diffusing effect, and the addition of mechanical bureaucrats could easily take that to new extremes.
If AI is allowed to make management decisions for which no human is considered responsible, the white-collar legal system quickly falls apart. Crimes are committed, but not by anyone, and it becomes impossible to properly deter them. Some people go laughing to the bank, while others suffer and/or die.
I'm not convinced by this argument.
Humans punish other humans for their actions, as we have a deep impulse stemming from our evolutionary need to punish those in our group who don't contribute.
Today, most countries operate systems of justice which assume that punishment is the best way to prevent criminals from reoffending. A more rational system in most places to actually reduce the level of crime would probably be to reduce the level of punishment, in favour of more complex rehabilitation schemes.
A computer cannot be punished for its actions, because punishment is not for rewiring, except as a crude evolutionary proxy. We punish because we have a need to cause pain in another person for the actions that we perceived them to have done. Since a computer cannot suffer, it cannot be punished
I think the argument at face value has merit, and I appreciate that Masley made it. It's easy to get stuck in a way of thinking, and oftentimes the easiest way to get dislodged from a rut is for someone else to push you out.
Now that doesn't mean I completely agree with his conclusions, and more than anything I think he doesn't take his reasoning far enough. Because he's right, assuming we have a decent understanding of the systems we're working with, we can and do direct them to our desires. So the bigger question becomes why are these undesirable outcomes still happening if we have no problem exerting total control over computers? Whose desires are they working for?
I think you can answer this question in a lot of ways, but I think an obvious direction people will look is to the owners of the AI, computers, and algorithms. Perhaps accountability is still needed, we just need it directed at the people who do things like casually request the president send the national guard into SF, or work to seek a multibillion dollar bailout for their industry.
I think the point of behind the original presentation is that a person should never be fired by a computer.
But beyond that, computers can’t be held accountable because they are inanimate objects. The computer doesn’t care if you replace it because it can’t care. It isn’t alive, it isn’t conscious, it is just a thing.
I think the original meme makes more sense if we consider who the "we" is.
If "we" consists of average consumers, then perhaps from that perspective there are times that "we" can do next to nothing about AI models behaving badly.
For example, one type of accountability is the ability to provide reasons why a decision was made. Many AI models fail this test.
As a consequence, I understand that financial institutions are limited to using interpretable models in assessing whether to offer someone a loan, because they are legally required to provide specific reasons when someone is denied.
An even worse scenario is dealing with health insurance companies. In that case, insurance providers are incentivized to deny claims, and to provide as little transparency as legally allowed into their reasons.
In those scenarios, from the consumer's point of view, fixing the model isn't an option.
Many of the best things we do to shape human behavior in flourishing societies are not forms of social punishment, but they aren't regarded as scary evil totalitarian thought control mostly because they're the status quo. What happens when people buy super fun lawn darts, use them a few times according to the instructions, then gradually start using them with increasing casual abandon, then start using them sometimes while drunk or giving them to small children? Social punishment? Nope, no more lawn darts one decade and "what are lawn darts?" the following decade.
Thank you for a really interesting post! I was using the IBM quote the other day to argue that current AI shouldn't be making high-consequence decisions, and now I'm starting to reconsider it.
But I'm still struggling with a couple of points:
1. When you say we can "kill" a misbehaving AI and reprogram it, or that we keep Waymos under "totalitarian surveillance" - it feels to me like those aren't punishments the way they'd be for humans. They're more like...quality control? If a batch of medicine is contaminated, we don't "punish" it by destroying it, we just destroy it because that's the appropriate response to a faulty product. The computer has no subjective experience of being punished, no deterrent effect from knowing punishment exists, no learning from social consequences.
2. The accountability in the Waymo case is real - but it's on Waymo the company, not the individual cars. Which seems like it's supporting the IBM quote: it's not the computer that's responsible, it's the humans in the loop.
Would love to hear your thoughts!
Right on 1 my point was that quality control is actually what we're aiming for with accountability:
I think when people say “A computer can never be held accountable” what they mean is “A computer can never be socially punished for a bad decision.” That’s true, but social punishment is just a means to a goal: better behavior or harm prevention. It’s also a pretty clumsy, often useless tool. Many people are socially punished and still behave terribly. If we had the ability to perform neurosurgery on everyone behaving badly to permanently change their behavior, I think this would be seen as:
Much more effective than regular social punishment.
Deeply evil and totalitarian.
And yet we do the same to computers regularly. Why would we want to “hold computers accountable” when that’s way, way less effective than what we can actually do to them?
On 2 yeah definitely, the car's not responsible, but it's a very very visible beacon to the humans who are in charge. Corporations can often be way less private than human drivers. I think there are a lot of other cases like that.