Mathematics: Interesting Application of Baye's Theorem

This post isn't so much to talk about the politics of the NSA mass surveillance program as it is to demonstrate a well known statistical concept.

Basically, what his article argues is this: suppose you want to determine if someone is a terrorist threat. Suppose you have some algorithm that is reasonably accurate when you apply it (say, if a person meets criteria X, Y, Z, W,.... then it follows that there is a, say, 90% probability that this person is a terrorist threat). Now you randomly apply this process to the population at large and it turns up names, say, including Ollie. So, now that Ollie has been identified, what is the probability that Ollie is indeed a terrorist threat?

Answer: pretty low; probably less than 50%. 50% is what you would get if you were to merely flip a coin with heads: "he is a terrorist" and tails: "he isn't".

Of course, one could argue is that this first mass process is really some weeding out thing so as one has a collection of people, each with a slightly higher than normal chance of being a terrorist threat, and then one could apply more expensive tests to this group, knowing that the "false alarm" rate is going to be very high.

Anyway, here goes with the article:

Why Does the NSA Engage in Mass Surveillance of Americans When It's Statistically Impossible for Such Spying to Detect Terrorists?


The Bush administration and the National Security Agency (NSA) have been secretly monitoring the email messages and phone calls of all Americans. They are doing this, they say, for our own good. To find terrorists. Many people have criticized NSA's domestic spying as unlawful invasion of privacy, as search without search warrant, as abuse of power, as misuse of the NSA's resources, as unConstitutional, as something the communists would do, something very unAmerican.

In addition, however, mass surveillance of an entire population cannot find terrorists. It is a probabilistic impossibility. It cannot work.

What is the probability that people are terrorists given that NSA's mass surveillance identifies them as terrorists? If the probability is zero (p=0.00), then they certainly are not terrorists, and NSA was wasting resources and damaging the lives of innocent citizens. If the probability is one (p=1.00), then they definitely are terrorists, and NSA has saved the day. If the probability is fifty-fifty (p=0.50), that is the same as guessing the flip of a coin. The conditional probability that people are terrorists given that the NSA surveillance system says they are, that had better be very near to one (p_1.00) and very far from zero (p=0.00).

The mathematics of conditional probability were figured out by the Scottish logician Thomas Bayes. If you Google "Bayes' Theorem", you will get more than a million hits. Bayes' Theorem is taught in all elementary statistics classes. Everyone at NSA certainly knows Bayes' Theorem.

To know if mass surveillance will work, Bayes' theorem requires three estimations:

1) The base-rate for terrorists, i.e. what proportion of the population are terrorists.

2) The accuracy rate, i.e., the probability that real terrorists will be identified by NSA;

3) The misidentification rate, i.e., the probability that innocent citizens will be misidentified by NSA as terrorists.

No matter how sophisticated and super-duper are NSA's methods for identifying terrorists, no matter how big and fast are NSA's computers, NSA's accuracy rate will never be 100% and their misidentification rate will never be 0%. That fact, plus the extremely low base-rate for terrorists, means it is logically impossible for mass surveillance to be an effective way to find terrorists.

I will not put Bayes' computational formula here. It is available in all elementary statistics books and is on the web should any readers be interested. But I will compute some conditional probabilities that people are terrorists given that NSA's system of mass surveillance identifies them to be terrorists.

The US Census shows that there are about 300 million people living in the USA.

Suppose that there are 1,000 terrorists there as well, which is probably a high estimate. The base-rate would be 1 terrorist per 300,000 people. In percentages, that is .00033% which is way less than 1%. Suppose that NSA surveillance has an accuracy rate of .40, which means that 40% of real terrorists in the USA will be identified by NSA's monitoring of everyone's email and phone calls. This is probably a high estimate, considering that terrorists are doing their best to avoid detection. There is no evidence thus far that NSA has been so successful at finding terrorists. And suppose NSA's misidentification rate is .0001, which means that .01% of innocent people will be misidentified as terrorists, at least until they are investigated, detained and interrogated. Note that .01% of the US population is 30,000 people. With these suppositions, then the probability that people are terrorists given that NSA's system of surveillance identifies them as terrorists is only p=0.0132, which is near zero, very far from one. Ergo, NSA's surveillance system is useless for finding terrorists.

Suppose that NSA's system is more accurate than .40, let's say, .70, which means that 70% of terrorists in the USA will be found by mass monitoring of phone calls and email messages. Then, by Bayes' Theorem, the probability that a person is a terrorist if targeted by NSA is still only p=0.0228, which is near zero, far from one, and useless.

Suppose that NSA's system is really, really, really good, really, really good, with an accuracy rate of .90, and a misidentification rate of .00001, which means that only 3,000 innocent people are misidentified as terrorists. With these suppositions, then the probability that people are terrorists given that NSA's system of surveillance identifies them as terrorists is only p=0.2308, which is far from one and well below flipping a coin. NSA's domestic monitoring of everyone's email and phone calls is useless for finding terrorists.

NSA knows this. Bayes' Theorem is elementary common knowledge. So, why does NSA spy on Americans knowing it's not possible to find terrorists that way? Mass surveillance of the entire population is logically sensible only if there is a higher base-rate. Higher base-rates arise from two lines of thought, neither of them very nice:

1) McCarthy-type national paranoia;

2) political espionage.

The whole NSA domestic spying program will seem to work well, will seem logical and possible, if you are paranoid. Instead of presuming there are 1,000 terrorists in the USA, presume there are 1 million terrorists. Americans have gone paranoid before, for example, during the McCarthyism era of the 1950s. Imagining a million terrorists in America puts the base-rate at .00333, and now the probability that a person is a terrorist given that NSA's system identifies them is p=.99, which is near certainty. But only if you are paranoid. If NSA's surveillance requires a presumption of a million terrorists, and if in fact there are only 100 or only 10, then a lot of innocent people are going to be misidentified and confidently mislabeled as terrorists.

The ratio of real terrorists to innocent people in the prison camps of Guantanamo, Abu Ghraib, and Kandahar shows that the US is paranoid and is not bothered by mistaken identifications of innocent people. The ratio of real terrorists to innocent people on Bush's no-fly lists shows that the Bush administration is not bothered by mistaken identifications of innocent Americans.

Also, mass surveillance of the entire population is logically plausible if NSA's domestic spying is not looking for terrorists, but looking for something else, something that is not so rare as terrorists. For example, the May 19 Fox News opinion poll of 900 registered voters found that 30% dislike the Bush administration so much they want him impeached. If NSA were monitoring email and phone calls to identify pro-impeachment people, and if the accuracy rate were .90 and the error rate were .01, then the probability that people are pro-impeachment given that NSA surveillance system identified them as such, would be p=.98, which is coming close to certainty (p_1.00). Mass surveillance by NSA of all Americans' phone calls and emails would be very effective for domestic political intelligence.

But finding a few terrorists by mass surveillance of the phone calls and email messages of 300 million Americans is mathematically impossible, and NSA certainly knows that.

How Not to Play Against Cheaters

If there was ever any doubt that the Democrats in Congress are clueless about how to fight against the disreputable tactics of the Bushite Republicans, that doubt should have been dispelled by the recent debacle in the House concerning a resolution to "stay the course" in Iraq.

The resolution in question had but one purpose, and that was to put the Democrats in a no-win situation politically. If they voted for the resolution, they would be tying themselves to President Bush's disastrous policy. If they voted against the resolution, the Democrats would expose themselves to the charge that theirs was a position of "cut and run"--precisely Karl Rove's strategy for turning the albatross of the Iraq disaster into another Republican election-winner, as it was in 2002 and 2004.

The process was carefully set up by the Republicans to assure there would be no real discussion of anything, no genuine progress toward clarifying the real situation in Iraq or the nature of our real policy choices there.

It was designed, that is, to serve no national purpose but only a partisan political purpose.

All of which is typical of what the Republicans have been doing with their power for some years now. It should be no surprise that the same folks who put poison pills into the Homeland Security legislation, so they could paint a war-hero like Max Cleland as "Osama bin Laden's man" back in 2002, would similarly poison American politics here in 2006 by trying to compel the opposition party two choose between two politically toxic options.

What's downright astonishing is that the Democrats seem to have learned nothing in the intervening four years about how to cope with these Bushite tactics. The Republicans having set their trap, the Democrats obligingly walked into it.

And now the chorus of "cut and run" -foolish and dishonest, but possibly politically effective--is heard in the land.

How brilliant do you have to be to understand that, given a choice between death by hanging and death by shooting, the proper response is "Neither"? Is there any law that required the Democrats to vote either for or against that purely manipulative resolution?

No, no law, only the Democrats' habit of playing "politics as usual" even when politics has become decidedly not as usual, even in this unprecedentedly dishonest political environment the Bushites have created.

In most of American history, the appropriate response to a measure before Congress has been to either support or oppose it. In today's morally corrupt political environment, the proper response is neither support nor opposition but rather denunciation of the whole scurrilous game.

But the Democrats keep letting the Republicans get away with their "Have you stopped beating your wife?" way of rigging the game.

Just as John Kerry gave away the game by campaigning against George W. Bush as "a good man," who just happened to have some different ideas about how to achieve our common goals, so also now, two years later, the Democrats of the House gave away the game by treating the measure before them as legitimate rather than as the scandal that it clearly was.

It's About that Rogue Elephant in the Room

This is but the latest example of an almost invariable pattern: the Bushite Republicans behave scandalously and the clueless or craven Democrats let them get away with it.

It is about the scandalous Republican way of governing, through lies and the abuse of power -not about the various bogus issues through which the Republicans seek to gain political advantage--that the Democrats should be talking to the American people.

"It is a scandal," the Democrats should have announced to the country just before walking out of the chamber, declaring that they would not dignify this fundamentally dishonest resolution by voting on it, "for the Republicans to be playing politics on this vital and painful national issue. Where lives are at stake, we need genuine discussion, not mere grandstanding for political advantage."

It is the job of the opposition party to help the American people recognize how profoundly these Bushite Republicans are debasing our political process. And part of this is to draw the contrast between how the Republicans are operating, and how a healthy democracy would operate. So the Democrats' statement might continue:

"How can we know whether or not we should 'stay the course' without conducting the substantive discussion this country so sorely needs in order to illuminate just what our real choices are with respect to this terrible mess that the Bush leadership has made in Iraq?

"The Republicans act as if we can somehow know what we should do just by virtue of some primitive logic that says because we went in we have to stay there until we achieve our goals. But such logic is foolish until we establish whether 'staying the course' will indeed get us any closer to reaching those goals.

"Neither is that logic sufficient that says that because it was a mistake to invade Iraq in the first place we ought simply to leave. We'd need first to assess what the consequences of a withdrawal would be."

The Democrats do not need to take some firm stand between the choices Karl Rove offers them. Indeed, among the Democrats there is no unity on this issue. But what they can be united on is their denunciation of the dishonest Republican effort to reduce American discourse to the primitive and simple-minded level at which the Rovian manipulations succeed in feeding Bushite power even at the cost of crippling the nation. Their denunciation might continue:

"The question of what course of action is likely now to serve best American interests and values -including the responsibility we have to those people whose country ours has invaded--can't be answered by primitive logic based only on slogans and ignorance. It can only be answered on the basis of genuine knowledge and expertise concerning the situation in Iraq and other relevant aspects of the world.

"America has many knowledgeable people -both in government service and in our private institutions--who could help us in the Congress come to judgments on such questions as: What is likely to happen if we continue what we are doing, and what is likely to happen if we choose some other course? What are the likely costs and benefits of the different courses of action, and how should those be weighed against each other?

"But that is precisely the kind of discussion that the Republicans have refused to have. As this scandal of a Republican-sponsored resolution in the House demonstrates, the same regime that deceived Congress to lead us into war, and that "fixed" the intelligence to support that decision to go to war -the genuine reasons for which we are left still to wonder about--has no interest in helping us to make reasoned decisions based on true understanding.

"The Republicans have seen to it that we hold no thoughtful hearings, have no substantive debate, contemplate no reasonable alternatives. All they seek is the political advantage they believe they can get by reducing our political discourse to simple-minded slogans like 'stay the course,' and by engaging in the kind patriotic flag-waving that has rightly been dubbed 'the last refuge of scoundrels.'

"America deserves better. America needs better.

"We have already seen how disastrous are the results of making policy on the basis of ignorance and arrogant assumptions. This Bush regime may have no great respect for genuine knowledge about reality, but reality does not go away just because we try to assume it away.

"That's why we have this mess in Iraq--from the series of colossal misjudgments and blunders this administration has made in both leading us into this invasion and in running the subsequent occupation. At every turn they have scorned what proved to be good counsel from people with real knowledge and experience. And now America and Iraq and indeed the whole world suffer the consequences."

The Bushite arrogance and dishonesty is hardly confined to the Iraq war. It is a pattern that pervades everything they do. And so whatever the immediate manifestation of this scandalous Bushite way of wielding power, the Democratic denunciation needs to tie the specific issue with the larger pattern, so that the pattern becomes more visible to the American people who can then join in the repudiation of this disastrous and destructive regime.

"So enough of this playing politics with a matter so grave," the Democratic statement might conclude. "Enough of putting political advantage over the interests of the nation.

"We need real deliberation, not posturing. We need genuine leadership, not mere manipulation and deception.

"The Republicans now have the power to disserve this country and disgrace this Congress with this dishonest charade. We Democrats cannot now stop it. But neither will we participate in it."


