David Burton

Voter fraud and voter ID in NC: statistical analysis of Interstate Crosscheck data

David Burton <...> Thu, May 5, 2016 at 8:32 PM
To: "Prof. Lorraine Minnite" <...>
Dear Prof. Minnite,

Did you receive my 4/29/2016 email? It didn't bounce, but I just found a different email address for you, so perhaps I sent my message to the wrong one.

This time I'm sending it to both addresses. Please let me know if you receive it, and which is the best address.

I did a bit more work on my statistical simulation program for analyzing the 2012 Interstate Crosscheck results. The simulation program is now much faster, and more "user friendly." Use the "-h" (help) option for usage instructions and examples, like this:

perl test_voter_fraud_stats.pl -h

The speed improvements made it possible to run many more election simulations. So I ran 25 million (instead of just 400,000) election simulations, for improved precision. Here are the results:

First column is number of coincidences per 35,750 matches
 : second column is number of runs (out of 25,000,000) which had that number of coincidences
  : third column is percentage of runs which had that number of coincidences
  0  :  702010  :   2.808040
  1  : 2503854  :  10.015416
  2  : 4474013  :  17.896052
  3  : 5333187  :  21.332748
  4  : 4767657  :  19.070628
  5  : 3408399  :  13.633596
  6  : 2029394  :   8.117576
  7  : 1038496  :   4.153984
  8  :  462684  :   1.850736
  9  :  184315  :   0.737260
 10  :   65800  :   0.263200
 11  :   21406  :   0.085624
 12  :    6353  :   0.025412
 13  :    1802  :   0.007208
 14  :     485  :   0.001940
 15  :     114  :   0.000456
 16  :      26  :   0.000104
 17  :       2  :   0.000008
 18  :       1  :   0.000004
 19  :       2  :   0.000008
Average = 3.57489

As you can see, out of 25 million simulated elections, each with 35,750 coincidental "name + DOB" matches with voters in other States (like the 2012 North Carolina General Election), just 31 elections (0.000124% of 25 million) resulted in more than 15 coincidental "name + DOB + Last4SSN" matches, and none resulted in more than 19.

Since 765 name + DOB + Last4SSN matches were identified in the actual 2012 NC General Election, we can say with >99.9998% statistical certainty at least 750 of the 765 Last4SSN matches were cases of actual voter fraud, and with >99.99999% certainty that at least 746 of the 765 Last4SSN matches were actual voter fraud.

BTW, I am a computer scientist, not a statistician, but a draft of this work was vetted by an eminent statistician.

It is unknown how many of the voter fraud cases were instances of a single person fraudulently voting in two States, and how many were people impersonating other voters (presumably after determining that the actual registered voters had moved out-of-State). Both are examples of voter fraud, but with different culprits.

Those two kinds of fraud cannot be distinguished, because in 2012 NC had no voter ID requirement. Unfortunately, if the culprits cannot be conclusively identified, the crimes cannot be prosecuted.

Note, also, that Interstate Crosscheck cannot identify all cases of voter impersonation. An impersonator won't be be detected unless the person who he impersonates actually votes in another State in the same election. Even then, the fraud won't be detected unless the other State participates in the Interstate Crosscheck project. (About half of the States participated in 2012, representing about 78% of the nation's voters.)

Likewise, cases of the same voter registering and voting in two States won't be detected unless both States participate in the Interstate Crosscheck project, and unless the voter uses the same name and his real social security number in both States.

So it is clear that the real number of fraudulent votes in NC was considerably higher than 750.

"When my information changes, I alter my conclusions. What do you do, sir?"
- John Maynard Keynes

I hope this new evidence will cause you to reconsider your 2010 published conclusion that the sort of voter fraud which could be prevented or prosecuted through voter ID requirements is quite rare.

If there's anything that I can do to assist you with your research, please do not hesitate to ask.

I look forward to hearing from you.

Sincerely,

Dave Burton
Cary, NC 
M: 919-244-3316



On Fri, Apr 29, 2016 at 12:04 PM, David Burton <...> wrote:
Dear Prof. Minnite,

In 2010 you wrote that your research indicated that incidents of deliberate voter fraud in the United States are "quite rare," and your work has been widely cited in support of that conclusion. However, as time passes, new evidence emerges, and I think it is time for you to reconsider that conclusion.

Using Interstate Crosscheck data, the NC SBOE identified 35,750 voters with the same name and date of birth (DOB) as voters in other participating States, in the 2012 general election.

Most were innocent coincidences: people with the same name & DOB as someone else in a different State. But many were not.

For most participating Crosscheck States, the NC SBOE also had access to the last four digits of voters' social security numbers ("Last4SSN"). Matching Last4SSN eliminates 99.99% coincidental matches. (Last4SSN is a four-digit number, 0000-to-9999, so matching Last4SSN eliminates 9,999 out of 10,000 coincidental [random] matches.)

So if all 35,750 matching name & DOB voters had been innocent coincidences, you'd expect to find only 3 or 4 "false positives," who also matched Last4SSN.

They found 765.

I did a statistical simulation analysis, simulating 400,000 "elections," each of which had 35,750 cases of voters with names and DOB which, by innocent coincidence, matched voters from other States, to determine the likelihood distribution of cases which also match Last4SSN. The program is short and simple. You can easily run it yourself. The only prerequisite is a free copy of Perl (almost any version). Here is the program:

http://sealevel.info/test_voter_fraud_stats.pl

(Simulating 400,000 elections takes about five hours on my modest desktop PC, but you can do quicker runs, at a cost of lost accuracy, by editing the "$numruns = 400000;" line at the top, and changing "400000" to a smaller number.)

Here are the results:


First column is number of coincidences per 35,750 matches
 : second column is number of runs (out of 400,000) which had that number of coincidences
  : third column is percentage of runs which had that number of coincidences
  0  : 11185  :   2.7963
  1  : 39863  :   9.9657
  2  : 71770  :  17.9425
  3  : 85277  :  21.3193
  4  : 76261  :  19.0652
  5  : 54745  :  13.6862
  6  : 32473  :   8.1182
  7  : 16641  :   4.1603
  8  :  7302  :   1.8255
  9  :  2946  :   0.7365
 10  :  1071  :   0.2678
 11  :   344  :   0.0860
 12  :    93  :   0.0233
 13  :    22  :   0.0055
 14  :     6  :   0.0015
 15  :     1  :   0.0003
Average = 3.57563
Run time = 280.9 minutes


As you can see, in 400,000 simulated elections there were 1537 (0.3844%) with 10-15 innocent coincidences of name, DOB & Last4SSN, but just one (0.0003%) with 15 innocent coincidences, and none with more than 15 innocent coincidences.

Thus we can say with 99.6% certainty that in the 2012 North Carolina general election there were at least 755 cases of fraud (765 minus 10), and with >99.999% certainty there were at least 750 cases of fraud (765 minus 15).

I don't consider 750 cases of actual, identifiable voter fraud in the 2012 NC general election "quite rare." Do you?

It is unknown how many of the voter fraud cases were instances of a single person fraudulently voting in two States, and how many were people impersonating other voters (presumably after determining that the actual registered voters had moved out-of-State). Both are examples of voter fraud, but with different culprits.

To the best of my knowledge, there's been no attempt to prosecute any of those 765 cases, perhaps because of the SBOE's inability to distinguish between cases of the same person voting in two States, and cases of impersonation. That's because in 2012 NC had no voter ID requirement. If they can't conclusively identify the culprits, they can't prosecute the crimes.

Note that Interstate Crosscheck cannot identify all, or even most, cases of voter impersonation. An impersonator won't be be detected unless the person who he impersonates actually votes in another State in the same election. Even then, the fraud won't be detected unless the other State participates in the Interstate Crosscheck project. (About half of the States did so in 2012, representing about 78% of the nation's voters.) So it is clear that the real number of fraudulent votes in NC was higher than 750, perhaps several times higher.

4,505,372 North Carolinians voted in the 2012 Presidential. A few thousand fraud cases is not a large percentage of that number, but it certainly is not "quite rare." 750 provable fraud cases is about 0.0166% of the total vote.

That's not a large percentage, but it is large enough to matter. Sometimes even a tiny percentage can change the outcome of an election, with momentous consequences.

In the 2000 Presidential election, the outcome depended on the result in Florida, where Bush beat Gore by a margin of just 0.00922% of the vote, which is much smaller than the percentage of provably fraudulent NC votes in 2012. Likewise, Sen. Al Franken’s 312-vote (0.0109%) MN victory margin in 2008 ultimately provided the deciding vote to enact ObamaCare, but that margin was also much smaller than the percentage of provably fraudulent NC votes in 2012.

"When my information changes, I alter my conclusions. What do you do, sir?"
- John Maynard Keynes

Are you ready to reconsider your 2010 conclusion, Prof. Minnite?

Sincerely,

Dave Burton
Cary, NC