|
Informatics Review > Thoughts > The Usability of Punched Ballots--Improving Usability in America's Voting Systems |
The Usability of Punched Ballots
Improving Usability in America's Voting Systems
Bob Bailey
Human Factors International, Inc.
Improving usability
Theresa LePore, the Supervisor of Elections in Palm Beach County, Florida, has received much criticism for the ballot she designed for this year's presidential election. Actually she made several good decisions. For example, she attempted to improve the ballot for older voters by making the characters larger. Also, she wanted to have all presidential candidates on one page. Her creative solution was to use the "butterfly ballot." Actually, to ensure adequate reading performance, she should have focused on at least five issues:
Font size -- For the majority of voters, a font size of 10 points would have been satisfactory. Most books are printed using type that is 10 or 11 points (a "point" is 1/72 of an inch). To accommodate older users, however, the research suggests that the characters should have been at least 12 points (maybe even 14 points). Using all uppercase letters, which she elected to do, made the characters larger for users. It is acceptable to use smaller fonts sizes when users can move closer to the text (or move the text closer to them) in order to make the image in the eye (angle subtended on the retina) larger.
Font type -- She used a "sans serif" font for the names. This decision was acceptable. There is one study that suggests that people over age 60 read "serif" fonts faster than sans serif fonts. In this case, the speed of reading is not as important as the accuracy of reading. Florida law allows each voter five minutes in the voting booth.
Text vs. background -- The fastest and most accurate readability comes from using black text on a white background. This is what she did. The ballot appears to be black print on "white" card stock.
Illumination level -- We do not know about the illumination level where the votes were cast. One recent research study found that in 71% of over 50 different "public places" in Florida, the light level was too low for adequate reading. Older adults need more illumination in order to see well. In general, because the main usability issue was reading accurately, not reading quickly, Ms. LePore did an adequate job of dealing with these basic human factors issues.
Layout of the Ballot
The issues surrounding the layout of the ballot were not as easy to detect and resolve. It is difficult, even for usability experts, to identify layout and formatting problems. For this reason, usability professionals make considerable use of usability tests.
Usability testing -- Usability tests are intended to identify and correct problems before products are used by large numbers of users. In Ms. LePore's case, she would be interested in finding and fixing most of the serious problems voters would have on Election Day. In her case, a usability test would require several people pretending to vote while using the proposed ballot. While voting, these test participants would be observed by experienced usability testers. The testers would note and record any difficulties that the "voters" appeared to be having. After voting, the participants would be individually interviewed about any problems they experienced. This information would be used to change the ballot, and then a second round of usability testing would take place. Sometimes it takes 3 to 5 (or more) iterations (design, test, redesign) to achieve the desired outcome, i.e., meet the performance goals for the ballot.
The "Buchanan" problem -- Buchanan got 3,407 votes for president in this heavily Democratic county, more than he received in any other Florida county. One explanation for the large number of votes related to the way Palm Beach County's punch-card style ballot was laid out for the presidential race. Candidates were listed on both sides of the vertical row of holes where the voters punch their choices. The top hole was for Bush, listed at top left; the second hole was for Buchanan, listed at top right, and the third hole was for Gore, listed under Bush on the left. Click to view an image of the ballot.
Informal evaluations -- Theresa LePore designed the ballot and then had it reviewed. Her usability testing, however, was less formal than just described. It initially consisted of seeking approval by two other members of the canvassing board of which she was a member. These two evaluators were intelligent, and highly experienced in conducting elections -- one was a county commissioner and the other was a judge. Even so, the probability of one or the other of these two people detecting the "Buchanan" problem by simply looking at the ballot was a very low. I calculated it as being about two chances in a 100. Ms. LePore then sent the ballot to both the Democratic and Republican National Committees for review. If we assume that the two groups had a total of ten people look at the ballot, the probability that one or more people in this group would have found the "Buchanan" problem was also very low. I calculated that they had about 1 chance in 10 of finding the problem. Obviously, none of these reviewers identified the "Buchanan" problem. Ms. LePore was not familiar with usability testing, but neither are many other highly experienced designers. For example, shortly after the Florida voting issue became known, one highly experienced system developer wrote: "Would usability testing (which often only uses 5-20 people of each background) have caught it? I think so." He links users to Jakob Nielsen's web site, where Nielsen has suggested that "100% of usability problems can be found using only 15 subjects." Both are wrong.
Number of test participants -- How many usability test participants would have been required for Ms. LePore to feel confident of finding these types of problems? This answer can be calculated.* We know what percent of the voters had problems. The "Buchanan" problem was only a difficulty for about 1% of all the voters. My calculations show that she would have required 289 test participants to be 95% confident of detecting the "Buchanan" problem (423 participants would be needed to be 99% confident). Let's be a bit more precise. If Palm Beach county were like the other Florida counties, Buchanan would have gotten around 600 votes, instead of 3,407. Many have proposed that this suggests that about 2,800 votes (3,400 minus 600) were erroneously made. We do not know for sure - the votes may have been correctly made for Buchanan. Keep in mind that Buchanan received over 8,000 votes in Palm Beach County in the 1996 presidential primary when he was running against Bob Dole. What most of her critics are ignoring is that more than 99% of the voters had no trouble voting by using Ms. LePore’s ballot. The question becomes, what is different about the people that may have had a problem with the ballot. There are several possibilities:
A good usability tester would have tried to determine which of the above reasons affected the voters. Assuming that the voters actually did not want to vote for Buchanan, the question becomes which one, or combinations of several, or one that is not listed, was most responsible.
The "Multiple votes" problem -- The same reasoning and calculations can be used with the other major problem. In Palm Beach County there were 19,010 ballots that were not considered valid (they were disqualified) because the voters had voted (punched) for more than one presidential candidate. In the initial voting, there were 432,286 ballots completed in that county. This means that 4.4% of the ballots were considered invalid. The question is how many test participants would have been required to have almost certainly detected the problem? The same formula can be applied. I calculate that they would have required 94 participants to complete a sample ballot, in order to be 99% confident of detecting the ‘Multiple votes’ problem. This is far fewer than were required for the “Buchanan’ problem because a higher percentage actually ended up making the "Multiple votes" error.
The "Dimpled ballot" problem -- Even the highly publicized "dimpled ballot" problem could have been identified before the election if the test participants, the ballots, the punching instrument, and the test items had been truly representative of the actual voting experience. All voters were given the following instruction, both before the election and when voting:
"STEP 3 -- To vote, hold the voting instrument straight up. Punch straight down through the ballot card for the candidates of your choice." (In the original version "Punch straight down through" was in boldface type.)
The usability test would have shown that some people were not punching through the card as they were directed to do in the instructions. This may have meant that better instructions were required, or that some people were unable to adequately use the punching instrument (or both). Some were obviously pressing lightly on the card ("dimpling") rather than punching through the card as they were instructed. One final point should be made. Each ballot had a final instruction, in all capital letters, at the bottom of the "instructions" page:
"AFTER VOTING, CHECK YOUR BALLOT CARD TO BE SURE YOUR VOTING SELECTIONS ARE CLEARLY AND CLEANLY PUNCHED AND THERE ARE NO CHIPS LEFT HANGING ON THE BACK OF THE CARD."
Conclusion
My conclusion is that Theresa LePore should not be so severely criticized for making design decisions that led to the "Buchanan" and "Multiple votes" problems. In the past, few (if any) ballots (and their related instructions) have received the kind of rigorous usability testing that would have identified these problems. Having a certain number of voter problems was considered simply a part of the cost of holding an election with millions of voters.
Generally, usability testing has been considered as too expensive. I figure that it would have cost less than $20,000 to run the necessary performance tests on the Palm Beach ballot. These usability tests would have enabled ballot designers to find and rectify the "Buchanan" problem, the "Multiple votes" problem, and even the "dimpled ballot" problem. For comparison purposes, the two presidential candidates spent about one billion dollars trying to get elected.
Footnote:
*Calculation of required number of test participants:
A reasonable estimate of the number of participants required to detect the problem can be made by using the formula: 1-(1-p)n, where p = the probability of the usability problem occurring (in this case 0.01), and n = the number of test participants required. Obviously, in this case we are solving for n.
For another expert review on the Palm Beach County ballot layout.
© 2000 Human Factors International, Inc.
|
Informatics Review > Thoughts > The Usability of Punched Ballots--Improving Usability in America's Voting Systems |