Brian Wichmann is the Editor of Voting matters and a visiting professor of The Open University.
For an STV algorithm, we could have too much stability in that part of the ballot papers are simply ignored - for instance, by only using the first preference. On the other hand, we could have an algorithm which lacks stability in vital respects by changing the result for inconsequential changes to a ballot paper.
One change made to a ballot paper can be regarded as small, due to the nature of the preferential system. Since the usual means of balloting does not provide for the voter to give equal preference, when the ballot paper records ABC, this might be because A and B were regarded as equal, but the voter specified A first arbitrarily. Hence the voter could equally have written BAC instead. Hence given the ballot ABC, the voter's true intentions could perhaps have been expressed as BAC or ACB. In general, given n preferences, n-1 ballot papers constructed by interchanging neighbouring preferences could be regarded as small differences.
Now consider two algorithms for STV which have broadly similar properties (as do all serious contenders). Figures 1 and 2 represent graphically these two algorithms.
Figure 1
Figure 1 represents a stable algorithm since small changes are unlikely to change the result of an election, while Figure 2 represents an unstable algorithm. If we were operating in two dimensions, then the property of stability could be measured rather like the game of shove-halfpenny: one would measure the probability that a small circle placed at random on the figure crossed one of the dividing lines.
Figure 2
In the case of STV algorithms, we do not have a simple two-dimensional system, and hence the figures are a crude diagrammatic representation. To measure the probabilities we must conduct a suitably controlled experiment. Fortunately, we can use a computer to aid this process so that we can perform the equivalent of shove-halfpenny sufficiently often to obtain results which are likely to be meaningful.
We select an actual election for which the ballot papers are available. We also choose a number, about 20, which is the number of ballot papers from the full set that is to be selected at random. (We return to the choice of this number n later.)
From each real election, we derive 100 mini-elections by randomly selecting n ballot papers. The experimental method is to analyse the effect of making small changes to these mini-elections. The analysis is as follows. Firstly, we compute the result of the two algorithms from the mini-election. (The result need not be the same for the two algorithms, nor the same as for the full election.) We now consider all the possible similar mini-elections derived by making one small change to one of the ballot papers. (This is potentially hundreds of elections - hence the computer.) This particular mini-election is on the edge if a specific criterion is met, say at least one of the small changes produces a different result.
The choice of n is important. If n is very small (say 1), then it is clear that the mini-election will not be representative of the real election. On the other hand, if n is large (say the full election), then the computation of the 'edge' becomes too large, and also the number of possible mini-elections becomes too small (in this case only 1). Care must be taken over the specific criterion for being on the edge. If one takes something like the ERS council elections (i.e., several posts to fill with no parties, so that small changes are likely to make a difference to the outcome), with the criterion that any small change resulting in a difference implies being on the edge, then there is a danger that all mini-elections are on the edge!
For the 100 random mini-elections we perform a different analysis in each of the three experiments given here. If one could assume statistical independence, then it would be a simple matter to undertake a chi-squared test to see if the result is significant. Unfortunately, we do not have elections with a large enough number of ballot papers to ensure the independence, and therefore we must be content with a non-statistical treatment.
The number of changes to those elected for one small change is usually 0 (no change), but is sometimes 1, rarely 2 and very rarely 3. Hence for each ballot paper in the mini-election, n-1 integers are output, representing the number of changes arising from each of the n-1 possible interchanges of adjacent preferences, where n is the number of preferences marked on the ballot paper. This implies that the output is of similar length to the input - an important consideration, since if complete results were printed for each election result computed, hundreds of pages of material would be produced.
The analysis is most easily seen by considering an example. A mini-election from election R038 is as follows:
17 5 1 11 9 10 0 1 10 17 5 9 11 16 0 1 6 16 2 1 14 17 10 9 11 5 8 4 12 13 15 0 1 4 8 12 15 13 0 1 17 5 11 1 16 10 2 0 1 5 9 10 11 17 0 1 3 7 9 14 17 0 ... 1 8 10 13 11 17 0 1 9 5 6 17 0 1 11 10 5 17 9 0 1 6 4 15 14 16 8 1 0 1 6 14 16 1 2 4 13 12 8 15 0 1 13 4 15 12 8 0 0 "A.1 " "B.2 " "C.3 " "D.4 " "E.5 " "F.6 " "G.7 " "H.8 " "I.9 " "J.10 " "K.11 " "L.12 " "M.13 " "N.14 " "O.15 " "P.16 " "Q.17 " "1R038: H3H "The above data is for an election with 17 candidates for 5 seats, in which the first ballot paper selects candidate 11 (K) as the first preference, then 9 and lastly 10. The names of the candidates are the letters A-Q, a convention used throughout.
The program computes the effect of making all possible interchanges of adjacent preferences, which for Meek gives:
v1 +F-L-B-O-P-H-G-N-E-M-C+I+Q-K+J+A-D 68 0 1 1 1 2 1 1 1 1 1 1 0 0 2 1 1 0 0 1 0 0 1 1 ... 2 1 1 1 0 1 0 1 1 0 0 1 1 0 0 0 0 0 2 1 0 1 1 1 0 2 0 0 1 1 0 1 0 0 1 1 1 1 0 1 1 0 1 mThe first line gives the result (with Meek) for this mini-election, where +F-L means F is elected and L is excluded, etc. (The v1 and 68 are not relevant.) Then, starting with the last ballot paper and working back towards the beginning, the number of differences to the result is printed for each possible interchange. Hence the last ballot paper has four possible interchanges, the first one giving no difference, but the last three each making a single difference. So in this case, interchanging the first two preferences makes no difference, but interchanging the 2nd and 3rd preferences does change the result by one candidate. The 'm' relates to the third experiment and is explained later.
One other program is needed which selects n ballot papers at random from a real election, and repeats this 100 times. This program is fast and straightforward.
For the main election data, six real elections have been chosen from the data already available (see Voting matters, Issue 2). The statistics from these elections are as follows:
Identifier Papers Candidates Seats n R006 239 9 2 20 R008 261 10 3 25 R010 270 9 5 27 R017 479 8 1 15 R033 196 14 7 25 R038 177 17 5 20Unfortunately, none of the elections in the data base are from elections involving parties, and so such elections could not be selected for this study.
We can now summarise the results obtained by example. For election R017, 100 mini-elections are computed by selecting 15 ballot papers from the actual 479. For each of these mini-elections, we compute what difference (if any) would be made by a single transposition of a preference. This is repeated for each possible transposition, which in this case, involves the analysis of 4585 elections!
Criterion: Some change for any transposition
Election ERS edge Meek edge R006 74 65 R008 80 74 R010 95 87 R017 69 74 R033 99 95 R038 100 100This table means, for instance, that for the 100 mini-elections derived from R006, 74 are on the 'edge' for ERS and 65 for Meek - which implies that there were 26 or 35 elections for which no change was made by any transpositions. Hence a very high proportion of the mini-elections are on the 'edge', over three quarters in almost all cases. However, even the most optimistic assumption shows that there is not much difference between the two algorithms.
We now change the criterion for being on the edge so that a lower proportion are on the edge.
Criterion: More than three transpositions make a change
Election ERS edge Meek edge R006 41 38 R008 55 46 R010 76 57 R017 32 49 R033 91 75 R038 94 91We again conclude that there is not much difference between the two algorithms.
We need to look at aspects other than the actual size of the edge to see significant differences.
Given a mini-election which is on the edge, then we know at least some transpositions of the preferences will change the result. It is therefore natural to ask which specific transpositions can change the result. Clearly, it is more likely that transposing the first two preferences will alter the result, but what about the subsequent transpositions? We therefore analyse the number of times a transposition makes a change, against the position of the transposition (pi).
Combined results p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 R006 ERS 283 31 14 6 4 0... Meek 310 147 61 39 23 16 14 0... R008 ERS 452 56 11 8 5 4 0 1 3 0... Meek 393 161 70 42 25 13 12 11 0... R010 ERS 668 173 36 21 9 4 2 0... Meek 423 174 82 34 21 18 13 0... R017 ERS 214 27 4 5 2 1 0... Meek 279 210 123 119 104 94 0... R033 ERS 979 227 78 31 17 8 3 2 1 0 0 1 Meek1876 392 225 144 117 91 69 61 57 41 41 34 37 R038 ERS 734 203 44 31 17 6 3 2 0 1 1 0 Meek 723 502 376 346 157 138 107 97 91 44 36 33In the table above, for each of the six elections, the number of times a transposition makes (at least) one change to the result is tabulated against the preference position for all the 100 mini-elections. The difference between ERS and Meek is now obvious. The number of changes for the first preference between the two algorithms is similar and is surely not significant. However, in all subsequent preferences, many more changes arise from Meek than from ERS.
In examining the subsequent preferences, there is no natural scale to work to, since a change in preference n is more significant if there are n candidates than 2n candidates. The number of seats is also relevant to this scale. Hence in analysing the table above, both the number of candidates and seats must be considered.
We can add up the results from each election for those positions beyond the number of seats (s) for each election, giving the following results:
Position s+1 s+2 s+3 >s+3 ERS 61 21 15 11 Meek 530 364 293 596 ratio 8.7 17 19 54Hence we conclude that transposing preferences beyond the number of seats has virtually no effect with ERS as compared with Meek.
We can now compare the violation rate for ERS and Meek, which is as follows:
Election ERS violations Meek violations R006 0 32 R008 2 29 R010 5 8 R017 5 78 R033 5 70 R038 0 141Hence there is no question that Meek violates mono-raise much more than ERS. This is likely to be due to the increased sensitivity of Meek to the effects of late preferences.
It appears that the results presented here have some limitations. Firstly, the mini-elections necessarily have a small number of ballot papers and so the results need not apply to larger elections. Secondly, a consequence of the small number of ballot papers is that in many cases, random choices are made by both the ERS and Meek algorithms.