Pontificating on which of Estonia's electoral polling companies produced the most accurate pre-election rating is superfluous, writes Tarmo Jüristo. Methods of attaining the results across the three or four companies engaged in polling where neither transparent, nor comparable, he goes on.
Warren Buffett, has among his many other witty aphorisms, noted that "it's only when the tide goes out that you learn who has been swimming naked."
For Estonian polling companies, this tidal ebb moment came last Sunday night, when the final Riigikogu election results became public; results which Norstat, Kantar Emor, Turu-uuringute AS and, to the surprise of all, Faktum Ariko, were busy engaged in predicting in the weeks leading up to the elections.
Party ratings have become in many ways an entertainment section similar to horoscopes in the Estonian media. Although the target audiences are somewhat different, both are expected with great excitement every week, and when they arrive, each reader may eventually find something there that confirms their beliefs and strokes their hopes.
Should Emor's rating brings bad news, then Norstat or Turu-uuringute's will be shifting rapidly, while there will be a very good chance that at least one of these will find a more positive picture to share with one's supporters on social media. Then repeat the process next week.
However, the somewhat more discerning reader will often see these rankings cause some head-scratching, since they often reveal a picture that does not want to coincide in any way with the one-to-three percent maximum statistical error limit, as proudly announced by the polling companies themselves.
It is not a rare thing for surveys conducted by different companies over essentially the same period to result in ratings which do not immediately fit into these ranges.
Usually, however, an outsider has little reason to consider one poll to be more true and fair than another. It also does not help matters when all survey companies confidently state that it was their ratings that turned out to be the most accurate last time, finding a few indicators, each of them of course, different from the others', to confirm these statements. And even if one of them did hit on the result the best last time out, this is no real guarantee that the same thing should happen this time (just by way of a prescient hint, this is what happened this time around).
At this point, it might be appropriate to make a small technical digression into how these ratings were born in the first place.
The point of party popularity ratings is to measure party support among voters at that point in time. If the point at which the poll was taken is relatively close to election day, then this, in turn, permits us to at least predict the likely voter turnout on polling day.
If one wanted to do just that, then, of course, one would need to hold a poll or a vox pop where, indeed, the entire populace could express their preferences.
This would understandably be a very costly and impractical way of doing things, so therefore survey companies use samples to get a picture of this information.
A sample represents in statistics a subgroup of manageable size, one which is expected to reflect within its main trajectories the characteristics of the population it is trying to describe; in other words, a sample which is representative of the population.
If the sample is indeed that, it might be assumed that the observations and conclusions drawn from it are valid—with some possible variations, meant to be measured by the aforementioned statistical error rate — even when translated to the general population.
However, the trouble comes when assembling a truly representative sample, which is often a very tricky task. There is a wide range of different ways and means of achieving this, and picking out the most appropriate of these for a given task is a job with a great responsibility.
Among other things, having to make compromises of one kind or another is inevitable, while it is quite understandable and to be expected that different polling companies may then make different choices.
However, the consequence of these choices is that the methodologies used by different companies to conduct their surveys differ, which in turn comprehensively leads to the fact that the results may also diverge.
There is nothing wrong or unacceptable about this per se — as long as we can get an idea of the probable causes that may have led to such divergences.
That is not the case at present, however. At best, the ratings which appear in the media (and often studies in general) are accompanied only by a fleeting footnote on how large the sample size taken was.
Sometimes this is accompanied by an indicator of margin of statistical error (which, strictly speaking, of course, applies only to a clean random sample; but this is not the case with online panels widely used by survey companies, for instance) and the survey methodologies used. But in the latter case, it is not at all common to indicate the exact division between online and telephone responses.
It is often stated that "the sample is representative", while not specifying the parameters under which it is considered representative, not to mention details of a more technical nature, such as weighting efficiency, the maximum permissible weighting or the non-response/interruption rate of the poll in question.
Certainly, the ordinary newspaper reader will not be interested in these things, of course, just as the ordinary shopper is not likely to carefully read the fine-print table on the yogurt pot, laying out its composition. However, it cannot be inferred from this that that information itself is irrelevant.
Let's now return to the topic of this year's pre-election ratings. The question of whose prediction turned out to be the most accurate is, in fact, superfluous.
The problem is that the methods in which these results (no matter how accurate or inaccurate they may be) were achieved were neither transparent nor comparable.
The elections finally gave us a reference point this time around, one which is nowhere to be taken between elections, while the ratings, although very visible, are ultimately only a small part of the work that polling companies do on a daily basis.
.In the vast majority of cases, the client has no way of taking another study on the same topic, in the same period, which the results would be compared with, in order be checked.
There is only the commissioned study, which comes with confirmation from the organizer that they have an ISO international quality certificate.
Based on surveys in Estonia, weighty choices are made every week in both public and private sectors; choices which often make a significant impact on all our lives, and for years to come. It is in all our interests that these decisions be made on the basis of as truthful and as accurate information as is possible.
Yes, it must be accepted that a final and inexorable accuracy will inevitably remain an unachievable goal, with sample-based survey surveys, which will always have some variability and uncertainty extant. But what is all the more necessary is that we, as users of this information, be able to evaluate critically the methods of achieving results in cases of doubt — or simply to keep these at arms length.
That said, sampling and conducting surveys are often very difficult tasks, and in the end, all this is still done by human beings. However, human beings, no matter how good they may be and careful, are prone to the occasional mistakes. There is nothing wrong with this, and this, too, has to be accepted. What cannot and must not be accepted, however, is a situation where it is not possible to identify and correct these errors.
It is unacceptable that survey results do not come with an exhaustive overview of the methodologies used, both in the collection of the data and in their weighting.
And if the individual conducting the study does not publish these, then there is no reason to take the results seriously at all, much less publish them in the media.
Ultimately this is all in precisely the same interests of the research companies themselves, so that the results of their hard work are such that subscribers and users can rely on them, and with peace of mind.
Editor: Andrew Whyte, Kaupo Meiel