On 12.11.2019 I participated in first EU DataViz conference held in Luxembourg City. I had a presentation entitled “EU elections: the case for a harmonized treatment of European election data” in the first set of thematic sessions
The presentation was filmed and can be found HERE (at 2h:51m:48s)
The slides are found at this link.
I took the liberty of publishing the presentation in the form of an article below. Enjoy!
I’m here to share with you what I’ve learned creating the most detailed maps of the European Elections ever.
Part I – Each election has its map
Today, when elections happen, news sites use maps to easily illustrate the result and patterns in the vote.
So (for example) when elections happen in the United States we see a lot of maps published in various news sites such as the New York Times, the Washington Post, including non-American ones, like the Guardian. They vary in detail, aesthetic, subject matter – presidential elections, congressional, presidential primaries – , and approach the vote from various angles.
We also see detailed maps when other big democracies vote, like India for example (on the left), the world’s single largest elections, with over 1 billion voters. You see beautiful detailed maps of Brazil as well (here on the right) made by Gazeta do Povo for the last presidential elections they had.
But when the European Union holds elections – second largest elections in the world, with more than 400 million eligible voters – we often get maps like these. These 1-county-1-color maps simplify the vote to the extreme. You do get to see the whole union, but with virtually no detail.
If not we see this alternative: complex maps that focus on the reader’s member state. So El Diario does a detailed cartography of the Spanish vote, without showing other countries. And Le Monde shows only how the French voted, and not the rest of the Union.
There is something missing. A map that merges the best of both: a pan-EU look but on a detailed level.
These are the only two maps of these kind.
The map on the left was created by me, as a personal project, for the 2014 elections. It took 6 months of on-and-off work. The one on the right was a collaborative work with Julius Tröger and Zeit Online, in June 2019 right after this year’s European elections. It is the first time an online newspaper published such a map.
Here it is on full screen.
(Apologies for cutting off Finland and Sweden). Almost 80.000 administrative units, mostly municipalities. A historic first.
Part II – Why are there so few maps like this?
So why did it take until now for someone to make these maps? Well the problem is that election data is organized by national authorities. There is no harmonised data to an EU format, done either by national or EU authorities. What this means is that the national news sites will process only familiar data (i.e. data formats they are already familiar with, from national elections).
And I encountered this a lot while working on my maps.
For example you would think member states publish data on the website of Election Authorities. Some do, some don’t. Sometimes it’s published on a open data portal, and sometimes it’s on a private site.
You would think that the data is made available for download. Yet sometimes it isn’t. And to get all the data I had to write small programs to scrape those websites page by page.
Which becomes harder the more complicated the site is. Simple HTML sites are easy, but some of these websites have complicated JavasScript interfaces that makes them pretty to look at, but also makes getting the data a headache.
Most of the times if you manage to download a file, it’s a simple Excel file of CSV. But in the case of the Netherlands it was a more complicated XML type file (which most people might not be very familiar with).
The data also changes based on the type of elections.
Most of Europe votes on list systems –each with its own twist: for example Luxembourg is somewhat special with 6 votes per person -, but some states or regions (like Malta, Ireland and Northern Ireland) use Single Transferable Vote, where candidates are ranked. This means no detailed data on the municipality level, since data is first centralised and then counted.
Voting abroad is another layer of complexity.
Sometimes votes abroad get allocated to where the voter is from, if they vote by mail, and sometimes it’s counted in a category apart. While in some countries embassy votes get added to the votes in the national capital, skewing the data there.
And even if the data is available to download, and even if the file is a clean Excel file, it’s not sure the data is geo-referenced with the relevant municipality code. Municipality codes (called Local Administrative Unit code by Eurostat) helps with attaching the data to the map’s shapefile.
If only the municipality name is available, one can run into a lot of problems. Especially so if there are spelling variations (like accents) or language variations when a municipality is in a bilingual area and it has the national name in one dataset, and the regional name in another.
Thus it is always better that these datasets have the relevant municipality codes.
So while doing this work, mapping these elections, I thought a lot about how difficult and time-consuming it is to visualize the European vote (and it shouldn’t be more difficult than the US or Indian elections.). So here is what I think could be done about it, because something should be.
Part III – Three ways forward
1) First possibility is a top-down approach: Hard policy of mandating that a version of national data follows a pre-determined format. Anyone who is familiar with the format can interpret it.
Advantage is that every country does things the same way, but since we are talking about 28 states that need to agree on the format, it might take a long time.
2) Second is what I call “The Bridge”: Nations publish their own versions of the data like before, and an EU authority (maybe Eurostat?) publishes a harmonized version, thus bridging the space between the member states and third parties.
Advantage is that harmonizing happens centrally but Eurostat must keep up with each state’s changes and shenanigans.
3) The third one would be Eurostat publishing just a “nation by nation guide” for third parties where it details how to access and harmonize data.
This is obviously the most cost-effective solution, and it requires a minimal intervention from the European Union. But on the other hand, it is others who will be harmonizing the data, so it could very well lead to inconsistencies.
But one solution needs to be implemented as there is need for this data. National and European institutions have already reached out to us.
By voting together, we are a single political space, although we do not see ourselves as such, partly because we are not represented that way in the media.
This is where dataviz comes in.
If we have harmonized EU data readily available, we will use this data, which means more visualisations in the media, and we will see ourselves represented more as a European Comunity.
12 November 2019 – Luxembourg
Some tweets from the event
How do you build the most detailed European election map results beautifully explained by @Arnold_Platon at #EUdataviz All it took was 6 months of data gathering and harmonisation 😅 pic.twitter.com/fcL2ck7M5d
— Karim Douïeb (@karim_douieb) November 12, 2019
— wahlatlas.net (@wahlatlas) November 18, 2019
— Sofia Sá (@SofiaPPSa) November 12, 2019