Open Data: many organisations are publishing data for the public good. For example, the European Union publishes numbers on economic activity, demographics, greenhouse gas emissions and gender equality. The European Central Bank publishes money supply statistics, interest rates and foreign exchange rates. Likewise, plenty of other institutions do the same. Thus, there is a wealth of information out there in the open!
Now how can you actually get this data? Well, I wrote some tutorials with practical examples for the data scientist platform DataCareer. In these tutorials, you can read how to retrieve data from these so-called “open data sources” with the programming language Python.
So as you can see, most organisations now use API’s to publish the data. This makes it easier for us to consume it. Certainly it is really great to find so much data publicly available. However, the main challenge is that it often comes in various different formats. As you can imagine, it is a little harder to process data from PDF files or poorly structured Excel files than well structured CSV files. Ben Wellington’s addressed this issue in his entertaining TED talk back in 2014:
To sum it up, next to open sourcing datasets, we should also aim for using more open data standards. Hopefully these tutorials help you getting started with open data. With more data we can get a better understanding of our world. Hans Rosling brilliantly shows this in his book Factfulness. If you haven’t read his book yet, put it on your reading list!
Please feel free to share your feedback through the comment section below.