Swiss parliament transcriptions

https://github.com/douglas-watson/parl-scraping

With a team of fellow hackers during the OpenData Election Hackdays, and in partnership with the newspaper Le Temps, we compiled a full-text-searchable index of the transcriptions of all parliamentary interventions of the Swiss government, since 1990. The result can be browsed online at parlement.letemps.ch.

Search results

The core of the project is a web scraper, which retrieved all the transcriptions from the official Swiss website, and combined the transcription with biographical information about the speakers (moslty party affiliations and canton of origin). The product is a large database of machine-readable transcriptions which can easily be re-used by researchers by using their favorite tools. We produced the data in two formats: CSV and JSON. The CSV data can be imported in R and Tableau, as well as other natural language processing and graphing tools. The JSON data can be imported into ElasticSearch. We additionally set up a Kibana interface for quick searches and analytics, hosted at parlement.letemps.ch. Check out the github page for instructions on using Kibana, on settings up your own ElasticSearch server (including docker files), and rebuilding the index with a web scraper.

More information about the project on the hackathon wiki: http://make.opendata.ch/wiki/project:chparlscraping

Visualization on Kibana