Description

This is the third release of a work in progress collecting political speeches from the German Presidency, Presidency of the Bundestag, Chancellery, and Ministry of Foreign Affairs, mostly from the 21st century.

See the description paper (PDF, in English, 6 pages), which you can use to refer to this corpus (BibTeX entry). The present version has been released on the occasion of the LREC 2018 conference, it includes updated texts with metadata encoded in XML format as well as updated visualizations.

If you wish to use the texts, please cite at least the following elements and if possible the permanent URL (http://purl.org/corpus/german-speeches):

Feel free to contact me if you have questions, if you would like to work on this corpus, if you want a particular list of queries to be performed on it, etc.

Downloads

Current version

Legacy versions

Visualizations (updated, beta version)

Beyond this point, the pages are in German (navigation should be instinctive though):

For maintenance reasons the pages are static: word lists of relevant queries, output in valid CSS/XHTML format.
Theoretically decent display on all desktop versions of Firefox, Safari, Chrome and Opera.

Mentions

The mentions below are updated on a regular basis.

Corpus and Computational Linguistics

History and Political Science

Miscellaneous

Changelog

2018-09-28 Refined speaker metadata and text base for the Chancellery.
2018-08-30 Refined text base and updated visualizations.
2018-05-09 Third release, updated text archive.
2012-08-03 First part of the (now outdated) code released: https://github.com/adbar/gps-corpus-builder
2012-03-05 Release of the 2nd version: POS-tags, lemmas, XML TEI, keywords.
2011-12-06 Readme and CC BY-SA license added.
2011-09-08 Better visualizations of the speeches and better formatting (title and meta-description, paragraphs).
2011-08-16 Minor bugs corrected, new welcome page in German.
2011-07-25 First release.