This is the third release of a work in progress collecting political speeches from the German Presidency, Presidency of the Bundestag, Chancellery, and Ministry of Foreign Affairs.

The corpus has been released on the occasion of the LREC 2018 conference and includes updated texts with metadata encoded in XML format.

See the description paper (PDF, in English, 6 pages), which you can use to refer to this corpus (BibTeX entry).

05/09/18 Third release, updated text archive.
08/03/12 First part of the code released (crawler and corpus builder):
03/05/12 Release of the 2nd version - POS-tags, lemmas, XML TEI, keywords.
12/06/11 Readme and CC BY-SA license added.
09/08/11 The texts are now numbered in chronological order. Better formatting (title and meta-description, paragraphs).
09/01/11 Better display of the speeches (CSS) and general list.
08/16/11 Minor bugs corrected, new welcome page in German.
07/25/11 First release.