Renate Bartsch on linguistic complexity

I just found a seminal article on complexity written by Renate Bartsch in 1973 (in German). It is a very good summary of the perspective on this topic at the beginning of the ‘70s. The generative grammar background research on language starts to be criticized, but it is still a landmark and a framework (most notably the reflexion on surface and deep structure).

R. Bartsch, “Gibt es einen sinnvollen Begriff von linguistischer Komplexität ?” Zeitschrift für Germanistische Linguistik, vol. 1, iss. 1, pp. 6-31, 1973.

Bartsch focuses on three main aspects of the problem to answer this question: does the idea …

more ...

Philosophy of technology, how things started: a typology

In my previous post, I presented a few references. I went on reading books and articles on this topic, and I am now able to sort them in several kinds of approaches.

This is mostly thanks to these books in French on philosophy of technology:

  • G. Simondon, L’invention dans les techniques : cours et conférences, Paris: Seuil, 2005.
  • G. Hottois, Philosophies des sciences, philosophies des techniques, Paris: Odile Jacob, 2004.
  • J. Goffi, La philosophie de la technique, Presses Universitaires de France, 1988.
  • G. Hottois, Le signe et la technique : la philosophie à l’épreuve de la technique, Paris: Aubier, 1984 …
more ...

Philosophy of technology: a few resources

As I once studied philosophy (back in the classes préparatoires), I like to keep in touch with this kind of reflexion. Moreover, in this research field where everything is moving very fast, it is a way to find a few continuities and to ground the peculiar questions regarding the analysis of language in a more conceptual framework.

Here is a list of texts available on the Internet (some of them partly) that seem important to me. Some are written in English, some in French or in German, as I chose the original ones.

It does not have the pretension to …

more ...

Binary search to find words in a list: Perl tutorial

Given a dictionary, say one of the frequent words lists of the University of Leipzig, given a series of words: How can you check which ones belong to the list ?

Another option would be to use the operator available since Perl 5.10: :::perl if ($word ~~ @list) {…} But this gets very slow if the size of the list increases. I wrote a naive implementation of the binary search algorithm in Perl that I would like to share. It is not that fast though. Basic but it works.

First of all the wordlist gets read:

my $dict = 'leipzig10000';
open (DICTIONARY, $dict …
more ...

Resource links update

I recently updated the blogroll section and I also would like to share a few links:

As I will be teaching LaTeX soon the LaTeX links section of the blog has expanded.

Last but not least, here is an E-Book, Mining of Massive Datasets, by A. Rajaraman and J. D. Ullmann. It was made of classes taught at Stanford and is now free to use (available chapter …

more ...

Quick review of the Falko Project

The Falko Project is an error-annotated corpus of German as a foreign language, maintained by the Humboldt Universität Berlin who made it publicly accessible.

Recently a new search engine was made available, practically replacing the old CQP interface. This tool is named ANNIS2 and can handle complex queries on the corpus.

Corpus

There are several subcorpora, and apparently more to come. The texts were written by advanced learners of German. There are most notably summaries (with the original texts and a comparable corpus of summaries written by native-speakers), essays who come from different locations (with the same type of comparable …

more ...

Having fun and making money doing research

What do people look for ? A few years ago it would have been difficult to gather information at a large scale and grab it with a powerful, yet more or less objective tool. Nowadays a single company is able to know what you want, what you buy or what you just did. And sometimes it shares a little bit of the data.

So, the end of the year gives me an occasion to try and discover changes in the mentalities using the ready-to-use Google Trends. Just for fun…

How does research compare with other interests ?

First of all, research is …

more ...

Three series of recorded lectures

Here is my selection of introductory courses given by well-known specialists in Computer Science or Natural Language Processing and recorded so that they can be followed at home.

1. Artificial Intelligence | Natural Language Processing, Christopher D. Manning, Stanford University.
More than 20 hours, 18 lectures.
Introduction to the key topics of NLP, summary of existing models.
Lecture 12 : Dan Jurafsky as a guest lecturer.
Requires the Silverlight plugin (no comment). Transcripts available.

2. Bits, Harry R. Lewis, Harvard University.
A general overview of information as quantity and quantitative methods.
Very comprehensive lecture (data theories, internet protocols, encryption, copyright issues, laws …

more ...

On Text Linguistics

Talking about text complexity in my last post, I did not realize how important it is to take the framework of text linguistics into account. This branch of linguistics is well-known in Germany but is not really meant as a topic by itself elsewhere. Most of the time, no one makes a distinction between text linguistics and discourse analysis, although the background is not necessarily the same.

I saw a presentation by Jean-Michel Adam last week, who describes himself as the “last of the Mohicans” to use this framework in French research. He drew a comprehensive picture of its origin …

more ...

E. Castello, Text Complexity and Reading Comprehension Tests - Reading Notes

Here is what I retain from my reading of this book: * E. Castello, Text Complexity and Reading Comprehension Tests, Bern: Peter Lang, 2008.

Notional framework

To begin with, Castello identifies two types of complexity, and states that research in this field attempts to quantify inherent complexity and receiver-oriented complexity, i.e. complexity or difficulty per se on one side and in terms of reader and text on the other.

He cites C.J. Alderson and L. Merlini Barbaresi (strangely enough, we are not related, as far as I know) for their definition of linguistic complexity, M. Halliday and T. Gibson …

more ...