About Google Reading Level

Jean-Philippe Magué told me there was a Google advanced search filter that checked the result pages to give a readability estimate. In fact, it was introduced about seven months ago and works to my knowledge only for the English language (that’s also why I didn’t notice it).

Description

For more information, you can read the official help page. I also found two convincing blog posts showing how it works, one by the Unofficial Google System Blog and the other by Daniel M. Russell.

The most interesting bits of information I was able to find consist in a brief …

more ...

A few links on producing posters using LaTeX

As I had to make a poster for the TALN 2011 conference to illustrate my short paper (PDF, in French), I decided to use LaTeX, even if it was not the easiest way. I am quite happy with the result (PDF).

I gathered a few links that helped me out. My impression is that there are two common models, and as I matter of fact I saw both of them at the conference. The one that I used, Beamerposter, was “made in Germany” by Philippe Dreuw, from the Informatics Department of the University of Aachen. I only had to adapt …

more ...

Lord Kelvin, Bachelard and Dilbert on Measurement

Lord Kelvin

Here is what William Thompson, better known as Lord Kelvin, once said about measure:

« I often say that when you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely in your thoughts advanced to the state of Science, whatever the matter may be. »
William Thompson, Lecture on “Electrical Units of Measurement” (3 May 1883)

Bachelard

I found …

more ...

Crawling a newspaper website to build a corpus

Basing on my previous post about specialized crawlers, I will show how I to crawl a French sports newspaper named L’Equipe using scripts written in Perl, which I did lately. For educational purpose, it works by now but it is bound to stop being efficient as soon as the design of the website changes.

Gathering links

First of all, you have to make a list of links so that you have something to start from. Here is the beginning of the script:

#!/usr/bin/perl #assuming you're using a UNIX-based system...
use strict; #because it gets messy without …
more ...

Building a basic specialized crawler

As I went on crawling again in the last few days I thought it could be helpful to describe the way I do.

Note that it is for educational purpose only (I am not assuming that I built the fastest and most reliable crawling engine ever) and that the aim is to crawl specific pages of interest. That implies I know which links I want to follow just by regular expressions, because I observe how a given website is organized.

I see two (or eventually three) steps in the process, which I will go through giving a few hints in …

more ...

Workshop on Complexity in Language – Day 2 (report)

I could not follow the whole second day of the Workshop on Complexity in Language (see previous post), but here is what I heard in the morning.

Salikoko Mufwene talked about the emergence of complexity, which he sees as a self-organization process : we don’t plan the way we are going to speak.

He adopts a relativistic perspective speaking of a multi-agent system and asking if the agents are really agentive or if there are triggers of particular behaviors. He likes to consider language as a technology that evolved. At the end of the talk he also tackled the notion …

more ...

Workshop on Complexity in Language - Day 1 (report)

I attended yesterday the first day of a workshop organized by Salikoko Mufwene and held at the ENS Lyon. This “Workshop on Complexity in Language: Developmental and Evolutionary Perspectives” lasts two days: HTML version of the program.

Here is my personal report on what I heard during the first day and on what I found interesting.

Complexity and complexity science

First of all, William S.-Y. Wang referred to Herbert Simon and Melanie Mitchell in particular to define complexity, two approaches that I described on this blog.

Tom Schoenemann talked about the increasing richness, subtlety and complexity of hominin conceptual …

more ...

Halliday on complexity (1992)

Sometimes you just feel lucky : I was reading the famous article by Charles J. Fillmore, “Corpus linguistics” or “Computer-aided armchair linguistics”, in the proceedings of a Nobel symposium which took place in 1991 (it is known for the introducing descriptions of the armchair and of the corpus linguist who don’t have anything to say to each other) as I decided to read the following article. The title did not seem promising to me, but still, it was written by Halliday :

M.A.K. Halliday, Language as system and language as instance: The corpus as a theoretical construct, pp. 61-77 …

more ...

Approaches to philosophy of technology

I held a presentation last week at the Easterhegg conference in Hamburg, which aim was to give a few insights into this topic and a few notions that could explain aspects of the hacker culture.

My talk was entitled Denkansätze zur Philosophie der Technik, as it dealt with approaches to philosophy of technology.

I started with a historical description of technology as a given fact that no one puts into question, then I spoke from the contempt regarding technicians and the difficulty to consider philosophy of technology as a subfield of philosophy.

The main part of my presentation consisted of …

more ...

Simon, Gell-Mann and Lloyd on complex systems

Definition

Herbert A. Simon is one of the first who tried to formalize the notion of a complex system: * H. A. Simon, “The Architecture of Complexity”, Proceedings of the American Philosophical Society, vol. 106, iss. 6, pp. 467-482, 1962.

First of all, here is how he defines it:

« Roughly, by a complex system I mean one made up of a large number of parts that interact in a nonsimple way. In such systems, the whole is more than the sum of the parts, not in an ultimate, metaphysical sense, but in the important pragmatic sense that, given the properties of …

more ...