My contribution to the Anglicism of the Year award

I contributed to the Anglicism of the Year award nominations. It is the second edition, the first was rather confidential but still got mentionned by the English-speaking press (e.g. by The Guardian). The jury is once again chaired by Anatol Stefanowitsch, a professor in linguistics at Hamburg University. The selection of the final nominees will be relayed by a few German bloggers specialized in linguistics.
I made it to the first nominees, but there was no selection so far, this phase goes till January 7th. News can be found on the official blog.

My suggestions are :

  • das Handyticketsystem
  • whistleblowen
  • der Occupist, die Occupisten
  • die Post-Privacy

To my opinion, the latter two have the good chances to advance to the final stage. Among the other nominees I like die Fazialpalmierung (facepalm) and die Liquid Democracy. But there are not that many interesting ones, that may be a reason why the deadline was postponed by a week.

I will keep this post up to date.

Updates :

Having fun and making money doing research

What do people look for ? A few years ago it would have been difficult to gather information at a large scale and grab it with a powerful, yet more or less objective tool. Nowadays a single company is able to know what you want, what you buy or what you just did. And sometimes it shares a little bit of the data.

So, the end of the year gives me an occasion to try and discover changes in the mentalities using the ready-to-use Google Trends. Just for fun…

How does research compare with other interests ?

First of all, research is no fun, it was more requested than money and was at the level of work, but things have changed. It still outnumbers fun in the news though.

A few trends regarding research

A few trends regarding research, “Research is no fun”… Source: Google), worldwide trends.

People seem to look for money more often than a few years ago, it’s the only thing which becomes more popular, even work just remains stable.

A remark: I think the search volume is much more bigger now than it was back in 2004, there are also more languages available, and probably more search terms (since the users may ...

Using and parsing the hCard microformat, an introduction

Recently, as I decided to get involved in the design of my personal page, I learned how to represent semantic markup on a web page. I would like to share a few things about writing and parsing semantic information in this format. I have the intuition that it is only the beginning and that there will be more and more formats to describe who you are, what do you do, who your are related to, where you link to, and engines that gather these informations.

First of all, the hCard microformat points to this standard, hCard 1.0.1.  For an explanation of what it is, see here on, for a global article on microformats see also Wikipedia.

The information displayed is useful as it is a way to markup semantic relations, so that named entities are correctly identified. By search engines for instance : Google supports several formats, including hCard, and there are more specific search engines which aim at gathering informations such as a contact or a product list starting from this kind of markup. For a comprehensive list see here.

Now, if you are interested in parsing microformats, there are several tools. Among them, my pick ...

A short bibliography on Latent Semantic Analysis and Indexing

To go a bit further than my previous post, here are a few references that I recently found to be interesting.

For a definition and/or other short bibliographies, see Wikipedia or something else this time : Scholarpedia, with an article “curated” by T.K. Landauer and S.T. Dumais.

U. Mortensen, Einführung in die Korrespondenzanalyse, Universität Münster,2009.

G. Gorrell and B. Webb, “Generalized Hebbian Algorithm for Incremental Latent Semantic Analysis,” in Ninth European Conference on Speech Communication and Technology, 2005.

P. Cibois, Les méthodes d’analyse d’enquêtes, Que sais-je ?, 2004.

B. Pincombe, Comparison of Human and Latent Semantic Analysis (LSA) Judgements of Pairwise Document Similarities for a News Corpus, Australian Department of Defence,2004.

M. W. Berry, S. T. Dumais, and G. W. O’Brien, “Using Linear Algebra for Intelligent Information Retrieval,” SIAM Review, vol. 37, iss. 4, p. pp. 573-595, 1995.

S. Dumais, Enhancing performance in latent semantic indexing (LSI) retrieval, Bellcore,1992.

S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman, “Indexing by latent semantic analysis”, Journal of the American society for information science, vol. 41, iss. 6, pp. 391-407, 1990.

G. Salton, A. Wong, and C. S. Yang, “A vector ...

Why I don’t blog on and why I might do so (someday…)

People around me at the lab keep talking about a French institutional blog platform named In fact it is well-known but no one is using it. The website is still a bit new, according to them they currently host a hundred blogs.

The main benefits are visibility and durability as it is institutional, well-referenced and competently maintained.

It is what it claims to be, which is also why I hesitated and finally chose to set up a basic personal website.

  • First you need to fill out a form to get a registration, which is good in terms of label, but I don’t know how long or how often I am going to blog. I don’t want to request a service I might finally not use.
  • The second reason is that it is very useful for people who do not want to deal with layout issues, all the pages look quite the same apart from backgrounds colors and a few images. I think it may be to maintain a global coherence on the website.
  • It’s not that international, it’s not what it’s meant to be. Most of the articles are in French, and I ...
