Bits of Language: corpus linguistics, NLP and text analytics

My contribution to the Anglicism of the Year award

I contributed to the Anglicism of the Year award nominations. It is the second edition, the first was rather confidential but still got mentionned by the English-speaking press (e.g. by The Guardian).

The jury is once again chaired by Anatol Stefanowitsch, a professor in linguistics at Hamburg University. The selection of the final nominees will be relayed by a few German bloggers specialized in linguistics. I made it to the first nominees, but there was no selection so far, this phase goes till January 7th. News can be found on the official blog.

My suggestions are:

das Handyticketsystem
whistleblowen …

more ...

Having fun and making money doing research

What do people look for ? A few years ago it would have been difficult to gather information at a large scale and grab it with a powerful, yet more or less objective tool. Nowadays a single company is able to know what you want, what you buy or what you just did. And sometimes it shares a little bit of the data.

So, the end of the year gives me an occasion to try and discover changes in the mentalities using the ready-to-use Google Trends. Just for fun…

How does research compare with other interests ?

First of all, research is …

more ...

Using and parsing the hCard microformat, an introduction

Recently, as I decided to get involved in the design of my personal page, I learned how to represent semantic markup on a web page. I would like to share a few things about writing and parsing semantic information in this format. I have the intuition that it is only the beginning and that there will be more and more formats to describe who you are, what do you do, who your are related to, where you link to, and engines that gather these informations.

First of all, the hCard microformat points to this standard, hCard 1.0.1. For …

more ...

Why I don’t blog on hypotheses.org and why I might do so (someday…)

People around me at the lab keep talking about a French institutional blog platform named hypotheses.org. In fact it is well-known but no one is using it. The website is still a bit new, according to them they currently host a hundred blogs.

The main benefits are visibility and durability as it is institutional, well-referenced and competently maintained.

It is what it claims to be, which is also why I hesitated and finally chose to set up a basic personal website.

First you need to fill out a form to get a registration, which is good in terms of …

more ...