I recently had to deal with a series of files with different encodings in the same corpus, and I would like to share the solution I found in order to try to convert automatically all the files in a directory to the same encoding (here UTF-8).
I first tried to write a script in order to detect and correct the encoding, but it was everything but easy, so I decided to use UNIX software instead, assuming these tools would be adequate and robust enough.
I was not disappointed, as
file for example gives relevant information
when used …