Archive of links gathered during my PhD thesis:
1 – Linguistics and NLP
General Linguistics
- Glottopedia, the free encyclopedia of Linguistics (project)
- Resource List of the Linguistic Society of America
- The Linguist List
- General Linguistics Internet Resources (Joaquim Llisterri, Universitat Autònoma de Barcelona)
Computational Linguistics
- Natural Language Processing FAQ
- comp.text Frequently Asked Questions (Usenet archive)
- Language Technology World
- Linguistics Computing Resources on the Internet
- Pattern Matching Pointers
- Natural Language Software Registry
- Semantic links
- Wordlists (Moby project)
- Regular compilations of links (Paris 3 University)
Online Articles and Conferences
- Open Access Journals > Linguistics
- Computation and Language on arxiv.org
- Electronically available Papers List (IMS Stuttgart)
- List of Computational Linguistics Conferences
- Mining of Massive Datasets by A. Rajaraman and J. D. Ullmann. E-Book.
Lists of CL Blogs
- mendicantbug.com
- aclweb.org
- linguistlist.org
- Christopher Phipps (alias the lousy linguist) Blogger profile
Resources for German
- LINSE, the linguistics portal of the University of Essen
- Quantitative Linguistics Bibliography for German
- DeReWo – corpus-based wordlists, a project of the IDS Mannheim
- German SUBTLEX-DE word frequencies
- Lothar Lemnitzer’s Wortwarte (detection and sampling of neologisms)
Computer Science
- The DBLP Computer Science Bibliography
- The Collection of Computer Science Bibliographies
- Programming Language Research links
- UNIXhelp for users
2 – Corpus Linguistics
General
- David Lee’s Bookmarks for Corpus-based Linguists
- Corpora List (mailing-list)
- Links of the MCL (University of Michigan)
Handling Corpora
- Developing Linguistic Corpora: a Guide to Good Practice (academic project at Oxford)
- Online Text Collections in Western European Literature
- Links on annotation (page in French, international links, collected by Karën Fort - Paris 13)
- TEI (Text Encoding Initiative)
Tools
- Text Analysis Tools on the Digital Research Tools Wiki
- Documentation of the Data Science Toolkit
- TXM, a text/corpus analysis platform
N-Gram models
- N-Gram Extraction Tools
- CMU-Cambridge Statistical Language Modeling Toolkit
- Counting N-grams Over FSMs
- Statistical Language Modeling Toolkit
Conferences
3 – Perl
- The Effective Perler
- One-Line Scripts
- Why does modern Perl avoid UTF-8 by default ?
- Understanding Unicode and UTF8 in Perl (slides)
- Devel::SizeMe – Visualizing Perl Memory Use
- Map of CPAN
Crawling
- A survey of Perl modules that make HTTP requests
- Link extraction
- Crawling intro
- Building a specialized crawler
Regex
- (very) comprehensive description (in German)
- troubleshooters.com
- The regular expression \K trick
- Oh Yes You Can Use Regexes to Parse HTML!
Vector space search
- Engine construction (perl.com)
- Description with examples (ibm.com)
4 – LaTeX
General
- The LaTeX Project
- The UK List of TeX Frequently Asked Questions
- LaTeX Wikibook (many languages available)
- Resources on the web (TeX Users Group)
- TeX/LaTeX information, including information for linguists (upenn.edu)
- A (La)TeX encyclopaedia
- Hypertext help and index (nasa.gov)
- LaTeX tips and tricks (blog)
- The comprehensive LaTeX symbol list (PDF)
- LaTeX Befehlsreferenz (Jürgen Weinelt, in German)
- Four steps to better LaTeX documents
- Are the LaTeX margins too big or is it the paper ?
Getting started
- The Not So Short Introduction to LaTeX2ε by Tobias Oetiker (translations available)
- Getting Started with LaTeX (D.R. Wilkins, Trinity College, Dublin)
- Getting to grips with LaTeX (tutorials)
- Getting started with LaTeX – a collection of resources
- Writing a thesis in LaTeX
Online compilers
- A compiler by the University of Halle
- An online compiler project at the University of Brno
- Online demo from the ScienceSoft software
- MonkeyTeX (an online environment)
- scribtex.com (another online environment)
BibTeX
- BibTeXing (PS), the original manual by the (co-)author of BibTeX
- Tame the BeaST (PDF), an explanation of the BibTeX format and bst files (E. Markey, ENS Cachan)
- The multibib package
Specific applications
- A Beamer tutorial
- Beamer User Guide (PDF)
- Einführung zur Beamer-Klasse
- ccBeamer : Creative Commons logos for your LaTeX Beamer presentations/li>
- Progress bar for latex-beamer
- Producing posters with LaTeX
- Listings package (pdf manual to this source code printer)
- Going further : TeX graphics examples
LaTeX for Humanities (and Linguists)
- Using LyX in Humanities Papers (wiki)
- Links about LaTeX and Linguistics (ubc.ca)
- Information for Linguists (ling.upenn.edu)
- LaTeX for Linguists (essex.ac.uk)
- ling-tex Mailing-List
Need help ?
- tex.stackexchange.com : a growing user community
- Ling-TeX (mailing-list)
Pages en français
- Une courte (?) introduction à LaTeX 2e (traduction)
- Bases et fiches techniques (tuteurs de l’ENS Ulm)
- Groupe francophone des utilisateurs de TeX (GUTenberg)
- FAQ LaTeX de l’équipe GRAPPA (Lille 3)
- Tutoriels (Xaver Perseguers)
- Informations sur LaTeX (transparents) et quelques modules complémentaires (Jean-Baptiste Rouquier)
5 – R
- R-bloggers (R news and tutorials)
- CRAN Task View: Machine Learning & Statistical Learning
- Resources to help you learn and use R
- An introduction to R
- Data Mining Algorithms In R (Wikibook, sometimes incomplete or a bit old)
- Data analysis and modeling in R: a crash course
6 – PhD related
- Tough love: An insensitive guide to thriving in your PhD
- How to avoid procrastination during the research phase of my PhD?
- Productivity tips, tricks and hacks for academics
- The disposable academic - Why doing a PhD is often a waste of time
- Academic Research: How can I carry my PhD research more efficiently?
- Writing a thesis in LaTeX
- How to skim through Phd Theses
7 – Misc.
- Bash FAQ
- Powerful Command Line Tools For Developers
- Mozilla Thimble (learn HTML and CSS in your browser)
- The GitHub flow
- Git cheat sheet
- Porting to Python 3 and Ubuntu cheatsheet for developers
- The architecture of Battle for Wesnoth