The difficulty is calculated by a combination of:
- Automated Readability Index: wikipedia article
- The percentage of words which are in the top 2000 most frequent words in the language. The majority of the word frequency lists are based on movie subtitles and come from this site: Invoke IT Word Frequency Lists
It's far from perfect, but seems to work reasonably well for Spanish, English, French and German texts. If anyone knows of a better, easy to calculate readability metric that works for multiple languages, please let me know!