Download the Ranked Wikipedia Lists

Last update: April 2015 Here you can download the raw text files used to create the Wikipedia search functionality found on this site. If you find yourself using the site a lot, you may want to download them and play with them offline.

The structure of the files is simple: it looks like

Olav V of Norway@100
Archimedes@100
Port Phillip@100
Optimus Prime@100
Merv@100
French people@100
Three's Company@100
...

Don't bother trying to open it in Excel; it's too big. Instead, just run some combination of grep and sed on it to find the entries you're looking for. (Windows users can get grep and sed with the excellent Cygwin utility.)

Since this data comes from Wikipedia and Wiktionary, it is distributed under the Creative Commons Attribution-ShareAlike 3.0 Unported License. In any case, you are free to share and remix the work as long as you give attribution.

Example searches:

# Find the top 10 entries of the form ??E?L??
> grep -iP '^..e.l..@' RankedWiki.txt -m 10
Heerlen@99
Shellac@98
Gremlin@98
Siedlce@98
Ixelles@98
EHealth@96
Apelles@96
Feedlot@95
Peebles@95
Kremlin@95

# Find the top 10 entries of the form ??E?L?? in "crossword mode"
> sed 's/[^A-Za-z0-9@]//g' RankedWiki.txt | grep -iP '^..e.l..@' -m 10
Heerlen@99
SheilaE@98
Shellac@98
Gremlin@98
Siedlce@98
Ixelles@98
ThePlay@97
TheBlob@97
TheBled@96
EHealth@96

# Find all people with first name "Ben" and score 100
> grep -iP '^ben \w+@100' RankedWiki.txt
Ben Harper@100
Ben Bernanke@100
Ben Folds@100
Ben Hogan@100
Ben Jonson@100
Ben Roethlisberger@100
Ben Affleck@100
Ben Stiller@100

# Find the top 10 entries of 15 letters or less with a hidden "LOL" in crossword mode
> sed 's/[^A-Za-z0-9@]//g' RankedWiki.txt | grep -iP '.lol.+@' -m 50 | grep -iP '^.{1,15}@' -m 10
BristolOldVic@99
Philology@98
SpecialOlympics@98
MarcelloLippi@98
Malolos@98
MickeyLolich@98
Vexillology@98
WillOldham@97
RunLolaRun@97
AlOliver@97