I needed to obtain a JSON object containing the tile values on scrabble tiles for a project I took on. This was across 8 dictionaries (6 languages). The problem was, I couldn't  find a good source for the required data. I have since found a far easier way to obtain the data but it was still a fun challenge.

The best I could do at the time was http://en.wikipedia.org/wiki/Scrabble_letter_distributions.

The data was laid out as displayed below and it didn't look very friendly.


English-language editions of Scrabble contain 100 letter tiles, in the following distribution:
  • 2 blank tiles (scoring 0 points)
  • 1 point: E ×12, A ×9, I ×9, O ×8, N ×6, R ×6, T ×6, L ×4, S ×4, U ×4
  • 2 points: D ×4, G ×3
  • 3 points: B ×2, C ×2, M ×2, P ×2
  • 4 points: F ×2, H ×2, V ×2, W ×2, Y ×2
  • 5 points: K ×1
  • 8 points: J ×1, X ×1
  • 10 points: Q ×1, Z ×1

With a little inspection, I was able to identify the formatting applied to the required elements.

Enter Google Chrome plus the jsshell extension. Jsshell allows you to run jQuery on any page. Now isn't that somethin' :-).

So after a few (debatable) minutes I came up with the following

In other words

  1. For each of the six languages specified, find the span with an id  equal to the language.
  2. Find the first list following the parent of the identified span
  3. For each bold letter in the list, assign the integer value of the italicized text which is a sibling of that letter to that letters index in that languages array.
    • Example, for the English dictionary, Q and Z (bold) are 10 points (italicized). The integer value of  "10 points" is of course 10.

The result,

Lines of code : 10