Curiosities

funny words

beaut, baba, beeves

In order to get 'beaut', we had to swallow 'baba' and 'beeves' (Yes, alas! It is the plural of beef!). Words in nine or more (of 16) dictionaries.  [1]

In exactly two dictionaries 

In only one:

funny letters

This is actually the word abbé, written (from the FRELI word list [2]) using HTML entities [3]. Different word lists have different ways of encoding letters that are not part of standard ASCII. 

The only languages that can comfortably be written with the repertoire of US-ASCII happen to be Latin, Swahili, Hawaiian and American English without most typographic frills. It is rumoured that there are more languages in the world. [4]

For a particular English dictionary [4], it was found that the individual letters {a..z} each appear as separate entries. Webster's dictionaries have followed this practice (of listing individual letters, as well as letters from foreign alphabets, like alpha and beta)  for quite some time. [5]. I thought it made sense to see how often each of these stand-alone letters appeared in other words.

$ for i in `grep ^.$ $w`; do echo $i `grep -c $i $w`; done|sort -nrk2|head -26

e 253599
a 246831
i 214267
r 203284
n 195631
o 189798
s 188871
t 170121
l 156410
u 113642
c 107493
h 101831
d 101512
m 96576
g 82380
p 71336
b 65707
k 61467
y 60780
f 47018
v 41102
w 36484
z 22110
j 20261
x 8156
q 5844

However, one might ask: "Why stop with 26?" Indeed: 

$ for i in `grep ^.$ $w`; do echo $i `grep -c $i $w`; done|sort -nrk2|head -27|tail -1
é 3898

Continuing on, just a bit:

$ for i in `grep ^.$ $w`; do echo $i `grep -c $i $w`; done|sort -nrk2|head -35|tail -8
í 2321
á 1525
ï 1137
ó 1047
ä 939
è 879
ö 766
ñ 724

One needs an ñ from time to time in English. Imagine American cooking without the jalapeño, or the habeñero. Or birthdays without piñatas. [6]

Imagine English romance without (words and frequencies)
$ grep é $e|head
fiancée 1121
café 1113
fiancé 1037


Further reading:

[1] It's a real beaut, that baba. A shame about the beeves.
[2] From the FRELI word list
[3] Dealing with HTML entities in FRELI and Webster's 1913 unabridged.

[4] Good ole's ASCII by Roman Czyborra

[5] My 1839 Webter's dictionary does not list letters as separate entities, though my 1913 one does.
[6] Musings on English letter frequencies