From the Guardian: "Google creates a tool to probe 'genome' of English words for cultural trends".
"Interest in computational approaches to the humanities and social sciences dates back to the 1950s," said Michel, a psychologist in Harvard's Program for Evolutionary Dynamics. "But attempts to introduce quantitative methods into the study of culture have been hampered by the lack of suitable data. We now have a massive dataset, available through an interface that is user-friendly and freely available to anyone."
When I was an undergraduate doing research on Shakespeare, I pored through a concordance that had been constructed by computer in the late 1960's, which had given rise to several kinds of textual analyses (including some of the authorship disputes). There has long been an interest in that kind of quantitative analysis, and the thought that you can do it now for 5 million books is pretty daunting. The patterns of borrowing may be quite analogous to the analysis of reticulation in biological networks, I would imagine.
The article presents some interesting factoids, focusing mainly on the appearance of celebrity names. There's this:
"Science is a poor route to fame. Physicists and biologists eventually reached a similar level of fame as actors but it took them far longer," wrote the researchers. "Alas, even at their peak, mathematicians tend not to be appreciated by the public."
Alas.
I suppose somebody could do much more interesting work on spoken English by analyzing NSA recordings...