What is JacketFlap

  • JacketFlap connects you to the work of more than 200,000 authors, illustrators, publishers and other creators of books for Children and Young Adults. The site is updated daily with information about every book, author, illustrator, and publisher in the children's / young adult book industry. Members include published authors and illustrators, librarians, agents, editors, publicists, booksellers, publishers and fans.
    Join now (it's free).

Sort Blog Posts

Sort Posts by:

  • in
    from   

Suggest a Blog

Enter a Blog's Feed URL below and click Submit:

Most Commented Posts

In the past 7 days

Recent Posts

(tagged with 'culturomics')

Recent Comments

Recently Viewed

JacketFlap Sponsors

Spread the word about books.
Put this Widget on your blog!
  • Powered by JacketFlap.com

Are you a book Publisher?
Learn about Widgets now!

Advertise on JacketFlap

MyJacketFlap Blogs

  • Login or Register for free to create your own customized page of blog posts from your favorite blogs. You can also add blogs by clicking the "Add to MyJacketFlap" links next to the blog name in each post.

Blog Posts by Tag

In the past 7 days

Blog Posts by Date

Click days in this calendar to see posts by day or month
new posts in all blogs
Viewing: Blog Posts Tagged with: culturomics, Most Recent at Top [Help]
Results 1 - 1 of 1
1. Books by the Numbers

By Dennis Baron


People judge you by the words you use. This warning, once the slogan of a vocabulary building course, is now the mantra of the new science of culturomics.

In “Quantitative Analysis of Culture Using Millions of Digitized Books” (Michel, et al., Science, Dec. 17, 2010), a Harvard-led research team introduces “culturomics” as “the application of high throughput data collection and analysis to the study of human culture.” In plain English, they crunched a database of 500 billion words contained in 5 million books published between 1500 and 2008 in English and several other languages and digitized by Google. The resulting analysis provides insight into the state of these languages, how they change, and how they reflect culture at any given point in time.

In still plainer English, they turned Google Books into a massively-multiplayer online game where players track word frequency and guess what writers from 1500 to 2008 were thinking, and why. The words you use tell the culturonomists exactly who you are–and they can even graph the results!

According to the psychologists and mathematicians on the culturomics team, reducing books and their words to numbers and graphs will finally give the fuzzy humanistic interpretation of history, literature, and the arts the rigorous scientific footing it has lacked for so long.

For example, the graph below tracks the frequency of the name Marc Chagall (1887-1985) in English and German books from 1900 to 2000, revealing a sharp dip in German mentions of the modernist Jewish artist from 1933 to 1945. You don’t need a graph to correlate Hitler’s ban on Chagall and his work with the artist’s disappearance from German print (other Jewish artists weren’t just censored by the Nazis, they were murdered), but it is interesting to note that both before and after the Hitler era, Chagall garners significantly more mentions in German books than he does in English ones.

One problem with the culturome data set is that books don’t always reflect the spoken language accurately. When the telephone was invented in 1876, Americans adapted hello as a greeting to use when answering calls. Before that time, hello was an extremely rare word that served as a way of hailing a boat or as an expression of surprise. But as the telephone spread across American cities, hello quickly became the customary greeting both for telephone, and then for face-to-face, conversation.

Expanding the data set of written English to include not just books but also newspapers, periodicals, letters, and informal writing, as we find in the smaller, 400-million word Corpus of Historical American English, gives a better idea of the frequency of words like hello. But crunching numbers doesn’t tell the whole story: we can infer from contemporary published accounts, many of them strong objections to the new term, that hello is much more common in speech than its occurrence in writing indicates.

It’s one thing to read a book and speculate about its meaning—that’s what readers are supposed to do. But culturomics crunches millions of books—more than the most ardent book club groupie could get through in a lifetime. Since mos

0 Comments on Books by the Numbers as of 1/1/1900
Add a Comment