What is JacketFlap

  • JacketFlap connects you to the work of more than 200,000 authors, illustrators, publishers and other creators of books for Children and Young Adults. The site is updated daily with information about every book, author, illustrator, and publisher in the children's / young adult book industry. Members include published authors and illustrators, librarians, agents, editors, publicists, booksellers, publishers and fans.
    Join now (it's free).

Sort Blog Posts

Sort Posts by:

  • in
    from   

Suggest a Blog

Enter a Blog's Feed URL below and click Submit:

Most Commented Posts

In the past 7 days

Recent Comments

Recently Viewed

JacketFlap Sponsors

Spread the word about books.
Put this Widget on your blog!
  • Powered by JacketFlap.com

Are you a book Publisher?
Learn about Widgets now!

Advertise on JacketFlap

MyJacketFlap Blogs

  • Login or Register for free to create your own customized page of blog posts from your favorite blogs. You can also add blogs by clicking the "Add to MyJacketFlap" links next to the blog name in each post.

Blog Posts by Tag

In the past 7 days

Blog Posts by Date

Click days in this calendar to see posts by day or month
new posts in all blogs
Viewing: Blog Posts Tagged with: corpus, Most Recent at Top [Help]
Results 1 - 2 of 2
1. Finding the Word of the Year

Ammon Shea is a vocabularian, lexicographer, the author of Reading the OED: One Man, One Year, 21,730 Pages and a frequent OUPblog contributor.  In light of our Word of the Year 2009 announcement (WOTY) Ammon has taken a closer look at how WOTY is chosen.  In the post below he reveals the process that led to unfriend being chosen as WOTY 2009.

Every year, at about this time, the New Oxford American Dictionary releases its Word of the Year (WOTY), a combination of solid lexicographic practice and a light-hearted look at the changing face of English today. Since there are quite possibly thousands (or at least dozens) of people out there who wonder “where does the Word of the Year come from?” the following is a brief explanation of what this momentous process entails, and what it does not.

You could be forgiven for thinking that the Word of the Year is chosen by a group of unruly lexicographers, drunk on whimsy and an inflated sense of their own power, who are hell-bent on introducing silly words into English. So let’s see what actually happens.

The candidates for WOTY are drawn from three main sources, each of which reflects a particular strength of Oxford University Press and its unrivaled language research program. The first of these is the Oxford English Corpus, a database of over two and a half billion words drawn from current English the world over. The corpus is fully searchable, allowing the editors to find words that have either entered the language or changed meaning significantly enough to warrant attention. The use of the corpus allows tracking of words, and the examination of the shifts that occur in geography, register, and frequency of use.

The second body of candidates to merit consideration for the WOTY is composed of those that have been “catchworded” (catchworded words are those that have been identified as new or unusual usages by one of the vast number of readers who provide citations of word use for the OED and other Oxford Dictionaries). An editor who is responsible for new words in English combines the catchworded items into a digital database, a sort of mini-corpus, in which individual words can be analyzed by frequency, register, and region.

The third source for potential Words of the Year comes from the various editors at OUP, who are continually keeping tabs on the varieties of English and the ways in which these varieties are changing. These words come from the editor’s own reading, or from conversations they’ve had, and from lists of new words that are taken from one of the numerous dictionaries published by OUP.

Once the preliminary list of words has been collected it is sent to a group of perhaps 7 or 8 editors, who commence poking at the words with a sharp stick, weeding out those that aren’t in fact new, or which may new, but not yet widespread enough to be more than a regionalism. The words are all checked to make sure that they do not exist in any current dictionary, and that there is sufficient evidence in the Oxford English Corpus, in various forms of print, and on internet search engines to warrant each one’s inclusion.

This list of words is sent around and winnowed to a short list, which is then itself winnowed to a final list, and from the final list a single word is chosen which has been accorded the honor of being the Word of the Year.

Although the process of picking the WOTY is quite similar to that of introducing a word into a dictionary, this status does not guarantee that the word will be included in any future reference works. The word in question may be quite widespread today and have fallen entirely from use within a few years. The WOTY is not a popularity contest, nor is it simply the word that has been used more than any other over the past year. It is a forward-looking examination of one small aspect of our language, one in which the Oxford lexicographers take a chance on picking the word that they think represents the use of language today, and that will continue to have an influence.

It can be a tricky business, trying to figure out which words will stick ahead of time, and there is no shame in making an educated guess that turns out to not be as accurate several years hence as it seems now. James Murray famously decided to leave the word appendicitis out of the first edition of the Oxford English Dictionary after receiving advice from William Osler (a famous doctor at Oxford) that it was likely not a word that would ever be in widespread use. A short time later the coronation of Edward VII was delayed after he had to undergo an emergency operation for his appendicitis. Although many people wondered why the word was not in the OED, there was no way that Murray could have made the necessary guess to include it.

The WOTY is an attempt to capture some of the breathtaking fluidity of our language, and to look at its semantic change and inventiveness in real time, through the use of solid research, editorial skill, and intuitive guesswork.

0 Comments on Finding the Word of the Year as of 11/18/2009 9:50:00 PM
Add a Comment
2. RT this: OUP Dictionary Team monitors Twitterer’s tweets

Purdy, Director of Publicity

A recent study out of Harvard confirms Twitter is all vanity. This is not a big surprise to the dictionary team at Oxford University Press. OUP lexicographers have been monitoring more than 1.5 million random tweets Since January 2009 and have noticed any number of interesting facts about the impact of Twitter on language usage. For example the 500 words most frequently used words on Twitter are significantly different from the top 500 words in general English text. At the very top, there are many of the usual suspects: “the”, “to”, “as”, “and”, “in”… though “I” is right up at number 2, whereas for general text it is only at number 10. No doubt this reflects on the intrinsically solipsistic nature of Twitter. The most common word is “the”, which is the same in general English.

Since January OUP’s dictionary team has sorted through many random tweets.  Here are the basic numbers:
Total tweets = 1,496,981
Total sentences = 2,098,630
Total words = 22,431,033
Average words per tweet = 14.98
Average sentences per tweet = 1.40
Average words per sentence in Twitter= 10.69
Average words per sentence in general usage = 22.09

Other interesting tidbits include:

Verbs are much more common in their gerund form in Twitter than in general text. “Going”, “getting” and “watching” all appear in the top 100 words or so.

“Watching”, “trying”, “listening”, “reading” and “eating” are all in the top 100 first words, revealing just how often people use Twitter to report on whatever they are experiencing (or consuming) at the time.

Evidence of greater informality than general English: “ok” is much more common, and so is “f***”.

And that is how we roll here at OUP, monitoring new social media and the changes in the English language up to the minute.  Tweet on.

0 Comments on RT this: OUP Dictionary Team monitors Twitterer’s tweets as of 6/4/2009 11:05:00 AM
Add a Comment