For years now it’s seemed to me like every fantasy novel, movie, tv show, etc. were titled with the same handful of words. I finally got around to doing some empirical investigation to see if it’s true. I wrote a script to “scrape” the results of a major online bookseller’s 2500 or so “Bestselling” Fantasy titles, split up the individual words in the resulting titles, and counted them. In addition, I counted up the aggregate appearance of particular “concepts” (i.e. multiple specific words with related meanings). “Darkness” was the most prominent concept behind “book” words (such as “book”, “chronicles”, and “trilogy”).
Like anything else subject to marketing efforts, fantasy fiction titles might be expected to become increasingly similar as they all seek to connect themselves to popular-selling words and concepts. I’ve been privately joking for years that someday I could write the ultimate best-selling work of fantasy fiction which would be titled something like “The Fireblood Throne of the Ice Song Sword Dance Crown Stone”. I wouldn’t be a real nerd/dork if I just assumed my impression of the increasing sameness of fantasy fiction titles was true without some scientific verification, though, so I finally got around to doing a real empirical test of the hypothesis.
Amazon.com, sadly, appears to make it nigh impossible to analyze things like this, but the website for Barnes and Noble seems to provide a search with usable title data. A few test queries using their website allowed me to work out the proper structure of a query, and I used wget to fetch individual pages of Barnes and Noble’s “Bestselling” Fantasy books. A script written in PHP (I had originally intended the script to also automatically do the result-fetching, which is easy to do with PHP, but as a one-off experiment decided doing the fetching by hand was quicker) was then used to extract just the titles of the books in the results, split out the individual words in the results, and count their appearances. The results were generated as a simple comma-separated-values text file and imported into a LibreOffice Spreadsheet for further analysis and sorting.
“Further Analysis” consisted of me sorting the list by number of appearances, then inserting rows to count word-combinations. Initially this was just for adding together relevant singular/plural words (“vampire”/”vampires”), but quickly expanded to a selection of “concept” combinations. These were all added up “by hand” on the assumption that doing so was quicker than writing and debugging a script to do it would be. As a result, some of the numbers may be off by a few, but I feel confident that the overall ranking is accurate.
One thing that skews the results a bit is that the Barnes and Noble results’ “titles” often include more than the title of the individual book – namely, they often include the name for the “series” that the book is in. This is actually good for me because the “series” names are also of interest to me, but it does mean that, for example, the words “book”, “chronicles”, and “trilogy” are the most common words (I’ve discarded words like “the”, “and”, and so forth). As an imaginary example, a book “title” from these results might look something like “Dark Blood: Book 38 of the MoonDragon Vampire Witch Saga Chronicles”. This also means that a series with a large number of books in it will inflate the numbers for words in a “series” name, but since my hypothesis is about the overall popularity of certain words in fantasy fiction titles rather than just a count of specific books, I think this effect is appropriate here.
I found some interesting bits from the analysis, but I’ll save that for another post. For now, here are the 21 most used individual words in fantasy fiction titles from my experiment and how many times they appeared in just over 2500 results:
- “Book”: 388
- “Chronicles”: 98
- “Trilogy”: 92
- “Dark“: 91
- “Blood”: 83
- “One”: 81 (I suspect from all the “Book One of…”)
- “Magic”: 76
- “Saga”: 71
- “Moon”: 63
- “Dragon”: 53
- “Vampire”: 51
- “Night”: 47
- “Time”: 47
- “Black”: 47
- “Story”: 45
(Why 21? Because you can’t stop a fantasy-title list before you even get to “blade”, obviously…)
A follow-up post with links to raw data and the most common “concepts”/word-groupings/synonyms in fantasy fiction titles may be found in part 2.