State of the Computer Book Market 2008, part 4 — The Languages

In this fourth post (parts one, two and three are found here) on the State of the Computer Book Market, we will look at programming languages and drill in a little on each language area.

Overall the market for programming languages was down 5.9% in 2008 when compared with 2007. There were 1,849,974 units sold in 2007 versus 1,740,808 units sold in 2008, which is a decrease of 109,166  units. So the unhealthy 8% loss in the Overall Computer Book Market was not completely fueled by programming-oriented books.

Before we begin to drill in on the languages, we thought it would be best to explain our “language dimension.” When we group books by their language dimension, we categorize them by the language used in their code examples. So Flash Programming with Java would be in our Flash atomic category, but the language dimension would be Java. Similarly, our Head First Design Patterns book contains all examples written in Java, so it too carries the “java” tag on the language dimension.   

A Treemap view of the Programming Languages

TM_qtr_py_Prog_Lang.jpg

In the above treemap view, you’ll notice a couple of bright green areas — namely Objective-C and ActionScript. PHP and C# are dark green and show a nice growth when compared to the rest of the larger languages in the fourth quarter of 2007. Unfortunately what this does not show is a visual reference for how the size of the box changes over time. We reported last year that Ruby had grown nicely, had passed Perl and Python, and was knocking on the door for Visual Basic’s spot. However, Ruby had the largest decrease in unit sales in 2008. Of the large languages, the following show a healthy growth trend in 2008: C# with 17,397 more units, PHP with 10,896 more units, ActionScript with 23,881 more units, and Python with 11,517 more units.

Last year we reported that C# should surpass Java as the number one language this year.” C# is now the largest programming language for all book sales, and that was the case for all of 2008.

If you look at the five-year trend for the languages shown below, you can see that C# has been steadily growing year after year while Java has been going in the opposite direction during the same period. PHP, ActionScript and Python are the other languages going in a positive direction. Ruby, Java, and C++ had the biggest declines in unit sales during 2008, and Ruby dropped out of the top 10 languages.

2008 Market Share


Computer Book Sales 2008 - All Languages 5yrs

Before we dive in, let’s look at the high-level picture for the grouping of languages. As you can see in the table below, the Major and Immaterial languages experienced growth in 2008 while the rest experienced a decline. The languages driving the growth in the Immaterial category are Alice, Haskell and F#. Titles in this category will be moving up as functional languages continue to take off. In the Major group it was Objective-C and Python that carried the group to a positive number compared to 2007.

Category Category Unit Range 2008 Units 2007 Units Growth
Large 100,000 – 275,000 1,075,317 1,173,444 -98,127
Major 28,000 – 99,999 508,431 441,739 66,692
Minor 5,000 – 27,999 114,397 152,890 -38,493
Low-Volume 2,000 – 4,999 32,679 77,482 -44,803
Immaterial 1,000 – 1,999 9,950 4,392 5,558
LineList Less than 1k 5,245 5,482 -237

For the sake of grouping and presenting this information in a more readable format, we have classified the categories for the languages in this way with the following headers:

*Large* U N I T S T I T L E S M A R K E T S H A R E
1. Language 2. 2008
Units
3. 2007
Units
4. 2008
Titles
5. 2007
Titles
6. 08Mkt
Share
7. 07Mkt
Share
  1. Name or short name of the language
  2. Units sold in 2008
  3. Units sold in 2007
  4. Number of Titles making Bookscan 3000 in 2008
  5. Number of Titles making Bookscan 3000 in 2007
  6. 2008 Market Share
  7. 2007 Market Share

The following table contains data for the Large languages. As you can see, C#, PHP and ActionScript were the only languages experiencing growth. It is interesting to note that last year we reported that PHP was surprisingly down, yet it rebounded in 2008 and showed a nice 1% market share growth. ActionScript joined the large languages up from the Major language group. The .NET Languages dropped out of the Large category and is now in the Major language group. JavaScript lost ground despite seeing more titles make the top 3000. Those JavaScript titles sold fewer units per book.

Large Programming Languages — >100,000 – 275,000 units in 2008

*Large* U N I T S T I T L E S M A R K E T S H A R E
Language 2008
Units
2007
Units
2008
Titles
2007
Titles
08Mkt
Share
07Mkt
Share
C# 271,938 232,102 223 178 15.58% 13.60%
Java 211,009 241,628 316 306 12.09% 13.60%
PHP 173,214 158,538 129 103 9.93% 8.86%
JavaScript 172,667 203,225 142 117 9.89% 10.91%
C/C++ 145,926 167,344 220 238 8.36% 9.24%
ActionScript 100,563 85,971 66 41 5.76% 4.84%

Here are the top titles for the Large languages, and incidentally, the titles and order are the same whether you look at Units sold or Dollars generated:

Apress Pro C# 2008 and the .NET 3.5 Platform
Sams Sams Teach Yourself PHP, MySQL and Apache All in One
Friends of Ed The Essential Guide to Dreamweaver CS3 with CSS, Ajax, and PHP
O’Reilly Head First Design Patterns
Peachpit PHP 6 and MySQL 5 for Dynamic Web Sites: Visual QuickPro Guide

You’ll notice in the Mid-Major languages that Python and Objective-C are the two languages that are showing growth when you compare 2008 and 2007. Objective-C has one of the largest market share growths for all languages. It seems as though developers really want to build iPhone and Mac applications — not sure what else this growth could be attributed to.

Major Programming Languages — 28,000 – 99,999 units in 2008

*Major* U N I T S T I T L E S M A R K E T S H A R E
Language 2008
Units
2007
Units
2008
Titles
2007
Titles
08Mkt
Share
07Mkt
Share
.NET Languages 94,169 107,077 89 89 5.40% 6.10%
SQL 79,722 89,289 84 82 4.57% 5.03%
Visual Basic 72,491 99,964 152 127 5.04% 5.67%
Ruby 61,171 95,731 69 40 3.51% 5.39%
Python 59,530 46,028 53 41 3.41% 2.63%
VBA 55,559 67,097 60 61 3.18% 3.78%
Objective-C 44,616 5,509 20 9 2.56% 0.47%
Perl 28,585 37,984 41 43 1.64% 2.14%

Here are the top titles for the Major languages.

Wrox Professional ASP.NET 3.5: In C# and VB
Addison Wesley Cocoa
O’Reilly Learning Python
Pragmatic Agile Web Development with Rails
Sams Sams Teach Yourself SQL in 10 Minutes
Wrox Beginning ASP.NET 3.5: In C# and VB

Minor Programming Languages — 5,000 – 27,999 units in 2008

So the news in this category is that Lua, Processing and C had the largest growth in units, 7 out of 12 languages in the category experienced unit growth. It is interesting to see Lua come out of nowhere and sell a bunch of units. Lua got a boost from the World of Warcraft title below that teaches some introductory Lua and uses the language in its examples.

*Minor* U N I T S T I T L E S M A R K E T S H A R E
Language 2008
Units
2007
Units
2008
Titles
2007
Titles
08Mkt
Share
07Mkt
Share
Transact-SQL 16,511 21,341 21 16 .95% 1.20%
Powershell 12,836 13,961 16 9 .79% .74%
Lua 11,155 2,367 6 3 .64% .13%
C 10,760 4,854 29 15 .62% .27%
Shell Script 10,113 11,479 17 12 .58% .65%
VBScript 9,497 18,167 14 16 .54% 1.03%
Processing 8,740 1,991 4 3 .50% .11%
PL/SQL 8,296 7,295 23 18 .48% .41%
BASIC 7,420 9,374 8 8 .43% .55%
MATLAB 6,937 4,602 18 15 .40% .26%
SAS 6,851 6,298 17 18 .39% .35%
Groovy 5,281 3,733 7 3 .30% .21%

Here are the top titles for the Minor languages.

Wiley World of Warcraft Programming: A Guide and Reference for Creating WoW Addons
Dummies Beginning Programming For Dummies
O’Reilly Visualizing Data: Exploring and Explaining Data with the Processing Environment
MIT Press Processing: A Programming Handbook for Visual Designers and Artists
O’Reilly Classic Shell Scripting

Low-Volume Languages — 2,000 – 4,999 units in 2008

The news in this category is that 9 out of 12 languages showed growth in 2008 when compared to 2007. Autolisp and FBML led the pack, but were closely followed by Linden-script and Alice. MDX, AppleScript and LaTeX are the only three languages in this grouping that sold fewer units in 2008 than in 2007.

*Low-Volume* U N I T S T I T L E S M A R K E T S H A R E
Language 2008
Units
2007
Units
2008
Titles
2007
Titles
08Mkt
Share
07Mkt
Share
Assembly 4,474 3,762 12 13 .26% .21%
Linden script 4,368 2,830 5 3 .25% .16%
MEL 3,181 2,386 6 4 .18% .13%
Erlang 2,622 2,617 1 1 .15% .15%
NXT-G 2,575 1,659 1 1 .15% .09%
AutoLISP 2,478 0 7 5 .14% 0%
FBML 2,363 0 5 0 .14% 0%
MDX 2,244 2,743 4 3 .13% .15%
AppleScript 2,206 3,012 6 6 .13% .17%
LaTeX 2,077 2,718 5 6 .12% .17%
Alice 2,007 751 8 6 .11% .04%

Here are the top titles for the Low-Volume languages.

Pragmatic Programming Erlang: Software for a Concurrent World
Apress LEGO MINDSTORMS NXT-G Programming Guide
Wiley AutoCAD 2009 & AutoCAD LT 2009 Bible
Sybex Stop Staring: Facial Modeling and Animation Done Right
Sybex Creating Your World: The Official Guide to Advanced Content Creation for Second Life

Immaterial Programming Languages — 1,000 – 1,999 units in 2008

The following languages all sold between 1 and 999 units in Q1 ’07. These are what I am considering the Immaterial programming languages. It should be noted that in 2009, our Real World Haskell book as already sold as much as the whole Haskell market did in 2008. The noticeable trend with the Immaterial languages is large growth of F# and NXT.

*Immaterial* U N I T S T I T L E S M A R K E T S H A R E
Language 2008
Units
2007
Units
2008
Titles
2007
Titles
08Mkt
Share
07Mkt
Share
AWK 1,971 2,572 2 2 .11% .14%
F# 1,763 698 3 2 .10% .04%
Haskell 1,491 1,268 4 4 .09% .07%
Scheme 1,349 1,271 7 7 .08% .07%
R 1,194 823 3 7 .07% .05%
Tcl 1,180 1,588 4 5 .07% .09%
NXT 1,002 0 1 0 .06% 0%

Here are the top titles for the Immaterial languages.

O’Reilly sed & awk
Apress Expert F#
Prentice Hall Practical Programming in Tcl and Tk
Apress Creating Cool MINDSTORMS NXT Robots
MIT Press The Little Schemer

LineList Programming Languages — < 1,000 units in 2008

Lastly, the following languages sold fewer than 1,000 units in 2008. Here is the list in alpha order: abap, ada, awd, blitzmax, cl, cobol, cs2, d, delphi, directx, dsl, e, eiffel, fortran, haxe, idl, javafx, jcl, kml, labview, lingo, lisp, m, maxscript, ml, mumps, mysql spl, natural, ocaml, octave, oopic, opl, pascal, pda languages, peoplecode, phrogram, pl/1, qbasic, realbasic, rexx, rpg, s, scratch, smalltalk, spark, sql server, squeak, unknown, unrealscript, windows script, and x++.

So this concludes the Languages view of the State of the Computer Book Market. I hope you enjoyed it. Pay attention to this space, as I will be publishing this information twice a year. Now that we have all the queries, spreadsheets, pivot-tables and systems down, we should be able to update these posts much more easily going forward. If you have anything you would like explored a bit more thoroughly, please leave a comment here and we will see what we can do.

Related

Sign up for the O'Reilly Programming Newsletter to get weekly insight from industry insiders.
  • http://cluonflux.com/ Alex

    No Scala? Maybe because the major Scala book last year was self-published?

  • Justin

    Graph data about Python is correct. Based on your data tables, Python grew by roughly 30%, but your graph shows a drop of 14%, I think you reversed your numerator and denominator here.

  • MySchizoBuddy

    the tree map is comparison of same quarter last year. Atleast thats what the menu on the treemap says. that’s prolly why the numbers in the tables do not match with the treemap.

  • Simon Hibbs

    Very confusing. The bar graph shows Visual basic having a 2008 market share of around 75,000 units, only a bit above Python, yet in the table of Large languages it’s share is 147,000 compared to Python’s 59,000. The treemap also shows VB only being only slightly larger than Python.

    Simon Hibbs

  • http://skein.tumblr.com Ted Han

    I don’t mean to troll, but isn’t there a rather large and unquantified hole in your data sources?

    If you’re only pulling numbers from Point Of Sale systems, aren’t you neglecting the entire ebook and online sales market? Things that i’d think would be important for accounting for with a tech savvy consumer base? I can say from my (yes anecdotal) experience that i have ordered many more books online, in digital and dead-tree formats, than i have purchased off the shelf in the past 12 months.

    A total lack of accounting for that seems like a fatal and catastrophic flaw for any sort of data analysis you’d like to do.

    (It’s hard to find information about how bookscan does it’s data collection, perhaps it does include ebooks. Amazon.com seems to be tracked, but what about publishers who sell directly like The Pragmatic Press guys?)

  • cytwombly

    Lots of value and insight in all four parts, which is what I’ve come to expect from the “State” series. Thanks for making it available on the free web.

  • MySchizoBuddy

    @Ted
    “Based on data from Nielsen Bookscan, which aggregates point-of-sale data from about 70% of US bookstores, including Amazon, Barnes & Noble, Borders, and many smaller chains and leading independent bookstores, computer book sales”

    Btw you are assuming that once the hole is filled up the data will somehow change significantly. Think of how polls are done. the exact numbers don’t matter what matters is the trend.

  • http://binstock.blogspot.com Andrew Binstock

    Interesting: C/C++ sales are down 22%, but C sales are up in a big way.

  • http://www.mikehendrickson.com Mike Hendrickson

    @Justin The Treemap for languages is comparing the fourth quarter of 2008 with the fourth quarter of 2007. That is why there is a -14% drop for python although for the year there was growth.

    @AndrewBinstock I think part of the reason why C is up is I asked for it to be separated from C++ because I do not like the lumping of the two together. Should be put all languages that are in a similar family tree together, I think not. So this is part of the reason. In general, a far as I still can tell, C is still heavily used for teaching in colleges and for devices.

    @simon.hibbs Thanks. VB was in the wrong place with the wrong number. I have adjusted it to where it belongs – so it is now correct. I had all of these posts deleted when I was about to post, so I had to re-write them from a hard copy that I had printed before the loss. I apparently did not get VB re-entered correctly, because my print out had 72,491 for VB. Good eye and thanks for spotting it.

    @TedWan I have a fifth posting that will contain a summary of 1-4, Top Authors, and eBooks – from what I can collect. Stay tuned. It is interesting to compare the growth of eBooks with the decline of print.

  • http://blog.purplearth.net Obbie Z

    Interesting article, but….

    Many experienced programmers wishing to learn a new language will download free info from the ‘net. Perhaps an added metric for measuring the “popularity” of a language would be the traffic to languages’ “official” site (e.g. php.net for php, or http://www.php.net/download-docs.php for the php manual). Yes, I know it would be hard to measure, but this is how *I* learn new languages (rather than buying books).

    Another important metric that’d be hard to measure would be traffic in language books at libraries – public and otherwise (schools, universities, etc.)… another source of info for people who don’t buy books.

  • http://www.phpreferencebook.com/ Mario

    Well, it’s good to know that PHP books are still trending upwards, there is still hope for those fighting for attention among the sea of PHP books. There is hope yet.

    On a side note, even if book sales from lulu and other avenues of distribution (such as creative commons pdfs of books), MySchizoBuddy is correct in that it won’t change the overall trending results in a significant way.

  • http://majesticseacreature.com Gregory Brown

    Very interesting data and great visualations!

    It’s sort of depressing to see Ruby fall out of the top 10, but not very surprising. Although plenty of Ruby books came out in 2008, few really pushed the envelope. It seemed like for a while there was a gold rush to put out anything that carried Ruby / Rails in the title, but that lack of diversity and depth to many of the books that came out left something to be desired. I think that now that the Rails honeymoon is over, authors and publishers will need to work harder to produce something that stands out rather than yet another introductory guide or quickly outdated reference book.

    It’s a shame, because some cool books did come out in 2008. I’ve heard really good things about “Design Patterns in Ruby”, and “Advanced Rails” by Brad Ediger is probably a hidden gem that could have been a big hit if it was actually marketed properly.

    But in 2009, I’m hoping that quality trumps quantity and people become seriously interested in Ruby again. Maybe it’s wishful thinking, but with Ruby 1.9.1 released, there is a lot to talk about. Books like David Black’s “The Well Grounded Rubyist” and the Pickaxe 3 will provide people with much better learning tools for a modern Ruby than what we’ve had over the last couple years.

    Of course, I’m hoping that my book, “Ruby Best Practices” also has an impact of it’s own, since it provides more of the “how” and “why” rather than simply the “what” component of Ruby programming. But only time will tell. If Ruby book sales never recover their growth rate, it’s not necessarily an indication of declining interest within the dedicated community, but the departure of fad-chasers off to whatever the ‘next big thing’ is.

  • http://dandascalescu.com Dan Dascalescu

    Perl’s book sales dropped in part because Oracle refused to publish two Perl books (one on Catalyst and another one on modern Perl), and laid off chromatic.

  • MySchizoBuddy
  • http://www.modernperlbooks.com/ chromatic

    @Dan, I appreciate the support, but none of those O’Reilly decisions happened in time to affect Q4 2008 sales.

    The sales figures here would be more meaningful if they took into account the frontlist/backlist breakdown of titles, showed the performance-per-unit in topic areas, and included information on publication date… but that’s probably a topic for a post of my own, and not a comment here.

  • Ted han

    @MySchizoBuddy

    Well that’s the issue. I have no intuition (i would gather that very few people, if any do) as to whether ebook & online sales in aggregate follow the other 70% of the market (and that’s if bookscan is accurate on their coverage of the market). 30% is a significant slice of the pie, and who knows, they could all be standing on their heads for all we know.

    Given that we don’t know anything about that segment of the market, i’m -really- skittish about general conclusions, particularly about language popularity, which may or may not be correlated with the segment of the market that nielsen just happens to cover.

    @MikeH

    I like the data analysis you’re doing. I think it’s fun and interesting to look at data trends. But i’m disappointed that you’re drawing conclusions w/o any discussion of whether your dataset is representative (and what it’s representative of). I’m interested in reading more, but, i’d like more greater notice of the caveats that should be borne in mind, personally.

  • Tom

    @Ted han: Covering 70% of the market is more than statistically significant. Consider that most polls and marketing studies are done with just a tiny fraction of that. I don’t think any rational person would believe that the remaining 30% would radically skew the data and trends you see in these statistics.

    In other words, I think you are in denial because your favorite programming language is on the decline. :o)

  • http://robbiebow.co.uk/ Robbie Bow

    I like the treemap view, and these figures give pause for thought. SQL taking such a big drop over the year suggests the forces at play are maybe more complex than we are tempted to reduce them to. After all, SQL is pretty much ubiquitous for programmers, no?

  • http://dandascalescu.com Dan Dascalescu

    @chromatic: noted, and s/Oracle/O’Reilly/ # duh

  • MySchizoBuddy

    SystemVerilog is missing from the list. it already has over 20 books on amazon, not to mention all the fpga design books that use systemverilog

  • http://www.buildingblock.com.au Natacha

    Sql Server sells less than 1000 units in 2008 – Really??? Are you sure?

    Since the last time I checked it was still Microsoft’s primary database development platform!

    :D

  • http://www.ffconsultancy.com Jon Harrop

    This is an interesting analysis but there is a lot of room for improvement:

    Firstly, you are still salami slicing data that is now over a year out of date. Using data from this year would be much more interesting.

    Secondly, low-volume books are typically more expensive so it is silly to label them “immaterial” when they make far more profit than most of the books O’Reilly publish.

    Thirdly, responding primarily to the other comments, the statistics presented here cover nowhere near 70% of the (worldwide) computer book market. Books like OCaml for Scientists sell only 30% to the US and under 1% through bookstores.

  • dhughes

    It is true that the sampling is cause for caution in interpreting these results.

    The issue is not the sample size, which seems large enough to me. In fact, you could use a far smaller sample size to arrive at statistically significant results.

    The issue is whether or not the sample is representative of the population.

    One could make a strong prima facie case that online book buyers may be different from those who buy books in stores. So I wouldn’t be surprised if the numbers came out a bit differently. I would be surprise if the overall pattern were radically different. But that’s an empirical question of course.

    Overall I feel confident that these numbers are close enough for anything you’d want to do with them.

  • http://millenniumweb.com ahsan

    Where would ColdFusion fall on this statistics?

  • Hassan

    I don’t know if it’s true that the sampling method wouldn’t strongly affect the results so that the trend isn’t represented. For instance, how many people buy the Ruby Pickaxe book direct from the publisher and so didn’t get counted?

    And I don’t even like Ruby (decent syntax, shocking performance).

    Then how likely do you think it is that the kind of people who write C# or Java code (and maybe even PHP + ActionScript for different reasons) are considerably more likely to buy books than a Perl, C or even Python dev?

    Then there is things like shell scripting. Who needs to buy a book on shell scripting? It’s easy and there are high quality free books on the web where you can learn it from. I’m sure it’s still a fairly minor language, but I’d less than Lua? Seriously, there must be a lot of hopeless sysadmins out there, or maybe they just learned it from the Advanced Bash Script HOWTO like everyone else.

    This is the best thing we have at the moment to measure the trends, no doubt about it, but I am *certain* beyond any reasonable doubt that there IS significant difference between this and certain actual trends. This is a really rough radar, for all it is the best one we have.

  • http://bestsewingmachineinfo.org brother-cs6000i

    Nice to see good old Visual basic still in there.