HAPLR and the LJ Index
Imitation is the sincerest form of flattery. So, as the author of the HAPLR ratings, I am deeply flattered by the recent publication of the LJ Index. But I am also perplexed. As the author, these past ten years, of HAPLR (Hennen’s American Public Library Ratings), I am flattered because some of my chief critics have finally agreed to the need to evaluate public libraries, although they have embraced different methods. I am perplexed because Ray Lyons and Keith Lance, the LJ Index authors, have previously argued that the task of rating, ranking, or evaluating public libraries cannot, indeed, should not, ever be done. Yet now they, and their backers, Library Journal and Baker &Taylor’s Bibliostat, have chosen to do so. They have done so without saying much that justifies setting aside their prior objections. Nevertheless, competition is a very good thing. I am sure that having a competing rating system will lead to improvements in both systems.
What are the differences between HAPLR and the LJ Index? They are many but the fundamental difference is that HAPLR includes input measures while the LJ index does not. The LJ Index looks at only one side of the library service equation. HAPLR looks at both sides.
Should we compare libraries?
In his 2000 Library Journal article, “Lies, Damn Lies, and Indexes,” Lance asked why “someone had not developed a rating system” before HAPLR. Lance responded then that it was because "the prevailing sentiment in public librarianship since at least the early 1980s is that the public library is a creature of the local community to be defined, managed, and evaluated locally." I responded in the same issue of American Libraries that different restaurants have different menus, but that you expect, and get, a different level of service (and cost) from a five-star restaurant than you do from a fast-food place. I am flattered that Lance liked the idea of using star ratings for the LJ Index, of course. But I am still perplexed about how he knows that the “prevailing attitude” has changed in the intervening 9 years. Where is his evidence? I see none offered, and I wonder if a survey of “public librarianship” would find a sea change on the question has arisen these past few years. Go figure.
Back in that 2000 Library Journal article, Lance said that, using valid statistical principles, anyone could and should construct a proper index based on thoughtful deliberation on the things that make for good library service. One should not just use available data, Lance then intoned. What the LJ Index delivers, almost 10 years later, is, well, more of the same it must be said. The LJ Index simply uses available data. There is no looking for new data, no seeking standards of excellence, none of that. The LJ Index just uses the available data. I am perplexed, but as I said back in 2000, go figure!
The LJ Index authors both use and reject the use of library spending as a measure. Go figure! Although Lyons and Lance would have to agree that spending per capita is highly correlated to the LJ Index’s four input measures, the authors reject any spending measures because they are “outputs.” So I am perplexed when the LJ Index then turns around and bases the comparison categories on spending! Allow me a direct quote from Lyons and Lance in June of 2008:
That is truly perplexing. Are we to understand that the LJ Index is suppressing valid information? What is the reason again? Is it to protect libraries, taxpayers, or who exactly? Have they suppressed information to protect libraries? Or, have they protected libraries at the expense of information? Shouldn’t library professionals respect the free flow of information?
Adding to the perplexity is their decision not to use library expenditures because of problems with differing accounting methods even as they proceed to use expenditures as the basis for defining peer libraries! There are, of course, similar though unacknowledged, problems with some of the measures they chose, of course. For example, a crowded library may have no programming space and inadequate space for public access computers. Yet those measures are included while the LJ Index excludes spending, staffing levels, and materials counts; items less susceptible to building configuration. Go figure!
The LJ Index authors chose to give “stars” only to the output side of the library service equation. HAPLR ratings acknowledge the input side as well. I believe that staffing, funding, and collection size have a major impact on library service and belong in the ratings. Lyons, Lance, Library Journal, and Bibliostat apparently do not.
Spending per Capita and the LJ Index Scores
The LJ Index Scores are highly correlated to total operating expenditure per
capita as the table below demonstrates. All though statisticians
might argue about it, the correlation of 0.57 for the $30 million budget
category means that almost 6 times out of 10, the variation in the score varies along
with the rate of per capita spending. The LJ Index has both used and
rejected spending as a factor in the rating scheme.
LJ Index Spending Categories
I am perplexed that fifty percent the libraries in the over $30 million spending category got stars. That kind of thing happens in Garrison Keillor’s Lake Woebegone -- where all the children are above average-- but why does it happen in the LJ Index? Why did half of the libraries in the “over $30 million spending” category get stars but only 3% of those in the smaller spending categories get them? Go figure.
The LJ Index could, of course, have stayed with the population categories that FSCS established when Lance was involved with the data at the national level. That is something they promised as recently as June of 2008. Lance urged me to stay with the established POPULATION categories for HAPLR. Yet, perplexingly, the LJ Index abandons population categories that are long established and agreed upon in favor of new, arbitrary SPENDING categories. Lyons, Lance, LJ, and Baker and Taylor seem not to have explained that decision. Will they?
Let’s ask some questions about these new and arbitrary spending categories that the LJ Index chose. First let’s look at the results that come from these new and arbitrary categories.
a) Note that there was a tie in the $5.0M to $9.9M Category.
What have we here? Libraries with budgets that exceed $ 30 million are star rated almost half the time. At $10 million only about a third are star rated by the LJ Index. At $ 5 million the tally is one in five. Below $ 1 million, it is less than 3%. Why did they choose the spending categories that they did? They could have put everyone over $1 million into a single category and arrived at a spending category about the same size as all the others, something they seem to have decided was important for the smaller spending categories. But they did not, why not?
To get about the same number of starred libraries by combining the top four categories the results would look like this. Only 43 libraries in the over $ 1million spending categories would get stars, compared to 106 in the arbitrary method that they chose. Why?
Now it must be said that HAPLR does a bit of the same when listing top ten libraries, of course. The top ten libraries in the over 500 million population category represent a much larger percent of the libraries in the largest population category than those in smaller categories. There are only 83 libraries in the over 500,000 population category and the top ten amount to 12% of the total. That is a lot, but by comparison to the LJ Index amount of 48% of the top category it is relatively mild.
Furthermore, I chose to use the population categories used by the Institute of Museums and Library Services. The LJ Index authors chose their categories arbitrarily, with no explanation of their choice. The LJ Index authors could have chosen to place libraries in spending categories that all contained about the same number of libraries. Why didn't they? You will look vain on their site.
I am sure that others will notice that in the funding categories there are libraries with vastly different populations served. HAPLR has been berated by both LJ Index authors for not comparing “true peers. The 87 libraries in the $10 million to $30 million category range in population size from 60,679 to 1,477,156. That’s almost a 24 to 1 ratio. But in the $1 million category the range from smallest population (2,143) to the largest population (365,685) is nearly 200 to one!
Magnets or Outliers
What do these LJ scores mean? The LJ Index eliminates some all libraries with less than 1,000 population and less than a $10,000 budget because this group may contain “outliers.” An outlier is a statistical concept that indicates an observation, in this case an output per capita, that is far out of range from the rest of the observations. A library reporting 100 circulations per capita would be an example since it is hard to imagine every person in a community checking out 100 items per year. I seems that when “outliers” appear in other categories, Lyons and Lance rename them “magnet libraries.” Go figure.
One of the most perplexing things about the LJ Index is the range of total scores in each spending category. In the $1 million spending category, the top score is over 6,000 while for all the others it never goes above 4,000. But moving just one library out of that $1 million spending category would make the top score fall to 2,700! That will raise some eyebrows. So what does the LJ score mean? Is a score of 6,000 in the $ 1 million spending category better or the same than a score of 3,000 in a different category? If not what does the score mean? I know they they describe the score formula on the star library web site. Where is the explanation of how the scores can change so radically by the influence of a single star? Perhaps the fault lies not in the index but in the stars.
Being “on the cusp” has a major impact. Lyons objected to HAPLR because moving from one population category to the next changes a library’s score. Now he helps develop the LJ Index with the same result when moving from one spending category to the next and never a word about it. Go figure.
As just noted, in the $ 1 million spending category, the top score is 6,000. Moving just one library out of that category would make the top score 2,700! Is a score of 6,000 in one spending category better or the same than a score of 2,000 in a different category? If not what does the score mean? I am sure that I am not the only one who will ask such questions about the scores. When libraries are high on a category in HAPLR, the best they can do is the 99th percentile. The total possible HAPLR score is 0 to 1,000 not 0 to infinity as appears to be true for the LJ Index.
Why does the LJ Index not use reference data? Don’t we deserve more than the off handed “that’s for another article” as they put it in this LJ Index article? We need to be clear. ‘Users of Electronic Resources" as a category is included in the LJ Index but not Reference. There should be some very unhappy reference librarians out there. The federal data on which both the LJ Index and HAPLR are based has included Reference queries since it’s inception. The “users of electronic resources” number is new this (2006) year, although it has been in the ‘testing phase’ for a number of years. How can the LJ Index reject, without any real comment a measure that has been around for decades and embrace one that is brand new, also without comment? I have long said that measures of electronic use should be incorporated into the HAPLR rating system. I have been reluctant to do so because the electronic use data as reported by libraries is vastly dissimilar. Until very recently Keith Lance, one of the LJ Index authors agreed, noting that the data were too skewed to justify incorporation into HAPLR. Lance must now believe that the data are no longer too skewed because he has included them in the LJ Index. He has done nothing to justify this change that I can see. The electronic use numbers are still far more skewed than visits, circulation or attendance but the LJ Index uses them without a word of caution. That, it seems is reserved only for small library outliers and reference queries. Go figure, again, please.
Let’s consider the output side of the equation a bit further. Lyons and Lance argue that valid statistical theory requires the use of highly correlated factors. They do not argue that input measures such as spending per capita, staffing per capita, and materials owned per capita, are not correlated to the four output measures they have sanctioned. They could not do so because they are highly correlated. The authors provide at least some justification for leaving out spending, but are silent on why staffing levels, hours of service, or collection size should be excluded. I disagree.
Jim Scheppke, State Librarian of Oregon, recently noted his 1999 Library Journal article (The Trouble with Hennen). Back then he said of his ideal rating system: “It ought to measure quality as perceived by library users. That means inputs, not outputs, and that means things like hours, facilities, collections, MLS librarians, children’s services, and public access computers.”
Back then he wanted to rate using INPUTS. Now he says that the LJ Index gets it right by stressing OUTPUTS. Which is it?
He also said back then: “If high FTE staff per 1000 population is good, why is low “cost/circulation” good as well? Those two statistics may actually work against each other.” But there are a lot of libraries that have a high level of staff to serve the public AND a low spending rate per circulation. That, it seems to me, is a measure of efficiency that cannot be captured by looking exclusively at either inputs OR outputs.
Imputation and Premature use of Electronic Resource Use Measures
The LJ Index leaves out a lot of libraries that did not report data for their 2006 annual reports. Out of 9,212 libraries they rate only 7,115. That means that 2,097 libraries are left unrated. Some are unrated because they serve a population of under 1,000. That is too small by LJ standards. Others are rejected because they do not have tax support or definable staff positions. Hundreds of libraries are excluded because they did not report visits, circulation, program attendance, or electronic resource use. For many of these libraries the data were imputed. Imputation means that IMLS, the federal agency that collects library data, has imputed the data for a library based on a series of possible methods. Most frequently that means that they have used prior year data to guess at current year amounts.
A lot of libraries got left out of the LJ Index because they did not report
electronic use sessions in 2006. This was the first year that that the
data were reported nationally in this form. Many libraries did not report
electronic use because of problems with the definition of such use. Almost
a third of the libraries in the largest (Over $30 million) and smallest ($10,000
to $49,999) spending categories were omitted because their electronic use was
either omitted or imputed. When that is rectified, 7 of the 15
libraries in the over $30 million category will probably be supplanted with new comers.
Cuyahoga will probably zoom to the head of the five star class; the same place
it is in HAPLR. Toledo and Seattle
will supplant 4 star libraries. Cincinnati and King County will supplant
libraries in the three star category.
BIX, the German Version of Public Library Ratings
In 1999, the same year that HAPLR was introduced, BIX, the German rating system was introduced. It was sponsored by the Bertelsmann Foundation and the German Library Association. Over 260 libraries of all sizes have decided to participate in the project. Since July 2005, the BIX is organized by a network of partners within the Library Network of Expertise. Funding is provided by the participating libraries via an annual fee of 170 Euros per library.
Lance and Lyons reject the use of input measures in a rating system, but not so the BIX. As the graph below demonstrates, the BIX is just under half input measures while HAPLR is just over half and the LJ Index relies on inputs not at all.
Weighting of the Measures
Lyons and Lance object to that HAPLR gives more weight to some measures than to others. Yet by not weighting their factors, they are making a value judgment as well. Not weighting the factors makes them equal in importance in the LJ Index ratings. This begs the question; is a library visit truly equal in value to a circulation, an electronic resource session, or attendance at a program? They need to address this perplexing question and they have not done so. Leaving the factors un-weighted makes them equal in the index. Do they mean to say so? The table below indicates the relative weights to possible measures assigned by HAPLR, BIX, and the LJ Index. HAPLR assigns 38% of the values to input measures, BIX assigns 56%, the LJ index assigns 0%.
To those like Lyons and Lance, among others, that argue that the measures in a rating system ought neither to be weighted nor include output measures, I point to BIX, the German public library rating system that does both. Their success may be measured by the fact that hundreds of libraries pay annually to be rated using the BIX methodology. I long ago conceded that HAPLR is not an index but a rating system of my own design. The HAPLR system does not simply develop scores for libraries. It offers a variety of reports to libraries that compare their performance to comparably sized libraries in their state and in the nation. Over the years, thousands of libraries have used standard or specialized reports to evaluate current operations and chart future courses of action. I am pleased that many libraries have improved their funding and service profiles with these reports. What about the LJ Index will inspire anyone to provide better financial support for libraries?
In the end I believe that competition will make both our endeavors better and welcome the LJ Index.