Diversifying Readership Through Open Access

By Ros Pyne, Director of Open Access Books and Book Policies, Springer Nature

A few years ago, we did some work looking at the effect of open access (OA) on downloads and citations of scholarly books. Our authors were excited to hear about the impact that OA could have on their work, but the next question was always along the lines of, ‘But where are those extra downloads coming from? Is OA actually helping books to achieve a more diverse audience?’ A survey of book authors’ attitudes to OA that we conducted last year confirmed this concern: we found that reaching a broad readership – and reaching non-academic audiences such as policymakers and practitioners – ranked high in book authors’ motivations. Reaching readers in low-income- and lower-middle-income-countries (LICs and LMICs) was particularly important to authors who had published an OA book.

OA books are now in their second decade, but we find many authors are still sceptical, or at any rate unsure if it’s really worth it. Perhaps it seems obvious, or intuitive, that OA expands a book’s readership, but being able to point to evidence for this important benefit can be immensely powerful in making the case for OA to book authors.

We’re at a crucial stage in the policy debate. European funders are starting to engage with OA for books, and Plan S for books is looming on the horizon. We knew the question of equity of access to scholarship was key for many funders, and hoped that evidence that OA was truly supporting this might help tip the balance in favour of greater policy support and funding for OA books.

With this in mind, we set out to find out how OA affected the geographic usage of scholarly books, with the aim of increasing the body of evidence-based research to support a greater understanding of the value of OA to books.

Others have asked this question before. Notably, Ronald Snijder’s 2013 study, based on a sample of 180 books, showed that despite a ‘digital divide’ in discovery and use between poorer and richer countries, OA led to increased proportions of usage in LICs and LMICs. Six years later, we are able to re-visit this using a much larger dataset of OA (and non-OA) books, and provide a more detailed exploration of these questions.

It was important to us to partner with an academic team to ensure that the study would be methodologically rigorous. A couple of years ago, the Collaborative Open Access Research & Development (COARD) team at Curtin University, comprising Cameron Neylon, Lucy Montgomery, and Alkim Özaygen, approached us about collaborating on a follow-up project to our previous work on OA book impact. COARD of course have a stellar reputation for their scholarly research on OA – and in particular, for their expertise in OA books – and when they reached out we knew exactly what project to propose.

We provided COARD with a dataset of almost 4,000 books, including 281 OA books. The resulting white paper, Diversifying readership through open access: A usage analysis for OA books, presents the analysis of that data, digging into those key questions of what effect OA has on the geographic usage of OA books. For those interested in the technical detail, check out the accompanying preprint.

So, what did COARD’s analysis find?

  • OA has a robust effect on the number of downloads, geographical diversity of downloads, and citations of books. Downloads of OA books in the study were on average 10 times higher than those of non-OA books, and citations of OA books were 2.4 times higher on average – an even larger OA effect than we found in our previous research in this area.
  • For every category of book in the sample there is an increase of at least 2.7-fold in downloads for OA books. The effect was seen for all disciplinary groupings, in HSS and STM, across all three years of publication in the dataset, for all types of book (monographs, contributed volumes, and mid-length books) and for every month after publication.
  • OA books in the study had a greater proportion of usage in a wider range of countries. They were downloaded in 61% more countries than non-OA books. Importantly, OA books had higher usage in low-income or lower-middle-income countries, including a high number of countries in Africa. Analysis using the Gini coefficient disparity index showed that OA books have quantitatively greater geographic diversity of downloads.
  • Downloads of OA books from the open web were generally around double those from institutional network points. Of course, we can’t rule out that the open web downloads are simply off-campus downloads from readers who already have institutional access, but the balance between the two, and the fact that the OA books reached so many more countries does point to a more diverse readership.

OA is, in other words, making a substantial difference to the reach of books and their authors.

We also wanted to understand how we could best support dissemination and visibility of OA books. Just making something free to access doesn’t mean it will be found and read, and publishers play a crucial role in supporting that dissemination. The COARD team looked at the effect of title on the number of downloads, and found that books that contained the names of countries and regions in their title generally showed enhanced usage in those regions, with the effect most apparent for Latin America and Africa. This effect was much stronger for OA books. That is, not only is OA enhancing usage in countries under-represented in global scholarship, it is also enhancing the global usage of scholarship about those countries.

We acknowledge that there are limitations. The analysis doesn’t control for the prestige of the authors, and so we can’t rule out a connection between authors who have access to funding for OA and authors with a high profile and network. Set against this, Springer Nature is a large publisher with extensive reach – our non-OA books are also read widely – and, if anything, we would expect this to diminish the impact of OA, so the striking increase is the more notable for that.

The biggest limitation, perhaps, is that the study only looks at books published by Springer Nature (under both the Springer and Palgrave Macmillan imprints), and we hope this study will encourage further work of this kind by others. I’m particularly excited by the work being done by the OA eBook Usage Project to establish a datatrust that will support further research and benchmarking in this area. Here’s to making the case for OA and supporting a vibrant OA future for scholarly books.

  1. Unfortunately I am not able to make it for the discussion. But I thought I would just note a couple of things that I hope you will be willing to consider.
    First, of course, it is great that SN is exposing this data for us to analyse – few other commercial publishers are being so open. Doing so allows us all to think carefully about the issues – and also allows us to identify and assess methodological issues in the analysis.

    I noted some concerns about the methodology used in a previous incarnation of this analysis and dataset: https://rupertgatti.wordpress.com/2017/12/11/handle-with-care-pitfalls-in-analysing-book-usage-data/
    Basically those concerns fell into two categories:
    1. the underlying data used
    2. failing to control for other possible explanatory factors
    I remain concerned about both these, as neither seem to be addressed in this report either.

    1. Underlying data.
    The report uses a metric for OA downloads which adds together the total number of individual chapter downloads for a title. When a book is downloaded in its entirety this is recorded as a download for every chapter – thus a whole book downloaded is recorded by the total number of chapters in the book (which, of course, varies from book to book).
    Data in this form has many difficulties – but one important feature is that changes or differences in download ‘behaviour’ by readers (either by topic, or country, or publication type) will causes differences in the number of downloads recorded.
    An example: if readers of works in countries with less stable internet connections are ‘relatively’ more likely to download the entire book – while readers with more stable internet connections are ‘relatively’ more likely to download individual chapters – then each reader with less stable internet will be recorded as making more downloads than those with better internet. (Why might that be: well, maybe readers with more reliable internet access just download to their browser the chapter they want to read when they want to read it – while readers with less reliable internet access download the whole book to their device while they have internet access, so that they can read it at their leisure offline). Thus the observation that African readers make heavier use of OA content ‘may’ just be an artifact of different downloading/reading behaviours of readers with less stable internet access (interacting with the data aggregation method adopted).

    2. Other ‘confounding’ explanatory factors
    Given that it often costs authors/institutions/funders quite a lot of money to make a book OA at SpringerNature – it may be that there is a selection bias in the works that are published that way (even amongst all works publishing in SN) . Works from ‘wealthy’ institutions may be more likely to be published OA, more senior/prestigious authors may be more likely to publish OA, works that are thought to have a broader audience may be more likely to be published OA, etc.. Thus there is a selection bias in the works that are self-selected to be published OA. Not correcting for this means that we cannot be ‘certain’ that OA itself is contributing anything to increased downloads – it may just be that books more likely to be downloaded are also more likely to be ‘self-selected’ for OA publication by the authors. There are have been numerous studies highlighting the importance of correcting for author/institution/topic characteristics in the analysis of journal articles – the vast majority significantly reduce the impact of OA. (https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2269040)
    All this information is available to SN and the researchers – but not (as I understand it) to anybody else, or any other researcher. So the only people presently able to undertake that analysis are internal to SN.

