• May 25, 2013

Previous

Next

Is There an Open-Access Citation Advantage?

October 19, 2010, 4:37 pm

It’s Open Access Week, a worldwide effort to promote open access “as a new norm in scholarship and research,” organized by the folks at the Scholarly Publishing and Resources Coalition, or Sparc. Many institutions are hosting related events this week, including workshops for graduate students and faculty members interested in open-access publishing models. (See, for instance, this how-to session organized by libraries at the University of Minnesota-Twin Cities. You can note your own event on ProfHacker.)

A key point in debates about open access centers on how much and whether authors benefit from open-access publication. Advocates such as Stevan Harnad maintain that such publication creates a citation advantage—that openly available articles are cited more frequently. Mr. Harnad holds a research chair in cognitive science at the University of Quebec at Montreal and  is a professor of cognitive science at the University of Southampton. He is also one of the co-authors of a paper, released yesterday by PLoS One, titled “Self-Selected or Mandated, Open Access Increases Citation Impact for Higher Quality Research.” Among the co-authors is Yassine Gargouri, one of his colleagues at Montreal.

The paper reports the findings of an investigation the co-authors conducted into whether authors got a citation boost when they voluntarily made copies of their articles available rather than being required to do so under an open-access, or OA, policy. As a news release about the paper puts the question, are open-access articles “more likely to be cited because they were made OA, or were they made OA because they were more likely to be cited?”

Skeptics have argued that self-archiving of articles leads to more citations because authors are likely to make their best work available. As the paper’s titles suggests, however, the investigators concluded that it was open access that most affected citations, not whether the open archiving was voluntary or required. According to their findings, open access created a “quality advantage” because users decided “what to use and cite, freed by OA from the constraints of selective accessibility to subscribers only.”

How reliable is that conclusion? When a preprint of the article circulated earlier this year, there was a lively debate about its merits on the Scholarly Kitchen blog. Philip M. Davis, a postdoctoral associate in the department of communication at Cornell University, identified what he considered a number of idiosyncrasies in the investigators’ approach. “Their unorthodox methodology,” he wrote, “results in some inconsistent and counterintuitive results that are not properly addressed in their narrative.” Other scholars weighed in on the issue in the comments thread.

Mr. Davis is now three years into his own research project on open access and citation rates. The Chronicle asked him for his reaction to the final version of the paper by Mr. Harnad et al. Describing it as a piece of advocacy more than research, he reiterated his concerns about the methodology used. “As a result, its conclusions—that ‘the OA advantage is real, independent and causal’—greatly overstate its findings,” Mr. Davis said via e-mail. He added that “none of the journals in our study are showing any citation effect as a result of open access.”

The debate continues. If you’d like to weigh in, please do so in the comments.

This entry was posted in Open Access. Bookmark the permalink.

17 Responses to Is There an Open-Access Citation Advantage?

gahnett - October 19, 2010 at 5:53 pm

I would say there is a Google-Access Citation advantage…although I wouldn’t agree that that is a good thing…

stevanharnad - October 19, 2010 at 9:55 pm

Correlation, Causation, and the Weight of Evidence

Jennifer Howard seems to have missed the point of our article. It is undisputed that study after study has found that the Open Access (OA) is *correlated* with higher probability of citation. The question our study addressed was whether making an article OA causes the higher probability of citation, or the higher probability causes the article to be made OA.

The latter is the “author self-selection bias” hypothesis, according to which the only reason OA articles are cited more is that authors do not make all articles OA: only the better ones, the ones that are also more likely to be cited.

The Davis et al study tested this by making articles OA randomly, instead of letting the authors do it self-selectively, and they found no increased citation for the OA articles one year after publication.

But almost no one finds that OA articles are cited more a year after citation. The OA citation advantage only becomes statistically detectable after citations have accumulated for 2-3 years.

Even more important, Davis and al. did not test the obvious and essential control condition in their randomized OA experiment: They did not test whether there was a statistically detectable OA advantage for self-selected OA in the same journals and time-window. You cannot show that an effect is an artifact of self-selection unless you show that with self-selection the effect is there, whereas with randomization it is not. All Davis et al showed was that there is no detectable OA advantage at all in their sample; randomness and self-selection have nothing to do with it.

Davis et al released their results prematurely. We are waiting to hear what Davis finds after 2-3 years, when he completes his doctoral dissertation. But if all he reports is that he has found no OA advantage at all in that sample and that interval, rather than an OA advantage for the self-selected subset and no OA advantage for the randomized subset, then again, all we will have is a failure to replicate the effect that has now been reported by many other investigators, in field after field, often with far larger samples than Davis et al’s.

Meanwhile, our study was similar to that of Davis et al, except that it was a much bigger sample, across many fields, and a much larger time window — and, most important, we *did* have a self-selective matched-control subset, which *did* show the usual OA advantage. Instead of comparing self-selective OA with randomized OA, however, we compared it with *mandated* OA — which amounts to much the same thing, because the point of the self-selection hypothesis is that the author picks and chooses what to make OA, whereas if the OA is mandatory (required), the author is not picking and choosing, just as the author is not picking and choosing when the OA is imposed randomly.

And our finding is that the mandated OA advantage is just as big as the self-selective OA advantage.

As we discussed in our article, if someone really clings to the self-selection hypothesis, there are some remaining points of uncertainty in our study that self-selectionists can still hope will bear them out: Compliance with the mandates was not 100%, but 60-70%. So the self-selected hypothesis has a chance of being resurrected if one argues that now it is no longer a case of positive selection for the stronger articles, but a refusal to comply with the mandate for the weaker ones. One would have expected, however, that if this were true, the OA advantage would at least be weaker for mandated OA than for unmandated OA, since the percentage self-archived under a mandate is almost three times the percentage 5-25% that is self-archived self-selectively. Yet the OA advantage is undiminished with 60-70% mandate compliance. And we have since extended the 2002-2006 study by three years, to 2009; the compliance rate rises by another 10%, but the mandated OA advantage remains undiminished.

The other way of saving the self-selection hypothesis despite our findings is to argue that there was a “self-selection” artifact in terms of which institutions do and do not mandate OA: Maybe it’s the better ones that self-select to do so. There may be a case to be made that one of our four mandated institutions — CERN — is an elite institution. (It is also physics-only.) But we re-did our analysis removing CERN, and we got the same outcome. Even if the objection of eliteness is extended to Southampton ECS, removing that did not change the outcome either. We leave it to the reader to decide whether our remaining two mandating institutions — University of Minho in Portugal and Queensland University of Technology in Australia — count as elite institutions, compared to other universities. It is a historical fact, however, that these four institutions were the first in the world to elect to mandate OA.

We will not speculate on the reason some might still wish to cling to the self-selection bias hypothesis in the face of the evidence to date. We point out only that it seems to be almost a matter of common sense that making articles more accessible to users also makes them more usable and citable — especially in a world where all researchers are familiar with the situation of arriving at a link to an article that they would like to read (but their institution does not subscribe), so they are asked to drop it into the shopping cart and pay $30 at the check-out counter. I would say that that was the default hypothesis, based on both plausibility and the weight of the evidence, and that the burden of proof is now on the advocates of the alternative.

Davis, PN, Lewenstein, BV, Simon, DH, Booth, JG, & Connolly, MJL (2008) Open access publishing, article downloads, and citations: randomised controlled trial British Medical Journal 337: a568
http://www.bmj.com/content/337/bmj.a568.full

Gargouri, Y., Hajjem, C., Lariviere, V., Gingras, Y., Brody, T., Carr, L. and Harnad, S. (2010) Self-Selected or Mandated, Open Access Increases Citation Impact for Higher Quality Research. PLOS ONE. http://www.plosone.org/article/info:doi/10.1371/journal.pone.0013636

Harnad, S. (2008) Davis et al’s 1-year Study of Self-Selection Bias: No Self-Archiving Control, No OA Effect, No Conclusion. Open Access Archivangelism July 31 2008

richardtaborgreene - October 20, 2010 at 8:08 am

There are KINDS of literature review and some kinds are nearly entirely googled while others require a great deal more precision. IF IF IF I can get away with mere googling I always do so, and what does not admit of that treatment, I reluctantly, with complaint and lamentation, do via paying outrageous rates to commercial publishers who have bought up all academic journal ones.

What AMUSES me greatly is I can recognize in the references sections of papers of others, outstanding mediocre papers that we all use because it covers nearly the same ground as an outstanding paper hidden in an over priced snobby piece of unread elitism by Elsevier for example. Love it.

jabberwocky12 - October 20, 2010 at 8:10 am

I’m afraid that “The Scholarly Kitchen” is known for being biased against open-access. Superficially, the writers might tip their hats at it, but the objections filter through. They may say something reasonable, and then follow it with a “but” and then posit an unreasonable statement or rhetorical question leaving the reader questioning open-access models.

Please don’t refer to them, unless you add put a disclaimer.

drgunn - October 20, 2010 at 10:16 am

To be fair, Stevan Harnad is as much an advocate for OA as Phil Davis et al. are against it, but Stevan does appear to have published data on his side. The comments above from richardtaborgreene imply a likely mechanism for this, as I’ve seen, and practiced, that kind of behavior myself.

mbelvadi - October 20, 2010 at 1:47 pm

Richardtaborgreene, haven’t you heard of Interlibrary Loan? Even if you aren’t affiliated with a university, your local public library should also be able to get any scholarly article for you for a processing fee far less than the publisher’s online fee! Or are you too impatient to wait for ILL? Most academic libraries are now processing ILL electronically, and can often get articles within the same day, or in a couple of days at worst.

Mr. Harnad, this is clearly outside the scope of your research, but I wonder to what extent people are finding open access (and thus citing it) more because they’re using non-library search tools (like Google Scholar) off-campus and stopping at paywalls when in fact with the appropriate use of library tools like the LibX toolbar, they might have gotten past the paywalls in just a click or two. That is, how often they COULD have had the non-OA article immediately for free if they just had better skills (or their libraries did a better job of offering the right tools). Combined with my previous para about ILL, I do wonder how much of the OA effect is actually illustrating a failure of libraries, not of the business model.

drgunn - October 20, 2010 at 4:02 pm

mbelvadi – Those are very interesting suggestions. I know many librarians work very hard to educate the researchers at their institutions, but I also know that some of them wonder if it’s a Sisyphean task to teach people the current right way to do something when a shorter and easier way exists.

Given that the shorter and easier way also has significant advantages, not only in easing the burden of libraries, but in improving the efficiency of the entire research process, perhaps the easier way should become the new right way?

stevanharnad - October 20, 2010 at 4:41 pm

In response to mbelvadi: Yes, some users are no doubt missing some articles to which their institution has licensed access, but that’s just a minor problem, and even if all users were proficient in accessing their institution’s licensed content, that would still only amount to a small fraction of the at least 2.5 million articles published annually in at least 25,000 refereed journals across all fields: No institution can afford all, most or even many of those journals. Nor is ILL (interlibrary loan) a solution, institutions can’t afford every paper a researchers wants to read that way either (not to mention the cost of the grotesque delay in the click-through age).
Yes, users rely on the easy (and efficient) tools (like google and google scholar) and — more important — on the accessible content. That means the OA content (for everything to which the user’s institution lacks licensed access). And that’s the self-selective factor behind the OA impact advantage.
And the universal remedy is within reach; all universities need do is grasp it, at long last: Mandate OA.
(Let’s hope this year’s OA Week Mandate Challenge hastens that optimal and inevitable day.)
A hyperlinked version of my prior post is available at:
http://openaccess.eprints.org/index.php?/archives/772-guid.html

sk_griffhoven - October 20, 2010 at 9:51 pm

I have read the paper and find the analysis rather inelegant, repetitive and obtuse. More importantly, the authors were unable to control for institutional effects in their model. While deposit mandates might be responsible for the results they report, they might not, and I don’t see how mandates would outperform self-selection. Most importantly, there is no basis for making a causal claim. I agree with Philip David: the authors greatly overstate their results.

stevanharnad - October 21, 2010 at 9:30 am

SELF-SELECTIVE EPICYCLES

Reply to sk_griffhoven:

(1) Controlling for Institutional Effects: Out of about 10,000 universities, there are (since the first one in 2003), a total of 102 institutional mandates and 26 departmental mandates in ROARMAP so far, most of them too recent to test. These mandates are all “self-selected.” But although they include elite universities such as Harvard and MIT, they also include non-elite universities, such as Minho and QUT, which were in our sample, and, as noted, when they were analyzed separately, the outcome was the same.

(2) The causal claim is not that mandated OA out-performs self-selected OA, but that self-selected OA does *not* out-perform mandated OA, hence OA is causal.

Both plausibility and the weight of the evidence to date strongly favour
the common sense default hypothesis that authors making published articles more accessible causes those articles to become more citable, rather than the alternative hypothesis that articles being more citable causes their authors to make them more accessible.

Yes, one can resurrect the self-selection hypothesis at the level of institutions too: being more cited makes institutions more likely to mandate OA, versus mandating OA makes institutions more cited. But the burden of proof is now on the proponents of the self-selection hypothesis.

For example, if a proponent of the self-selection hypothesis really believes that the citation advantage of the mandated OA was due to a self-selection bias on the part of more-cited institutions toward adopting an OA mandate, they can conduct a covariance analysis partialling out the citation rank of the author’s institution from the size of the OA advantage for both mandated and self-selected OA to test whether that eliminates or substantially reduces the mandated OA advantage relative to the self-selected OA advantage.

signofthefourwinds - October 21, 2010 at 9:49 am

One area of inquiry that would help shed light on this topic is to investigate how the OA advantage occurs. Most faculty writing journal articles use library resources, so how does the OA advantage occur? Are researchers looking outside of the library resources? If so, why? Is this a change in faculty researcher behavior? In my experience a researcher develops a habit in graduate school, based on feedback from their mentors, regarding what databases to use for their research. Then, typically, they use those resources for years (and years). If this is the case, how are the faculty coming in contact with these OA resources? It doesn’t make sense that researchers suddenly give up their habit of consulting certain databases to go search in Google Scholar. What user behavior change accounts for the OA advantage?

stevanharnad - October 21, 2010 at 10:43 am

signofthefourwinds writes: “It doesn’t make sense that researchers suddenly give up their habit of consulting certain databases to go search in Google Scholar. What user behavior change accounts for the OA advantage?”

The databases (e.g. Scopus, Web of Science, PubMed, WoS) are increasingly providing the URLs of online OA versions of the articles they index. That’s a natural development.

And, yes, Google Scholar (and CiteSeerx, Citebase, Base, Scirus and many others) are becoming the first port of call for many researchers. Once all research (and not just 15% if it) is OA, they will become the preferred ports of call — and their capabilities will be duly upgraded and enriched. (At 15% it is as yet hardly worth the bother to enrich them.)

stevanharnad - October 22, 2010 at 9:26 am

Further to signofthefourwinds who wrote: “It doesn’t make sense that researchers suddenly give up their habit of consulting certain databases to go search in Google Scholar. What user behavior change accounts for the OA advantage?”

I should of course have added something it is very important to understand: The OA Advantage is not just, or primarily, a convenience or laziness effect (though some of that no doubt contributes to it too): It is not that scholars have become sloppy, relying on google scholar instead of consulting more established databases. It is that when their institution cannot afford access to articles they need, they must make do with only those of them that they can access for free online. And that’s how the OA research garners its extra usage and impact — if (and only if) it is (1) OA [only 15% is OA unmandated, whereas 60-100% is OA if it mandated] and if and only if (2) it is of sufficient quality to warrant using and citing it: As our article notes, articles may be unusable and uncitable no matter how accessible they are made. The Pareto/Seglen effect is that the top 20% of articles are the ones that are used and cited most (80%). They are hence also the ones that derive the greatest OA Advantage.

We think the OA Advantage has several contributing components. In likely order of magnitude, they are (1) the quality advantage (the more useful articles benefit more, as noted); (2) the early access advantage (the sooner new research can be accessed and used, the greater its impact); (3) the usage advantage (reflected in downloads that come much earlier than citations, but are correlated with them and predict them); (4) the competitive advantage (while OA is still only at 15%, the OA minority has a competitive advantage over the non-OA majority, but once OA reaches 100%, this advantage will vanish); and, yes, we think there is also (5) a quality bias, with authors having a tendency to self-archive their better work when the self-archiving is self-selective: this too will vanish at 100% mandated OA, and we have shown it is not the basis of all or most of the OA Advantage, but we do not doubt that it is a contributor too, and especially in the self-archiving of unrefereed preprints, where authors are much more nervous about posting results that might prove to be wrong after refereeing…
http://www.openaccessweek.org/photo/components-of-the-oa-advantage?context=user

Further to k_griffhoven’s point about institutional self-selection bias (in deciding whether or not to adopt a mandate), the supplementary analyses showing that the OA Advantage is present even if we analyze only Minho and QUT is here (SUPPLEMENT-1: Analysis minus CERN and minus CERN&Southampton):
http://eprints.ecs.soton.ac.uk/18493/

pchard - October 22, 2010 at 3:32 pm

I don’t think that the problem is to know if Open Access reinforces or not the number of citations. It is rather a question of knowing if the measurement of science by the measurement of the number of citations has an interest for the scientific development. There is confusion between the market of the citations and the operational scientific impact of research.
The OA has many other advantages to improve the position of the researchers on the market of the knowledge. It allows initially a broader diffusion of the knowledge and it is that which is important. In particular for the “small” reviews with reduced diffusion (for reasons of economic capacity of access to the circuits of diffusion; for reasons of language of communication). The level of diffusion of a review not should be the single criterion of its scientific quality.
But this type of OA is him even restrictive. There are at least two other approaches which aims at a better diffusion of research: the open archives which make it possible to expose a paper in construction; and direct electronic open access reviews. The whole of these means must be able to give a chance to the scientific works developed in the least equipped countries.

Patrick Chardenet

signofthefourwinds - October 22, 2010 at 3:42 pm

stevanharnad writes: We think the OA Advantage has several contributing components. In likely order of magnitude, they are (1) the quality advantage (the more useful articles benefit more, as noted)………

The five OA advantages and the order that they are presented doesn’t ring true to me, because there is not enough discussion about change in discovery paths which has to occur in order for the OA advantage to occur. Increase in discovery paths would have to be present prior to increase in citations. Most subject-specific databases still don’t include many links to OA journals, and I don’t think researchers, who are under enormous time constraints, spend their time trolling Google Scholar for the needle in the haystack article that they missed in, say, PsycINFO (and how would they know that they missed an important OA article?) Steven, to help support your argument, I think that you need user stories from researchers to help provide real world examples of how their discovery paths have changed to put concrete examples around the data that you present. I think that it is fair to say that we all have the belief that faculty habits die hard, and until you can describe how faculty discovery paths have changed, the OA advantage won’t be understood by a substantial number of folks.

pchard - October 22, 2010 at 3:43 pm

RePEc: Research Papers in Economics (http://repec.org) is a volunteer-driven initiative to create a public-access database .The RePEc database links information on the published and unpublished works of thousands of economists. RePEc promotes scholarly communication by providing a database that welcomes all providers of unpublished materials in various languages.

stevanharnad - October 22, 2010 at 5:53 pm

My reply (in French) to Patrick Chardenet is here:
http://www.openaccessweek.org/forum/topics/come-see-and-join-the?xg_source=activity

In a nutshell, citations are not the goal of research; the goal is that the research should be read, used and built upon, in further research and applications. And citations are a measure of that. But for research to be read, used and built upon, it has to be accessible. That is why and how OA increases citations. And what is being mandated is OA self-archiving by authors, of all their articles (final, refereed, accepted drafts, not just drafts “in construction”), in all journals, the great and the small, in all disciplines and in all languages. The mandate is not about OA journals (though they are very welcome too).

In reply to signofthefourwinds, on “discovery paths”: Most researchers today have discovered the web, and hence they can discover the research of others, and others can discover their research, on the web. But the web is useless if the paper you want is not freely accessible on it, and your institution cannot afford to subscribe, regardless of what database you discovered it in. Most databases are online today; whether you find the title of a paper from PsycINFO or Google Scholar, you will not be able to access it without having to pay unless your institution subscribes or the paper has been made OA. (Pasting a title found in PsycINFO into a google scholar search window is a piece of cake.)

The faculty habits that need updating are not their online searching habits; it’s their online self-archiving habits — and that’s what the OA mandates from their institutions and funders are for.

  • 1255 Twenty-Third St, N.W.
  • Washington, D.C. 20037
subscribe today

Get the insight you need for success in academe.