SPIRES Image

Some Untested Citation Metrics

These are descriptions of (and information about) some citation metrics that have been suggested by users for use on SPIRES.

All of these are currently available on the wwwcitesummary2 format, however, if you have suggestions about them, please let us know. If you think something should be removed, or something else added, we'd like to know.

Of course any discussion of citations needs to include the fine print. Note that all stats are provided for published papers and all papers. This refers to the status of the paper, not the papers citing the paper, and uses the definition of published explained in the fine print.

We currently display the following citation statistics for any group of papers

Breakdown of search results by citation
This is our traditional citesummary format, displaying how many papers in the group fall into which category of citation count. Note that the scale is roughly a log scale, and the stats for the database as a whole can be found here
Total eligible papers analyzed
This is simply the number of papers from your search that were counted in the citesummary. This is the number used for averages, etc. Other papers may have been found, but may not have been eligible for citation tracking (again, see the fine print).
Total number of citations
Simply the total number of citations to the papers in your search
Average citations per paper
The above 2 numbers as a ratio

The above statistics have been around for a while, which does not mean that they are particularly good at predicting a physicist's Nobel prize potential, height, eye color, or any other characteristic. However, some newer statistical measures have been suggested, which are displayed in the next section. Some of these may be better as measures on an individual paper, rather than taken over a group.

h
A measure proposed by Hirsch in physics/0508025. It is explained better there than here...
Total of cites/authors
This is a sum over all papers of the quantity
  <cites to the paper>/<number of authors listed on the paper>
This should depress (further) the weight of experimental papers in a citesummary (they generally get suppressed already by the RPP), but could also eliminate some effects from being "in the room" with the authors of a renowned paper. An roundoff/truncation error was corrected in March 2006 in the calculation of this quantity.
Average of cites/authors
The above number divided by the number of papers. Note that this is not the ratio of the average cites to the average authors, which would probably be less useful.
Total of (cites_to + cites_from)
This is the sum over all papers of the number of cites to a paper (we call these citations), plus the number of citations from the paper (we call these references). This may say something about how connected to the rest of the literature this paper is.
Average of (cites_to + cites_from)
Again, simply the average of the above. (For the arithmetically challenged: this is exactly the same as average cites + average references).
Total of (cites_to - self_cites)
This is the sum over all papers of the number of citations minus the number of citations to a paper by at least one of the authors of that paper. Things to note here:
  • If you are doing a citesummary for author A, the self_cites for her paper 1 are all papers citing paper 1 by author A or her co-authors for paper 1. The self_cites for paper 2 are author A or her co-authors on paper 2.
  • If there are more than 20 authors on a paper, the collaboration name is used to exclude cites from papers with the same collaboration name. Collaboration is not used for papers under 20 authors. If the paper has more than 20 authors and no collaboration name, only the first 20 authors are used. This will exclude most self citations from that same group, but will miss a few from individual authors later in the list. This is a rare case.
  • Names are ambiguous. See our author page for more information. In this case, equality of authors is determined by Last, F., i.e. the name is truncated after the first initial. John and Jane Smith will be conflated, and will have a few spurious self citations, but this is better than treating J.D. Smith and James Smith as different.
    • We do not have much ability to handle this in a smarter way, because the software is limited in what it can do while producing these stats.
    • It isn't perfect, but it appears to be pretty good, and if you know that there is someone in the same field (a likely citer) with the same last name and initial, consider yourself forewarned.
Average of (cites_to - self_cites)
Again, simply the average of the above.

The purpose of providing these alternative measures is not to endorse them, but rather to help us and others explore their potential relevance. If you are convinced that you want some citation metric, but don't know which one to use, use the ones above the line: citation breakdown, avg. citation count, and total citations. While all citation metrics have problems, it is certainly true that these traditional metrics have some non-zero correlation with influential papers. The same cannot neccessarily be said of the "experimental" ones. If you are curious about other ways to look at the data, have a look at these, and if you have another suggestion, let us know.


SPIRES HEP was a joint project of SLAC, DESY & FNAL as well as the worldwide HEP community.
It was superseeded by INSPIRE

Last Updated: 08/05/2004

Valid XHTML 1.0! Valid CSS!