Some Untested Citation Metrics
These are descriptions of (and information about) some citation metrics
that have been suggested by users for use on SPIRES.
All of these are currently
available on the wwwcitesummary2 format, however, if you have suggestions
about them, please let us
know. If you think something should be removed, or something else
added, we'd like to know.
Of course any discussion of citations needs to include the fine print. Note that all
stats are provided for published papers and all papers. This refers to
the status of the paper, not the papers citing the paper, and uses the definition of
published explained in the fine print.
We currently display the following citation statistics for any group of papers
- Breakdown of search results by citation
- This is our traditional citesummary format, displaying how many
papers in the group fall into which category of citation count. Note that
the scale is roughly a log scale, and the stats for the database as a
whole can be found here
- Total eligible papers analyzed
-
This is simply the number of papers from your search that were counted in
the citesummary. This is the number used for averages, etc. Other papers
may have been found, but may not have been eligible for citation tracking
(again, see the fine print).
- Total number of citations
- Simply the total number of citations to the papers in your search
- Average citations per paper
- The above 2 numbers as a ratio
The above statistics have been around for a while, which does not mean
that they are particularly good at predicting a physicist's Nobel prize
potential, height, eye color, or any other characteristic. However, some
newer statistical measures have been suggested, which are displayed in the
next section. Some of these may be better as measures on an individual
paper, rather than taken over a group.
- h
- A measure proposed by Hirsch in physics/0508025. It is
explained better there than here...
- Total of cites/authors
-
This is a sum over all papers of the quantity
<cites to the paper>/<number of authors listed on the paper>
This should depress (further) the weight of experimental papers in a
citesummary (they generally get suppressed already by the RPP), but could
also eliminate some effects from being "in the room" with the authors of a
renowned paper. An roundoff/truncation error was corrected in March 2006 in the calculation
of this quantity.
- Average of cites/authors
- The above number divided by the number of papers. Note that this is
not the ratio of the average cites to the average authors, which would
probably be less useful.
- Total of (cites_to + cites_from)
- This is the sum over all papers of the number of cites to a paper (we
call these citations),
plus the number of citations from the paper (we call these references).
This may say something about how connected to the rest of the
literature this paper is.
- Average of (cites_to + cites_from)
- Again, simply the average of the above. (For the arithmetically
challenged: this is exactly the same as average cites + average
references).
- Total of (cites_to - self_cites)
- This is the sum over all papers of the number of citations minus the
number of citations to a paper by at least one of the authors of that
paper. Things to note here:
-
If you are doing a citesummary for author A, the self_cites for her paper
1 are all papers citing paper 1 by author A or her co-authors for paper 1. The
self_cites for paper 2 are author A or her co-authors on paper 2.
-
If there are more than 20 authors on a paper, the collaboration name is
used to exclude cites from papers with the same collaboration name.
Collaboration is not used for papers under 20 authors. If the paper has
more than 20 authors and no collaboration name, only the first 20 authors
are used. This will exclude most self citations from that same
group, but will miss a few from individual authors later in the list.
This is a rare case.
- Names are ambiguous. See
our author page for more information. In this case, equality of
authors is determined by Last, F., i.e. the name is truncated
after the first initial. John and Jane Smith will be conflated, and will
have a few spurious self citations, but this is better than treating
J.D. Smith and James Smith as different.
- We do not have much ability to
handle this in a smarter way, because the software is limited in what it
can do while producing these stats.
- It isn't perfect, but it appears to
be pretty good, and if you know that there is someone in the same field (a
likely citer) with the same last name and initial, consider yourself
forewarned.
- Average of (cites_to - self_cites)
- Again, simply the average of the above.
The purpose of providing these alternative measures is not to endorse
them, but rather to help us and others explore their potential relevance.
If you are convinced that you want some citation metric, but don't know which
one to use, use the ones above the line: citation breakdown, avg. citation count, and total
citations. While all citation metrics have problems, it is
certainly true that these traditional metrics have some non-zero correlation with
influential papers.
The same cannot neccessarily be said of the "experimental" ones. If you are curious about other ways to look at the data, have
a look at these, and if you have another suggestion, let us know.
|
|