Over the recent years, there has been a growing interest in developing new scientometric measures that could go beyond the traditional citation-based bibliometric measures. This interest is motivated on one side by the wider availability or even emergence of new information evidencing research performance, such as article downloads, views and twitter mentions, and on the other side by the continued frustrations and problems surrounding the application of citation-based metrics to evaluate research performance in practice.
Semantometrics are a new class of research evaluation metrics which build on the premise that full text is needed to assess the value of a publication. This talk will present the results of an investigation into the properties of the semantometric contribution measure (Knoth & Herrmannova, 2014). We will provide a comparative evaluation of the contribution measure with traditional bibliometric measures based on citation counting.
2. 2/26
Towards full-text based research metrics:
Exploring semantometrics
13th June 2016: Announcement of
the report release:
https://scholarlyfutures.jiscinvolve.or
g/wp/2016/06/towards-full-text-
based-research-metrics-exploring-
semantometrics/
Report available at:
http://repository.jisc.ac.uk/6376/1/Ji
sc-semantometrics-experiments-
report-final.pdf
3. 3/26
Current impact metrics
• Pros: simplicity
• Cons: insufficient evidence they capture quality and research
contribution, ad-hoc/established axiomatically
4. 4/26
The crisis of research evaluation?
Figure: Rejection rates vs Journal Impact Factor (JIF) according to (da Silva, 2015).
5. 5/26
Problems of current impact metrics
• Sentiment, semantics, context and motives [Nicolaisen, 2007]
• Popularity and size of research communities [Brumback, 2009;
Seglen, 1997]
• Time delay [Priem and Hemminger, 2010]
• Skewness of the distribution [Seglen, 1992]
• Differences between types of research papers [Seglen, 1997]
• Ability to game/manipulate citations [Arnold and Fowler,
2010; PLoS Medicine Editors, 2006]
6. 6/26
Alternative metrics
• Alt-/Webo-metrics etc.
– Impact still dependent on the number of interactions in a
scholarly communication network (downloads, views,
readers, tweets, etc.)
8. 8/26
Many possibilities for semantometrics …
• Detecting good research practices were followed
(sound methodology, research data/code shared
…)
• Detecting paper type …
• Analysing citation contexts (tracking facts
propagation) …
• Detecting the sentiment of citations …
• Normalising by size of community that is likely to
read the research …
• Detecting good writing style …
9. 9/26
Semantometrics – contribution metric
Hypothesis: Added value of publication p can be estimated
based on the semantic distance from the publications cited by p
to publications citing p.
Detailed explanation: http://semantometrics.org
10. 10/26
Contribution metric
• Based on semantic distance between citing
and cited publications
– Cited publications – state-of-the-art in the domain
of the publication in question
– Citing publications – areas of application
12. 12/26
Experiment – dataset
• Obtained by merging three open datasets:
– Connecting Repositories (CORE) – OA publications,
metadata and full-texts
– Microsoft Academic Graph (MAG) – citation
network
– Mendeley – publication texts (abstracts) and
readership information
• Over 1.6 million CORE publications, over 12
million publications in total
13. 13/26
Experiment – dataset statistics
Articles from CORE matched with MAG 1,655,835
Average number of received citations 16.09
Standard deviation 66.30
Max number of received citations 13,979
Average readership 15.94
Standard deviation 42.17
Max readership 15,193
Average contribution value 0.89
Standard deviation 0.0810
Total number of publications 12,075,238
17. 17/26
Experiment – results
• No direct correlation between contribution
measure and citations/readership
• When working with mean citation, readership
and contribution values a clear behavioral
trend emerges
20. 20/26
Current impact metrics vs semantometrics
Unaffected by Current impact metrics Semantometrics
Citation sentiment, semantics, context,
motives
✗ ✔
Popularity & size of res. communities ✗ ✔
Time delay ✗ ✗/✔*
Skewness of the citation distribution ✗ ✔
Differences between types of res. papers ✗ ✔
Ability to game/manipulate the metrics ✗ ✗/✔**
* reduced to 1 citation
** assuming that self-citations are not taken into account
12
21. 21/26
Metrics for evaluating article sets
• Encourage focusing on quality rather than
quantity
• Comparable regardless of discipline, seniority,
etc.
22. 22/26
Evaluating research metrics
• Need for a data driven approach
– Ground truth
– Human judgments
– Many facets of performance (societal impact,
economical impact, rigour, originality/novelty)
23. 23/26
WSDM Cup – work on new metrics
• The goal of the challenge is to assess the
query-independent importance of scholarly
articles, using data from the Microsoft
Academic Graph (>120M papers).
• Human judgements
• But no full text in MAG
25. 25/26
Conclusions
• Full-text necessary for research evaluation
• Semantometrics are a new class of methods.
• We are studying one semantometric method
to assess the research contribution
• Need for a data driven approach for evaluating
metrics
26. 26/26
References
• Jeppe Nicolaisen. 2007. Citation Analysis. Annual Review of
Information Science and Technology, 41(1):609-641.
• Douglas N Arnold and Kristine K Fowler. 2010. Nefarious
numbers. Notices of the American Mathematical Society,
58(3):434-437.
• Roger A Brumback. 2009. Impact factor wars: Episode V -- The
Empire Strikes Back. Journal of child neurology, 24(3):260-2,
March.
• The PLoS Medicine Editors. 2006. The impact factor game.
PLoS medicine, 3(6), June.
27. 27/26
References
• Jason Priem and Bradely M. Hemminger. 2010. Scientometrics
2.0: Toward new metrics of scholarly impact on the social
Web. First Monday, 15(7), July.
• Per Ottar Seglen. 1992. The Skewness of Science. Journal of
the American Society for Information Science, 43(9):628-638,
October.
• Per Ottar Seglen. 1997. Why the impact factor of journals
should not be used for evaluating research. BMJ: British
Medical Journal, 314(February):498-502.
Editor's Notes
In this presentation, I would like to talk about the problems of existing impact metrics. These problems led us to the development of a new approach, based on the processing of publication full-texts, for automatically assesing the research contribution.
As you might know, there is a wide range of metrics that are currently used for evaluation of research. You can see them in the foloowing tag cloud. These metrics have one thing in common. They are all based on citations. This has both advantages and disdvanatages.They are … Let me say a few more things about the cons
Evidence for citations comes externally. Does it mean that people who cite a paper need to read it? Not necessarily.
As you might know, there is a wide range of metrics that are currently used for evaluation of research. You can see them in the foloowing tag cloud. These metrics have one thing in common. They are all based on citations. This has both advantages and disdvanatages.They are … Let me say a few more things about the cons
Evidence for citations comes externally. Does it mean that people who cite a paper need to read it? Not necessarily.
There is no correlation.
Does it mean that citation counting does not work, does it mean that peer-review does not work? Probably a bit of both.
There are a number of properties a good research evaluation metric should have. (Jennings, 2006) mentions the following:
Reliability – Level of accuracy comparable or better than that of the current peer-review system.
Digestibility – It must allow to make quick decisions about what to read.
Economical and fast to produce – The production of the metric should be possible in reasonable time and shouldn’t be expensive or human labour intensive.
Resistance to gaming – It shouldn’t be possible to improve the metric in any other way than by improving the research.
There is now also a lot of interest in new metrics, which are referred to as Alt and Webometrics. However, these metrics are (in the same way as citation based metrics) still dependent on the number of interactions in the scholarly communication network.
We are proposing a new class of evluation metrics => Semantometrics. These metrics are based on the assujmption that it is not possible to evaluate research impact without accessing the research outputs .
The problem with Altmetrics is that they measure popularity, the size of the community, etc, but not quality.
It is an insult to researchers and to science that the current metrics do not consider the text of the manuscript in evaluation, but take external evidence into account only.
This hypothesis is based on the process of how research builds on the existing knowledge in order to create new knowledge on which others can build. A publication, which in this way creates a \bridge" between what we already know and something new which will people develop based on this knowledge, brings a contribution to science. A publication has a high contribution if it creates a \long bridge" between more distant areas of science.
Dasha and Petr have participated in the challenge which is part of the upcoming Web Search and Data Mining (WSDM) conference. The challenge, coorganised by Microsoft and Elsevier, was to assess the importance of scholarly articles, using data from Microsoft Academic Graph -- a large heterogeneous graph comprised of more than 120 million publications and the related authors, venues, organizations, and fields of study. Dasha and Petr (team called BletchleyPark) were the best out of 32 teams in the training round of the competition and after the validation round were invited to take part in the second phase of the challenge as one of the eight best of the 32 teams. They will also present their method at the workshop in San Francisco, California, in February.
- See more at: http://kmi.open.ac.uk/news/article/18807#sthash.kn2oaULQ.dpuf
The leads us to the conlusion that it is not possible to evaluate research impact without accessing the research outputs