DOI:​10.1371/journal.pcbi.1002854
Corpus ID: 14620651
Getting More Out of Biomedical Documents with GATE's Full Lifecycle Open Source Text Analytics
H. Cunningham, V. Tablan, +1 author Kalina Bontcheva
Published 2013
Computer Science, Medicine

PLoS Computational Biology
This software article describes the GATE family of open source text analysis tools and processes. GATE is one of the most widely used systems of its type with yearly download rates of tens of… Expand
View On PubMed
Journals.Plos.Org
Share This Paper




301 Citations
Highly Influential Citations
30
Background Citations
46
Methods Citations
121
Results Citations
1
Figures, Tables, and Topics from this paper
Figure 1
Table 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
GATE
Open-source software
Text mining
GNU
Mental disorders
Ecosystem
Software developer
Operating system
Download
Computation
Plant Roots
Biological Science Disciplines
Biomedicine
FOCAL (programming language)
Reuse (action)
Engineering
Deploy
Paper Mentions
Blog Post
D-Lib: Semantic Enrichment and Search: A Case Study on Environmental Science Literature
Planet Code4Lib
15 January 2015
301 Citations
PubMedPortable: A Framework for Supporting the Development of Text Mining Applications
Kersten Döring, B. Grüning, Kiran K. Telukunta, P. Thomas, S. Günther
Medicine, Computer SciencePloS one
2016
TLDR
The presented workflows show how to use PubMedPortable to retrieve, store, and analyse a disease-specific data set and the approach was tested extensively and applied successfully in several projects. Expand
6 Citations
PDF
Text Mining in Medicine
Slavko Zitnik, M. Bajec
Computer Science
2013
TLDR
This chapter overviews some methods and tools that enable researchers to automatically retrieve, extract, and integrate unstructured medical data. Expand
Using large-scale text mining for a systematic reconstruction of molecular mechanisms of diseases : a case study in thyroid cancer
Chengkun Wu
Medicine
2014
TLDR
A systematic and efficient text-mining pipeline that can identify key pathways and genes that characterise a disease and build networks of their interactions (interactome) and can be applied to other diseases for systematic studies and assisted curation efforts is developed. Expand
Pattern-based Mining in Electronic Health Records for Complex Clinical Process Analysis
O. Metsker, E. Bolgova, A. Yakovlev, Anastasia A. Funkner, S. Kovalchuk
Computer Science
2017
TLDR
The efficiency of this method is demonstrated in the course of correlation analysis of comorbidities on the treatment duration of ACS and in the case of extracted data using to develop process models with complexity metrics at the control-flow perspective of process mining techniques. Expand
TextHunter - A User Friendly Tool for Extracting Generic Concepts from Free Text in Clinical Research
R. Jackson, M. Ball, R. Patel, R. Hayes, R. Dobson, Robert Stewart
Computer Science, Medicine
AMIA

2014
TLDR
TextHunter is presented, a tool for the creation of training data, construction of concept extraction machine learning models and their application to documents and achieved recall measurements as high as 99% in real world use cases. Expand
36 Citations
Development of an information retrieval tool for biomedical patents
Tiago Alves, Rúben Rodrigues, H. Costa, Miguel Rocha
Computer Science, Medicine
Comput. Methods Programs Biomed.
2018
TLDR
This work builds a patent pipeline addressing IR tasks over patent repositories to make these documents amenable to BioTM tasks, decreasing drastically the time required for this task, and provides graphical interfaces to ease the use of these tools. Expand
5 Citations
PDF
Application of Domain Ontologies to Natural Language Processing: A Case Study for Drug-Drug Interactions
María Herrero-Zazo, Isabel Segura-Bedmar, Janna Hastings, Paloma Martínez
Computer ScienceInt. J. Inf. Retr. Res.
2015
TLDR
The authors apply the drug-drug interactions ontology DINTO to named entity recognition and relation extraction from pharmacological texts and evaluate their results in the framework of the last SemEval-2013 DDI Extraction task. Expand
5 Citations
iTextMine: integrated text-mining system for large-scale knowledge extraction from the literature
Jia Ren, Gang Li, +7 authors Cathy H. Wu
Computer Science, Medicine
Database J. Biol. Databases Curation
2018
TLDR
The iTextMine system with an automated workflow to run multiple text-mining tools on large-scale text for knowledge extraction and implement a text alignment algorithm to solve the text discrepancy for result integration is described. Expand
11 Citations
PDF
PathNER: a tool for systematic identification of biological pathway mentions in the literature
Chengkun Wu, J. Schwartz, G. Nenadic
Computer Science, Medicine
BMC Systems Biology
2013
TLDR
In contrast to existing text-mining efforts that target the automatic reconstruction of pathway details from molecular interactions mentioned in the literature, PathNER focuses on identifying specific named pathway mentions that can support large-scale curation and pathway-related systems biology applications. Expand
Supporting the annotation of chronic obstructive pulmonary disease (COPD) phenotypes with text mining workflows
Xiao Fu, R. Batista-Navarro, Rafal Rak, S. Ananiadou
Computer Science, MedicineJ. Biomed. Semant.
2015
TLDR
A semi-automatic methodology for producing a corpus that can ultimately support the development of text mining tools that, in turn, will expedite the process of identifying groups of COPD patients, and demonstrates that the corpus, although still a work in progress, can foster theDevelopment of significantly better performing COPD phenotype extractors.Expand
...
1
2
3
4
...
References
SHOWING 1-10 OF 108 REFERENCES
Getting Started in Text Mining
K. Cohen, L. Hunter
Computer Science, MedicinePLoS Comput. Biol.
2008
TLDR
A surprising phenomenon can be noted in the recent history of biomedical text mining: although several systems have been built and deployed in the past few years—Chilibot, Textpresso, and PreBIND (see Text S1 for these and most other citations), the ones that are seeing high usage rates and are making productive contributions to the working lives of bioscientists have been build not by text mining specialists, but by bioscientism. Expand
230 Citations
PDF
Biomedical Text Mining and Its Applications
R. Rodriguez-Esteban
Computer Science, MedicinePLoS Comput. Biol.
2009
TLDR
This tutorial examines the relationship between progressive multifocal leukoencephalopathy (PML) and antibodies and introduces text mining, the subfield that deals with text that comes from biology, medicine, and chemistry. Expand
99 Citations
PDF
MutationFinder: a high-performance system for extracting point mutation mentions from text
J. Caporaso, W. Baumgartner, David A. Randolph, K. Cohen, L. Hunter
Medicine, Computer ScienceBioinform.
2007
TLDR
An open-source, rule-based system, MutationFinder, for extracting point mutation mentions from text achieves nearly perfect precision and a markedly improved recall over a baseline on blind test data. Expand
154 Citations
PDF
Research Paper: ALICE: An Algorithm to Extract Abbreviations from MEDLINE
H. Ao, T. Takagi
Computer Science, Medicine
J. Am. Medical Informatics Assoc.
2005
TLDR
ALICE does not only facilitate recognition of an undefined abbreviation in a paper by constructing an abbreviation database or dictionary, but also makes biomedical literature retrieval more accurate. Expand
82 Citations
PDF
An overview of MetaMap: historical perspective and recent advances
A. Aronson, François-Michel Lang
Computer Science, Medicine
J. Am. Medical Informatics Assoc.
2010
TLDR
This study reports on MetaMap's evolution over more than a decade, concentrating on those features arising out of the research needs of the biomedical informatics community both within and outside of the National Library of Medicine. Expand
1,240 Citations
PDF
A Simple Algorithm for Identifying Abbreviation Definitions in Biomedical Text
Ariel S. Schwartz, Marti A. Hearst
Computer Science, Medicine
Pacific Symposium on Biocomputing
2003
TLDR
This paper shows that the problem of identifying abbreviations' definitions can be solved with a much simpler algorithm than that proposed by other research efforts, and achieves 96% precision and 82% recall on a standard test collection, which is at least as good as existing approaches.Expand
534 Citations
PDF
Getting Started in Text Mining: Part Two
A. Rzhetsky, Michael R. Seringhaus, M. Gerstein
Computer Science, MedicinePLoS Comput. Biol.
2009
TLDR
This article is intended to continue where Cohen and Hunter left off in “Getting Started in Text Mining,” an introduction in the January 2008 issue of PLoS Computational Biology which covered the actual mining of text and its digestion into small quanta of computer-manageable information. Expand
42 Citations
PDF
ABNER: an open source tool for automatically tagging genes, proteins and other entity names in text
Burr Settles
Medicine, Computer ScienceBioinform.
2005
ABNER (A Biomedical Named Entity Recognizer) is an open source software tool for molecular biology text mining. At its core is a machine learning system using conditional random fields with a variety… Expand
257 Citations
PDF
Automated recognition of malignancy mentions in biomedical literature
Yang Jin, Ryan T. McDonald, +6 authors Peter S. White
Computer Science, MedicineBMC Bioinformatics
2006
TLDR
Together, these results suggest that the identification of disparate biomedical entity classes in free text may be achievable with high accuracy and only moderate additional effort for each new application domain. Expand
UIMA: an architectural approach to unstructured information processing in the corporate research environment
D. Ferrucci, Adam Lally
Computer ScienceNatural Language Engineering
2004
TLDR
A general introduction to U IMA is given focusing on the design points of its analysis engine architecture and how UIMA is helping to accelerate research and technology transfer is discussed.Expand
970 Citations
...
1
2
3
4
...
SORT BY
Related Papers
TwitIE: An Open-Source Information Extraction Pipeline for Microblog Text
Kalina Bontcheva, Leon Derczynski, A. Funk, M. Greenwood, D. Maynard, N. Aswani
Computer Science
RANLP

2013
Twitter is the largest source of microblog text, responsible for gigabytes of human discourse every day. Processing micr...
203 Citations
Natural Language Processing Tools
Justin F. Brunelle, Chutima Boonthum-Denecke
Computer Science
2012
This chapter discusses a subset of Natural Language Processing (NLP) tools available for researchers and enthusiasts of ...
10 Citations
Show More
2/10
Abstract
Figures, Tables, and Topics
Paper Mentions
301 Citations
108 References
Related Papers
Stay Connected With Semantic Scholar
What Is Semantic Scholar?
Semantic Scholar is a free, AI-powered research tool for scientific literature, based at the Allen Institute for AI.
Learn More
About
About Us
Publishers
Beta Program
Contact
Research
Team
Datasets
Open Corpus
Supp.ai
Resources
Librarians
Tutorials
FAQ
API
Proudly built by AI2
Terms of ServicePrivacy Policy
By clicking accept or continuing to use the site, you agree to the terms outlined in our Privacy Policy, Terms of Service, and Dataset License
ACCEPT & CONTINUE