Welcome to JSTOR's Data for Research (DfR) service.
The DfR service is provided by JSTOR for use by the research community.
It provides a set of web-based tools for selecting and interacting with content from
the JSTOR archive. The service also provides the ability to obtain data sets
via bulk downloads or using a REST API.
Features provided by the site include:
- Full-text and fielded searching of the entire JSTOR archive using a powerful faceted search
interface. Using this interface one can quickly and easily define content of interest through
an iterative process of searching and results filtering.
- Online viewing of document-level data including word frequencies, citations, key terms, and ngrams.
- Request and download datasets containing word frequencies, citations, key terms, or ngrams
associated with the content selected.
- API for content selection and retrieval.
Release Notes:
Data for Research beta #4 was released on March 5th 2010.
This release a couple of enhancements for content discovery including some significant improvements in the
calculation of search results relevancy and the addition of an experimental free-text recommendation tool
in a new ‘Labs’ section of the site.
- Tuning of search results relevancy. This release changes in DfR search results. The relevancy algorithm
now incorporates recency (based on year of publication), citation frequency (how many times an article is cited by
other articles in the JSTOR corpus), and article size.
- Experimental Free Text recommender tool. A new 'Labs' section has been added to the site. The initial tool
to be included in this section is what we call a Free Text Recommender. This tool enables a user to submit arbitrary
text, which is then analyzed and used to select articles in the JSTOR corpus thought to be similar. This is
highly experimental, but does show much promise. This will be an ongoing subject of research and refinement in
future releases.
- User interface streamlining.
- Misc bug fixes.
Data for Research beta #3 was released on May 27th, 2009. The new features provided in this release
include:
- The addition of references for search and export. The JSTOR archive contains nearly
35 million parsed citations from approximately 1.25 million articles. The DfR Explore
tool now supports searching and filtering on reference text as well as tools for viewing
reference patterns over time. The "References Profile" tab
on the Explore screen provides graphs depicting the average number and age of references
per article by year for all documents in a search result set. The age of a reference
refers to the difference between the publication data of the citing work and the
publication date of the referenced works. For instance, if an article was published in
1990 and references a work from 1980, the age of the reference is 10 years.
Full-text searching can now be performed on the
approximately 35 million references in the archive. Searching reference text
is accomplished by selecting the "in References" option in the 'Field' selector in
the search box. As an example, searching on the term "new york times" using the "in References"
option returns hits for over 58,000 articles that reference the New York Times.
- The addition of new facets for searching and filtering. The search interface
has been enhanced by the addition of facets for 'Publisher', 'Reviewed Work',
'Reviewed Author', and 'Has References'. The 'Reviewed Work' and 'Reviewed Author' facets
provide greater visibility into the nearly 1.6 million review articles in the
JSTOR archive. Using this facet one finds that the archive has over 250 reviews
of works authored by William Shakespeare. The combination of the 'Reviewed Author'
and 'Reviewed Work' facets shows that JSTOR has 19 reviews of Shakespeare's
'Hamlet'.
The 'Has References' facet provides a
convenient way of filtering content based on whether they have parsed references or
not.
- More options for search results sorting. In addition to relevance, publication
date, and dataset order, search results may now be sorted using references count,
references average age,
and the number of times that an article has been cited by other articles in the JSTOR
archive.
- Downloadable chart data. The DfR tool provides a number of charts that
visually summarize the contents of results sets. These include charts that
depict the distibution of documents by discipline and year of publication, as well
as the new charts provided in this version for parsed references. The data used
to build these charts can now be downloaded as Excel-compatible CSV files.
- Application Programming Interface (API) support. In addition to
the bulk download option supported in earlier versions, the DfR site now provides
the means to search and download data programatically using a REST-based API. The
API is based on the SRU (Search and Retrieve via URL) and CQL
(Context Query Language) standards. More information on the API can be found here.
- Disabling of email notifications. By default, email notifications are
generated when a dataset request has been completed. While this can be useful,
some users have expressed an interest in being able to disable this feature. The
site now provides an option in a users account preferences to turn this
feature off.
- Performance improvements. In addition to the visual changes to
the site, the underlying infrastructure has been largely reimplemented using
a new framework resulting in a noticable improvement in response times for most
operations.
About JSTOR
JSTOR is a not-for-profit
service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity
and facilitate new forms of scholarship.
As of March 2010, the JSTOR contains approximately 6.2 million journal articles covering a broad range of
disciplines.
Data for Research has been developed by JSTOR's Advanced Technology Research (ATR) group.
The Advanced Technology Research Group is dedicated to discovering and using relevant technologies in
support of JSTOR and the broader scholarly community.
To view other projects, built by ATR and the greater scholarly community,
please visit us at http://showcase.jstor.org.