Science as a public good

Public Use and Public Funding of Science
Nature Human Behaviour (Forthcoming)


Yian Yin, Yuxiao Dong, Kuansan Wang, Dashun Wang, Benjamin F. Jones

Paper Summary

Knowledge of how science is consumed in public domains is essential for a deeper understanding of the role of science in human society. While science is heavily supported by public funding, common depictions suggest that scientific research remains an isolated or ‘ivory tower’ activity, with weak connectivity between scientific research activity and its public use, and little correspondence between the funding of science and its public use.

This paper examines public uses of science, the public funding of science, and how use and funding relate. Specifically, we integrate five large-scale datasets that link scientific publications from all scientific fields to their upstream funding support and downstream public uses across three public domains -- government documents, the news media, and marketplace invention.

We find that the public uses of science are extremely diverse, with different public domains drawing distinctively across scientific fields. Yet amidst these differences, we find key forms of alignment in the interface between science and society. First, despite concerns that the public does not engage high-quality science, we find universal alignment, in each scientific field and public domain, between what the public consumes and what is highly impactful within science. Second, a scientific field’s public funding is closely aligned with the field’s collective public use.

Overall, public uses of science present a rich landscape of specialized consumption, yet collectively science and society interface with remarkable, quantifiable alignment between scientific use, public use, and funding.

Data & Code

Building on prior research that considers the use of science within a given public domain, here we integrate five large-scale datasets that link scientific publications from all scientific fields to their upstream funding support and downstream public uses across three public domains.

Our first dataset is scientific publications, using the Microsoft Academic Graph (MAG), which is one of the largest bibliometric databases of scientific research in the world.

Our second dataset leverages the Microsoft Bing search engine to collect over 6 million government documents available online across all branches of the U.S. government. Using a machine reading technology, we systematically identify academic publications that are referenced in these government documents and match these references to the MAG. This pipeline allows us to collect a high-scale dataset on how government documents consume scientific knowledge, including 389,896 unique academic publications cited by 43,014 government documents.

Our third dataset uses the Altmetric data to track academic publications covered by mainstream media reports. Matching these publications to the MAG data yields 724,849 unique papers covered by 2,701 media outlets.

Our fourth dataset links all patents granted by the United States Patent and Trademark Office (USPTO) to the academic papers they reference, yielding 4,276,940 papers cited by 1,932,642 patents.

Finally, we integrate funding records, using the Dimensions dataset, which includes 5 million projects funded by 400 funding agencies worldwide and links each funded project with its resulting publications.

The de-identified data and code necessary to reproduce all plots and statistical analyses in this paper is freely available for download. MAG raw data is publicly available at https://docs.microsoft.com/en-us/academic-services/graph/. MAG-USPTO linkage data is publicly available at https://doi.org/10.5281/zenodo.3575146. Those who are interested in raw data of Altmetric and Dimensions should contact Digital Science directly.

Media Coverage

Designed with Mobirise - Try it