Leading-edge digital initiatives at the National Institutes of Health (NIH) are building a modernized and integrated biomedical data ecosystem that supports academic researchers and other stakeholders addressing today’s challenging issues. In case you missed the lecture, hear all about them in this Data Citizens Distinguished Lecture Series session with Dr. Susan Gregurick:
“We are developing a common infrastructure and architecture, leveraging industry resources and adapting technologies and software from other fields to be used in biomedical research,” said Susan K. Gregurick, PhD, Associate Director for Data Science and Director of the Office of Data Science Strategy (ODSS) at the National Institutes of Health (NIH). “We also seek to enhance the biomedical data science workforce through improved programs and novel partnerships with other agencies, nonprofits, and industry.”
Gregurick spoke at a March 22 webinar sponsored by the University of Miami Institute for Data Science and Computing (IDSC). Her talk on “Connecting Data, Enhancing Software, and Creating a Digital Data Health Ecosystem” was part of the Data Citizens: A Distinguished Lecture Series, co-sponsored by the Miami Clinical and Translational Science Institute (CTSI). She was introduced by a long-term research collaborator Yelena Yesha, PhD, Professor, UM Department of Computer Science, IDSC Innovation Director, and the first IDSC Knight Chair of Data Science and Artificial Intelligence.
“We are committed to fair data sharing, streamlining access to data and creating interoperable solutions, while ensuring the security and confidentiality of data,” Gregurick said. “We want to capture, curate, validate, store, and analyze clinical data for biomedical research.”
“We are rolling out a Cloud Lab Pilot,
reducing barriers for researchers to work
with us. You can try different
configurations and cloud opportunities
before making a commitment.”
Under Gregurick’s leadership, the ODSS leads the implementation of the NIH Strategic Plan for Data Science through scientific, technical, and operational collaboration with 27 institutes, centers, and offices at NIH. She noted that her office manages 163 petabytes of data across multiple clouds that can be accessed by authorized users. “We are rolling out a Cloud Lab Pilot, reducing barriers for researchers to work with us,” she said. “You can try different configurations and cloud opportunities before making a commitment.”
In her talk, Gregurick emphasized the importance of ethical frameworks for biomedical artificial intelligence (AI) and machine learning (ML) projects. “It is vital to avoid biases in the collection and employment of data,” she said. “Along with strong governance through standardized practices, we must address concerns about data privacy and security. That means thinking about the inherent fairness of data and providing training in ethics and accountability.”
Gregurick added that NIH has been working with a diverse group of scientists, patient advocates, legal scholars, and other professionals to discuss the ethical considerations related to the collection of data, the reuse of existing data, and the development of new models. The goal is to build multidisciplinary collaborations with various stakeholders to advance the use of ethics in AI and ML.
With the “Bridge2AI” initiative, NIH is promoting ethical best practices in generating flagship data sets and preparing AI/ML-friendly data in ways that address cultural challenges and build trust, according to Gregurick.
Later in the session, Swarup S. Swaminathan, MD, Assistant Professor of Clinical Ophthalmology and IDSC Member, asked about the challenges of sharing patient data from different institutions. “We take various approaches,” Gregurick said. “We put de-identified patient data in various enclaves and we can send the computation tools to the data, so the information doesn’t have to be shared outside the institution.”
“We want to understand and advance
our understanding of inequities in our
health system by applying artificial
intelligence and machine learning
tools to the data.”
Social Determinants of Health
One of NIH’s priorities is data science projects related to the social determinants of health. “We want to understand and advance our understanding of inequities in our health system by applying artificial intelligence and machine learning tools to the data,” she said. For example, Gregurick pointed to racial differences in pulse oximetry measurements, as lower levels of oxygen in the blood are not detected in Blacks as often as in Whites.
Gregurick highlighted several other NIH initiatives, such as the “AIM-AHEAD” program (Artificial Intelligence/Machine Learning Consortium to Advance Health Equity and Researcher Diversity), which identifies research needs and gaps. NIH is recruiting consortium members, creating fellowship opportunities for trainees, and engaging community members.
NIH has also partnered with the National Science Foundation (NSF) on a “Smart and Connected Health” (SCH) program that includes projects related to wearables, sensors, imaging, and robotics. “We want to bring computer and biomedical scientists together,” she said.
Gregurick also outlined NIH’s digital-twin strategy, which involves using multimodal data integration and algorithms to model individual health and support the early diagnosis of disease. One example would be a cancer-patient digital twin that incorporates different types of data to provide a prognosis for an individual patient.
Finally, NIH is committed to preparing for future public health challenges, while building a pipeline of biomedical data science professionals through its STRIDES initiative (Science and Technology Research Infrastructure for Discovery, Experimentation, and Sustainability). Gregurick noted that NIH has devoted extensive data-science resources to the COVID-19 pandemic in the past two years, including the development of LitCovid, a curated literature hub. “We want to expedite drug delivery and discovery, and have supported plenty of studies on COVID’s spike protein,” Gregurick said. “We also want to be able to flip on a dime and dedicate major resources for the community when there is another crisis.”
STORY by Richard Westlund | PowerPoint Presentation
This Data Citizens: A Distinguished Lecture series talk took place on Tuesday, March 22, 2022 (2:00-3:00 PM ET via Zoom).
TALK TITLE: Connecting Data, Enhancing Software, and Creating a Digital Data Health Ecosystem
The NIH works to maximize opportunities that data science can bring to aid in our mission of improving the health and wellbeing of all US citizens. NIH’s strategic plan for data science serves as a north star to optimizing data storage, creating connected data resources, modernizing our data repository ecosystem, sustaining software development, expanding the national data science workforce, and supporting the NIH data management and sharing policy. Inspired by the strategic plan, and in partnership with colleagues from across the NIH Institutes and Centers, this talk highlights work to catalyze new data science capabilities in biomedical data science.
About Susan Gregurick
Susan K. Gregurick, Ph.D., was appointed Associate Director for Data Science and Director of the Office of Data Science Strategy (ODSS) at the National Institutes of Health (NIH) on September 16, 2019. Under Dr. Gregurick’s leadership, the ODSS leads the implementation of the NIH Strategic Plan for Data Science through scientific, technical, and operational collaboration with the institutes, centers, and offices that comprise NIH. Dr. Gregurick received the 2020 Leadership in Biological Sciences Award from the Washington Academy of Sciences for her work in this role. She was instrumental in the creation of the ODSS in 2018 and served as a senior advisor to the office until being named to her current position.