New Algorithm Uses Online Learning for Massive Cell Data Sets

The method enables researchers to analyze millions of cells with the amount of memory found on a standard computer.

9:47 AM

Author | Kelly Malcom

Purple Colored Data With Laptop
Michigan Medicine

The fact that the human body is made up of cells is a basic, well-understood concept. Yet amazingly, scientists are still trying to determine the various types of cells that make up our organs and contribute to our health.

A relatively recent technique called single-cell sequencing is enabling researchers to recognize and categorize cell types by characteristics such as which genes they express. But this type of research generates enormous amounts of data, with datasets of hundreds of thousands to millions of cells.

A new algorithm developed by Joshua Welch, Ph.D., of the Department of Computational Medicine and Bioinformatics, Ph.D. candidate Chao Gao and their team uses online learning, greatly speeding up this process and providing a way for researchers world-wide to analyze large data sets using the amount of memory found on a standard laptop computer. The findings are described in the journal Nature Biotechnology.

MORE FROM THE LAB: Subscribe to our weekly newsletter

 "Our technique allows anyone with a computer to perform analyses at the scale of an entire organism," says Welch. "That's really what the field is moving towards."

The team demonstrated their proof of principle using data sets from the National Institute of Health's Brain Initiative, a project aimed at understanding the human brain by mapping every cell, with investigative teams throughout the country, including Welch's lab.

Typically, explains Welch, for projects like this one, each single-cell data set that is submitted must be re-analyzed with the previous data sets in the order they arrive. Their new approach allows new datasets to the be added to existing ones, without reprocessing the older datasets. It also enables researchers to break up datasets into so-called mini-batches to reduce the amount of memory needed to process them.

"This is crucial for the sets increasingly generated with millions of cells," Welch says. "This year, there have been five to six papers with two million cells or more and the amount of memory you need just to store the raw data is significantly more than anyone has on their computer."

Welch likens the online technique to the continuous data processing done by social media platforms like Facebook and Twitter, which must process continuously-generated data from users and serve up relevant posts to people's feeds. "Here, instead of people writing tweets, we have labs around the world performing experiments and releasing their data."

Like Podcasts? Add the Michigan Medicine News Break on iTunes, Google Podcast or anywhere you listen to podcasts.

The finding has the potential to greatly improve efficiency for other ambitious projects like the Human Body Map and Human Cell Atlas. Says Welch, "Understanding the normal compliment of cells in the body is the first step towards understanding how they go wrong in disease."

Paper cited: "Iterative single-cell multi-omic integration using online learning," Nature Biotechnology. DOI: 10.1038/s41587-021-00867-x


More Articles About: Lab Report All Research Topics Health Care Delivery, Policy and Economics Hospitals & Centers Emerging Technologies Future Think
Health Lab word mark overlaying blue cells
Health Lab

Explore a variety of healthcare news & stories by visiting the Health Lab home page for more articles.

Media Contact Public Relations

Department of Communication at Michigan Medicine

[email protected]

734-764-2220

Stay Informed

Want top health & research news weekly? Sign up for Health Lab’s newsletters today!

Subscribe
Featured News & Stories doctor holding tablet hospital room with stethoscope
Health Lab
Popular sepsis prediction tool less accurate than claimed
The algorithm is currently implemented at hundreds of U.S. hospitals.
Xray of a stem cell in a mouse brain.
Health Lab
Stem cells improve memory, reduce inflammation in Alzheimer’s mouse brains
Researchers improved memory and reduced neuroinflammation in a mouse model of Alzheimer’s Disease, suggesting another avenue for potential treatment.
Health care provider with stethoscope holds patient's hand
Health Lab
Opinion: Hospice care for those with dementia falls far short of meeting people’s needs at the end of life
An end-of-life care specialist discusses the shortfalls of hospice care coverage for people with dementia, using the experience of former President Jimmy Carter and former First Lady Rosalynn Carter as examples.
Illustration of doctor pictured outside a pill bottle that houses a bent-over figure with pills lying on the ground
Health Lab
It’s easier now to treat opioid addiction with medication -- but use has changed little
Buprenorphine prescribing for opioid addiction used to require a special waiver from the federal government, but a new study shows what happened in the first year after that requirement was lifted.
Pill capsule pushing through a paper with amoxicillin printed on it.
Health Lab
Rise seen in use of antibiotics for conditions they can’t treat – including COVID-19
Overuse of antibiotics can lead bacteria to evolve antimicrobial resistance, but Americans are still receiving the drugs for many conditions that they can’t treat.
marijuana leaf drawing blue lab note yellow badge upper left corner
Health Lab
Data shows medical marijuana use decreased in states where recreational use became legal 
Data on medical cannabis use found that enrollment in medical cannabis programs increased overall between 2016 and 2022, but enrollment in states where nonmedical use of cannabis became legal saw a decrease in enrollment