Reviews & Analysis

A multi-task learning strategy to pretrain models for medical image analysis

Pretraining powerful deep learning models requires large, comprehensive training datasets, which are often unavailable for medical imaging. In response, the universal biomedical pretrained (UMedPT) foundational model was developed based on multiple small and medium-sized datasets. This model reduced the amount of data required to learn new target tasks by at least 50%.

Research Briefing19 Jul 2024
Multi-task learning for medical foundation models

To address the challenge of pretraining foundational models with large datasets, a multi-task approach is proposed, thus helping to overcome the data scarcity problem in biomedical imaging.

Jiancheng Yang
News & Views19 Jul 2024
Free-form metamaterials design with isotropic materials

A recent study proposes a computational method for the design of free-form metamaterials systems. The method simplifies the design process by avoiding the use of anisotropic materials that are usually required for the conventional methods. The method can be applied in designing both two-dimensional and three-dimensional metamaterials that are subject to multiple physical fields.

Juan Manuel Restrepo-Flórez
News & Views18 Jul 2024
Boosting graph neural networks with virtual nodes to predict phonon properties

A graph neural network using virtual nodes is proposed to predict the properties of complex materials with variable dimensions or dimensions that depend on the input. The method is used to accurately and quickly predict phonon dispersion relations in complex solids and alloys.

Research Briefing16 Jul 2024
Unlocking T-cell receptor–epitope insights with structural analysis

A method leverages protein structural data to predict T-cell receptor–peptide interactions for unseen peptide epitopes, which can be particularly useful for applications in cancer immunotherapy, autoimmunity studies, and vaccine design.

Miaozhe Huo
Yuepeng Jiang
Shuai Cheng Li
News & Views10 Jul 2024
Promising directions of machine learning for partial differential equations

Machine learning has enabled major advances in the field of partial differential equations. This Review discusses some of these efforts and other ongoing challenges and opportunities for development.

Steven L. Brunton
J. Nathan Kutz
Review Article28 Jun 2024
The exciting potential and daunting challenge of using GPS human-mobility data for epidemic modeling

While large-scale GPS location datasets have been instrumental to applications in epidemiology, there are still several challenges with these data that should be considered and addressed to make data-driven epidemiology more reliable.

Francisco Barreras
Duncan J. Watts
Perspective19 Jun 2024
A nonlinear dimension for machine learning in optical disordered media

A recent study shows that, by leveraging nonlinear optical processes in disordered media, photonic processors can transform high-dimensional machine-learning data, using nonlinear functions that are otherwise challenging for digital electronic processors to compute.

Tianyu Wang
News & Views14 Jun 2024
Linguistics-based formalization of the antibody language as a basis for antibody language models

The parallels between natural language and antibody sequences could serve as a stepping stone to using deep language models for analyzing antibody sequences. This Perspective discusses how issues in antibody language model rule mining could be addressed by linguistically formalizing the antibody language.

Mai Ha Vu
Philippe A. Robert
Victor Greiff
Perspective14 Jun 2024
Systematic simulations and analysis of transition states using committor functions

Data about the transition states of rare transitions between long-lived states are needed to simulate physical and chemical processes; however, existing computational approaches often gather little information about these states. A machine-learning technique resolves this challenge by exploiting the century-old theory of committor functions.

Research Briefing05 Jun 2024
Outsourcing eureka moments to artificial intelligence

A two-stage learning algorithm is proposed to directly uncover the symbolic representation of rules for skill acquisition from large-scale training log data.

Martijn Meeter
News & Views24 May 2024
Discrete latent embeddings illuminate cellular diversity in single-cell epigenomics

CASTLE, a deep learning approach, extracts interpretable discrete representations from single-cell chromatin accessibility data, enabling accurate cell type identification, effective data integration, and quantitative insights into gene regulatory mechanisms.

Zhi Wei
News & Views24 May 2024
Designing semiconductor materials and devices in the post-Moore era by tackling computational challenges with data-driven strategies

Discovering improved semiconductor materials is essential for optimal device fabrication. In this Perspective, data-driven computational frameworks for semiconductor discovery and device development are discussed, including the challenges and opportunities moving forward.

Jiahao Xie
Yansong Zhou
Lijun Zhang
Perspective23 May 2024
Shuffling haplotypes to share reference panels for imputation

We present a method to alleviate re-identification risks behind sharing haplotype reference panels for imputation. In an anonymized reference panel, one might try to infer the genomes’ phenotypes to re-identify their owner. Our method protects against such attack by shuffling the reference panels genomes while maintaining imputation accuracy.

Research Briefing22 May 2024
A multidimensional dataset for structure-based machine learning

MISATO, a dataset for structure-based drug discovery combines quantum mechanics property data and molecular dynamics simulations on ~20,000 protein–ligand structures, substantially extends the amount of data available to the community and holds potential for advancing work in drug discovery.

Matthew Holcomb
Stefano Forli
News & Views14 May 2024
Advancements in multicellular simulations

Multicellular modeling is increasingly being used to understand biological systems. SimuCell3D is a tool that allows mechanically realistic simulations, using the deformable cell model, to be developed and run.

Domenic P. J. Germano
James M. Osborne
News & Views02 May 2024
In search of the most cooperative network

Cooperation is crucial for human prosperity, and population structure fosters it through pairwise interactions and coordinated behavior in larger groups. A recent study explores the evolution of behavioral strategies in higher-order population structures, including pairwise and multi-way interactions to reveal that higher-order interactions promote cooperation across networks, especially when they are formed by conjoined communities.

Valerio Capraro
Matjaž Perc
News & Views25 Apr 2024
Annotating cell types in single-cell ATAC data via the guidance of the underlying DNA sequences

SANGO efficiently removed batch effects between the query and reference single-cell ATAC signals through the underlying genome sequences, to enable cell type assignment according to the reference data. The method achieved superior performance on diverse datasets and could detect unknown tumor cells, providing valuable functional biological signals.

Research Briefing22 Apr 2024
Discovering metal complexes in vast chemical spaces

Approaches are needed to accelerate the discovery of transition metal complexes (TMCs), which is challenging owing to their vast chemical space. A large dataset of diverse ligands is now introduced and leveraged in a multiobjective genetic algorithm that enables the efficient optimization of TMCs in chemical spaces containing billions of them.

Research Briefing18 Apr 2024
Digital twins in mechanical and aerospace engineering

While there is a clear opportunity for digital twins to bring value in mechanical and aerospace engineering, they must be considered as an asset in their own right so that their full potential can be realized.

Alberto Ferrari
Karen Willcox
Perspective26 Mar 2024