Integrating multiple modalities of patient data can provide more powerful insights for prognosis than a single modality alone. And multimodal models are easier than ever to develop with transformers. Guillaume Jaume et al. combined whole slide images (WSI) with bulk transcriptomics to predict survival. Their model, SurvPath, first tokenizes each modality. These tokens are then combined with transformer attention using both intra- and cross-modality terms. However, the large number of patches (and therefore tokens) in WSIs would produce a huge memory requirement, so they excluded the intra-modality term for the WSI tokens. Finally, they predicted survival using binary predictors for a set of non-overlapping time intervals. They evaluated SurvPath on five TCGA datasets, demonstrating that it is usually the top performer compared to a single modality alone or other methods of integrating WSIs and transcriptomics. paper: https://lnkd.in/etp75Dij code: https://lnkd.in/eGcEJgWJ If you enjoy these posts and want to hear more, sign up for my Computer Vision Insights newsletter: https://lnkd.in/g9bSuQDP #Pathology #CancerResearch #PrecisionMedicine #MedicalImaging #MachineLearning #DeepLearning #ComputerVision
Thanks Heather Couture, PhD for sharing this paper. Good to see Transformer models not only coping with these multimodal challenges (including WSI), but delivering good performance. I expect this will be an area of rapid progress.
Amazing, looking forward to catchup more when we meet in Morocco. :)
Insightful And thanks for sharing
Thank you for sharing! Your newsletter is so valuable!
Maichan Lau Willa Yim, PhD
Postdoctoral researcher
3wMemory should not be such a problem there are smart ways to deal with it like in this paper using Gradient Accumulation https://arxiv.org/abs/2203.03981