Extended Data Fig. 1: Overview of high-throughput bNMF pipeline for multi-ancestry (MA) clusters.
From: Multi-ancestry polygenic mechanisms of type 2 diabetes
![Extended Data Fig. 1](https://cdn.statically.io/img/media.springernature.com/full/springer-static/esm/art%3A10.1038%2Fs41591-024-02865-3/MediaObjects/41591_2024_2865_Fig5_ESM.jpg)
Flowchart of the steps implemented in the clustering pipeline. Steps include: 1) extract variants from diverse set of T2D GWAS datasets, 2) apply LD-pruning across reference panels for all populations included, to ensure independent genetic signals, 3) find proxy variants for variants that are multi-allelic, ambiguous, or have low trait counts, 4) align variants to risk increasing alleles in largest MA T2D GWAS and remove if their P value in this GWAS does not meet a Bonferroni threshold, 5) filter trait GWAS by minimum sample size, 6) filter trait GWAS by a minimum Bonferroni-corrected P value across the selected variants, 7) filter by correlation between traits and 8) generate the variant by trait association matrix which serves as the bNMF input.