Abstract
Transposon-encoded tnpB and iscB genes encode RNA-guided DNA nucleases that promote their own selfish spread through targeted DNA cleavage and homologous recombination1,2,3,4. These widespread gene families were repeatedly domesticated over evolutionary timescales, leading to the emergence of diverse CRISPR-associated nucleases including Cas9 and Cas12 (refs. 5,6). We set out to test the hypothesis that TnpB nucleases may have also been repurposed for novel, unexpected functions other than CRISPR–Cas adaptive immunity. Here, using phylogenetics, structural predictions, comparative genomics and functional assays, we uncover multiple independent genesis events of programmable transcription factors, which we name TnpB-like nuclease-dead repressors (TldRs). These proteins use naturally occurring guide RNAs to specifically target conserved promoter regions of the genome, leading to potent gene repression in a mechanism akin to CRISPR interference technologies invented by humans7. Focusing on a TldR clade found broadly in Enterobacteriaceae, we discover that bacteriophages exploit the combined action of TldR and an adjacently encoded phage gene to alter the expression and composition of the host flagellar assembly, a transformation with the potential to impact motility8, phage susceptibility9, and host immunity10. Collectively, this work showcases the diverse molecular innovations that were enabled through repeated exaptation of transposon-encoded genes, and reveals the evolutionary trajectory of diverse RNA-guided transcription factors.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
Next-generation sequencing data generated in this study were deposited in the NCBI SRA (BioProject accession PRJNA1029663) and GEO (GSE245749). The published genome used for ChIP–seq analyses was obtained from NCBI (GenBank NC_000913.3). Publicly available RNA-seq data analysed for TldR–gRNA expression are in the NCBI SRA (ERR6044061) and GEO (GSE115009) databases. The published genomes used for bioinformatics analyses were obtained from NCBI (Supplementary Table 4). The ISfinder database can be accessed at https://www-is.biotoul.fr/index.php.
Code availability
Custom scripts used for bioinformatics, TAM library analyses, and ChIP–seq data analyses are available on request. The R script describing initial steps to discover TldRs is available at https://github.com/sternberglab/Wiegand_etal_2024.
References
Altae-Tran, H. et al. The widespread IS200/IS605 transposon family encodes diverse programmable RNA-guided endonucleases. Science 374, 57–65 (2021).
Karvelis, T. et al. Transposon-associated TnpB is a programmable RNA-guided DNA endonuclease. Nature 599, 692–696 (2021).
Meers, C. et al. Transposon-encoded nucleases use guide RNAs to promote their selfish spread. Nature 622, 863–871 (2023).
Zedaveinyte, R. et al. Antagonistic conflict between transposon-encoded introns and guide RNAs. Preprint at bioRxiv https://doi.org/10.1101/2023.11.20.567912 (2023).
Kapitonov, V. V., Makarova, K. S. & Koonin, E. V. ISC, a novel group of bacterial and archaeal DNA transposons that encode Cas9 homologs. J. Bacteriol. 198, 797–807 (2015).
Chylinski, K., Makarova, K. S., Charpentier, E. & Koonin, E. V. Classification and evolution of type II CRISPR–Cas systems. Nucleic Acids Res. 42, 6091–6105 (2014).
Larson, M. H. et al. CRISPR interference (CRISPRi) for sequence-specific control of gene expression. Nat. Protoc. 8, 2180–2196 (2013).
Nakamura, S. & Minamino, T. Flagella-driven motility of bacteria. Biomolecules 9, 279 (2019).
Samuel, A. D. et al. Flagellar determinants of bacterial sensitivity to χ-phage. Proc. Natl Acad. Sci. USA 96, 9863–9866 (1999).
Wilson, D. R. & Beveridge, T. J. Bacterial flagellar filaments and their component flagellins. Can. J. Microbiol. 39, 451–472 (1993).
Aziz, R. K., Breitbart, M. & Edwards, R. A. Transposases are the most abundant, most ubiquitous genes in nature. Nucleic Acids Res. 38, 4207–4217 (2010).
Cosby, R. L., Chang, N. C. & Feschotte, C. Host–transposon interactions: conflict, cooperation, and cooption. Genes Dev. 33, 1098–1116 (2019).
Jangam, D., Feschotte, C. & Betran, E. Transposable element domestication as an adaptation to evolutionary conflicts. Trends Genet. 33, 817–831 (2017).
Gould, S. J. & Vrba, E. S. Exaptation — a missing term in the science of form. Paleobiology 8, 4–15 (1982).
Koonin, E. V. & Makarova, K. S. Mobile genetic elements and evolution of CRISPR–Cas systems: all the way there and back. Genome Biol. Evol. 9, 2812–2825 (2017).
Koonin, E. V. & Makarova, K. S. Origins and evolution of CRISPR–Cas systems. Phil. Trans. R. Soc. B 374, 20180087 (2019).
Nunez, J. K., Lee, A. S., Engelman, A. & Doudna, J. A. Integrase-mediated spacer acquisition during CRISPR–Cas adaptive immunity. Nature 519, 193–198 (2015).
McGinn, J. & Marraffini, L. A. Molecular mechanisms of CRISPR–Cas spacer acquisition. Nat. Rev. Microbiol. 17, 7–12 (2019).
Krupovic, M., Makarova, K. S., Forterre, P., Prangishvili, D. & Koonin, E. V. Casposons: a new superfamily of self-synthesizing DNA transposons at the origin of prokaryotic CRISPR–Cas immunity. BMC Biol. 12, 36 (2014).
Hickman, A. B., Kailasan, S., Genzor, P., Haase, A. D. & Dyda, F. Casposase structure and the mechanistic link between DNA transposition and spacer acquisition by CRISPR–Cas. eLife 9, e50004 (2020).
Klompe, S. E., Vo, P. L. H., Halpin-Healy, T. S. & Sternberg, S. H. Transposon-encoded CRISPR–Cas systems direct RNA-guided DNA integration. Nature 571, 219–225 (2019).
Strecker, J. et al. RNA-guided DNA insertion with CRISPR-associated transposases. Science 365, 48–53 (2019).
Faure, G. et al. CRISPR–Cas in mobile genetic elements: counter-defence and beyond. Nat. Rev. Microbiol. 17, 513–525 (2019).
Seed, K. D., Lazinski, D. W., Calderwood, S. B. & Camilli, A. A bacteriophage encodes its own CRISPR/Cas adaptive response to evade host innate immunity. Nature 494, 489–491 (2013).
Al-Shayeb, B. et al. Clades of huge phages from across Earth’s ecosystems. Nature 578, 425–431 (2020).
Frost, L. S., Leplae, R., Summers, A. O. & Toussaint, A. Mobile genetic elements: the agents of open source evolution. Nat. Rev. Microbiol. 3, 722–732 (2005).
Bao, W. & Jurka, J. Homologues of bacterial TnpB_IS605 are widespread in diverse eukaryotic transposable elements. Mob. DNA 4, 12 (2013).
Jiang, K. et al. Programmable RNA-guided DNA endonucleases are widespread in eukaryotes and their viruses. Sci. Adv. 9, eadk0171 (2023).
Saito, M. et al. Fanzor is a eukaryotic programmable RNA-guided endonuclease. Nature 620, 660–668 (2023).
Siguier, P., Gourbeyre, E., Varani, A., Ton-Hoang, B. & Chandler, M. Everyman’s guide to bacterial insertion sequences. Microbiol. Spectr. 3, MDNA3-0030-2014 (2015).
Altae-Tran, H. et al. Diversity, evolution, and classification of the RNA-guided nucleases TnpB and Cas12. Proc. Natl Acad. Sci. USA 120, e2308224120 (2023).
Makarova, K. S. et al. Evolutionary classification of CRISPR–Cas systems: a burst of class 2 and derived variants. Nat. Rev. Microbiol. 18, 67–83 (2020).
Huang, C. J., Adler, B. A. & Doudna, J. A. A naturally DNase-free CRISPR–Cas12c enzyme silences gene expression. Mol. Cell 82, 2148–2160.e4 (2022).
Wu, W. Y. et al. The miniature CRISPR–Cas12m effector binds DNA to block transcription. Mol. Cell 82, 4487–4502.e7 (2022).
Peters, J. E., Makarova, K. S., Shmakov, S. & Koonin, E. V. Recruitment of CRISPR–Cas systems by Tn7-like transposons. Proc. Natl Acad. Sci. USA 114, E7358–E7366 (2017).
Nakamura, M., Gao, Y., Dominguez, A. A. & Qi, L. S. CRISPR technologies for precise epigenome editing. Nat. Cell Biol. 23, 11–22 (2021).
Zetsche, B. et al. Cpf1 is a single RNA-guided endonuclease of a class 2 CRISPR–Cas system. Cell 163, 759–771 (2015).
He, S. et al. The IS200/IS605 family and “peel and paste” single-strand transposition mechanism. Microbiol. Spectr. https://doi.org/10.1128/microbiolspec.MDNA3-0039-2014 (2015).
Doeven, M. K., van den Bogaart, G., Krasnikov, V. & Poolman, B. Probing receptor–translocator interactions in the oligopeptide ABC transporter by fluorescence correlation spectroscopy. Biophys. J. 94, 3956–3965 (2008).
Biemans-Oldehinkel, E., Doeven, M. K. & Poolman, B. ABC transporter architecture and regulatory roles of accessory domains. FEBS Lett. 580, 1023–1035 (2006).
Mukherjee, S. et al. CsrA–FliW interaction governs flagellin homeostasis and a checkpoint on flagellar morphogenesis in Bacillus subtilis. Mol. Microbiol. 82, 447–461 (2011).
Lawrence, J. G. Shared strategies in gene organization among prokaryotes and eukaryotes. Cell 110, 407–413 (2002).
Siguier, P., Perochon, J., Lestrade, L., Mahillon, J. & Chandler, M. ISfinder: the reference centre for bacterial insertion sequences. Nucleic Acids Res. 34, D32–D36 (2006).
Leenay, R. T. & Beisel, C. L. Deciphering, communicating, and engineering the CRISPR PAM. J. Mol. Biol. 429, 177–191 (2017).
Michaux, C. et al. Single-nucleotide RNA maps for the two major nosocomial pathogens Enterococcus faecalis and Enterococcus faecium. Front. Cell. Infect. Microbiol. 10, 600325 (2020).
Nety, S. P. et al. The transposon-encoded protein TnpB processes its own mRNA into ωRNA for guided nuclease activity. CRISPR J. 6, 232–242 (2023).
Ohnishi, K., Kutsukake, K., Suzuki, H. & Iino, T. Gene fliA encodes an alternative sigma factor specific for flagellar operons in Salmonella typhimurium. Mol. Gen. Genet. 221, 139–147 (1990).
Ide, N., Ikebe, T. & Kutsukake, K. Reevaluation of the promoter structure of the class 3 flagellar operons of Escherichia coli and Salmonella. Genes Genet. Syst. 74, 113–116 (1999).
Klepsch, M. M. et al. Escherichia coli peptide binding protein OppA has a preference for positively charged peptides. J. Mol. Biol. 414, 75–85 (2011).
Solovyev, V. A. S. in Metagenomics and its Applications in Agriculture, Biomedicine and Environmental Studies (ed Li, R. W.) 61–78 (Nova Science Publishers, 2011).
Swarts, D. C., van der Oost, J. & Jinek, M. Structural basis for guide RNA processing and seed-dependent DNA targeting by CRISPR–Cas12a. Mol. Cell 66, 221–233.e4 (2017).
Qi, L. S. et al. Repurposing CRISPR as an RNA-guided platform for sequence-specific control of gene expression. Cell 152, 1173–1183 (2013).
Zhang, X. et al. Multiplex gene regulation by CRISPR–ddCpf1. Cell Discov. 3, 17018 (2017).
Kim, S. K. et al. Efficient transcriptional gene repression by type V-A CRISPR–Cpf1 from Eubacterium eligens. ACS Synth. Biol. 6, 1273–1282 (2017).
Samatey, F. A. et al. Structure of the bacterial flagellar protofilament and implications for a switch for supercoiling. Nature 410, 331–337 (2001).
Yonekura, K., Maki-Yonekura, S. & Namba, K. Complete atomic model of the bacterial flagellar filament by electron cryomicroscopy. Nature 424, 643–650 (2003).
Reid, S. D., Selander, R. K. & Whittam, T. S. Sequence diversity of flagellin (fliC) alleles in pathogenic Escherichia coli. J. Bacteriol. 181, 153–160 (1999).
Esteves, N. C., Bigham, D. N. & Scharf, B. E. Phages on filaments: a genetic screen elucidates the complex interactions between Salmonella enterica flagellin and bacteriophage Chi. PLoS Pathog. 19, e1011537 (2023).
Guttenplan, S. B. & Kearns, D. B. Regulation of flagellar motility during biofilm formation. FEMS Microbiol. Rev. 37, 849–871 (2013).
Dacquay, L. C. et al. E. coli nissle increases transcription of flagella assembly and formate hydrogenlyase genes in response to colitis. Gut Microbes 13, 1994832 (2021).
Kim, M. J., Lim, S. & Ryu, S. Molecular analysis of the Salmonella typhimurium tdc operon regulation. J. Microbiol. Biotechnol. 18, 1024–1032 (2008).
Esteves, N. C. & Scharf, B. E. Flagellotropic bacteriophages: opportunities and challenges for antimicrobial applications. Int. J. Mol. Sci. 23, 7084 (2022).
Yoon, S. I. et al. Structural basis of TLR5-flagellin recognition and signaling. Science 335, 859–864 (2012).
Tenthorey, J. L. et al. The structural basis of flagellin detection by NAIP5: a strategy to limit pathogen immune evasion. Science 358, 888–893 (2017).
Wang, L., Rothemund, D., Curd, H. & Reeves, P. R. Species-wide variation in the Escherichia coli flagellin (H-antigen) gene. J. Bacteriol. 185, 2936–2943 (2003).
Cullender, T. C. et al. Innate and adaptive immunity interact to quench microbiome flagellar motility in the gut. Cell Host Microbe 14, 571–581 (2013).
Brussow, H., Canchaya, C. & Hardt, W. D. Phages and the evolution of bacterial pathogens: from genomic rearrangements to lysogenic conversion. Microbiol. Mol. Biol. Rev. 68, 560–602 (2004).
Rees, D. C., Johnson, E. & Lewinson, O. ABC transporters: the power to change. Nat. Rev. Mol. Cell Biol. 10, 218–227 (2009).
Holtzman, L. & Gersbach, C. A. Editing the epigenome: reshaping the genomic landscape. Annu. Rev. Genomics Hum. Genet. 19, 43–71 (2018).
Bikard, D. et al. Programmable repression and activation of bacterial gene expression using an engineered CRISPR–Cas system. Nucleic Acids Res. 41, 7429–7437 (2013).
Cui, L. et al. A CRISPRi screen in E. coli reveals sequence-specific toxicity of dCas9. Nat. Commun. 9, 1912 (2018).
Vigouroux, A., Oldewurtel, E., Cui, L., Bikard, D. & van Teeffelen, S. Tuning dCas9’s ability to block transcription enables robust, noiseless knockdown of bacterial genes. Mol. Syst. Biol. 14, e7899 (2018).
Workman, R. E. et al. A natural single-guide RNA repurposes Cas9 to autoregulate CRISPR–Cas expression. Cell 184, 675–688.e19 (2021).
Ratner, H. K. et al. Catalytically active Cas9 mediates transcriptional interference to facilitate bacterial virulence. Mol. Cell 75, 498–510.e5 (2019).
Eddy, S. R. Accelerated profile HMM searches. PLoS Comput. Biol. 7, e1002195 (2011).
Huang, Y., Niu, B., Gao, Y., Fu, L. & Li, W. CD-HIT Suite: a web server for clustering and comparing biological sequences. Bioinformatics 26, 680–682 (2010).
Katoh, K., Kuma, K., Toh, H. & Miyata, T. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 33, 511–518 (2005).
Capella-Gutierrez, S., Silla-Martinez, J. M. & Gabaldon, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009).
Minh, B. Q. et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 37, 1530–1534 (2020).
Letunic, I. & Bork, P. Interactive Tree of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 49, W293–W296 (2021).
Wright, E. S. DECIPHER: harnessing local sequence context to improve protein multiple sequence alignment. BMC Bioinformatics 16, 322 (2015).
Biostrings: String objects representing biological sequences, and matching algorithms. R package version 2.70.1 (Pagès, H. A. P., Gentleman, R. & DebRoy, S., 2023).
Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinformatics 10, 421 (2009).
Cantalapiedra, C. P., Hernandez-Plaza, A., Letunic, I., Bork, P. & Huerta-Cepas, J. eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol. Biol. Evol. 38, 5825–5829 (2021).
Hyatt, D. et al. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11, 119 (2010).
Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2 — approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490 (2010).
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Goddard, T. D. et al. UCSF ChimeraX: meeting modern challenges in visualization and analysis. Protein Sci. 27, 14–25 (2018).
Waterhouse, A. M., Procter, J. B., Martin, D. M., Clamp, M. & Barton, G. J. Jalview version 2 — a multiple sequence alignment editor and analysis workbench. Bioinformatics 25, 1189–1191 (2009).
Mirdita, M. et al. ColabFold: making protein folding accessible to all. Nat. Methods 19, 679–682 (2022).
Pei, J., Kim, B. H. & Grishin, N. V. PROMALS3D: a tool for multiple protein sequence and structure alignments. Nucleic Acids Res. 36, 2295–2300 (2008).
Nawrocki, E. P. & Eddy, S. R. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics 29, 2933–2935 (2013).
Will, S., Joshi, T., Hofacker, I. L., Stadler, P. F. & Backofen, R. LocARNA-P: accurate boundary prediction and improved detection of structural RNAs. RNA 18, 900–914 (2012).
Vasimuddin M., Misra, S., Li, H. & Aluru, S. Efficient architecture-aware acceleration of BWA-MEM for multicore systems. In 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS) https://doi.org/10.1109/IPDPS.2019.00041 (IEEE, 2019).
Danecek, P. et al. Twelve years of SAMtools and BCFtools. Gigascience 10, giab008 (2021).
Ramirez, F. et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 44, W160–W165 (2016).
Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).
Hoffmann, F. T. et al. Selective TnsC recruitment enhances the fidelity of RNA-guided transposition. Nature 609, 384–393 (2022).
Zhang, Y. et al. Model-based analysis of ChIP–seq (MACS). Genome Biol. 9, R137 (2008).
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Bailey, T. L. et al. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. 37, W202–W208 (2009).
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal https://doi.org/10.14806/ej.17.1.200 (2011).
Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Sharan, S. K., Thomason, L. C., Kuznetsov, S. G. & Court, D. L. Recombineering: a homologous recombination-based method of genetic engineering. Nat. Protoc. 4, 206–223 (2009).
Luo, G. et al. flrA, flrB and flrC regulate adhesion by controlling the expression of critical virulence genes in Vibrio alginolyticus. Emerg. Microbes Infect. 5, e85 (2016).
Kreutzberger, M. A. B. et al. Flagellin outer domain dimerization modulates motility in pathogenic and soil bacteria from viscous environments. Nat. Commun. 13, 1422 (2022).
Shevchenko, A., Tomas, H., Havlis, J., Olsen, J. V. & Mann, M. In-gel digestion for mass spectrometric characterization of proteins and proteomes. Nat. Protoc. 1, 2856–2860 (2006).
Kulak, N. A., Pichler, G., Paron, I., Nagaraj, N. & Mann, M. Minimal, encapsulated proteomic-sample processing applied to copy-number estimation in eukaryotic cells. Nat. Methods 11, 319–324 (2014).
Meier, F. et al. Online parallel accumulation-serial fragmentation (PASEF) with a novel trapped ion mobility mass spectrometer. Mol. Cell. Proteomics 17, 2534–2545 (2018).
Cox, J. et al. Andromeda: a peptide search engine integrated into the MaxQuant environment. J. Proteome Res. 10, 1794–1805 (2011).
Acknowledgements
We thank S. R. Pesari and Z. Akhtar for laboratory support; G. D. Lampe for suggesting the TldR moniker; A. Bernheim for helpful discussions; F. Tesson, A. Bernheim, A. M. Earl and D. Gray for sharing E. coli and Enterobacter strains; C. Lu for Covaris sonicator access; R. K. Soni for mass spectrometry support; L. F. Landweber for qPCR instrument access; and the JP Sulzberger Columbia Genome Center for next-generation sequencing support. S.T. was supported by a Medical Scientist Training Program grant (5T32GM145440-02) from the NIH. M.W.G.W. was supported by a National Science Foundation Graduate Research Fellowship. C.M. was supported by the NIH Postdoctoral Fellowship F32 GM143924-01A1. S.H.S. was supported by the NSF Faculty Early Career Development Program (CAREER) Award 2239685, a Pew Biomedical Scholarship, an Irma T. Hirschl Career Scientist Award, and a startup package from the Columbia University Irving Medical Center Dean’s Office and the Vagelos Precision Medicine Fund.
Author information
Authors and Affiliations
Contributions
T.W., C.M., and S.H.S. conceived and designed the project. T.W. performed all of the bioinformatics experiments and aided in the design of the experimental assays. F.T.H. performed plasmid interference, ChIP–seq, and the RFP repression assays. M.W.G.W. designed and generated the E. coli strains and plasmids for the RFP repression assays, fragments for Enterobacter recombineering, conducted the motility assays and isolated flagella for liquid chromatography with tandem mass spectrometry. S.T. performed and analysed the RNA-seq and RIP-seq experiments. E.R. cultured Enterobacter strains, extracted RNA for RNA-seq, and performed the RT–qPCR and recombineering experiments. C.M. performed the preliminary TnpB bioinformatics and neighbourhood analyses, together with H.C.L., and helped design the ChIP–seq and RFP repression assays. T.W. and S.H.S. discussed the data and wrote the manuscript, with input from all authors.
Corresponding author
Ethics declarations
Competing interests
Columbia University has filed a patent application related to this work. M.W.G.W. is a co-founder of Can9 Bioengineering. S.H.S. is a co-founder and scientific advisor to Dahlia Biosciences, a scientific advisor to CrisprBits and Prime Medicine, and an equity holder in Dahlia Biosciences and CrisprBits. All other authors declare no competing interests.
Peer review
Peer review information
Nature thanks Wen Wu and the other, anonymous reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Fig. 1 Phylogeny and RuvC nuclease domain analysis of oppF-associated TldRs.
a, Phylogenetic tree of oppF-associated TldR proteins from Fig. 2a, together with closely related TnpB proteins that contain intact RuvC active sites. The rings indicate RuvC DED active site intactness (inner) and TldR/TnpB domain composition (outer). Homologs marked with an orange square (TnpB) or green circle (TldR) were tested in heterologous experiments. b, Multiple sequence alignment of representative TnpB and TldR sequences from a, highlighting deterioration of RuvC active site motifs (shaded in red) and loss of the C-terminal zinc-finger (ZnF)/RuvC domain. Highly conserved residues are shaded in grey. c, Empirical (DraTnpB) and predicted AlphaFold structures of TnpB and TldR homologs marked with an asterisk in b, showing progressive loss of the active site catalytic triad.
Extended Data Fig. 2 Diverse prophages encode fliCP-associated tldR genes.
a, Genomic architecture of representative prophage elements whose boundaries could be identified by comparison to closely related isogenic strains. In each example, the prophage-containing strain is shown above the prophage-lacking strain, with species/strain names and NCBI genomic accession IDs indicated. Sequences flanking the left (5′) and right (3′) ends are highlighted in purple and yellow, respectively, together with their percentage sequence identities calculated using BLASTn. b, Alignment of distinct prophage elements, constructed using Mauve. Empty boxes represent open reading frames, and windows show sequence conservation for regions compared between prophage genomes with lines. Putative gene functions are shown below sequence conservation windows for the fliCP-tldR-encoding prophage from Enterobacter AR_163 (bottom). c, DNA sequence identities between the prophages in a, calculated with BLASTn. Identities were calculated as total matching nucleotides across the two genomes being compared, divided by the length of the query prophage genome.
Extended Data Fig. 3 RIP-seq reveals that some oppF-associated TldR proteins use short, 9–11-nt guides.
a, RNA immunoprecipitation sequencing (RIP-seq) data for an oppF-associated TldR homolog from Enterococcus faecalis (Efa1TldR) reveals the boundaries of a mature gRNA containing a 9-nt guide sequence. Reads were mapped to the TldR-gRNA expression plasmid; an input control is shown. b, Published RNA-seq data for Enterococcus faecalis V583 reveals similar gRNA boundaries, including an approximately 11-nt guide. c, RIP-seq data as in a for a second biological replicate of Efa1TldR, further corroborating the observed 9–11-nt guide length.
Extended Data Fig. 4 oppF-associated TldRs target conserved genomic sequences that overlap with promoter elements driving oppA expression.
a, Schematic of original (left) and improved (right) search strategy to identify putative targets of gRNAs used by oppF-associated TldRs. Key insights resulted from the use of TAM and a shorter, 9-nt guide. b, Analysis of the guide sequence from the Efa1TldR-associated gRNA in Extended Data Fig. 3 revealed a putative genomic target near the predicted promoter of oppA encoded within the same ABC transporter operon immediately adjacent to the tldR gene. The magnified schematics at the bottom show the predicted TAM and gRNA-target DNA base-pairing interactions for two representatives (Efa1TldR and EceTldR), in which the gRNAs target opposite strands. Promoter elements predicted with BPROM are shown as brown squares. c, WebLogos of predicted guides and genomic targets associated with diverse oppF-associated TldRs highlighted in Extended Data Fig. 1. d, Schematic of the oppF-tldR genomic locus (left) alongside the predicted function of OppA as a solute binding protein that facilitates transport of polypeptide substrates from the periplasm to the cytoplasm, in complex with the remainder of the ABC transporter apparatus. e, Published RNA-seq data for Enterococcus faecium AUS000445, highlighting the oppA transcription start site (TSS). The predicted gRNA guide sequence (grey) is shown beneath the putative TAM (yellow) and target (purple) sequences, with guide-target complementarity represented by grey circles.
Extended Data Fig. 5 oppF-associated TldR homologs may target additional sites across the genome.
Schematic of Enterococcus cecorum genome and inset showing the oppF-tldR locus (top), with additional putative targets of the gRNA, other than the oppA promoter, numbered and highlighted in yellow along the genomic coordinate. A magnified view for each numbered target is shown below, with TAMs in yellow, prospective targets in purple, and TldR gRNA guide sequences in grey. Grey circles (right) represent positions of expected guide-target complementarity.
Extended Data Fig. 6 Genome-wide binding data from ChIP-seq experiments suggest a high mismatch tolerance for some TldR homologs.
a, Genome-wide ChIP–seq profiles for the indicated fliCP-associated TldR homologs, normalized to the highest peak within each dataset. The magnified insets at the bottom show the off-target sequences (grey) compared to the intended (engineered) on-target sequence (purple), with TAMs in yellow. Off-target #3 has no clear TAM-flanked off-target sequence but is intriguingly located at a tRNA locus, and binding was observed for diverse fliCP- and oppF-associated TldRs that recognized distinct TAMs. The phylogenetic tree at right indicates the relatedness of the tested and labeled homologs. b, Results for the indicated oppF-associated TldR homologs, shown as in a.
Extended Data Fig. 7 Plasmid interference assays confirm that TldR homologs lack detectable nuclease activity.
a, Schematic of E. coli-based plasmid interference assay using pEffector and pTarget. b, Representative dilution spot assays for GstTnpB3 and synthetically inactivated RuvC mutant (D196A), showing the entire plate (left) and the magnified area of plating. Transformants were serially diluted, plated on selective media, and cultured at 37 °C for 16 h. Colony visibility was enhanced by inverting the colors and increasing contrast/brightness. c, Dilution spot assays for the indicated fliC-associated TldR homologs (left) and closely related TnpB homologs (right). Non-targeting (NT) gRNA controls are shown at the bottom, and the phylogenetic tree indicates the relatedness of the tested proteins. d, Results for the indicated oppF-associated TldR and TnpB homologs, shown as in c.
Extended Data Fig. 8 RFP repression assays reveal variable abilities of TldR homologs to block transcription elongation.
a, RFP repression activity was measured (right) as in Fig. 4f,g using modified gRNAs exhibiting variable complementarity to the target site, as schematized in the grid (left). A gRNA was also tested that lacked the extra 5′ sequence which was absent in RIP-seq reads of mature gRNAs (20 nt no 5′ seq). Bars indicate mean ± s.d. (n = 3 biological replicates). b, Schematic of RFP repression assay in which gRNAs were designed to target either the top or bottom strand within the 5′ UTR of RFP, downstream of the promoter. The phylogenetic trees (right) indicate the relatedness of the tested and labeled homologs. c, Bar graphs plotting normalized RFP fluorescence for the indicated conditions and TldR homologs. EV, empty vector; NT, non-targeting guide. Results with nuclease-dead dCas12 and dCas9 are shown for comparison. Bars indicate mean ± s.d. (n = 3 biological replicates for TldR; n = 6 biological replicates for dCas12/dCas9).
Extended Data Fig. 9 Enterobacter RNA-seq data confirm the native expression of gRNAs from fliCP-tldR loci.
a, RNA-seq read coverage from three Enterobacter strains that natively encode fliCP-tldR loci, revealing clear peaks associated with mature gRNAs containing ~95–97-nt scaffolds and 16-nt guides. Data from three biological replicates are overlaid. b, Predicted secondary structure and sequence of the gRNA associated with EhoTldR. c, Multiple sequence alignment of the DNA encoding gRNA scaffold sequences for representative fliCP-associated TldRs, with conserved positions colored in darker blue.
Extended Data Fig. 10 FliCP is expressed and incorporated into Enterobacter flagella, concomitantly with host FliC repression.
a, RNA-seq read coverage across the tldR-encoding prophage of Enterobacter sp. BIDMC93, demonstrating strong expression of fliCP, tldR, and the gRNA, alongside other genes involved in lysogeny maintenance (e.g. CI). b, Motility assays (left) with wild-type (WT) and Enterobacter deletion strains reveal similar motility phenotypes, as visualized with LB-agar plate images (middle) and a bar graph quantifying motility via halo size (right). Plate images and bar graphs represent three biological replicates; bars indicate mean ± s.d. c, Schematic representation of FliC/FliCP homologs encoded by Enterobacter sp. BIDMC93, with relative genomic positions indicated. FliC2 is a second host flagellin gene copy encoded at an alternate flagellar assembly locus within this strain, which is not targeted by TldR and not commonly present in other Enterobacter strains. d, Results from liquid chromatography with tandem mass spectrometry (LC–MS/MS) analyses performed on digested peptides from purified flagellar filaments, isolated from the three indicated Enterobacter sp. BIDMC93 strains. The WT ( + CmR) strain encodes the cmR gene downstream of the tldR-gRNA locus (as in Fig. 5e). Data represent the label free quantification (LFQ) intensities reflecting the variable D2-3 regions of FliC, FliCP, or FliC2. Although the FliC2 appears to be the most dominant flagellin component, the relevant amounts of host FliC and FliCP demonstrate that prophage-encoded FliCP readily assembles into extracellular flagellar filaments, and that host FliC production is de-repressed upon prophage deletion. e, Quantification of changes in the expression profiles of Enterobacter FliC homologs, measured from RNA-seq data of three biological replicates depicted in Fig. 5f,g. TPM, transcripts per million. f, Alignment of fliC/fliCP/fliC2 promoters indicates that guide RNA-target DNA mismatches prevent TldR-targeting of fliC2 and fliCP in Enterobacter sp. BIDMC93. g, RNA-seq read coverage in the host fliC promoter/5′-UTR region overlayed for three biological replicates of four Enterobacter strains, with labeled TAM and target sequences highlighted upstream of the TSS. Strain AR136 (top) does not encode a fliCP-tldR locus; note the distinct expression levels, measured via relative counts per million (CPM). h, Alignment of host fliC promoter regions for the strains shown in g compared to E. coli K12, with percent sequence identities indicated on the right. Reported FliA/σ28 promoter elements from E. coli K12 are shown below the alignment. i, RNA-seq read coverage in the prophage-encoded fliCP promoter/5′-UTR region overlayed for three biological replicates of two representative Enterobacter strains, confirming the predicted TSS. j, Schematic of multiple sequence alignment of the promoter region driving fliCP gene expression, across six verified prophages described in Extended Data Fig. 2, highlighting the region that was queried for MEME motif detection.
Supplementary information
Supplementary Information
This file contains Supplementary Figs. 1–6.
Supplementary Tables
Supplementary Tables 1–8.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wiegand, T., Hoffmann, F.T., Walker, M.W.G. et al. TnpB homologues exapted from transposons are RNA-guided transcription factors. Nature 631, 439–448 (2024). https://doi.org/10.1038/s41586-024-07598-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41586-024-07598-4
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.