Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

TnpB homologues exapted from transposons are RNA-guided transcription factors

Abstract

Transposon-encoded tnpB and iscB genes encode RNA-guided DNA nucleases that promote their own selfish spread through targeted DNA cleavage and homologous recombination1,2,3,4. These widespread gene families were repeatedly domesticated over evolutionary timescales, leading to the emergence of diverse CRISPR-associated nucleases including Cas9 and Cas12 (refs. 5,6). We set out to test the hypothesis that TnpB nucleases may have also been repurposed for novel, unexpected functions other than CRISPR–Cas adaptive immunity. Here, using phylogenetics, structural predictions, comparative genomics and functional assays, we uncover multiple independent genesis events of programmable transcription factors, which we name TnpB-like nuclease-dead repressors (TldRs). These proteins use naturally occurring guide RNAs to specifically target conserved promoter regions of the genome, leading to potent gene repression in a mechanism akin to CRISPR interference technologies invented by humans7. Focusing on a TldR clade found broadly in Enterobacteriaceae, we discover that bacteriophages exploit the combined action of TldR and an adjacently encoded phage gene to alter the expression and composition of the host flagellar assembly, a transformation with the potential to impact motility8, phage susceptibility9, and host immunity10. Collectively, this work showcases the diverse molecular innovations that were enabled through repeated exaptation of transposon-encoded genes, and reveals the evolutionary trajectory of diverse RNA-guided transcription factors.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Bioinformatic identification of naturally occurring, nuclease-deficient TnpB homologues.
Fig. 2: tldR genes are strongly associated with diverse non-transposon genes and encoded in prophages.
Fig. 3: TldR proteins are encoded next to gRNAs that target conserved genomic sites.
Fig. 4: TldRs are RNA-guided DNA binding proteins capable of programmable transcriptional repression.
Fig. 5: Flagellin-associated TldRs repress host flagellin gene expression in native Enterobacter strains.

Similar content being viewed by others

Data availability

Next-generation sequencing data generated in this study were deposited in the NCBI SRA (BioProject accession PRJNA1029663) and GEO (GSE245749). The published genome used for ChIP–seq analyses was obtained from NCBI (GenBank NC_000913.3). Publicly available RNA-seq data analysed for TldR–gRNA expression are in the NCBI SRA (ERR6044061) and GEO (GSE115009) databases. The published genomes used for bioinformatics analyses were obtained from NCBI (Supplementary Table 4). The ISfinder database can be accessed at https://www-is.biotoul.fr/index.php.

Code availability

Custom scripts used for bioinformatics, TAM library analyses, and ChIP–seq data analyses are available on request. The R script describing initial steps to discover TldRs is available at https://github.com/sternberglab/Wiegand_etal_2024.

References

  1. Altae-Tran, H. et al. The widespread IS200/IS605 transposon family encodes diverse programmable RNA-guided endonucleases. Science 374, 57–65 (2021).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  2. Karvelis, T. et al. Transposon-associated TnpB is a programmable RNA-guided DNA endonuclease. Nature 599, 692–696 (2021).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  3. Meers, C. et al. Transposon-encoded nucleases use guide RNAs to promote their selfish spread. Nature 622, 863–871 (2023).

    Article  CAS  PubMed  ADS  Google Scholar 

  4. Zedaveinyte, R. et al. Antagonistic conflict between transposon-encoded introns and guide RNAs. Preprint at bioRxiv https://doi.org/10.1101/2023.11.20.567912 (2023).

  5. Kapitonov, V. V., Makarova, K. S. & Koonin, E. V. ISC, a novel group of bacterial and archaeal DNA transposons that encode Cas9 homologs. J. Bacteriol. 198, 797–807 (2015).

    Article  PubMed  Google Scholar 

  6. Chylinski, K., Makarova, K. S., Charpentier, E. & Koonin, E. V. Classification and evolution of type II CRISPR–Cas systems. Nucleic Acids Res. 42, 6091–6105 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Larson, M. H. et al. CRISPR interference (CRISPRi) for sequence-specific control of gene expression. Nat. Protoc. 8, 2180–2196 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Nakamura, S. & Minamino, T. Flagella-driven motility of bacteria. Biomolecules 9, 279 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Samuel, A. D. et al. Flagellar determinants of bacterial sensitivity to χ-phage. Proc. Natl Acad. Sci. USA 96, 9863–9866 (1999).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  10. Wilson, D. R. & Beveridge, T. J. Bacterial flagellar filaments and their component flagellins. Can. J. Microbiol. 39, 451–472 (1993).

    Article  CAS  PubMed  Google Scholar 

  11. Aziz, R. K., Breitbart, M. & Edwards, R. A. Transposases are the most abundant, most ubiquitous genes in nature. Nucleic Acids Res. 38, 4207–4217 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Cosby, R. L., Chang, N. C. & Feschotte, C. Host–transposon interactions: conflict, cooperation, and cooption. Genes Dev. 33, 1098–1116 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Jangam, D., Feschotte, C. & Betran, E. Transposable element domestication as an adaptation to evolutionary conflicts. Trends Genet. 33, 817–831 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Gould, S. J. & Vrba, E. S. Exaptation — a missing term in the science of form. Paleobiology 8, 4–15 (1982).

    Article  Google Scholar 

  15. Koonin, E. V. & Makarova, K. S. Mobile genetic elements and evolution of CRISPR–Cas systems: all the way there and back. Genome Biol. Evol. 9, 2812–2825 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Koonin, E. V. & Makarova, K. S. Origins and evolution of CRISPR–Cas systems. Phil. Trans. R. Soc. B 374, 20180087 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Nunez, J. K., Lee, A. S., Engelman, A. & Doudna, J. A. Integrase-mediated spacer acquisition during CRISPR–Cas adaptive immunity. Nature 519, 193–198 (2015).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  18. McGinn, J. & Marraffini, L. A. Molecular mechanisms of CRISPR–Cas spacer acquisition. Nat. Rev. Microbiol. 17, 7–12 (2019).

    Article  CAS  PubMed  Google Scholar 

  19. Krupovic, M., Makarova, K. S., Forterre, P., Prangishvili, D. & Koonin, E. V. Casposons: a new superfamily of self-synthesizing DNA transposons at the origin of prokaryotic CRISPR–Cas immunity. BMC Biol. 12, 36 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  20. Hickman, A. B., Kailasan, S., Genzor, P., Haase, A. D. & Dyda, F. Casposase structure and the mechanistic link between DNA transposition and spacer acquisition by CRISPR–Cas. eLife 9, e50004 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Klompe, S. E., Vo, P. L. H., Halpin-Healy, T. S. & Sternberg, S. H. Transposon-encoded CRISPR–Cas systems direct RNA-guided DNA integration. Nature 571, 219–225 (2019).

    Article  CAS  PubMed  Google Scholar 

  22. Strecker, J. et al. RNA-guided DNA insertion with CRISPR-associated transposases. Science 365, 48–53 (2019).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  23. Faure, G. et al. CRISPR–Cas in mobile genetic elements: counter-defence and beyond. Nat. Rev. Microbiol. 17, 513–525 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Seed, K. D., Lazinski, D. W., Calderwood, S. B. & Camilli, A. A bacteriophage encodes its own CRISPR/Cas adaptive response to evade host innate immunity. Nature 494, 489–491 (2013).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  25. Al-Shayeb, B. et al. Clades of huge phages from across Earth’s ecosystems. Nature 578, 425–431 (2020).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  26. Frost, L. S., Leplae, R., Summers, A. O. & Toussaint, A. Mobile genetic elements: the agents of open source evolution. Nat. Rev. Microbiol. 3, 722–732 (2005).

    Article  CAS  PubMed  Google Scholar 

  27. Bao, W. & Jurka, J. Homologues of bacterial TnpB_IS605 are widespread in diverse eukaryotic transposable elements. Mob. DNA 4, 12 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Jiang, K. et al. Programmable RNA-guided DNA endonucleases are widespread in eukaryotes and their viruses. Sci. Adv. 9, eadk0171 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Saito, M. et al. Fanzor is a eukaryotic programmable RNA-guided endonuclease. Nature 620, 660–668 (2023).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  30. Siguier, P., Gourbeyre, E., Varani, A., Ton-Hoang, B. & Chandler, M. Everyman’s guide to bacterial insertion sequences. Microbiol. Spectr. 3, MDNA3-0030-2014 (2015).

    Article  PubMed  Google Scholar 

  31. Altae-Tran, H. et al. Diversity, evolution, and classification of the RNA-guided nucleases TnpB and Cas12. Proc. Natl Acad. Sci. USA 120, e2308224120 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Makarova, K. S. et al. Evolutionary classification of CRISPR–Cas systems: a burst of class 2 and derived variants. Nat. Rev. Microbiol. 18, 67–83 (2020).

    Article  CAS  PubMed  Google Scholar 

  33. Huang, C. J., Adler, B. A. & Doudna, J. A. A naturally DNase-free CRISPR–Cas12c enzyme silences gene expression. Mol. Cell 82, 2148–2160.e4 (2022).

    Article  CAS  PubMed  Google Scholar 

  34. Wu, W. Y. et al. The miniature CRISPR–Cas12m effector binds DNA to block transcription. Mol. Cell 82, 4487–4502.e7 (2022).

    Article  CAS  PubMed  Google Scholar 

  35. Peters, J. E., Makarova, K. S., Shmakov, S. & Koonin, E. V. Recruitment of CRISPR–Cas systems by Tn7-like transposons. Proc. Natl Acad. Sci. USA 114, E7358–E7366 (2017).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  36. Nakamura, M., Gao, Y., Dominguez, A. A. & Qi, L. S. CRISPR technologies for precise epigenome editing. Nat. Cell Biol. 23, 11–22 (2021).

    Article  CAS  PubMed  Google Scholar 

  37. Zetsche, B. et al. Cpf1 is a single RNA-guided endonuclease of a class 2 CRISPR–Cas system. Cell 163, 759–771 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. He, S. et al. The IS200/IS605 family and “peel and paste” single-strand transposition mechanism. Microbiol. Spectr. https://doi.org/10.1128/microbiolspec.MDNA3-0039-2014 (2015).

  39. Doeven, M. K., van den Bogaart, G., Krasnikov, V. & Poolman, B. Probing receptor–translocator interactions in the oligopeptide ABC transporter by fluorescence correlation spectroscopy. Biophys. J. 94, 3956–3965 (2008).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  40. Biemans-Oldehinkel, E., Doeven, M. K. & Poolman, B. ABC transporter architecture and regulatory roles of accessory domains. FEBS Lett. 580, 1023–1035 (2006).

    Article  CAS  PubMed  Google Scholar 

  41. Mukherjee, S. et al. CsrA–FliW interaction governs flagellin homeostasis and a checkpoint on flagellar morphogenesis in Bacillus subtilis. Mol. Microbiol. 82, 447–461 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Lawrence, J. G. Shared strategies in gene organization among prokaryotes and eukaryotes. Cell 110, 407–413 (2002).

    Article  CAS  PubMed  Google Scholar 

  43. Siguier, P., Perochon, J., Lestrade, L., Mahillon, J. & Chandler, M. ISfinder: the reference centre for bacterial insertion sequences. Nucleic Acids Res. 34, D32–D36 (2006).

    Article  CAS  PubMed  Google Scholar 

  44. Leenay, R. T. & Beisel, C. L. Deciphering, communicating, and engineering the CRISPR PAM. J. Mol. Biol. 429, 177–191 (2017).

    Article  CAS  PubMed  Google Scholar 

  45. Michaux, C. et al. Single-nucleotide RNA maps for the two major nosocomial pathogens Enterococcus faecalis and Enterococcus faecium. Front. Cell. Infect. Microbiol. 10, 600325 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Nety, S. P. et al. The transposon-encoded protein TnpB processes its own mRNA into ωRNA for guided nuclease activity. CRISPR J. 6, 232–242 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Ohnishi, K., Kutsukake, K., Suzuki, H. & Iino, T. Gene fliA encodes an alternative sigma factor specific for flagellar operons in Salmonella typhimurium. Mol. Gen. Genet. 221, 139–147 (1990).

    Article  CAS  PubMed  Google Scholar 

  48. Ide, N., Ikebe, T. & Kutsukake, K. Reevaluation of the promoter structure of the class 3 flagellar operons of Escherichia coli and Salmonella. Genes Genet. Syst. 74, 113–116 (1999).

    Article  CAS  PubMed  Google Scholar 

  49. Klepsch, M. M. et al. Escherichia coli peptide binding protein OppA has a preference for positively charged peptides. J. Mol. Biol. 414, 75–85 (2011).

    Article  CAS  PubMed  Google Scholar 

  50. Solovyev, V. A. S. in Metagenomics and its Applications in Agriculture, Biomedicine and Environmental Studies (ed Li, R. W.) 61–78 (Nova Science Publishers, 2011).

  51. Swarts, D. C., van der Oost, J. & Jinek, M. Structural basis for guide RNA processing and seed-dependent DNA targeting by CRISPR–Cas12a. Mol. Cell 66, 221–233.e4 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Qi, L. S. et al. Repurposing CRISPR as an RNA-guided platform for sequence-specific control of gene expression. Cell 152, 1173–1183 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Zhang, X. et al. Multiplex gene regulation by CRISPR–ddCpf1. Cell Discov. 3, 17018 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Kim, S. K. et al. Efficient transcriptional gene repression by type V-A CRISPR–Cpf1 from Eubacterium eligens. ACS Synth. Biol. 6, 1273–1282 (2017).

    Article  CAS  PubMed  Google Scholar 

  55. Samatey, F. A. et al. Structure of the bacterial flagellar protofilament and implications for a switch for supercoiling. Nature 410, 331–337 (2001).

    Article  CAS  PubMed  ADS  Google Scholar 

  56. Yonekura, K., Maki-Yonekura, S. & Namba, K. Complete atomic model of the bacterial flagellar filament by electron cryomicroscopy. Nature 424, 643–650 (2003).

    Article  CAS  PubMed  ADS  Google Scholar 

  57. Reid, S. D., Selander, R. K. & Whittam, T. S. Sequence diversity of flagellin (fliC) alleles in pathogenic Escherichia coli. J. Bacteriol. 181, 153–160 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Esteves, N. C., Bigham, D. N. & Scharf, B. E. Phages on filaments: a genetic screen elucidates the complex interactions between Salmonella enterica flagellin and bacteriophage Chi. PLoS Pathog. 19, e1011537 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Guttenplan, S. B. & Kearns, D. B. Regulation of flagellar motility during biofilm formation. FEMS Microbiol. Rev. 37, 849–871 (2013).

    Article  CAS  PubMed  Google Scholar 

  60. Dacquay, L. C. et al. E. coli nissle increases transcription of flagella assembly and formate hydrogenlyase genes in response to colitis. Gut Microbes 13, 1994832 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  61. Kim, M. J., Lim, S. & Ryu, S. Molecular analysis of the Salmonella typhimurium tdc operon regulation. J. Microbiol. Biotechnol. 18, 1024–1032 (2008).

    CAS  PubMed  Google Scholar 

  62. Esteves, N. C. & Scharf, B. E. Flagellotropic bacteriophages: opportunities and challenges for antimicrobial applications. Int. J. Mol. Sci. 23, 7084 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Yoon, S. I. et al. Structural basis of TLR5-flagellin recognition and signaling. Science 335, 859–864 (2012).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  64. Tenthorey, J. L. et al. The structural basis of flagellin detection by NAIP5: a strategy to limit pathogen immune evasion. Science 358, 888–893 (2017).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  65. Wang, L., Rothemund, D., Curd, H. & Reeves, P. R. Species-wide variation in the Escherichia coli flagellin (H-antigen) gene. J. Bacteriol. 185, 2936–2943 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Cullender, T. C. et al. Innate and adaptive immunity interact to quench microbiome flagellar motility in the gut. Cell Host Microbe 14, 571–581 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Brussow, H., Canchaya, C. & Hardt, W. D. Phages and the evolution of bacterial pathogens: from genomic rearrangements to lysogenic conversion. Microbiol. Mol. Biol. Rev. 68, 560–602 (2004).

    Article  PubMed  PubMed Central  Google Scholar 

  68. Rees, D. C., Johnson, E. & Lewinson, O. ABC transporters: the power to change. Nat. Rev. Mol. Cell Biol. 10, 218–227 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Holtzman, L. & Gersbach, C. A. Editing the epigenome: reshaping the genomic landscape. Annu. Rev. Genomics Hum. Genet. 19, 43–71 (2018).

    Article  CAS  PubMed  Google Scholar 

  70. Bikard, D. et al. Programmable repression and activation of bacterial gene expression using an engineered CRISPR–Cas system. Nucleic Acids Res. 41, 7429–7437 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Cui, L. et al. A CRISPRi screen in E. coli reveals sequence-specific toxicity of dCas9. Nat. Commun. 9, 1912 (2018).

    Article  PubMed  PubMed Central  ADS  Google Scholar 

  72. Vigouroux, A., Oldewurtel, E., Cui, L., Bikard, D. & van Teeffelen, S. Tuning dCas9’s ability to block transcription enables robust, noiseless knockdown of bacterial genes. Mol. Syst. Biol. 14, e7899 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  73. Workman, R. E. et al. A natural single-guide RNA repurposes Cas9 to autoregulate CRISPR–Cas expression. Cell 184, 675–688.e19 (2021).

    Article  CAS  PubMed  Google Scholar 

  74. Ratner, H. K. et al. Catalytically active Cas9 mediates transcriptional interference to facilitate bacterial virulence. Mol. Cell 75, 498–510.e5 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Eddy, S. R. Accelerated profile HMM searches. PLoS Comput. Biol. 7, e1002195 (2011).

    Article  MathSciNet  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  76. Huang, Y., Niu, B., Gao, Y., Fu, L. & Li, W. CD-HIT Suite: a web server for clustering and comparing biological sequences. Bioinformatics 26, 680–682 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Katoh, K., Kuma, K., Toh, H. & Miyata, T. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 33, 511–518 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. Capella-Gutierrez, S., Silla-Martinez, J. M. & Gabaldon, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  79. Minh, B. Q. et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 37, 1530–1534 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Letunic, I. & Bork, P. Interactive Tree of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 49, W293–W296 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Wright, E. S. DECIPHER: harnessing local sequence context to improve protein multiple sequence alignment. BMC Bioinformatics 16, 322 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  82. Biostrings: String objects representing biological sequences, and matching algorithms. R package version 2.70.1 (Pagès, H. A. P., Gentleman, R. & DebRoy, S., 2023).

  83. Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinformatics 10, 421 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  84. Cantalapiedra, C. P., Hernandez-Plaza, A., Letunic, I., Bork, P. & Huerta-Cepas, J. eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol. Biol. Evol. 38, 5825–5829 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  85. Hyatt, D. et al. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11, 119 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  86. Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2 — approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490 (2010).

    Article  PubMed  PubMed Central  ADS  Google Scholar 

  87. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  88. Goddard, T. D. et al. UCSF ChimeraX: meeting modern challenges in visualization and analysis. Protein Sci. 27, 14–25 (2018).

    Article  CAS  PubMed  Google Scholar 

  89. Waterhouse, A. M., Procter, J. B., Martin, D. M., Clamp, M. & Barton, G. J. Jalview version 2 — a multiple sequence alignment editor and analysis workbench. Bioinformatics 25, 1189–1191 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  90. Mirdita, M. et al. ColabFold: making protein folding accessible to all. Nat. Methods 19, 679–682 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  91. Pei, J., Kim, B. H. & Grishin, N. V. PROMALS3D: a tool for multiple protein sequence and structure alignments. Nucleic Acids Res. 36, 2295–2300 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  92. Nawrocki, E. P. & Eddy, S. R. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics 29, 2933–2935 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  93. Will, S., Joshi, T., Hofacker, I. L., Stadler, P. F. & Backofen, R. LocARNA-P: accurate boundary prediction and improved detection of structural RNAs. RNA 18, 900–914 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  94. Vasimuddin M., Misra, S., Li, H. & Aluru, S. Efficient architecture-aware acceleration of BWA-MEM for multicore systems. In 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS) https://doi.org/10.1109/IPDPS.2019.00041 (IEEE, 2019).

  95. Danecek, P. et al. Twelve years of SAMtools and BCFtools. Gigascience 10, giab008 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  96. Ramirez, F. et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 44, W160–W165 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  97. Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  98. Hoffmann, F. T. et al. Selective TnsC recruitment enhances the fidelity of RNA-guided transposition. Nature 609, 384–393 (2022).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  99. Zhang, Y. et al. Model-based analysis of ChIP–seq (MACS). Genome Biol. 9, R137 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  100. Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  101. Bailey, T. L. et al. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. 37, W202–W208 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  102. Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal https://doi.org/10.14806/ej.17.1.200 (2011).

  103. Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).

    Article  CAS  PubMed  Google Scholar 

  104. Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  105. Sharan, S. K., Thomason, L. C., Kuznetsov, S. G. & Court, D. L. Recombineering: a homologous recombination-based method of genetic engineering. Nat. Protoc. 4, 206–223 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  106. Luo, G. et al. flrA, flrB and flrC regulate adhesion by controlling the expression of critical virulence genes in Vibrio alginolyticus. Emerg. Microbes Infect. 5, e85 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  107. Kreutzberger, M. A. B. et al. Flagellin outer domain dimerization modulates motility in pathogenic and soil bacteria from viscous environments. Nat. Commun. 13, 1422 (2022).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  108. Shevchenko, A., Tomas, H., Havlis, J., Olsen, J. V. & Mann, M. In-gel digestion for mass spectrometric characterization of proteins and proteomes. Nat. Protoc. 1, 2856–2860 (2006).

    Article  CAS  PubMed  Google Scholar 

  109. Kulak, N. A., Pichler, G., Paron, I., Nagaraj, N. & Mann, M. Minimal, encapsulated proteomic-sample processing applied to copy-number estimation in eukaryotic cells. Nat. Methods 11, 319–324 (2014).

    Article  CAS  PubMed  Google Scholar 

  110. Meier, F. et al. Online parallel accumulation-serial fragmentation (PASEF) with a novel trapped ion mobility mass spectrometer. Mol. Cell. Proteomics 17, 2534–2545 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  111. Cox, J. et al. Andromeda: a peptide search engine integrated into the MaxQuant environment. J. Proteome Res. 10, 1794–1805 (2011).

    Article  CAS  PubMed  ADS  Google Scholar 

Download references

Acknowledgements

We thank S. R. Pesari and Z. Akhtar for laboratory support; G. D. Lampe for suggesting the TldR moniker; A. Bernheim for helpful discussions; F. Tesson, A. Bernheim, A. M. Earl and D. Gray for sharing E. coli and Enterobacter strains; C. Lu for Covaris sonicator access; R. K. Soni for mass spectrometry support; L. F. Landweber for qPCR instrument access; and the JP Sulzberger Columbia Genome Center for next-generation sequencing support. S.T. was supported by a Medical Scientist Training Program grant (5T32GM145440-02) from the NIH. M.W.G.W. was supported by a National Science Foundation Graduate Research Fellowship. C.M. was supported by the NIH Postdoctoral Fellowship F32 GM143924-01A1. S.H.S. was supported by the NSF Faculty Early Career Development Program (CAREER) Award 2239685, a Pew Biomedical Scholarship, an Irma T. Hirschl Career Scientist Award, and a startup package from the Columbia University Irving Medical Center Dean’s Office and the Vagelos Precision Medicine Fund.

Author information

Authors and Affiliations

Authors

Contributions

T.W., C.M., and S.H.S. conceived and designed the project. T.W. performed all of the bioinformatics experiments and aided in the design of the experimental assays. F.T.H. performed plasmid interference, ChIP–seq, and the RFP repression assays. M.W.G.W. designed and generated the E. coli strains and plasmids for the RFP repression assays, fragments for Enterobacter recombineering, conducted the motility assays and isolated flagella for liquid chromatography with tandem mass spectrometry. S.T. performed and analysed the RNA-seq and RIP-seq experiments. E.R. cultured Enterobacter strains, extracted RNA for RNA-seq, and performed the RT–qPCR and recombineering experiments. C.M. performed the preliminary TnpB bioinformatics and neighbourhood analyses, together with H.C.L., and helped design the ChIP–seq and RFP repression assays. T.W. and S.H.S. discussed the data and wrote the manuscript, with input from all authors.

Corresponding author

Correspondence to Samuel H. Sternberg.

Ethics declarations

Competing interests

Columbia University has filed a patent application related to this work. M.W.G.W. is a co-founder of Can9 Bioengineering. S.H.S. is a co-founder and scientific advisor to Dahlia Biosciences, a scientific advisor to CrisprBits and Prime Medicine, and an equity holder in Dahlia Biosciences and CrisprBits. All other authors declare no competing interests.

Peer review

Peer review information

Nature thanks Wen Wu and the other, anonymous reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Phylogeny and RuvC nuclease domain analysis of oppF-associated TldRs.

a, Phylogenetic tree of oppF-associated TldR proteins from Fig. 2a, together with closely related TnpB proteins that contain intact RuvC active sites. The rings indicate RuvC DED active site intactness (inner) and TldR/TnpB domain composition (outer). Homologs marked with an orange square (TnpB) or green circle (TldR) were tested in heterologous experiments. b, Multiple sequence alignment of representative TnpB and TldR sequences from a, highlighting deterioration of RuvC active site motifs (shaded in red) and loss of the C-terminal zinc-finger (ZnF)/RuvC domain. Highly conserved residues are shaded in grey. c, Empirical (DraTnpB) and predicted AlphaFold structures of TnpB and TldR homologs marked with an asterisk in b, showing progressive loss of the active site catalytic triad.

Extended Data Fig. 2 Diverse prophages encode fliCP-associated tldR genes.

a, Genomic architecture of representative prophage elements whose boundaries could be identified by comparison to closely related isogenic strains. In each example, the prophage-containing strain is shown above the prophage-lacking strain, with species/strain names and NCBI genomic accession IDs indicated. Sequences flanking the left (5′) and right (3′) ends are highlighted in purple and yellow, respectively, together with their percentage sequence identities calculated using BLASTn. b, Alignment of distinct prophage elements, constructed using Mauve. Empty boxes represent open reading frames, and windows show sequence conservation for regions compared between prophage genomes with lines. Putative gene functions are shown below sequence conservation windows for the fliCP-tldR-encoding prophage from Enterobacter AR_163 (bottom). c, DNA sequence identities between the prophages in a, calculated with BLASTn. Identities were calculated as total matching nucleotides across the two genomes being compared, divided by the length of the query prophage genome.

Extended Data Fig. 3 RIP-seq reveals that some oppF-associated TldR proteins use short, 9–11-nt guides.

a, RNA immunoprecipitation sequencing (RIP-seq) data for an oppF-associated TldR homolog from Enterococcus faecalis (Efa1TldR) reveals the boundaries of a mature gRNA containing a 9-nt guide sequence. Reads were mapped to the TldR-gRNA expression plasmid; an input control is shown. b, Published RNA-seq data for Enterococcus faecalis V583 reveals similar gRNA boundaries, including an approximately 11-nt guide. c, RIP-seq data as in a for a second biological replicate of Efa1TldR, further corroborating the observed 9–11-nt guide length.

Extended Data Fig. 4 oppF-associated TldRs target conserved genomic sequences that overlap with promoter elements driving oppA expression.

a, Schematic of original (left) and improved (right) search strategy to identify putative targets of gRNAs used by oppF-associated TldRs. Key insights resulted from the use of TAM and a shorter, 9-nt guide. b, Analysis of the guide sequence from the Efa1TldR-associated gRNA in Extended Data Fig. 3 revealed a putative genomic target near the predicted promoter of oppA encoded within the same ABC transporter operon immediately adjacent to the tldR gene. The magnified schematics at the bottom show the predicted TAM and gRNA-target DNA base-pairing interactions for two representatives (Efa1TldR and EceTldR), in which the gRNAs target opposite strands. Promoter elements predicted with BPROM are shown as brown squares. c, WebLogos of predicted guides and genomic targets associated with diverse oppF-associated TldRs highlighted in Extended Data Fig. 1. d, Schematic of the oppF-tldR genomic locus (left) alongside the predicted function of OppA as a solute binding protein that facilitates transport of polypeptide substrates from the periplasm to the cytoplasm, in complex with the remainder of the ABC transporter apparatus. e, Published RNA-seq data for Enterococcus faecium AUS000445, highlighting the oppA transcription start site (TSS). The predicted gRNA guide sequence (grey) is shown beneath the putative TAM (yellow) and target (purple) sequences, with guide-target complementarity represented by grey circles.

Extended Data Fig. 5 oppF-associated TldR homologs may target additional sites across the genome.

Schematic of Enterococcus cecorum genome and inset showing the oppF-tldR locus (top), with additional putative targets of the gRNA, other than the oppA promoter, numbered and highlighted in yellow along the genomic coordinate. A magnified view for each numbered target is shown below, with TAMs in yellow, prospective targets in purple, and TldR gRNA guide sequences in grey. Grey circles (right) represent positions of expected guide-target complementarity.

Extended Data Fig. 6 Genome-wide binding data from ChIP-seq experiments suggest a high mismatch tolerance for some TldR homologs.

a, Genome-wide ChIP–seq profiles for the indicated fliCP-associated TldR homologs, normalized to the highest peak within each dataset. The magnified insets at the bottom show the off-target sequences (grey) compared to the intended (engineered) on-target sequence (purple), with TAMs in yellow. Off-target #3 has no clear TAM-flanked off-target sequence but is intriguingly located at a tRNA locus, and binding was observed for diverse fliCP- and oppF-associated TldRs that recognized distinct TAMs. The phylogenetic tree at right indicates the relatedness of the tested and labeled homologs. b, Results for the indicated oppF-associated TldR homologs, shown as in a.

Extended Data Fig. 7 Plasmid interference assays confirm that TldR homologs lack detectable nuclease activity.

a, Schematic of E. coli-based plasmid interference assay using pEffector and pTarget. b, Representative dilution spot assays for GstTnpB3 and synthetically inactivated RuvC mutant (D196A), showing the entire plate (left) and the magnified area of plating. Transformants were serially diluted, plated on selective media, and cultured at 37 °C for 16 h. Colony visibility was enhanced by inverting the colors and increasing contrast/brightness. c, Dilution spot assays for the indicated fliC-associated TldR homologs (left) and closely related TnpB homologs (right). Non-targeting (NT) gRNA controls are shown at the bottom, and the phylogenetic tree indicates the relatedness of the tested proteins. d, Results for the indicated oppF-associated TldR and TnpB homologs, shown as in c.

Extended Data Fig. 8 RFP repression assays reveal variable abilities of TldR homologs to block transcription elongation.

a, RFP repression activity was measured (right) as in Fig. 4f,g using modified gRNAs exhibiting variable complementarity to the target site, as schematized in the grid (left). A gRNA was also tested that lacked the extra 5′ sequence which was absent in RIP-seq reads of mature gRNAs (20 nt no 5′ seq). Bars indicate mean ± s.d. (n = 3 biological replicates). b, Schematic of RFP repression assay in which gRNAs were designed to target either the top or bottom strand within the 5′ UTR of RFP, downstream of the promoter. The phylogenetic trees (right) indicate the relatedness of the tested and labeled homologs. c, Bar graphs plotting normalized RFP fluorescence for the indicated conditions and TldR homologs. EV, empty vector; NT, non-targeting guide. Results with nuclease-dead dCas12 and dCas9 are shown for comparison. Bars indicate mean ± s.d. (n = 3 biological replicates for TldR; n = 6 biological replicates for dCas12/dCas9).

Extended Data Fig. 9 Enterobacter RNA-seq data confirm the native expression of gRNAs from fliCP-tldR loci.

a, RNA-seq read coverage from three Enterobacter strains that natively encode fliCP-tldR loci, revealing clear peaks associated with mature gRNAs containing ~95–97-nt scaffolds and 16-nt guides. Data from three biological replicates are overlaid. b, Predicted secondary structure and sequence of the gRNA associated with EhoTldR. c, Multiple sequence alignment of the DNA encoding gRNA scaffold sequences for representative fliCP-associated TldRs, with conserved positions colored in darker blue.

Extended Data Fig. 10 FliCP is expressed and incorporated into Enterobacter flagella, concomitantly with host FliC repression.

a, RNA-seq read coverage across the tldR-encoding prophage of Enterobacter sp. BIDMC93, demonstrating strong expression of fliCP, tldR, and the gRNA, alongside other genes involved in lysogeny maintenance (e.g. CI). b, Motility assays (left) with wild-type (WT) and Enterobacter deletion strains reveal similar motility phenotypes, as visualized with LB-agar plate images (middle) and a bar graph quantifying motility via halo size (right). Plate images and bar graphs represent three biological replicates; bars indicate mean ± s.d. c, Schematic representation of FliC/FliCP homologs encoded by Enterobacter sp. BIDMC93, with relative genomic positions indicated. FliC2 is a second host flagellin gene copy encoded at an alternate flagellar assembly locus within this strain, which is not targeted by TldR and not commonly present in other Enterobacter strains. d, Results from liquid chromatography with tandem mass spectrometry (LC–MS/MS) analyses performed on digested peptides from purified flagellar filaments, isolated from the three indicated Enterobacter sp. BIDMC93 strains. The WT ( + CmR) strain encodes the cmR gene downstream of the tldR-gRNA locus (as in Fig. 5e). Data represent the label free quantification (LFQ) intensities reflecting the variable D2-3 regions of FliC, FliCP, or FliC2. Although the FliC2 appears to be the most dominant flagellin component, the relevant amounts of host FliC and FliCP demonstrate that prophage-encoded FliCP readily assembles into extracellular flagellar filaments, and that host FliC production is de-repressed upon prophage deletion. e, Quantification of changes in the expression profiles of Enterobacter FliC homologs, measured from RNA-seq data of three biological replicates depicted in Fig. 5f,g. TPM, transcripts per million. f, Alignment of fliC/fliCP/fliC2 promoters indicates that guide RNA-target DNA mismatches prevent TldR-targeting of fliC2 and fliCP in Enterobacter sp. BIDMC93. g, RNA-seq read coverage in the host fliC promoter/5′-UTR region overlayed for three biological replicates of four Enterobacter strains, with labeled TAM and target sequences highlighted upstream of the TSS. Strain AR136 (top) does not encode a fliCP-tldR locus; note the distinct expression levels, measured via relative counts per million (CPM). h, Alignment of host fliC promoter regions for the strains shown in g compared to E. coli K12, with percent sequence identities indicated on the right. Reported FliA/σ28 promoter elements from E. coli K12 are shown below the alignment. i, RNA-seq read coverage in the prophage-encoded fliCP promoter/5′-UTR region overlayed for three biological replicates of two representative Enterobacter strains, confirming the predicted TSS. j, Schematic of multiple sequence alignment of the promoter region driving fliCP gene expression, across six verified prophages described in Extended Data Fig. 2, highlighting the region that was queried for MEME motif detection.

Supplementary information

Supplementary Information

This file contains Supplementary Figs. 1–6.

Reporting Summary

Supplementary Tables

Supplementary Tables 1–8.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wiegand, T., Hoffmann, F.T., Walker, M.W.G. et al. TnpB homologues exapted from transposons are RNA-guided transcription factors. Nature 631, 439–448 (2024). https://doi.org/10.1038/s41586-024-07598-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41586-024-07598-4

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing Microbiology

Sign up for the Nature Briefing: Microbiology newsletter — what matters in microbiology research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: Microbiology