Background

Plants have evolved a range of mutualistic endosymbiotic partnerships with microbes to enhance nutrient uptake. The most ancient mutualistic endosymbiosis is the interaction between plant roots and Glomeromycota fungi, also known as arbuscular mycorrhizal (AM) fungi, which evolved over 400 million years ago [1]. Even today, AM endosymbiosis still occurs in ~ 72% of all higher plants [2]. Besides AM symbiosis, several plant lineages evolved additional or even alternative mutualistic endosymbiotic interactions, like orchid mycorrhiza, ericoid mycorrhiza, and diazotrophic rhizobia or Frankia bacteria hosted in root nodules. Interestingly, the evolution of these mutualistic endosymbiotic partnerships co-opted a signalling pathway critical for AM symbiosis. This pathway, known as the common symbiosis signalling pathway, is highly conserved and can be found in angiosperms, gymnosperms, monilophytes, and bryophytes [3].

The common symbiosis signalling pathway was first discovered in pea (Pisum sativum), showing to be critical for AM symbiosis and rhizobium-induced nodulation [4]. The subsequent molecular genetic characterisation in the legume models Lotus japonicus and Medicago truncatula revealed the pathway consists of four conserved components stretching from an LRR-type transmembrane receptor kinase down to the transcription factor LjCYCLOPS/MtIPD3 [5, 6]. The LRR-type receptor kinase is generally named SYMRK (SYMBIOSIS SIGNALLING RECEPTOR KINASE), except for pea, M. truncatula, and Medicago sativa, where it is named PsSYM19, MtDMI2, and MsNORK, respectively [7]. The SYMRK extracellular structure varies between species, but in case of eudicots possesses a malectin domain, a conserved GDPC motif, and 2–3 LRR domains linked to a canonical intracellular serine-threonine kinase domain [7,8,9,10]. The malectin domain is cleaved in the absence of symbiotic signalling [11, 12]. Studies in L. japonicus showed that the remaining part of the SYMRK protein interacts with the LysM-type transmembrane receptor LjNFR5 [11, 12]. LjNFR5 is part of the receptor complex essential for recognising rhizobium-secreted lipo-chitooligosaccharide (LCO) signal molecules [13, 14]. Legume symrk knockout mutants are blocked in rhizobium LCO-induced signalling through the common symbiosis signalling pathway. Subsequently, nodule formation is not initiated, nor is Rhizobium infection initiated in symrk mutants [7, 8, 15, 16]. LjSYMRK also interacts with the innate immune receptor LjBAK1 (BRASSINOSTEROID INSENSITIVE 1-ASSOCIATED RECEPTOR KINASE 1), which may allow repression of immune responses upon symbiotic infection [17]. Such a role is supported by symrk mutant analysis, revealing fortification of the plant cell wall upon infection with Glomus mosseae AM fungus or rhizobium in mutant or RNAinterference (RNAi) lines [18, 19].

Studies on SYMRK in non-legumes are limited. RNAi Knockdown studies in the actinorhizal plants Datisca glomerata and Casuarina glauca showed that, like in legumes, SYMRK is essential for nodulation [9, 20]. These findings demonstrate that the common symbiosis signalling pathway defines a conserved genetic basis for nodulation with rhizobia or Frankia.

More recent phylogenomic studies support the hypothesis that the nodulation trait has a single evolutionary origin in the last common ancestor of the orders Fabales, Fagales, Cucurbitales and Rosales, representing all ten nodulating plant lineages [20,21,22]. The occurrence of non-nodulating lineages in these four taxonomic orders allowed the identification of nodulation-specific genes, as such genes are prone to pseudogenization from the moment a plant lineage loses the nodulation trait. We identified seven of such nodulation-specific genes by comparing nodulating Parasponia species to their non-nodulating sister species of the genus Trema [23]. Among these is an NFR5 orthologous LysM-type receptor named NFP2, essential for nodulation in Parasponia [24]. To our surprise, however, we also identified a seemingly critical mutation in SYMRK of Trema orientalis (accession RG33; TorSYMRKRG33), originating from the Sabah Provence in Malaysian Borneo [25]. It suggests that the TorSYMRKRG33 allele experiences pseudogenization, despite the fact T. orientalis accession RG33 can still establish an AM symbiosis [23].

TorSYMRKRG33 has a conserved gene structure, though has a mutation in the conserved dinucleotide motif in the 5’-donor splice site of intron 12, converting this generally highly conserved dinucleotide motif into ‘GA’. This led us to investigate the symbiotic functioning of SYMRK in the Trema-Parasponia lineage and investigate the impact of a seemingly critical SNP in an intron donor splice site in this gene.

Results

Trema orientalis and Parasponia andersonii differ in Rhizophagus irregularis colonization

Since SYMRK is known to be important for arbuscular mycorrhization in a range of species [7, 8, 20, 26], we first questioned whether T. orientalis accession RG33 can be effectively mycorrhized. To investigate this, we compared the mycorrhization dynamics of T. orientalis RG33 to P. andersonii (accession WU1). Both species are close relatives that diverged less than 20 million years ago [22], though have a somewhat different root architecture. T. orientalis plantlets have a shorter main root, whereas its lateral roots are longer when compared to P. andersonii (Fig. S1).

To compare the mycorrhization efficiency, seedlings of both species were inoculated with 125 spores of Rhizophagus irregularis DOAM197198. Mycorrhization was quantified for 6 weeks, focussing on the frequency of mycorrhizal presence in the root system(F%), the intensity of mycorrhization in the root system (M%), the arbuscule abundance in the root system (A%), and the averaged arbuscule abundance in randomly selected infected root segments (a%) [27]. This revealed a clear difference in mycorrhization colonization dynamics between both species. The root system of P. andersonii is broadly colonized, showing an abundant presence of hyphae 4 weeks post-inoculation (F% > 80%, M% > 50%, Fig. 1A, C). In contrast, T. orientalis RG33 showed a reduced mycorrhizal infection and a low abundance of mycorrhizal hyphae in the root (F% < 20%, M% < 10%, 4 weeks post-inoculation) (Fig. 1A, D). These reduced mycorrhizal infection rates of T. orientalis RG33 were also reflected in a reduced number of arbuscules found in the root system (A%). However, when evaluating the infected root segments, the arbuscule abundance (a%) was comparable to P. andersonii (Fig. 1B). This indicates that T. orientalis RG33 is infected less frequently by R. irregularis DOAM197198 when compared to P. andersonii. But once infected, the number of arbuscules formed in the infected root segment is similar between both species.

Fig. 1
figure 1

Trema orientalis accession RG33 and Parasponia andersonii accession WU1 differ in mycorrhizal colonisation. A Comparison of mycorrhization efficiency in the root system of P. andersonii WU1 (blue) and T. orientalis RG33 (red) at 2, 4 and 6 weeks post-inoculation with Rhizophagus irregularis DOAM197198. F%: The frequency of mycorrhiza in the root system. M%: the intensity of mycorrhizal colonisation in the root system. A%: Arbuscule abundance in the root system. B a%: Averaged arbuscule abundance detected in 50 randomly selected 1 cm infected segments of a root system. Error bars represent the SE of 10 biological replicates for each 50 × 1 cm root segment that has been analysed. Analysis was done according to Trouvelot et al. (1986) [26]. C Toluidine blue-stained P. andersonii and D T. orientalis root segment visualising R. irregulates arbuscules 6 weeks post-inoculation. Size bar = 10 μm

Parasponia andersonii SYMRK is essential for arbuscular mycorrhization and nodulation

As T. orientalis RG33 can establish an arbuscular mycorrhizal symbiosis, we questioned whether SYMRK represents a single copy gene in the Trema-Parasponia taxonomic lineage. We analysed genome sequences of 20 species representing monocots and major clades of dicots, including Fabales, Fagales, Cucurbiales, and Rosales species. The closest SYMRK paralogs of P. andersonii and T. orientalis were included as an outgroup. This revealed that SYMRK is a single-copy gene in the Parasponia—Trema lineage (Fig. S2).

Knock-down experiments in legumes and the actinorhizal species C. glauca and D. glomerata showed that SYMRK commits a dual role in establishing arbuscular mycorrhizal symbiosis and nodulation [7,8,9, 15, 16, 20, 28]. Furthermore, studies in L. japonicus revealed that ectopic expression of LjSYMRK results in the spontaneous onset of nodule organogenesis in absence of rhizobia [29]. To determine whether SYMRK in the Parasponia-Trema lineage fulfils a similar symbiotic role, two experiments were conducted. We generated CRISPR/Cas9 symrk knockout mutants in P. andersonii and conducted PanSYMRK ectopic expression studies in roots.

In total, three Pansymrk knockout mutant lines (homozygous line Pansymrk-4 and the bi-allelic mutant lines Pansymrk-5 and Pansymrk-6) were obtained by targeting the fourth and fifth coding exon using two single guide RNAs (sgRNAs) (Fig. S3A). All mutant alleles represent large deletions, only encoding a fragment of the extracellular domain (Fig. S3B). To determine whether SYMRK commits a key symbiotic function in P. andersonii, we first studied the nodulation phenotype of the Pansymrk mutants. Pansymrk-4, Pansymrk-5 and Pansymrk-6 plantlets were inoculated with Mesohizobium plurifarium BOR2, and the nodulation phenotypes were examined six weeks post-inoculation. The transgenic empty vector control plants (EV) were effectively nodulated, having nodule numbers ranging from 25 to 61 per plant. In contrast, the three Pansymrk mutant lines were unable to nodulate (Fig. 2A).

Fig. 2
figure 2

Parasponia andersonii SYMRK is essential for mycorrhization and nodulation. A Nodule numbers formed in P. andersonii empty vector control line (EV) and three Pansymrk mutant lines, 6 weeks post-inoculation with Mesorhizobium plurifarium BOR2. B mycorrhization efficiency in the root system of P. andersonii EV-control and three independent Pansymrk mutant lines 6 weeks post-inoculation with Rhizophagus irregularis DOAM197198. F%: The frequency of mycorrhiza in the infected root system. M%: the intensity of mycorrhizal colonisation in the infected root system. A%: Arbuscule abundance in the infected root system. a%: Averaged arbuscule abundance detected in 50 randomly selected 1 cm segments of a root system. Error bars represent the SE of 10 biological replicates, for each 50 × 1 cm root segment that has been analyzed. Analysis was done according to Trouvelot et al. (1986) [26] (C-F): Toluidine blue-stained P. andersonii EV-control C, Pansymrk-4 D, Pansymrk-5 E, and Pansymrk-6 F root segment visualizing R. irregulates infections 6 weeks post-inoculation. Size bar = 10 μm

Next, we investigated the role of PanSYMRK in arbuscular mycorrhizal symbiosis. Pansymrk-4, Pansymrk-5, Pansymrk-6, and EV control plantlets were inoculated with an R. irregularis DAOM197198 spore suspension. Mycorrhization phenotypes were examined six weeks post-inoculation by quantifying four parameters; F%, M%, a%, and A%, as described above. The EV control plants interacted normally with the applied symbiont, with F%, M%, a%, and A% of 65,4%, 36,8%, 77,1%, and 26,1%, respectively (Fig. 2B, C). Although some intraradical hyphae were observed in a minority of the Pansymrk root segments (7 out of 417, 6 out of 760, and 9 out of 1085 segments) (Fig. 2B, D-F), generally, no arbuscules were observed in any of the tested Pansymrk mutant plantlets. This demonstrates that SYMRK is essential for nodulation and arbuscular mycorrhization of P. andersonii roots.

Next, we questioned whether the ectopic expression of PanSYMRK is sufficient for spontaneous formation of nodule-like structures. We employed Agrobacterium rhizogenes-mediated (A. rhizogenes) root transformation to introduce PanSYMRK driven by the L. japonicus UBIQUITIN 1 (LjUBI1) promoter. This revealed spontaneous formation of nodule-like structures on roots ectopically expressing PanSYMRK (n = 5/25) (Fig. 3A-C). Longitudinal sections revealed that these nodule-like structures originate from dividing cortical and pericycle cells, similar to genuine Parasponia nodules (Fig. 3D). This led us to conclude that SYMRK is an essential key regulatory LRR-type receptor kinase for the onset of the nodule developmental program in the non-legume P. andersonii.

Fig. 3
figure 3

PanSYMRK ectopic expression induces spontaneous nodulation in Parasponia andersonii. (A, B) Bright-field A and green fluorescent image B of P. andersonii A. rhizogenes-transformed roots expressing GFP and PanSYMRK under control of the pLjUBI1 promoter showing spontaneously formed nodule-like structures (6 weeks post planting). C Relative gene expression of PanSYMRK in P. andersonii A. rhizogenes-transformed roots containing an empty vector (EV) or pLjUBI1:PanSYMRK (n = 3). D Longitudinal section of a spontaneously formed nodule-like structure visualizing cortical and pericycle cell divisions

The GA mutation of the 5’-donor splice site of intron 12 doesn’t affect SYMRK functionally

As T. orientalis RG33 -possessing a single SYMRK gene copy- can be mycorrhized effectively, it suggests that the TorSYMRKRG33 allele encodes a functional protein to support this plant-fungus symbiosis. Earlier studies in M. truncatula revealed that the SYMRK requirements differ between mycorrhizal colonization and rhizobium nodulation [7]. The M. truncatula R38 dmi2 mutant possesses a missense mutation converting a glycine to glutamic acid mutation at position 794 of the protein [7]. This mutation affects the kinase phosphorylation activity and the capacity of the protein to interact with it’s downstream target 3-HYDROXY-3-METHYLGLUTARYL COENZYME A REDUCTASE1 (MtHMGR1) [30, 31]. M. truncatula R38 dmi2 is affected in nodulation but not in mycorrhization, suggesting a SYMRK functional kinase domain is less critical for the latter interaction [7]. As the T. orientalis SYMRKRG33 may encode -at least in part- a truncated SYMRK protein lacking essential domains of the kinase motif (Fig. 4A), we question to what extent this allele could function in nodulation.

Fig. 4
figure 4

Parasponia symrk-5 mutant trans-complementation of root nodule symbiosis. A Schematic representation of P. andersonii SYMRK gene structure. Arrowhead points to the location of the introduced GA mutation in PanSYMRK at the 5’-donor splice site of intron 12. B Nodule number per plant formed on Pansymrk-5 A. rhizogenes transformed root with pPanSYMRK:PanSYMRK gene (n = 13). (C-E) Nodule number per plant (n = 5) C, representative image of green flurescence protein (GFP) nodule D and a section through a mature nodule E of Pansymrk-5 A. rhizogenes transformed root with pPanSYMRK:PanSYMRKGA carrying a GA mutation at the 5’-donor splice site of intron 12. Nodules were harvested and analysed at 8 weeks post inoculation with Mesorhizobium plurifarium BOR2 (OD600 = 0.025)

To investigate this, first the native promoter region of P. andersonii SYMRK was identified. We used A. rhizogenes root transformation to show that a ~ 3 kb upstream region including the 5’-UTR driving the PanSYMRK gene functionally complemented the Pansymrk-5 mutant (4.9 nodules/plant at 8 wpi) (Fig. 4B; Fig. S4B). Next, we used this promoter to drive a PanSYMRK gene mutant harbouring a GA at the donor site of intron 12, mimicking the TorSYMRKRG33 allele to determine its functionality in the P. andersonii Pansymrk-5 mutant background. Using A. rhizogenes root transformation, we found full complementation of the Pansymrk mutant phenotype (Fig. 4C-E; Fig. S4C). On average, 13 nodules per plant were formed at 8 wpi. Sections of these nodules revealed a wild type cytoarchitecture, including a large zone of cells possessing fixation threads. This shows that the GA point mutation at the donor site of intron 12 is not affecting SYMRK gene functionality.

A GA 5’-donor splice site is very rare, though effectively spliced in TorSYMRK RG33

We question how effective an intron that possesses a GA as the first two nucleotides of a donor splice site is spliced. To determine this, we aimed to compare the coverage of RNAseq reads of the 15 exons and 14 introns of the SYMRK gene of T. orientalis and P. andersonii. SYMRK is highly similar in both species, though introns show some variation in length (Table 1). SYMRK is known to be expressed in the root [8]. We grew T. orientalis and P. andersonii seedlings in vitro on a low nitrate medium and subsequently isolated 1 cm regions of roots just above the root meristemic zone. RNA extracted from these samples was sequenced (in triplicates), mapped, and analysed (Fig. S5A). When focussing on intron 12, we found a per base mean coverage of 4.6 ± 1.0 for TorSYMRKRG33, whereas in P. andersonii, the coverage of this intron is only 0.2 ± 0.3 mean per base coverage (Table 1). Comparing the splice site efficiency of intron 12, we observe that GA splice site in T. orientalis splices efficiently at approximately 95%, while the GC splice site in P. andersonnii shows an efficiency of 99.9%. This difference in intron retention between PanSYMRK and TorSYMRKRG33 was also observed by qRT-PCR on root mRNA (Fig. S5B). These data suggest that SYMRK intron 12 is spliced less efficiently in T. orientalis when compared to P. andersonii. However, a similar variance is observed for other introns, which possess canonical donor and acceptor splice sites; e.g. PanSYMRK intron 11 (Table 1), suggesting some intron retention is not hampering gene function. Therefore, we conclude that SYMRKRG33 is fully functional, despite a non-canonical GA dinucleotide motif in the donor splice site.

Table 1 Splicing efficiency of Trema orientalis and Parasponia andersonii SYMRK in the root susceptible zone for symbiotic engagement

RNA-seq quantification for each intron and exon in the SYMRK gene of Trema orientalis and Parasponia andersonii is determined by the mean per base coverage of three biological replicates.

Next, we questioned how unique a GA donor splice site is in plants. For this, we analysed all annotated introns in T. orientalis, P. andersonii, and the model plant species L. japonicus, M. truncatula, and Arabidopsis thaliana [23, 32,33,34]. This showed that a GA donor splice site is extremely rare, varying from none in the annotated gene models of M. truncatula to 14 in A. thaliana (Table 2, Table S1).

Table 2 Frequency of predicted canonical and non-canonical donor splice sites. Splice site occurrences are based on existing gene models predictions for Trema orientalis, Parasponia andersonii, Lotus japonicus, Medicago truncatula and Arabidopsis thaliana

Trema orientalis SYMRK RG33 GA donor splice site is geographically limited

As T. orientalis RG33 possesses an extremely rare GA motif at the donor splice site of intron 12, we question to what extent such polymorphism is unique in SYMRK. First, we analysed SYMRK orthologs in a broad phylogenetic context. This showed that a non-canonical GC donor splice site is common in SYMRK intron 12 of dicotyledon species (Fig. 5).

Fig. 5
figure 5

Phylogeny of SYMRK including the splice site dinucleotide motifs for intron 12. Phylogeny was reconstructed based on an alignment of SYMRK orthologous proteins from 19 species. Leaves are labelled by their respective species, gene name if available) and gene identifier. The non-canonical GC donor splice site is common in SYMRK intron 12 of dicotyledon species, except in Glycine max SYMRKβ and Pisum sativum SYM19, where GC is substituted by GT. In contrast, only Trema orientalis RG33 possesses a GA motif in this position (highlighted in red)

However, none of the analysed SYMRK genes possesses a GA motif at this position. Subsequently, we analysed SYMRK of the Parasponia-Trema species complex. Among others, T. orientalis accession RG33 was collected during an expedition in Sabah Provence, Malaysian Borneo, in 2012 [23, 25]. We analyzed 27 additional T. orientalis individuals collected from five distinct locations in Malaysian Borneo (Fig. 6A). All possess the rare GA intron 12 donor splice, whereas this mutation is absent in Trema and Parasponia accessions sampled outside Borneo (Fig. 6; Table S2). This demonstrates that the SYMRKRG3 allele is not unique, though it associates with the Borneo T. orientalis population.

Fig. 6
figure 6

SYMRK intron 12 unique non-canonical donor splice site occurs in a Trema orientalis population endogenous to Sabah, Malaysia. A Locations of 28 Trema orientalis specimens collected in Malaysian Borneo, province of Sabah. 1: Sayap, 2: Poring, 3: Mahua, 4: Gunung Alab, and 5: Inobong. Plants were collected in 2012 as described in Merckx et al. (2015) [31] (see also Table S2). Map data © 2023 Google. B The ‘GA’ donor splice site of intron 12 is unique to Trema orientalis of Malaysia, Sabah, whereas related accessions and species possess a non-canonical ‘GC’ at this position in SYMRK

Discussion

The LRR-type receptor kinase SYMRK is a critical component in the common symbiosis signalling pathway controlling endosymbioses. In legumes, SYMRK is essential for rhizobium LCO-induced signalling. We identified a seemingly critical mutation of the conserved dinucleotide motif in the 5’-donor splice site in T. orientalis SYMRK accession RG33. T. orientalis is a non-nodulating relative of nitrogen-fixing Parasponia species and has experienced pseudogenization of several key nodulation genes [23]. Here we show that despite a mutation in a splice site motif, TorSYMRKRG33 remains a functional allele that can be effectively spliced. The dominant occurrence of the TorSYMRKRG33 allele in the Malaysian Borneo T. orientalis population underlines the splice site mutation is not affecting the fitness of the tree species.

Splicing is a highly conserved process in eukaryotes, requiring a spliceosome complex consisting of five small nuclear RNAs and several proteins. The vast majority of introns are spliced by the so-called U2-type spliceosome, recognizing two highly conserved di-nucleotide motifs at the start and end of the intron sequence, namely GT-AG. Bioinformatic studies in plant, animal, and fungal species indicate that alternative dinucleotide motifs are used in less than 2% of cases, among which GC-AG is the most abundant non-canonical splice motif representing 1.5% of all introns annotated in plant gene models [35, 36]. The GA-AG splicing motif, as found in TorSYMRKRG33 intron 12, is reported to occur in > 0.03% of the cases [36].

The mechanism driving the evolution of rare non-canonical splice sites remains elusive. The GA-AG dinucleotide splicing motif was found in higher frequency in two non-related animal species; the copepod Eurytemora affinis and the tunicate Oikopleura dioica [36,37,38]. However, it remains unknown whether both species have gained these by convergent evolution or, alternatively, it is an ancestral trait preserved in only a few species [36]. In the case of SYMRK, we noted that in related species, SYMRK intron 12 possesses the more common non-canonical GC-AG dinucleotide splice motif. This may lead to the hypothesis that such a GC-AG motif is the ancestral state allowing the evolution of the even more rare GA-AG motif. We inserted the GC to GA mutation in the P. andersonii SYMRK gene and showed that this variant is fully functional when expressed under its native promoter. This suggests that a simple single nucleotide polymorphism is sufficient to allow the evolution of the GA-AG dinucleotide splicing motif in TorSYMRKRG33. We analysed genomes of five plant species for gene models possessing a GA dinucleotide motif in the donor splice site. We found that the GA motif is indeed present in the annotated gene models, albeit at very low frequency in the analysed species.

Using CRISPR-Cas9 technology in P. andersonii, we demonstrated for the first time by mutant analysis that SYMRK commits a dual symbiotic role in essential nodulation and AM symbiosis in a non-legume. Earlier studies using RNAi in C. glauca and D. glomerata provided evidence that SYMRK is required for Frankia-induced nodulation and mycorrhization [9, 20]. Parasponia, Casuarina, and Datisca, together with legumes, represent all four taxonomic orders that contain nodulating species and for which SYMRK is an essential symbiotic gene. It supports the hypothesis that SYMRK -and other components of the common symbiosis signalling pathway- have been recruited to function in nodulation in a common ancestor that lived before the divergence of the Fabales, Fagales, Cucurbitales, and Rosales orders.

Conclusions

This study of the LRR-type receptor kinase SYMRK in the non-nodulating relative of nitrogen-fixing Parasponia species, T. orientalis, led to the identification of a functional splice site mutation in the gene. The discovery of this rare non-canonical GA-AG splice site motif in SYMRK raises questions about the evolution of such motifs and the mechanisms driving their occurrence. Furthermore, this study demonstrates the conservation of SYMRK functioning in nodulation and AM symbiosis in both legumes and non-legumes. The Parasponia-Trema comparative system is established to obtain insight into the evolutionary trajectory of the nodulation trait. It uncovered several genes critical for rhizobium-induced nodulation in a non-legume [23, 24, 39]. Eventually, Trema species can serve as an experimental test system to uncover essential genes to rebuild the nodulation trait. Additionally, we demonstrated that the Parasponia-Trema comparative system is equally valuable to uncovering the functionality of rare non-canonical splicing motifs. Overall, this study contributes to our understanding of both the common symbiosis signalling pathway and the mechanisms of gene splicing in plants.

Material and methods

Plant materials and growth conditions

Trema orientalis plants used in this study were collected between September 10th and 25th, 2012, during the Crocker Range/Kinabalu Scientific Expedition. This expedition was conceived, organized, funded, and conducted jointly by Sabah Parks (Malaysia) and the Naturalis Biodiversity Center (The Netherlands). Detailed information about the expedition is available in Merckx et al., 2015 [25]. Taxonomic analysis of the T. orientalis samples has been previously published in van Velzen et al., 2018. [23]. P. andersonii WU1 and T. orientalis RG33 were grown and maintained as described previously [40, 41]. Plantlets for nodulation and mycorrhization assay were vegetatively propagated in vitro and rooted [40, 41].

Mycorrhization assays and trypan blue staining

Mycorrhization assays were performed using a commercial spore of Rhizopagus irregularis (Agronutrion-DAOM197198, Carbonne, France). Spores inoculum, inoculation, and trypan blue staining were prepared and performed as described previously [41].

To quantify mycorrhization, a minimum of ~ 50 cm roots for each sample were cut into 1 cm fragments. 25–30 root fragments were placed on a single microscope slide, and 30% glycerol was added. Roots were covered with a cover glass and pressed until root fragments became flat. The frequency of mycorrhiza (%F), the intensity of mycorrhizal colonization (%M), and arbuscules abundance (%A) in the root system was scored and calculated according to Trouvelot et al. [27].

Nodulation assay

P. andersonii plantlets for nodulation were inoculated with Mesorhizobium plurifarium BOR2 (OD600 = 0.05) [23, 40, 41]. Plants were removed from the pots six weeks post-inoculation, roots were washed with running water to remove perlite, and nodules were counted. In (trans) complementation studies, plant roots were examined under fluorescent stereo microscopy, and nodule number was quantified for each transgenic root (eight weeks post-inoculation with Mesorhizobium plurifarium BOR2 (OD600 = 0.025)).

Root growth assay

Five seedlings of P. andersonii and T. orientalis RG33 were grown on ½ strength modified Hoagland medium in 12 cm square plates. Plants were grown vertically at a 60-degree angle for 21 days at 28 °C, 16/8 h day-night regime. The primary root was determined as the main root that emerged from cotyledon, whereas lateral roots were determined as roots that emerged from the primary root. Per plants, primary root length, the average number of lateral roots, and lateral root density (per cm main root) were determined 21 days post germination. Primary root growth was measured by following its development every day for 21 days post germination. The average lateral root length was determined by measuring its size in five selected lateral roots 21 days post germination.

Vectors and constructs

Single-guide RNAs (sgRNAs) were designed using the ‘Find CRISPR Targets’ function implemented in Geneious software version 9.1.5 (Biomatters, New Zealand) and subsequently checked against the P. andersonii genome for high identity off-targets. For CRISPR/Cas9-mediated mutagenesis and complementation studies, binary transformation constructs were created using Golden Gate assembly as described previously [40, 41], and a list of constructs generated from both studies is listed in Table S3. For CRISPR/Cas9-mediated mutagenesis, two sgRNAs were used to target the fourth and the fifth coding exons of PanSYMRK (Fig. S3). Selected sgRNAs were amplified using sequence-specific forward primers and a universal reverse primer (Table S4), using Addgene plasmid no. 46966 as template [42]. To allow for Golden Gate cloning, BpiI and BsaI restriction sites in the putative promoter sequence of PanSYMRK were mutated by introducing single nucleotide substitution [43]. For the complementation study, the sequence of P. andersonii SYMRK promoter, 5’ untranslated region (5’ UTR), genomic DNA, 3’ untranslated region (3’ UTR), and terminator were synthesized. Also, a modified version of P. andersonii SYMRK genomic DNA was synthesized harbouring a point mutation at the donor splice site of the 12th intron, mimicking T. orientalis SYMRKRG33. (Invitrogen, Thermo Fisher Scientific, United States).

Plant transformation

Agrobacterium tumefaciens-mediated transformation and genotyping were done based on previously published protocols [40, 41]. Primers used for genotyping are listed in Table S4. Hairy root transformations were performed according to Cao et al. [44], where A. rhizogenes MSU440 or AR1193 harbouring plasmid DNA of interest were used to infect micro-propagated plants wounded on their base. Infected plants were grown on agar plates of Schenk and Hildbrandt medium (SH medium) [45] and incubated at 21 °C for one week on a 16/8 h light/dark regime. Transformed plants were transferred to agar plates of SH medium supplemented with 10 g sucrose/L, cefotaxime 100 μg/mL, and kanamycin 50 μg/mL and subsequently incubated at 21 °C for one week followed by 28 °C for two weeks. Plants were checked for transgenic roots using a fluorescence stereo microscope.

RNA Sequencing

For RNA isolation, tissue was harvested from a ~ 1 cm region just above the meristematic zone of young growing roots and snap-frozen in liquid nitrogen. Material from ~ 5 plants was combined to form a single biological replicate. RNA was isolated in triplicate as previously described [23]. Library preparation and RNA sequencing was conducted by BGI (Schenzhen, China). Mapped RNA-sequencing reads covering the SYMRK gene in P. andersonii and T. orientalis were visualized using Integrative Genomics Viewer (IGV) [46]. Based on the different splice sites, two SYMRK splice variants were manually constructed. Functional protein domains for these variants were annotated using InterProScan 5 [47].

Phylogenetic reconstruction

Orthologs of SYMRK were identified among 49 publicly available proteomes by applying a Reciprocal Best Hits (RBH) approach, using L. japonicus SYMRK (Lj2g3v1467920.1) as the query sequence. Identified orthologous proteins were aligned using Clustal Omega 1.2.3. [48]. A phylogenetic SYMRK tree was constructed using PhyML 3.0 [49] with LG substitution model 1,000 bootstrap replicates and rooted on the two Poales outgroup species. The tree was visualized using the Interactive Tree Of Life (iTOL) tree viewer [50]. A sub-selection of 20 species was extracted from the SYMRK orthogroup, and a tree was constructed using the same methods described above. Based on the SYMRK gene models for these 20 species, the splice site at intron 12 for each SYMRK ortholog was added.

Statistical analysis

Graphs and statistical analysis for mycorrhization quantification were performed using RStudio version 1.1.456. The Ramf R package was used to analyze and display quantitative AM fungal root colonization data [51]. Statistical tests on three classes of mycorrhization efficiency were done using Kruskal–Wallis test in combination with the post-hoc test using Fisher’s least significant difference criterion. Statistical significance was defined as a p < 0.01. A statistical test on root growth assays and for nodules number quantification on complementation study was done using a student t-test. Statistical significance for these parameters was defined as a p < 0.05.