Keywords

1 Introduction

With the introduction of precision genome editing using Clustered Regularly Interspaced Palindromic Repeats (CRISPR) and CRISPR-associated protein (Cas) technology, we have entered a new era of genetic engineering. CRISPR technology has allowed straightforward, cost-effective and efficient gene editing compared with technologies as zinc finger nucleases (ZFNs) and transcription activator-like effector nucleases (TALENs). CRISPR represents a new perspective for plant breeding and a powerful alternative for genetic engineering to speed up the introduction of improved traits by precise and predictable modifications (deletions, substitutions, insertions) directly in an elite background [38]. Major challenge for the application of genome editing in crop breeding is generating plants without introducing a transgene, and this has led to new challenges for the regulation and social acceptance of genetically modified organisms (GMO). Furthermore, CRISPR technology can produce novel plants transgene-free indistinguishable from natural variants or generated by conventional breeding techniques [42]. Nowadays it has emerged as a powerful tool for applications in medicine, agriculture, and basic studies of gene function. In plants, since the first demonstration of CRISPR in DNA editing in 2013 there has been much progress in basic plant science and crop improvement with applications for biotic and abiotic tolerance, yield performance improvement, biofortification and enhancement of plant quality [118] ranging from model plant, like Arabidopsis thaliana, to food crops [38].

In addition, CRISPR technology can accelerate crop domestication, a labour-intensive process involving alteration of a plant from its wild state to a new form that can serve human needs. Recently, CRISPR was used to domesticate wild tomato, Solanum pimpinellifolium, which is remarkably stress tolerant but is defective in terms of fruit production [77]. In one study, six loci that are important for yield and productivity were targeted, and the engineered lines displayed increased fruit size, fruit number and fruit lycopene accumulation [157].

In prokaryotes, CRISPR/Cas are a family of DNA sequences found in bacteria and archaea genomes and as adaptive immune system that naturally protects cells from DNA virus infections [38]. As biotechnological tool, the CRISPR/Cas9 system is comprised of the Cas9 endonuclease and a synthetic single guide RNA (sgRNA), which combines functions of CRISPR RNA (cRNA) and trans-activating CRISPR RNA (tracrRNA) to direct the Cas9 protein to the DNA target sequence preceding the protospacer-associated motif (PAM) [153].

Various strategies and different types of CRISPR/Cas systems, as genetic manipulation tools, have been attempted to generate and study the impact of functional mutations in crop improvement [92].

Furthermore, the requirement for a specific PAM sequence is a major factor restricting the selection of target sequences. For this reason, different engineered SpCas9 or orthologous Cas9 nucleases derived from different organisms, able to recognize different PAM sequences, have been used for genome editing [28].

The desired genetic modification is initiated by inducing double stranded breaks (DSBs) into a target sequence by using nucleases, and it is subsequently attained by DNA repair through Non-Homologous End Joining (NHEJ) or homology-directed repair (HDR) [36].

The events of genome editing in plant according to the NHEJ mechanism described in the literature, are mainly represented by gene knock-out [117]. The rate of knock-in is much lower [41]. In fact, NHEJ plays a major role in the repair of DSBs through insertions and deletions (indels) at the junction, which change the nucleotide sequence information surrounding the repair region [140]. These indels may cause frameshifts and lead to knock-out of the corresponding gene [122].

HDR-mediated genome editing, also known as gene targeting (GT), is the approach to introduce precise insertion (knock-in) using information from an exogenously supplied DNA donor template to repair the break. The information is copied from the donor template to the chromosome, achieving the desired DNA sequence modification [22]. GT in higher plants was extremely difficult for decades. One of the obstacles in achieving HDR has been the ability to deliver sufficient donor templates to the plant cell to repair the DSB [51]. Development of approaches to improve GT efficiencies in plants are in progress; however, there is still no universal and efficient method for increasing the knock-in frequency. Currently the best approach for precise modification of plant genomes is geminivirus-mediated Gene targeting, but also other non HDR-mediated approaches like Base Editing (BE) and Prime Editing (PE) are promising [44].

Irrespective to the nuclease or editing system, a major concern of CRISPR/Cas technology is the guide efficiency and specificity for a given gene [74]. Moreover, despite the huge repertoire of CRISPR-based molecular tools and their great potential for improving food and nutrition security, genome editing application in agriculture is still slowed down by some limitations. One of the main bottlenecks is the ability to regenerate transformed plants. This process is very time consuming and labour intensive; in many cases leading to very low transformation frequency especially for those species and genotypes which are recalcitrant to plant regeneration or suffer poor transformability. Due to this reason, it becomes essential to rely on highly efficient editing systems and, consequently, execute a flawless experimental design. This should include a careful choice of the most suitable editing system and the validation of the CRISPR reagents before undertaking the stable transformation. In particular, the evaluation of the guide efficiency is of utmost importance to ensure a successful CRISPR experiment.

2 Approaches and Constraints of Genome Editing Using CRISPR Technology

2.1 CRISPR-Based Editors

Many CRISPR-based genome editing tools have been developed to facilitate efficient plant genome engineering. Thanks to its flexibility, efficiency and low cost, CRISPR/Cas technology has been widely used in plants for fundamental and applied research.

CRISPR/Cas systems are divided into two distinct classes according to their structures and functions: class 1 systems (including types I, III, and IV) that use multiprotein complexes to destroy foreign nucleic acids, and class 2 systems (including types II, V, and VI) that use single proteins [93].

The CRISPR/Cas9 system (type II) is the most frequently used. It is composed of a Cas9 nuclease and an engineered single guide RNA (sgRNA). The sgRNA comprises a scaffold sequence necessary for Cas binding and a specific DNA sequence designed to be complementary to the target DNA site, followed by a short DNA sequence acting as a binding signal for Cas9 (Protospacer Adjacent Motif – PAM). Every Cas9 requires a specific PAM to recognize and cleave the target DNA: for example the widely used Cas9 from Streptococcus pyogenes (SpCas9) recognizes an 5′-NGG-3′ PAM sequence [134]. Cas9 contains two nuclease domains, HNH and RuvC, which break the double-stranded DNA (dsDNA) site primarily near the PAM in the target DNA. The resulting DSB is then usually repaired by the NHEJ repair pathway that is the most active repair mechanism; this usually leads to nucleotide insertions or deletions (indels) at the cleavage position. This system enables targeted mutations to be introduced into genomes with high efficiency, but the resulting mutations can vary and are not easy to predict.

Although SpCas9 is very efficient, its specific PAM requirement, target specificity and the large protein size limit its applications. To overcome this limitation, several Cas9 orthologs and variants have been studied for their suitability in genome editing, exhibiting diverse preferences for PAM sequences, and varying in their molecular weights. These include Staphylococcus aureus (SaCas9), Neisseria meningitides (NmeCas9), Streptococcus thermophilus (StCas9), Campylobacter jejuni (CjeCas9) and others [88]. The PAM sequences recognized by these Cas9s are relatively complex, which limits the widespread use of these nucleases in genome editing. However, some engineered Cas9 variants with an increased PAM compatibility have been reported, such as VQR-Cas9 (NGA PAM), VRER- Cas9 (NGCG PAM), EQR-Cas9 (NGAG PAM), xCas9 (NG, GAA and GTA PAM), SpCas9-NG (NG PAM), SpG, and SpRY, a near-PAMless SpCas9 variant (NRN>NYN PAMs) [88]. These variants allow the targeting of simple, non-canonical PAM sites, thus expanding the range of targetable sequences.

In addition to the Cas9 proteins, class 2 type V CRISPR-Cas systems involving Cas12a (or Cpf1) and Cas12b (or C2c1) have been adopted for modifying genomes at AT-rich PAM sequences [10], and presently AsCpf1, LbCpf1, and FnCpf1 are the most common types of Cpf1 used in genome editing. Another interesting alternative Cas nuclease recently identified is the type I CRISPR/Cas10 that causes long-range deletions [143]. An applicative example can be found in Osakabe et al. [111] where a 7.2 kb deletion has been obtained in tomato.

During CRISPR experiment design, a major concern is the guide efficiency and specificity. In fact, although this system can be programmed to virtually cleave any sequence preceding the PAM site sequence, it does not always succeed to target all the predicted sites [91]. Multiple guides designed with different target sequences determine the rate at which simultaneous modifications can be introduced in the genome and therefore the ability to perform comprehensive genome engineering at corresponding specific sites [30]. This feature is especially important to edit multiple loci simultaneously in the same individual.

CRISPR system has also been used to achieve a precise change in the plant genome by different approaches like homology-directed DNA repair (HDR)-mediated GT [44], BE [50, 82] and PE [6, 44, 82].

GT is an HDR-mediated targeted gene replacement that requires the presence of a DNA donor template containing the desired sequence delivered with the guide and Cas9. GT enables specific nucleotide changes ranging from a single nucleotide change to large insertions or deletions. However, GT is inefficient because HDR occurs with extremely low frequency in plants, limiting the widespread use of this process for gene modification.

Different strategies have been applied to improve GT efficiencies, such as increasing the copy number of the repair template using geminivirus replicons or releasing the template from a T-DNA or manipulating the DNA repair pathway to improve HDR frequency [44], but despite these attempts, efficiencies are still low.

Unlike HDR, base editors (BE) do not require the formation of DSBs and a repair template. In general, a base editor is composed of an impaired nuclease, nickase (nCas9) or dead (dCas9) fused with a deaminase that can convert a nucleotide into a different one. According to the action types of deaminases, the BE systems are classified as the cytosine BE (CBEs convert C to T), the adenine BE (ABEs convert A to G) and dual-base editors. In these years several base editors have been developed using Cas protein variants with different PAM requirements (SpCas9, SaCas9, SaKKH-Cas9, VQR-Cas9, SpRY, SpCas9-NG), testing several deaminases to improve editing efficiency (APOBEC, BE3, AID, CDA1, A3A, ABE7.10, ABEmax, ABE8e, ABE9) and engineering their connection to Cas9 to alter position and width of the editing window [50, 102]. Currently, efficient editing from C to T and A to G have been achieved with the BE, but not all BE systems work equally in plants and the technology has still some limits. BE is limited by the targeting scope of Cas, it can only work in a narrow activity window and it has a low accuracy when multiple target nucleotides of the deaminase are present within the activity window. In addition, the purity of the cytosine base editing (CBE) product depends on the uracil N-glycosylase inhibitor (UGI).

Prime editing (PE) is a precise genome editing technology capable of introducing a predefined change in a genome without the need for a DSB. PE can achieve a variety of edits, including all 12 types of base substitution, small indels, and replacements. The PE system consists of a Cas9 nickase (nCas9) fused with a reverse transcriptase (RT) enzyme Moloney Murine Leukemia Virus RT (M-MLV RT), and a PE gRNA (pegRNA), an engineered standard sgRNA targeting the specific genome sites that specifies the target site and encodes the desired edit [6]. The nCas9/pegRNA complex binds to the desired target region and nicks one strand of the targeted DNA, providing a primer to initiate the production of edited DNA, while the pegRNA act as a template for reverse transcription.

Different versions of PEs have been developed in plants, such as PE1, PE2, PE3 e PE3b [102]. Unfortunately, the editing efficiency of PE is low. Different strategies have been used to increase PE efficiency, such as the use of alternative promoters for the expression of Cas9 and RT, as well as the codon optimization of their coding sequences and the fusion with nuclear localization signals [50]. Furthermore, a recent study in rice has shown that designing primer binding site with a melting temperature of 30 °C and the use of two pegRNAs in trans encoding the same edits enhanced the editing efficiency [86]. However, the editing efficiency of PE is still low and further improvement is needed to broaden its application in plants, by optimizing key parameters such as RT enzyme type, experimental condition, and pegRNA design [50].

2.2 Plant Transformation and Regeneration Bottlenecks

To initiate a CRISPR-mediated genome editing experiment, CRISPR reagents are delivered into plant cells through various methods such as agrobacterium-based delivery [2], particle bombardment [112], or protoplast-based delivery [152]. Regardless of the delivery method, plant cells must undergo tissue culture procedures after transformation or transfection to obtain fully edited plants. The process of organogenesis involves three phases: cell dedifferentiation, cell reprogramming, and the development of new apical meristems (root apical meristem or shoot apical meristem) [11, 73]. These steps are challenging, time-consuming, and labor-intensive.

The success of the regeneration process depends on the ability of explant cells to overcome their programmed cell type. Once reprogramming and regeneration are activated, cells acquire a new fate, leading to the generation of new meristems and organs. However, the process is not linear and can encounter obstacles in each phase, affecting organ or plant development.

Factors such as the ratio of auxins and cytokinins [128], carbon sources, salts, vitamins ([25, 151], hormones [40], and the type of explant used [33] can influence the success of regeneration. Epigenetic factors [90] and other intrinsic factors like hormones, hormone receptors, transcription factors, and hormone signal transducers [11, 73] also play a role in guiding cell fate during the regeneration process.

Certain plant species, like Nicotiana and tomato [37, 48], can be regenerated in vitro with relatively high efficiency, while others like pepper [97] or fruit trees [95, 100, 137] exhibit strong recalcitrance to regeneration. One strategy to enhance regenerative capacity is the expression of key transcription factors involved in meristem organization and development, such as BABY BOOM [14, 131], WUSCHEL [13, 18, 58], and SHOOT MERISTEMLESS [18, 156].

The regenerative capacity of each species can be a limiting factor when using genome editing technologies. These processes can act as bottlenecks in the gene editing pipeline, particularly for plants that are difficult to culture and regenerate in vitro. Therefore, it is crucial to test and validate the efficiency of the CRISPR guides used for mutagenesis to increase the likelihood of successful mutagenesis events and reduce the number of plants needed for analysis after transformation and regeneration.

3 In Silico Designing of a Successful CRISPR Experiment

3.1 Features Affecting CRISPR-Mediated Editing

When designing a CRISPR experiment, the main issue to be taken into account for successful gene editing is the optimal trade-off between efficiency and specificity of CRISPR machinery. The ultimate goal is to maximize the on-target mutation rate and to avoid off-target mutations, which can occur when unintended genomic sites are cleaved due to sequence homology with the target site(s).

The specificity issue is less problematic in plant breeding than in clinical research because unwanted mutations can be segregated away from the on-target mutation(s) by crossing mutants with wild-type plants. However, the crossing procedure can be laborious, time-consuming, or even impossible for perennial plants and asexually propagated crops. Consequently, as general rule, it is important to choose the RNA guides with highest specificity scores (minimum off-target risk) [35, 103]. Being tightly dependent on the sequence homology, the off-target risk is normally predictable by in silico analysis and nowadays many bioinformatic tools are freely available. While SpCas9 tolerates single-base mismatches in the PAM-distal region, the proximal region is much more sensitive, even single mismatches can inhibit the cleavage [47, 62]. Then only guides whose off-sites have at least 1–2 mismatches within the PAM-proximal region should be considered highly specific.

On the contrary, mutagenesis efficiency is another bottleneck for genome editing in plants. In fact, low mutation frequency coupled with poor regeneration performance may jeopardize the success of the experiment leading to no edited plants.

Several aspects can affect a CRISPR-mediated editing experiment and concern intrinsic features of the nucleases and, most importantly, those of the RNA guides, which are different depending on the DNA sequence to be mutagenized.

When designing a CRISPR experiment, the first step is deciding which type of nuclease to use. In plants, the codon-optimized versions of SpCas9 from Streptococcus pyogenes are the most used nucleases. Normally its PAM (NGG) is well distributed in the genome [12], therefore, unless specific requirements are needed, NGG-PAM is usually suitable for generating loss-of-function mutants. While for T-rich regions, Cpf1 has found a wide application in plant genome editing [10]. However, in some cases it is necessary to edit specific genomic regions for which SpCas9 or Cpf1 cannot be used, e.g. in Gene-Targeting, Base Editing or Prime Editing experiments. Having a wide repertoire of gene editing tools at one’s disposal guarantees a better chance of obtaining the desired mutation. To this end, many efforts have been made to engineer SpCas9 to expand its ability to recognize different [70] or more flexible PAMs [144]. Furthermore, Cas9 orthologues with different PAM preferences have been identified and used for GE [23, 46, 104, 120, 133].

Considering the huge genetic variability in microorganisms, it is expected to see an increasing number of Cas nucleases available for genome editing in the near future. To this end, Ciciani et al. [27] developed a computational pipeline to identify and isolate sequence-tailored Cas9 nucleases expanding the genome editing toolbox to respond to possibly any PAM requirement.

3.2 Selection of the Gene Region for Mutagenesis

The availability/distribution of PAMs within the gene of interest identifies potential target sites for mutagenesis. However, not all positions within the gene are equivalent. The choice of the sequence target depends on the specific aim of the editing. In certain cases, such as the removal or disruption of a specific regulatory element on the promoter or intron, or when modifying a particular sequence by Base Editing, Prime Editing or Gene-Targeting, there may be limited flexibility for selecting the target sequence. To knock-out gene function, it is generally recommended to target the coding sequence relatively close to the N-terminus to generate a premature termination codon (PTC). However, targeting of regions too close to the initial ATG might be impractical due to a lack of PAMs. Even if it were possible, there is a risk that other ATG codons downstream of the mutation could act as a translation re-initiation site, leading to N-terminally truncated proteins with partial activity [129]. If the functional or structural domain(s) of the protein are already characterized, the guide can be designed to target those specific domains, rendering the protein non-functional [125]. However, designing guides on a single exon might not guarantee a successful knockout if alternative splicing eliminates the mutated exon and produces a partially functional protein [129]. Other strategies include designing guides that bind on the exon-intron junctions to disrupt the splicing site and generate misprocessed mRNA, or attempting to delete the entire gene by designing target sequences upstream and downstream of the coding sequence. However, the latter approach may occur at a low frequency due to the length of the gene. Point mutations on the promoter and untranslated regions (UTRs) may have just a little effect on the expression and stability of the mRNA, respectively. For instance, it has been observed in mammalian cells that sgRNAs targeting the 5′ and 3′ UTRs were highly ineffective [34]. To ensure gene disruption, the best strategy is to employ multiple guides targeting different positions along the gene, leading to multiple mutations or deletions of varying size.

In addition to the selection of the nuclease and the position on target gene, another critical aspect in a CRISPR experiment is the guide’s ability to form the complex with the nuclease and to trigger the cut on the target site. Several papers have investigated the characteristic of an ideal guide for optimal mutation induction, primarily based on SpCas9 system. It has been observed that nucleotide composition, GC content and secondary structure play a pivotal role in determining guide efficiency.

Wang et al. [145] found that the nucleotide composition at the PAM proximal region was an important factor determining cutting efficiency, in particular the purines (G/A) are preferred in the last 4 bases of the guide, while the pyrimidines are disfavored. These data have been substantially confirmed in other papers; in particular, there is a strong evidence about the preference of a G in the first position before the PAM [34, 103, 148] and the disfavoring of the T at the last 4 bases of the spacer [34, 148]. In plants, Liang et al. [83] did not find a specific relationship between spacer nucleotide composition and guide efficiency. This finding suggests that there might be a distinction between animals and plants in this regard. However, it is worth noting that further analysis is required as the authors used fewer guides in their study.

CG content has a great impact on sgRNA efficacy [34, 91]. Doench et al. [34] found that sgRNAs with low (35%) or high (75%) CG content were less active in mammalian cells. Similarly, in plants Liang et al. [83] showed that for the 97% of the efficient guides, the CG content spanned between 30% and 80%; then this is the range to be taken into consideration when designing a guide.

In addition to base composition, secondary structures of sgRNA can affect the ability to form the ribonucleoproteic complex and/or its activity. Alterations in the canonical secondary structure of the sgRNA, can impede the interaction between the sgRNA scaffold and the Cas9 or the binding of the sgRNA seed sequence with the target DNA. Hairpin formation in the spacer region of the guide can prevent the recognition of the target DNA and so its cleavage [91]. Liang et al. [83] evaluated the secondary structure of a population of effective sgRNAs and found some common features which allowed to determine some criteria to design the guides. Based on this analysis, it was suggested to check the secondary structure and select guides with an overall intact tetra loop structure (especially for the loops 2, 3 and RAR), with no more than 12 spacer bases pairing with other bases of the sgRNA and no more than 7 consecutive base pairs. Moreover, the spacer sequence should have a low level of self-pairing with no more than 6 base pairs.

3.3 Computational Tools for Guide Activity Prediction

Conventionally, computational methods for efficient and specific guide designing can be classified into three groups: (i) alignment-based methods; suitable guides are designed based only on the alignment, and retrieving on the genome by locating the PAM sequence; (ii) hypothesis-driven; guide activity is predicted by using empirically rules (GC content or nucleotide composition at position 20); (iii) machine and deep learning-based; guides are scored by using training models which consider several features [71, 98, 149].

The most reliable predictions come from the hypothesis-driven and learning-based methods [146, 150, 154] because they are driven by previously described features [149]. However, learning-based models are considered the current cutting edge for in silico guide efficacy prediction. The development of reliable models requires large datasets of guides and their respective cut performance determined experimentally. Many algorithms have been developed to design guide suitable for SpCas9 like Azimuth 2.0, CRISPRpred, TUSCAN, CRISPRscan, sgDesigner and many others reviewed in Konstantakos et al. [71]. Furthermore, some computational algorithms emerged also for Cpf1, deepCpf1 [66] and SaCas9 [106]. To date, many web-based bioinformatic platforms are available for guide design and estimation of on- and off-target activity that rely on one or more above-mentioned computational methods [43].

Considering the wide availability of software tools for guide design, choosing the most appropriate is challenging. There are several features which should be taken into consideration such as the kind of input that the program allows, type of nucleases they support, algorithms used to predict on- and off-target activity, availability of genome of interest. The latter is a crucial point as a large number of software tools do not operate with guide design for plants. The most used and recommended tools for plants are: CRISPOR, CRISPR-P, RGEN Cas designer, or CHOPCHOP [43]. Among them, CRISPOR is one of the most complete and reliable. It supports hundreds of species and tens of different nucleases, giving a wide coverage in terms of organisms and molecular tools. Also, it integrates multiple scoring models for sgRNA efficiency prediction for SpCas9, in particular it uses CRISPRscan algorithm [103] and Azimuth 2.0 algorithm [35], considered a state-of-the-art model [71], but also deepCpf1 and Namj’s models for Cpf1 and SaCas9, respectively. Furthermore, CRISPOR predicts CRISPR/Cas outcomes in terms of probability to obtain out-of-frame deletions based on the microhomology around the target site [7] or to obtain frameshift due to any type of insertion or deletions [24]. About the specificity, CRISPOR includes scores, such as MIT [47] and CFD [35], for off-targeting prediction and gives support for off-sites identification in the genome and for primer designing.

Albeit machine learning-based guide design algorithms represent exceptionally useful tool, they have some limitations. Most of computational models have been built by using datasets regarding guide performances of SpCas9. With the exception of SaCas9 and Cpf1, other cas nucleases remain completely deprived of reliable models for an optimal guide design. Furthermore, it is important to emphasize that the efficiency prediction reliability may depend on the cell type or on species [80], and that many algorithms have been developed with human or animal datasets. Therefore, it is not obvious that they work equally well for plants. Indeed, Naim et al. [105] examined the prediction performances of several on-line tools and they did not find a statistical correlation between software rankings and in vivo effectiveness measured in several plants. This suggests that current algorithms based on rules designed for guides in animals do not perform well for plants.

Consequently, further efforts are required to improve in silico guide design for plant genome editing. With this in mind, after a preliminary in silico evaluation of guide efficiency by using the most suitable tools, it is advisable to experimentally test their performance before starting with a plant stable transformation.

About prime editing, in addition to usual features of sgRNAs which mediate the binding on the target sequence, other aspects must be evaluated during pegRNA design. In fact, it has two more components affecting the editing efficiency, the RT template which guides the DNA repair and a PBS which anneals to the nicked target DNA strand [6]. It has been recommended that the length range should be between 9 and 15 nt for PBS and 10–15 for RT templates [68]. Moreover, Lin et al. [86] found that, in rice, the optimal melting temperature of PBS should be 30 °C.

Many computational tools have been developed for pegRNA design, Easy-Prime [78], PrimeDesign [48], pegFinder [26], PnB Designer [127], PINE-CONE [132], PE-Analyzer [53], peg-IT [4], PlantPegDesigner [61, 86]. Unfortunately, very few are suitable for plants. The most advanced one is PlantPegDesign [61, 86] which was developed on the basis of rice experimental data and takes into consideration a series of known parameters such as optimal Tm, the exclusion of the first C in the 3′ extension of the RT template, length of RT, GC content of the PBS and the PE window. This tool is promising, but at the moment it has been used to predict efficient pegRNAs only in rice. More study would be necessary to investigate if PlantPegDesign algorithm parameters are suitable for other plants or if they need to be reevaluated and adjusted accordingly.

At present, the application of prime editing in plants presents significant challenges. Therefore, it would be appropriate to experimentally validate pegRNAs before starting stable transformation in order to save time and resources.

4 Experimental Approaches for CRISPR Reagent Validation

Stable transformation and plant regeneration are tedious, time-consuming and cost intensive. Moreover, for certain species, especially crops, they still pose a challenge [1]. Having efficient CRISPR reagents, particularly guides, is a crucial aspect to realistically obtain edited plants. Since reliable computational software, based on plant specific training datasets is lacking, the evaluation and selection of the most efficient guides should be carried out experimentally. Over time, a number of procedures have been proposed to validate the effectiveness of CRISPR reagents in plants (Fig. 6.1).

Fig. 6.1
An illustration of the steps in CRISPR-mediated genome editing. The steps are silico guide design, experimental guide with in vitro assay, agroinfiltration, protoplast transfection, and hairy root transformation, and genotyping.

Outline of CRISPR-mediated genome editing design experiments (a) Schematic representation of the first step in the CRISPR-mediated genome editing experiments, involving the use of bioinformatic tools to design specific and efficient guides suitable for plant genomes. The selected guides meeting the desired specificity and efficiency criteria undergo experimental validation as shown in (b). (b) Experimental validation methods for the selected guides, including in vitro cleavage assay as a pre-screening system to reduce the number of guides for in planta validation. Depending on the species and desired genotyping depth, different assays such as agroinfiltration, protoplasts, or hairy roots can be employed. (c) Analytical methods used for genotyping in genome editing experiments. Various techniques can be selected based on the specific requirements of the experiment. (Image sources: Some images have been obtained from the freely available collection on the pixabay.com website)

4.1 Endonuclease Cleavage In Vitro Assay

An easy way to test guide efficiency is through the endonuclease cleavage in vitro assay. It has been demonstrated that CRISPR reagents can successfully work in vitro, and protocols have been developed to produce and purify recombinant cas nucleases by heterologous expression. Additionally, guides can be prepared by in vitro transcription followed by ribonucleoprotein complex assembly and cleavage activity assays [3, 101]. This approach has been used to assess the effectiveness of the CRISPR system with the specific guide [60, 65, 67, 94]. Such a system is very simple and rapid, and the availability of commercial purified nucleases and in vitro transcription kits make even easier and attractive its application for guide evaluation. However, the reliability of this system is undermined by several aspects. The efficiency of T7-mediated in vitro transcription can be affected by the nucleotide composition of the initially transcribed region [31]. Consequently, the altered amount of guides transcript can affect the efficiency of the cleavage leading to potentially misleading results. Moreover, in vitro cleavage assay cannot simulate the in vivo expression level of nuclease and guide, which is one of the well-known factors affecting the mutagenesis rate. It has been shown that low concentrations of SpCas9 reduce on-target cleavage activity [47]. In addition, the in vitro assay does not replicate the accessibility to the genomic target sequence and the in vivo biochemical conditions for the folding of guide secondary structures. Recently, Sagarbarria and Caraan [123] found a discrepancy between in vitro and in vivo evaluation of sgRNA efficiency, as guides that appeared to be functional in vitro did not lead to successful mutagenesis in stable transformed plants of S. melongena.

Lastly, the in vitro approach, while can provide insights into cut efficiency, has limitations in predicting the types of mutations induced by the repair systems after the DSB. It also does not offer information regarding the ability of Base or Prime editors to introduce the desired edits, nor does it provide any information on the risk of off-targets in the genome.

For these reasons, the in vitro approach cannot be considered truly predictive of mutagenesis efficiency and should be taken into account only as pre-screening method. To better simulate physiological conditions and obtain reliable estimates of the mutagenesis rates, in vivo systems should be preferred.

4.2 Agroinfiltration Assay

Transient expression systems constitute a good compromise between ease and reliability for testing the efficacy of CRISPR reagents before proceeding with stable transformation. Acting directly on the genome, these systems provide more detailed and realistic information about the in vivo cleavage and mutagenesis efficiency. Additionally, they offer the advantage of being rapid to execute, as they do not involve time-consuming tissue culture.

Among the in vivo approaches used with plants, agroinfiltration is a rapid method that involves the use of special strains of Agrobacterium tumefaciens. These modified bacteria, that have inserted the CRISPR machinery genes into the T-DNA, are infiltrated into the intercellular space of plant tissues by syringe or vacuum infiltration. This process leads to the transient expression of nuclease and guide. Agroinfiltration can be performed in various parts of the plant, such as fruit [110], petal [52], whole seedling [75] and pollen [32]. However, the most commonly used organ for agroinfiltration is the leaf, for which numerous of protocols have been optimized (reviewed in [63]) and high level of expression and high transformation efficiency have been reached [130]. The main advantage of this technique is its rapidity, with high expression levels of transgenes typically achieved within a few days [63]. For this reason, transient agroinfiltration has been widely used as a preliminary experiment to test the effectiveness of constructs expressing guides and cas nucleases for in vivo targeting of specific genes through CRISPR-mediated genome editing.

In some cases, this system has been used as proof-of-concept study for assessing whether the CRISPR machinery was active in different species of interest [16, 59, 75, 107, 136], or for testing new cloning approaches to assemble CRISPR constructs [64, 141]. Moreover, several studies have reported the use of agroinfiltration as a technique to experimentally validate constructs before proceeding with stable transformation. Baltes et al. [8] tested the activity of Cas9 and sgRNAs in Nicotiana benthamiana to target BeYDV replicons. Zhang et al. [155] used the agroinfiltration to verify the mutagenesis efficiency of hundreds of guides in tomato and N. benthamiana to edit 63 immunity associated genes.

Overall, the agroinfiltration method has several limitations. The transformation efficiency may depend on many factors, including the biological compatibility and tissue accessibility of the plant species (and genotypes) with A. tumefaciens [147]. For example, comparing some CRISPR constructs harboring guides which recognize the identical target sites in both species, Zhang et al. [155] found a much lower mutagenesis in tomato than N. benthamiana, concluding that agroinfiltration system in tomato leaves can give misleading results underestimating the guides efficiency.

To overcome the incompatibility, the recalcitrance or the inaccessibility which make agroinfiltration problematic for some plant species, the guide efficiency could be determined by co-transforming the CRISPR system (Cas9 and guide) with its target DNA in N. benthamiana. This method was developed by Khan et al. [64] as a proof of concept that an exogenous gene (YFP) transiently co-transformed in N. benthamiana together with CRISPR apparatus can be successfully mutated. This system could allow to assess virtually any guide for its target DNA from any species.

Another limitation of agroinfiltration is the inability to accurately estimate the mutagenesis efficiency due to the presence of genomic DNA from non-transformed cells. This can introduce variability in the results, as there is no selection of cells that have integrated the T-DNA. In particular, Li et al. [75] showed that the mutagenesis efficiency appeared to be higher when using protoplast transfection compared to foliar agroinfiltration in both A. thaliana and N. benthamiana. They proposed that this difference might be attributed to a higher gene transfer efficiency in protoplasts which leads to a different dilution ratio between transformed and untransformed genomic DNA.

4.3 Protoplast Assay

Protoplasts, plant cells deprived of cell wall, are very useful biological and biotechnological tools for both basic and applied plant science [152]. They are mainly, but not exclusively, obtained from leaf through mannitol-mediated plasmolysis and exposure of mesophyll cells to cell-wall-digesting enzymes (macerozyme and cellulase). Due to the high transfection efficiency and rapidity, protoplasts have become an excellent system to evaluate the effectiveness of CRISPR vectors in plants before attempting to transform an entire organism [152]. PEG-mediated protoplast transfection has been employed in a multitude of plant species to check the efficacy of designed CRISPR tools, not only in model plants like Arabidopsis [60], N. benthamiana [60] or N. tabacum [85], but also in several crops, including both dicotyledonous [88, 103] and monocotyledonous plants [15, 85].

Once the protocols for protoplast isolation and transfection are established, the protoplast platform to validate CRISPR constructs is simple, reliable, and not expensive. The yield of viable protoplasts is normally high, and this allows several tests to be carried out from the same preparation, e.g. to evaluate many CRISPR constructs or to examine different experimental conditions [9]. Moreover, protoplast transfection is suitable for both RNPs and plasmids, making it possible to assess not only the effectiveness of the cas-guide complex activity, but also different plasmid architectures, and to make a comparison between RNPs and plasmids. Jiang et al. [60] showed that RNPs are more efficient than plasmids, suggesting that, despite production and purification of cas nuclease and guides may be more challenging (or costly, if purchased) than extracting plasmid from bacteria, RNPs may offer better mutagenesis performance, providing a greater guarantee of successful editing of target genes. Another important feature of protoplast platform is its versatility. Many studies have focused on the evaluation of mutagenesis rate caused mainly by SpCas9 (and its variants or orthologs) and Cpf1 to give random mutations on target sites. Furthermore, it has been proven that protoplasts can be useful to detect also precise mutagenesis events, like those determined by HDR-mediated Gene-Targeting, Prime Editing [60] and Base Editing [39].

Although the plant protoplast platform is robust and widely applicable in multiple plant species, it does have certain limitations. Protoplasts, despite their high transfection efficiency, require specific expertise for their careful handling, from isolation to transfection. The transfection of protoplast generates thousands of independent events, which can be genotyped using various systems. For precise and detailed information about the mutations generated by the CRISPR machinery, amplicon sequencing by NGS is the preferred technology, but it requires specific expertise that may not always be readily available, unless an external service is used. Furthermore, even with NGS technology, there is still some uncertainty regarding the exact mutation rate and composition. This is because it is not possible to differentiate between transfected and non-transfected protoplasts from the total DNA extracted, unless a transfection marker, such as a GFP reporter, is employed.

4.4 Hairy Root Assay

A possible way to overcome the weaknesses of agroinfiltration and protoplast transfection can be given by Agrobacterium rhizogenes mediated hairy roots generation.

The hairy root system is a rapid and convenient approach for obtaining stably transformed roots. It is based on the natural ability of Agrobacterium rhizogenes (Rhizobium rhizogenes) to infect injured parts of the plant, triggering root organogenesis from the wounded sites and giving origin to the well-known “hairy root disease”. The combination of hairy root approach and CRISPR/Cas techniques represents an excellent platform for an easy, rapid, accurate and cost-effective evaluation of CRISPR reagents, with the added value of a possible functional analysis of genes of interest in roots.

Currently, in the literature there are several articles reporting the use of hairy root transformation to deliver CRISPR vectors in a multitude of plant species such as tomato [56, 81, 121], potato [19], cucumber [108], soybean ([20];), Brassica napus [57], peanut [126], papaya [45], Populus [139], rubber dandelion [54], Brassica carinata [69], Salvia miltiorrhiza [76], Medicago truncatula [99].

The main advantage of the hairy root assay counts on the fact that transformed roots can serve as a simulation of stable whole plant transformation. Each root can be considered an independent transformation event, and the transformation efficiency is very high, as escapes can be avoided by antibiotic-mediated selection. These characteristics facilitate straightforward and rapid genotyping, enabling the calculation of the mutagenesis rate as the proportion of roots harboring at least one edited allele among the total number of transformed roots.

The drawbacks of this system include the need for some tissue culture procedures and a waiting time between the infection of the explant and formation of the hairy roots, which can range from 10 days to several weeks depending on the plant species. However, these drawbacks are offset by the simplicity and speed of genotyping using PCR followed by Sanger sequencing. The availability of commercial PCR kits with engineered DNA polymerase that are resistant to PCR inhibitors present in plant extracts allows for direct genotyping of root fragments without the need for genomic DNA purification, using minimal amounts of plant material. This enables the amplification of numerous samples in a short period of time.

Many plant species, mostly dicotyledonous, are A. rhizogenes susceptible, consequently from these plants it is easy to obtain hairy roots transformed with T-DNA harboring CRISPR machinery. This characteristic makes the hairy root system a promising method of choice for evaluating guide efficiency or new CRISPR systems in a rapid and reliable way. Furthermore, it has been shown that tomato hairy roots exhibit morphological similarity to normal adventitious roots, making them a valuable tool for studying the function of genes in this organ [121]. In cases where target genes are completely inactivated, hairy roots can provide insights into the potential phenotypic changes that may arise in roots after stable transformation with CRISPR constructs.

A typical hairy root protocol workflow, established in our lab to evaluate CRISPR approaches in tomato, is described in the Fig. 6.2. With some adaptations, this protocol can serve as a roadmap for designing and validating genome editing experiment in other plant species.

Fig. 6.2
A flowchart of the steps in CRISPR workflow. The steps are guide design, cloning, a rhizogenes transformation, explant infection, co-cultivation, transfer on selection medium, hairy root formation, genotyping by P C R, and Sanger sequencing.

Workflow of CRISPR construct assembly and hairy root generation from tomato explants. The workflow illustrates the steps involved in the generation of CRISPR constructs and the induction of hairy roots in tomato explants. First, computational tools are used to design the guides required for the CRISPR experiment. The guide expression cassette is then produced and assembled with the nuclease and antibiotic resistance expression cassettes. These components are combined to form the final constructs. The CRISPR constructs are cloned into Agrobacterium rhizogenes cells, which are used to infect cotyledon and/or leaf explants from 10–15 day old tomato plantlets. The infected explants undergo a co-cultivation process with Agrobacterium rhizogenes for 3 days. Subsequently, the explants are transferred to a selective medium and incubated. After approximately 2 weeks, the first hairy roots start to develop and can be collected. The collected hairy roots are then subjected to direct genotyping using PCR followed by Sanger sequencing to analyze the desired genomic modifications. (Image sources: Some images have been obtained from the freely available collection on the pixabay.com website)

5 Analytical Techniques to Estimate the Editing Efficiency

Genotyping is a pivotal step to assess the editing efficiency of CRISPR constructs or RNPs. Effective screening methods are necessary before undertaking the time-consuming transformation and regeneration procedures to validate the efficacy of genome-editing reagents (guide and nuclease) in order to increase the efficiency of genome editing. For this purpose, many experimental methods and bioinformatic tools have been developed to detect and analyze indels generated after CRISPR-mediated targeting.

The choice of analytical technique depends mainly on the type of information required in terms of mutagenesis occurrence, precise identification of indels and the accurate frequency of each type of mutation. Some editing approaches, such as Base Editing, Prime Editing and HDR-mediated knock-in, are not very efficient and the desired mutations may occur at very low frequencies. In these cases, the detection and quantification of induced mutations are very challenging and require specific methods for large-scale mutation detection. Conversely, when gene inactivation with typical out-of-frame mutations is desired, frequencies are normally higher, partly because a wide range of indels can result in a knockout effect. In these cases, techniques that can provide qualitative (or semi-quantitative) information about mutagenesis occurrence are preferred. Such techniques should be simple and cost-effective, should not require special equipment, and should enable the detection of a mutated allele in a background of wild-type alleles even when mutations are obtained at a very low frequency or in a complex polyploid plant genome.

Although sequencing will ultimately be required to confirm and identify the exact sequence of the mutant DNA, the availability of powerful high-throughput strategies during the screening stage can significantly reduce the cost involved in generating and identifying mutants.

A commonly used method for detecting modified genes relies on the enzymatic cleavage of heteroduplexes formed, after melting and re-annealing, by hybridization of wild-type and mutated DNA strands or two differently mutated DNA strands. A bacteriophage resolvase, such as T7E1, or single-strand-specific plant endonucleases, such as those of the CEL family (commercialized under the brand Surveyor) are used to recognize and digest unpaired heteroduplex DNA independently of the sequence. Therefore, they are suitable for screening of any target sequence [142]. Enzymatic digestion of heteroduplexes has been utilized, for example, by Cai et al. [20] to test target site editing efficiency in soybean hairy roots, by Khan et al. [64] for sgRNA efficiency testing in infiltrated leaves of N. benthamiana, and by Brandt et al. [15] to optimize wheat protoplasts transformation with CRISPR-Cas ribonucleoprotein complexes. A side-by-side comparison has shown that T7E1 identifies preferentially insertions and deletions, whereas Surveyor has better sensitivity to recognize substitutions [142]. The T7E1/Surveyor assays are reproducible, inexpensive, and easy to use, but are unable to differentiate between identically mutated (homozygous mutants) and wild-type alleles, or between biallelic mutants and heterozygous monoallelic mutants. Therefore, additional testing is necessary to detect biallelic mutations, and homozygous mutant clones may be discarded as falsely reported wild type. Furthermore, the T7EI and Surveyor nuclease assays do not provide information about the type of induced mutation or the number of indels, making it impossible to exclude indels of (multiples of) three nucleotides. In addition, in polyploid species, false positive signals can arise from formation of heteroduplex between non-identical paralogs. In some cases, a heteroduplex mobility assay (HMA) using PAGE has been used to analyze heteroduplex formation instead of the enzymatic digestion, as reported by Hoang et al. [45] and Nguyen et al. [108] to test the efficiency of genome editing in hairy roots of papaya and cucumber, respectively.

Another classic method used for mutation detection is polymerase chain reaction (PCR)/restriction enzyme (RE) assay. If a restriction site is present in the target locus, the region around it can be amplified by PCR and run on an agarose gel after digestion with the RE to display the digestion pattern. When a mutation disrupts the restriction site, the amplified fragment remains uncut and appears as a single undigested band. Although it is straightforward and accessible to most laboratories, the PCR/RE method is heavily dependent on the availability of a restriction enzyme site near the target sequence, which is already constrained of a PAM for the nuclease at the cleavage site. Furthermore, each target sequence requires a specific set-up that hinders the general optimization of the protocol. Nevertheless, this approach is widely used for rapid and inexpensive screening of mutagenesis events, especially when optimizing protocols in less explored species such as in wild diploid potato relative [19] and Taraxacum [54] hairy roots, in tobacco and maize protoplasts [85], soybean protoplasts and hairy roots [135] or in agroinfiltrated N. benthamiana leaves [141].

Recently, a method based on PCR followed by digestion with purified ribonucleoprotein complexes of SpCas9 or FnCpf1 (known as PCR/RNP method) was reported to detect nuclease induced mutations in both polyploid and diploid plants [84]. This method is more applicable than PCR/RE as the CRISPR nucleases RNP will digest PCR products identical to the guide (wild type) but fail to digest PCR products with mutated sequences (mutants) without the need for the presence of the additional restriction site. According to the authors, the PCR/RNP method is less effective in detecting SNPs than indels, but the latter can still be distinguished from the wild-type sequence. In addition, the PCR/RNP method appears to be superior to the T7EI assay in terms of accuracy, and to Sanger sequencing in terms of sensitivity [84]. The main drawback of this screening strategies is that it requires purified in vitro transcribed guides and purified nucleases to preassemble CRISPR ribonucleoprotein (RNP) complexes.

To overcome the limitations of the PCR/RE method, several other PCR-based protocols have been developed, such as annealing at critical temperature PCR (ACT-PCR) [49], double-strand break site-targeted PCR (DST-PCR) [55], and bindel-PCR [124]. In addition, real-time PCR [114] and droplet digital PCR [115] are interesting alternatives to conventional PCR methods because they do not require post-PCR product manipulation. They are fast, high-throughput, and reduce the risk of laboratory contamination. Both protocols require the design of two differently labelled probes recognizing different parts of the same amplicons. The probe designed outside the expected mutation position will bind to all alleles, allowing the assessment of the total amount of amplicons present in the sample, while the probe designed on the gene-editing target site will only bind to the wild-type sequence, thus revealing the presence of a new mutation. The ratio of the relative amplification values of the two probes in qPCR or the ratio of mutant droplets (positive for one fluorophore) to wild-type droplets (positive for a double fluorophore) in dPCR can then be used to distinguish wild-type, homozygous, and heterozygous mutations and quantify the mutation frequency of gene editing. qPCR and dPCR can detect single nucleotide indels or single nucleotide mutations with high sensitivity, especially qPCR. The cost of the labelled probes, the inability to detect large deletions in homozygous samples, and the need for direct sequencing to determine the exact mutated sequence are the major drawbacks of these two protocols.

Fluorescent PCR-capillary gel electrophoresis/DNA fragment analysis [119] and high-resolution melting curve analysis (HRM) [138] are also PCR based approaches that are successfully employed for genotyping genome editing events. Fluorescent PCR-capillary gel electrophoresis/DNA fragment analysis, of which several variants are available, employs fluorophore-labelled primers to amplify the genomic region containing the expected edited site. The labelled amplicons are then separated by capillary electrophoresis, and sized by comparison to an internal standard mixed with the sample before proceeding with electrophoresis. Data analysis guided by software identifies mutants based on the shifts in fragment size compared to the wild-type fragment. Recently, Carlsen et al. [21] successfully employed a specific fragment analysis strategy called Indel Detection by Amplicon Analysis (IDAA) to evaluate editing efficiency in tetraploid potato protoplasts in a study aimed at improving guide efficiency design [21]. Additionally, High Resolution Fragment Analysis (HRFA) described by Andersson et al. [5] has been utilized to test mutation efficiency during genome editing in tomato protoplasts [89], and the editing and regeneration protocol in rapeseed protoplasts [79]. Fragment analysis efficiently and accurately detects the number of nucleotides inserted or deleted at the target site, but is not accurate in detecting indels larger than 30 base pairs, and does not detect base substitutions or single nucleotide polymorphisms (SNPs). By providing information about the size and the relative abundance of different amplicons, fragment analysis allows detection of different alleles, homo- and heterozygosity, chimerism, sample mixtures, and genome editing efficiency. In addition, it offers a high sensitivity and resolution, being able to discriminate fragments that differ by a single base pair, and is a fast, high-throughput, automatable and multiplexable. On the other hand, it requires specialized equipment and software that are not commonly available and accessible to all laboratories. This also applies for another easy and fast techniques called High Resolution Melting (HRM), that combines PCR and heteroduplex formation. By increasing the temperature after PCR of small amplicons (about 100 bp) containing the target region, a specific and different dissociation profile is observed for homoduplex and heteroduplex of double stranded DNA fluorescently labelled. Subsequently, the Tm and the characteristic different signatures of PCR products are used by dedicated software to identify the presence of mutant alleles without the need of additional manipulations after PCR. The main advantage of HRM protocol is that it can distinguish individual mutant alleles within a complex background, allowing detection of even rare mutation and chimerism as low as 5%. On the other hand, in addition to the cost of the equipment, a limitation of HRM is that it cannot detect large deletions due to the small size of the amplicons.

Certainly, Sanger sequencing of PCR products is the most direct and definitive approach to obtain detailed information about the type(s) of mutation in a sample, and is used in almost all studies, both to directly analyze amplified target regions and to accurately determine the mutated sequence after screening with one of the aforementioned methods. Although highly informative, sequencing methods are costly for high-throughput genotyping and require bioinformatic skills to analyze the data. In fact, chromatograms of PCR amplicons derived from complex samples (heterozygotes, biallelic mutants, chimera, polyploid species) can contain multiple traces, and several online bioinformatics tools, such as TIDE [17], DSDecode [87], and ICE [29], have been developed to decode the underlying mutation types.

Finally, thanks to its high sensitivity (0.01%), deep sequencing of the amplicons using NGS-based methods represents the gold standard for mutation detection, especially when targeted mutagenesis frequencies are low and rare editing events must be detected among a high background of unmodified alleles. The genomic region around the target site is amplified with a proofreading DNA polymerase, and the PCR products are barcoded (to distinguish the reads from independent amplifications of the target site) and indexed (to enable library multiplexing in the flow cell) to allow high-throughput sequencing. The raw data are then analyzed to calculate mutation efficiency as the percentage of reads containing indels in a defined window around the cleavage site, with the help of specific bioinformatic pipelines such as Cas-Analyzer [113], CRISPResso [116] or CRISPAltRations [72]. Several examples of deep sequencing of amplicons for assessing editing efficiency can be found in the literature, specifically in studies involving protoplast assay [9, 39, 60] or agroinfiltration assay [8, 96].

Unfortunately, due to the high costs and the need for specialized skills and bioinformatic tools, NGS methods are not widely applicable for the initial screening of edited lines.

6 Summary and Conclusions

CRISPR/Cas technology is considered the cutting-edge tool for genome editing for both fundamental and applied research in plants. It relies on a wide repertoire of molecular tools, including nucleases with the ability to target a large spectrum of PAM sequences, as well as precise editing approaches such as Base Editing and Prime Editing. This versatility makes CRISPR/Cas an invaluable technology. However, the challenge lies in obtaining transformed plants that can effectively express the editing machinery, as this step remains challenging for many plant species. Without verifying the actual efficacy of the selected CRISPR reagents, undertaking a stable plant transformation can be demanding and risky using current transformation and regeneration protocols. Therefore, knowing the actual editing frequencies is crucial in selecting the most efficient guides and planning the editing experiment accordingly.

The development of computational tools enormously facilitates the selection of putative editing sites on target genes by predicting the most specific and efficient guides. The development of these computational models requires the use of large datasets regarding the performance of the experimentally determined guide. To date, computational models are not available for all editing approaches, especially for new nucleases, and it will be difficult to keep up with the continuous emergence of new molecular tools. Furthermore, bioinformatic tools built by using plant-specific datasets are almost completely lacking. For this reason, it is essential to have rapid systems capable of evaluating the performance of CRISPR reagents in planta. This allows for screening different guides and choosing the most effective one, as well as testing new editing systems quickly.

The in vitro test is extremely simple, but has many limitations, so it can only be considered as a pre-screening system, while the in vivo tests, such as agroinfiltration, protoplasts and hairy roots are to be preferred. The choice of the in vivo system depends mostly on the species and the desired type of genotypic data, which in turn relies on the analytical method used to determine the editing rate. The disadvantage of agroinfiltration and protoplasts is that they require NGS for accurate genotyping.

Among the in vivo platforms, the hairy roots represent a promising tool to test the efficacy of CRISPR reagents in A. rhizogenes-susceptible plants. This system combines the advantage of stable transformation with the rapid analysis of transient systems, allowing for accurate and rapid genotyping using PCR-based methods, and potentially Sanger sequencing, without the need for regenerating entire plants.