Introduction

Genetic variance plays a pivotal role in both breeding programs and mutant screening for genetic research, whether the source of this variance is natural or artificially induced. Natural variance is rather limited, but there is a broad range of artificial methods to induce genetic mutations. Traditional screening and breeding programs rely heavily on random mutagenesis induced by chemicals or radiation to generate mutations across the genome1,2,3. Although these methods can occasionally produce desirable traits, they have several significant limitations. First, classic mutagenesis cannot target specific genes or regions of the genome and often produces unpredictable outcomes4. Second, random mutagenesis approaches often unintentionally result in mutations in multiple genes, causing off-target noise5,6,7. Third, classic mutagenesis in current breeding programs cannot deal with genetic linkage, as genes situated closely on the chromosome stay together during crossover events. For example, in plants, approximately 12% of genes within each family are genetically linked8,9, so individual mutated lines cannot be combined to obtain double mutants in closely related genes. This linkage is a critical concern for breeding programs that rely on crossing to develop desired crop varieties. Finally, perhaps one of the most overlooked shortcomings of these methods is genetic redundancy: Genes with high sequence similarity often have overlapping or redundant functions10. Redundancy masks the effects of mutations in individual genes, thereby leading to buffered phenotypic plasticity that poses significant challenges in deciphering the precise roles of genes and understanding their functions11,12.

Over the past decade, CRISPR-Cas technology has emerged as a dominant and extremely versatile tool in genetic engineering. The system involves an endonuclease directed by a single guide RNA (sgRNA) to a complementary sequence in the genome that also includes a protospacer adjacent motif (PAM); CRISPR enables site-specific gene editing with high-efficiency and low off-target effect13,14,15. Moreover, use of sgRNA multiplexing, introducing two or more sgRNAs into the same plant, can enable simultaneous targeting of multiple genes to circumvent issues of genetic redundancy16. For example, the BREEDIT tool has been used to facilitate multiplexed genome editing of several gene families in maize to improve complex traits such as yield and drought tolerance17. Similarly, the multiplexing strategy was applied to elite winter wheat varieties for simultaneous editing of multiple genomic loci in one generation18. A recent example of a multiplex CRISPR approach was carried out in poplar to increase fiber production capacity19. sgRNA multiplexing has enabled efficient genetic editing in tomato20,21. For instance, introducing six sgRNAs targeting three genes involved in fruit color regulation led to diverse inherited mutation combinations and a broad spectrum of fruit colors22. In another study, this approach resulted in a fivefold increase in lycopene content23. Although the sgRNA multiplexing approach tackles functional redundancy and enables selection of a desired phenotype to improve agronomically important traits, it suffers from low scalability as each vector is cloned manually. As gene lists must be filtered manually and only a handful of potential candidates can be targeted, the likelihood of success is relatively low. Due to this very low scalability, the influence of sgRNA multiplexing approaches on crop improvement and breeding programs has been limited24.

Recent studies have generated small, medium, and even large-scale CRISPR libraries in several crop plants. The first instance of such a system was demonstrated in tomato, where 165 sgRNAs were designed to target all 54 genes within the LRR-RLK subfamily XII, achieving a reported mutation detection rate of 62.5%25. In another example, a CRISPR library was applied to rice with 25,604 sgRNAs targeting 12,802 genes expressed in the shoot26. A parallel study in rice generated a genome-scale CRISPR library of 88,541 sgRNAs, targeting 34,234 genes with an average coverage of 2.59 sgRNA vectors per gene27. In two other recent studies, a CRISPR library of 246 sgRNAs was used to screen cotton for insect-resistance genes28, and 4379 sgRNAs were designed to target 990 transcription factors in tomato29. These libraries demonstrate the potential of large-scale targeted genetics but do not bypass the issue of functional redundancy that results from large gene families with partially overlapping activities. Since, on average, 64.5% of the genes in the genome of plants are part of paralogous gene families30,31,32, phenotypic plasticity may limit the utility of these large CRISPR libraries.

In an attempt to utilize CRISPR’s pinpoint accuracy and high efficiency while maintaining the high scalability of traditional methods, we previously developed an approach for the design of libraries in which each vector harbors a single sgRNA that targets conserved sequences across multiple genes, allowing simultaneous editing of several gene family members33,34,35,36. Applying this approach in Arabidopsis generated a genome-wide multi-targeted CRISPR library, which included tens of thousands of unique sgRNAs targeting genes in both genome-wide and functional group-specific manners37. Given the relevance of this approach to agricultural crops and breeding programs, we here design and develop a genome-wide, multi-targeted CRISPR library for use in a crop plant, tomato. Using this library, we generated approximately 1300 independent CRISPR lines and identified over 100 independent lines that have a wide range of phenotypes, including those related to fruit shape and size, fruit flavor, pathogen response, and nutrient uptake. In addition, we developed an sgRNA mapping system, CRISPR-GuideMap, that uses a double barcode tagging system and deep sequencing to enhance the use of CRISPR libraries in breeding programs. By integrating precise gene targeting with advanced sequencing technology, we streamlined the identification of gene functions and broadened the scope of functional characterization obtainable from these libraries.

Results

Design of tomato genome-wide, multi-targeted CRISPR library

Given that high sequence similarity among members of gene families in plants can lead to phenotypic buffering, classic forward genetic screening is limited in its ability to identify unknown phenotypes. Previously, we showed that by designing sgRNAs that target multiple genes in a family containing the same or similar target sites, double or multi-knockout mutants could be generated at the genome scale37. To date, this approach has only been demonstrated in the model plant Arabidopsis. Here, we develop and apply this strategy to tomato, being a major crop species.

To this end, we first grouped all coding gene sequences of Solanum lycopersicum into gene families based on amino acid sequence similarity and used the CRISPys algorithm38 to design multiple sgRNAs for each family. Given a gene family, its phylogenetic tree was reconstructed, such that a subgroup of genes that are more closely related are placed closer to each other. CRISPys then designed multiple sgRNAs that could optimally target multiple members within each subgroup (represented by internal nodes in these trees). To maximize the likelihood of gene knockouts, sgRNA targets were confined to the first two-thirds of the coding sequence. We note that during this procedure, multiple sgRNAs could be designed for the same set of genes, allowing for different types of mutations to be created. Based on the similarity between the 20-nucleotide sequence of the sgRNA and its target genes, an “on-target” score was calculated using the cutting frequency determination (CFD) scoring function39, and discarded sgRNAs with an on-target score below 0.8. Once an optimal sgRNA was generated, specificity was verified by scanning the rest of the genome for sequences with similarity to the sgRNA. To ensure specificity, we filtered out sgRNAs with potential off-target effects, applying stricter thresholds for off-targets in exons (20% of the on-target score) compared to other genomic regions (50% of the on-target score). These strict parameters were chosen to achieve high cleavage efficacy while maintaining the robustness required for a genome-scale tool. The design was then subsequently applied to all subgroups and gene families (Fig. 1). This strategy generated a library with 15,804 unique sgRNAs targeting 10,036 of the 34,075 genes in tomato (Fig. 2a, b). Approximately 95% of the sgRNAs target groups of two or three genes, with the remaining sgRNAs target groups of four to eight genes (Fig. 2c). On average every sgRNA targeted 2.23 genes. Analysis of all the matches between sgRNAs and the targeted genes showed that 25% had no mismatches, 33% had 1 mismatch, 32% had 2 mismatches, and 10% had 3 or more mismatches, with an average of 1.21 mismatches per gene. (Fig. 2d).

Fig. 1: Genome-wide multi-targeted CRISPR-Cas library genetic workflow in tomato.
figure 1

Schematic overview illustrating the library workflow, from design to screening. All coding genes were divided into phylogenetic trees, and trees were classified by function. sgRNAs were designed to target multiple genes located in close proximity to one another phylogenetically (indicated by colors). Each sgRNA was cloned into the Cas9 vector, creating a plasmid library. Transformed lines were screened with multi-targeted, large-scale, forward genetics for specific traits of interest, revealing hidden phenotypes. *Whole genome indicates all coding genes excluding transporters, transcription factors, and enzymes.

Fig. 2: Design and construction of the genome-wide, multi-targeted tomato CRISPR library.
figure 2

a Schematic visualization of the 10 sub-libraries, detailing the main gene families targeted, the number of genes targeted, and number of sgRNAs designed. MC, mitochondrial carrier. Numbers in bold and in brackets in ‘Genes’ column represent the number of genes targeted and the total number of genes (respectively) in aforementioned groups. b Bar chart of the number of genes targeted (black) versus the total number of genes in each sub-library (gray). c Distribution of the number of target genes per sgRNA in the library. d Pie charts of the number of mismatches between the designed sgRNAs and their target genes for sub-library 1 (top) and the entire library (bottom). e The frequency of numbers of sgRNA reads in sub-library 1 as determined by deep sequencing. Coverage was compared to the number of sgRNAs theoretically present in the library designed in silico. f Skewness (left axis, orange) and coverage (right axis, blue) of each sub-library. Orange and blue dotted lines mark the high-quality thresholds for skewness (−1, 1) and coverage (95%), respectively. Skewness for deep sequencing results were calculated as 3*(mean-median)/std dev. Source data are provided as a Source Data file.

Sub-library construction and transformation

To create a research-ready tool that can be easily and flexibly used, we split the sgRNA library into 10 sub-libraries based on the targeted genes function. Sub-libraries 1–9 target specific families within transporter, transcription factor, and enzyme functional groups, and sub-library 10 targets all other genes (Fig. 2a). The three sub-libraries within the transporter, transcription factor, and enzyme functional groups focus on specific gene families. For example, for the transporter functional group, sub-library 1 includes sgRNAs targeting genes from the ABC, MFS, and DMT families, whereas sub-libraries 2 and 3 target genes from the APC family, channels and porins, mitochondrial carriers, and cation carriers, among others (Fig. 2a). The transcription factors sub-libraries (4–6) target noticeable groups such as ARF, AP2/ERF, ARR/GRF, MYB, WRKY, bHLH, bZIP, HSF and NAC, and the enzyme sub-libraries (7–9) were split into hydrolases, isomerases, oxidoreductases, lyases, ligases and transferases (Fig. 2a). Sub-libraries 1–9 contain between 179 and 511 sgRNAs targeting between 90 and 342 genes, and sub-library 10 contains 12,923 sgRNAs targeting 8156 genes (Fig. 2b). A detailed breakdown of functional groups, their associated smaller families, and the corresponding numbers of genes and sgRNAs is provided in Supplementary Table 1. The genome-scale library design and output are available in Supplementary Data File 1.

During synthesis, sub-library-specific adaptors of approximately 30 nucleotides were added on each side of every sgRNA according to their target gene’s functional group. This enabled the amplification of specific sub-libraries using primers complementary to these adaptors. Once amplified, the sub-library sgRNAs were cloned into a binary Cas9 vector (pMR284) in bulk as previously described17, resulting in copies of the vector that contain different sgRNAs. For each sub-library, we evaluated coverage (i.e., how many of the sgRNAs were cloned into the vector) and the distribution of the amplified sgRNAs. Deep sequencing of the 10 sub-libraries revealed essentially full coverage (> 99%) of the possible sgRNAs and a narrow bell-shaped distribution (skew values ranging from 0.02 to 0.26) (Fig. 2e, f, Supplementary Fig. 1), indicating that the sgRNAs were adequately distributed with no gross overrepresentation of individual sgRNAs.

To demonstrate the robustness and applicability of this tool in overcoming redundancy in forward-genetics screens, we transformed sub-library 1 into the M82 sp- tomato background. The 502 cloned sgRNAs, targeting 199 genes belonging to three main transporter families (ABC, MFS, and DMT), were transformed into plants via Agrobacterium tumefaciens-mediated transformation. The transformation process was conducted in a bulk manner: The pool of vectors encoding the 502 sgRNAs was introduced into a single batch of Agrobacterium. Subsequently, this heterogenous Agrobacterium batch was used for plant transformation. We generated ~250 independent transgenic lines, each harboring a unique sgRNA targeting multiple transporter genes. DNA was extracted from individual T0 plants. Notably, not all lines yielded seeds. In addition, we generated ~1060 lines in an Ailsa Craig background from sub-libraries targeting genes encoding transporters (sub-libraries 2 and 3) and transcription factors (sub-libraries 4 and 5). This large-scale transformation demonstrates the versatility of our CRISPR library system, which resulted in over 1300 unique transgenic lines in two tomato backgrounds.

Identifying effective predictors of Cas9 cleavage efficiency

Precise sgRNA design is essential for efficient CRISPR editing, but in order to achieve multi-targeting in our library, we applied a strategy allowing mismatches between sgRNA and target gene. This approach accommodates the multi-targeted nature of our library by enabling a single sgRNA to target multiple homologous genes, often in cases where no fully identical sequence is located near a PAM site or where off-target effects must be minimized. While allowing mismatches facilitates broader targeting, it may affect cleavage efficiency. To evaluate how mismatches and CFD scores influence Cas9 activity, we performed Sanger sequencing on 146 target genes with varying mismatch counts and CFD scores, using both randomly selected plants and those with observable phenotypes for analysis. We also included plants from two tomato backgrounds, M82 and Ailsa Craig, to assess whether these trends were background-dependent. The results showed that both mismatch count and CFD score were significant predictors of whether a gene would be cleaved (Fig. 3a, b), while neither the selection method (phenotype-based vs. random) nor the background (M82 or Ailsa Craig) significantly impacted cleavage efficiency (Fig. 3c, d). Having established that mismatch count and CFD score are reliable predictors of cleavage efficiency in our sequenced samples, we extended the analysis to the entire library to verify whether similar trends held across all target genes. The data confirmed a strong correlation between CFD score and mismatch count across the full library (Fig. 3e). In summary, our findings indicate that the ability to induce mutations can be predicted by target gene-sgRNA parameters, most notably mismatch count, and is independent of other factors like selection method or transformation background (Fig. 3f).

Fig. 3: Mismatch count and CFD score as effective predictors of Cas9 cleavage efficiency.
figure 3

a Number of mismatches in cut (n = 63) and uncut (n = 83) plants, mutations verified by Sanger sequencing. Statistical significance was evaluated by Students two-sided t-test, ***p-value < 0.001, assuming equal variances. b CFD score in cut (n = 63) and uncut (n = 83) plants. Statistical significance was evaluated by Students two-sided t-test, * p-value < 0.05, assuming equal variances. Data are presented as mean values ± SD. c Cas9 edits (cleavage efficiency) in phenotype-based selection (n = 66) and randomly selected plants (n = 80). Not significant by Students two-sided t-test (p-value = 0.064). Data are presented as mean values ± SD. d Cas9 cleavage efficiencies in two tomato backgrounds, M82 (n = 66) and Ailsa Craig (AC, n = 80). Data are presented as mean values ± SD. Not significant by Students two-sided t-test (p-value = 0.134). e CFD score in all library targets, divided into mismatch numbers: 0, 1, 2, and 3 or more mismatches. n number indicated in brackets below. Statistical significance was evaluated by one-way ANOVA and Tukey’s multiple comparison test, p-value < 0.0001. Data are presented as a box plot with the center by the median, box limits correspond to the 25th and 75th percentiles, whiskers extending to the minimum and maximum values. f Cas9 cleavage efficiencies in two tomato backgrounds, M82 (n = 66) and Ailsa Craig (AC, n = 80), among target genes with 0, 1 or 2 mismatches. “random” includes Cas9 cleavage efficiency data from independent plants randomly selected. “Phenotype” indicates phenotype-based selection and includes Cas9 cleavage efficiency data from plants selected based on observable phenotypes. Source data are provided as a Source Data file.

CRISPR-GuideMap—a barcode tagging tool enables whole-library sgRNA sequencing

CRISPR-based forward genetic screens are powerful for identifying gene functions and interactions in plants. However, due to the large number of generated plants, a major limitation of these screens is the inability to track all CRISPR constructs within the library. Since the transformation is carried out in bulk, the identity of sgRNAs in individual plants is unknown. To identify the sgRNAs, researchers typically sequence only plants showing phenotypes, leaving all other plants un-sequenced and unstudied. We envisioned that a double barcode tagging system would enable us to determine the identity of the inserted sgRNA in each plant at a library scale. Thus, we aimed to sequence sgRNAs present in all plants in the library, regardless of whether the plant shows a phenotype at first sight. Such an approach would enhance the utility of our library by enabling complete characterization of the population and maximizing the information obtained from each plant.

To establish this barcoding approach, DNA was separately extracted from 253 T0 M82 transgenic plants transformed with sub-library 1, and the sgRNA region of each plant was amplified with forward and reverse primers that each have unique 8-nucleotide overhangs; this results in a unique “double barcode” for each individual plant (Fig. 4a). We deep sequenced the PCR products of all samples pooled together using the paired-end 150 (PE150) strategy in which 150 nucleotides of each strand of every individual amplicon is sequenced, creating an overlapping region. Use of n forward primers and n reverse primers results in \({n}^{2}\) combinations of barcodes. For example, a library of 1024 samples will require 32 forward primers and 32 reverse primers. The deep sequencing results produced millions of reads, many of which required filtering to remove irrelevant data and isolate meaningful information. The details of the process are described in the Methods section. Of the 253 samples, 17 of them (6.7%) had no sgRNA sequences. We speculate this resulted from loss of DNA during extraction or unsuccessful PCR amplification rather than an inherent absence of sgRNAs in these plants. We detected 146 unique sgRNAs among the 236 samples that expressed an sgRNA. The majority of these sgRNAs, 133 out of 146, were observed three or fewer times; however, we identified three sgRNAs that were significantly overrepresented, appearing 11, 24, and 37 times, respectively, across our sample set (Fig. 4b). In addition, we found that 186 (78%) of the plants expressed a single sgRNA, 41 (17%) expressed 2 sgRNAs, and 9 plants (4%) harbored 3 or more sgRNAs (Fig. 4c). These findings are in accordance with existing literature of insertion rates40. Importantly, when comparing our CRISPR-GuideMap results to those obtained through Sanger sequencing, we observed over 95% matching results. Comprehensive datasets of these results, along with additional data containing barcoding information for 441 transcription factors plants, are available in Supplementary Data File 2. Together, these results demonstrate the reliability and robustness of this system in providing a comprehensive overview of the entire sgRNA repertoire within the plant library. This enables a reverse-genetics approach, as data is available for every plant rather than just a selected few.

Fig. 4: CRISPR-GuideMap streamlines phenotype discovery in CRISPR Libraries.
figure 4

a Schematic visualization of the CRISPR-GuideMap system. DNA from each plant is amplified using primers with unique 8-nucleotide overhangs, generating amplicons with unique barcodes and sgRNAs. b Number of sgRNAs versus the number of plants that harbor a given sgRNA. c Number of sgRNAs inserted per plant. d Phylogenetic tree of the SlNPF1 clade showing the genes targeted by sgRNAs in the library (left) and the combinations detected using CRISPR-GuideMap in planta (right). e Bacterial population counts in 4-week-old M82 (n = 9) and 2 independent npf triple mutant plants (n = 3) at 0 and 3 DPI. Each point represents a biological replicate, averaged from three technical replicates. Statistical significance was evaluated by Students two-sided t-test, *p-value < 0.05: assuming equal variances. NS, not significant by Students t-test (p-value = 3.32 × 10-5 and 5.45 × 10−4 for alleles 1 and 2, respectively). Data are presented as box plot with the center by the median, box limits corresponding to the 25th and 75th percentiles, whiskers extending to the minimum and maximum values. f Representative images of symptoms on leaflets of M82 and two alleles of the npf triple mutant at 4 DPI. Scale bar = 1 cm. g Phylogenetic tree of the SlPT clade, showing the targeted sgRNA combinations in the library design and the combinations detected in plants. Solyc09g073010 has not been previously named and is referred to as PT9. hj Phenotypic data of 8-week-old M82 and pt2/pt6 mutant plants grown under normal and low-phosphate conditions. h n = 13 in M82 control conditions, n = 10 in M82 low P, and pt2/pt6. i, j n = 14 in M82 control conditions, n = 10 in M82 low P, n = 10 in pt2/pt6 control conditions, n = 9 in pt2/pt6 low P. Data are presented as box plot with the center by the median, box limits correspond to the 25th and 75th percentiles, whiskers extending to the minimum and maximum values: h shoot biomass dry weight, not significant by Students two-sided t-test (p-value = 0.992) (i) root surface area, not significant by Students two-sided t-test (p-value = 0.872) and j number of root tips. Not significant by Students two-sided t-test (p-value = 0.603) Statistical significance was evaluated by Students t-test, *p-value < 0.05, **p-value < 0.01, assuming equal variances. k Representative images of the whole root system of M82 and pt2/pt6 mutant under normal (30 ppm) and low (3 ppm) phosphate conditions. Scale bar = 2 cm. Source data are provided as a Source Data file.

Phenotypic screening of CRISPR library in tomato

Two distinct populations of tomato plants transformed with our CRISPR library were screened, each targeting different gene families and having specific phenotypic focuses. The first population transformed subgroup 1 into the M82 background, targeted transporter genes from the ABC, MFS, and DMT families. We aimed to identify physiological and developmental abnormalities under standard and low fertilization conditions in transformed plants. We speculated that screening under low fertilization conditions may accentuate some phenotypic differences.

During initial screening of the mutant population, we identified a mutant line that exhibited an apparent increased susceptibility to pathogens. Sequencing analysis revealed that this line harbored an sgRNA targeting three adjacent genes within the NPF1 subfamily: NPF1.10, NPF1.11, and NPF1.12 (Solyc05g005990, Solyc05g006000, Solyc05g006010), all with zero mismatches. While these genes were successfully amplified in wild-type plants, the mutant plant showed no amplification of the target genes, raising the possibility that a large deletion of all 3 genes may have occurred; this line is referred to as npf1.10/11/12-1 (allele 1) (Supplementary Fig. 2a). Leveraging our CRISPR-GuideMap system, we searched the library for another independent sgRNA line targeting the same genes and identified an additional line harboring an sgRNA targeting NPF1.10, NPF1.11, and NPF1.12 (Fig. 4d). We successfully amplified the expected outcome of a large deletion using the forward primer of NPF1.10 and the reverse primer of NPF1.12. Sequencing confirmed a large, 10,739 base pair, deletion between the target sites in NPF1.10 and NPF1.12 (Supplementary Fig. 2a, b). NPF1.10 was truncated in the first 1/3 of the gene, NPF1.11 was deleted entirely, and the deletion resulted in a frameshift in NPF1.12, resulting in a knockout of all 3 genes. This line is referred to as npf1.10/11/12-2 (allele 2). Data from the Tomato Functional Genomics Database indicated that NPF1.10 is upregulated in response to various pathogens, suggesting a role in the pathogen response. To test this hypothesis, we conducted a Xanthomonas euvesicatoria bacterial inoculation assay on wild-type M82 plants and both npf1.10/11/12 mutant alleles. Plant samples were taken at 0- and 3 days post inoculation (DPI) with bacteria. No differences were observed at 0 DPI, but both alleles showed increased bacterial growth at 3 DPI compared to the wild-type plant, indicating that loss of NPF1.10/11/12 enhanced susceptibility to X. euvesicatoria (Fig. 4e, f). Although the exact mechanism remains unclear, several studies have suggested potential links between nitrogen metabolism, a putative NPF substrate, and pathogen responses41,42,43.

In plants grown under low fertilization conditions, we identified a mutant line with heightened sensitivity to nutrient deficiency. Sequencing revealed mutations in PT2 and PT6 (Solyc03g005530, Solyc03g005560), which encode PHT1 phosphate transporter gene family members (Fig. 4g, Supplementary Fig. 3a). While PT2 and PT6 knockouts were not studied in tomato thus far, the eight PHT1 family members play crucial roles in phosphate uptake and transport, and evolutionary relationships and functional divergence within the Solanaceae family have been characterized44,45. The identical tandem duplicates, PT2 and PT6, are expressed in roots, with upregulation observed under low phosphate conditions45,46. We compared the growth of the pt2/pt6 double mutant with wild-type plants under both control and low-phosphate conditions. Although the double mutant had no noticeable defects under standard fertilized conditions, under low-phosphate conditions, the mutant had stunted growth and reduced shoot biomass accumulation (Fig. 4h). Since both genes are expressed in the root, we assessed root elongation on agar and found that while pt2/6 exhibited reduced root length under low P conditions, no effect was detected under normal P fertilization (Supplementary Fig. 3b). To further investigate the phenomenon in a more natural environment, we imaged the root system under varying phosphate conditions. Under normal phosphate conditions, there was no significant difference between wild-type plants and the pt2/pt6 mutant. However, under low-phosphate conditions, the root surface area and overall number of root tips were both decreased in pt2/pt6 plants compared to wild-type plants (Fig. 4i–k). Root system depth and width did not show a significant change (Supplementary Fig. 3c, d). These results indicate that the two genes are important in phosphate uptake and response. In conclusion, our results demonstrate that multi-targeted CRISPR libraries, combined with CRISPR GuideMap, are effective tools for generating higher-order mutants and uncovering previously unknown phenotypes.

Manipulation of tomato fruit-related traits

The multi-targeted CRISPR approach applied in Subgroup 1 showed significant feasibility in revealing hidden phenotypes, we wanted to evaluate the universality of our approach at larger scales. We, therefore, transformed subgroups 2 and 3, targeting transporters, and 4 and 5, targeting transcription factors (Fig. 2a and Supplementary Table 1), into the Ailsa Craig indeterminate background, and generated additional 1062 independent lines. To identify genes associated with tomato fruit quality, such as fruit size, shape, and Brix content, we screened all 1062 lines in two consecutive seasons at T1 generation (over 10,000 plants, with each line grown in 10 replicates). We identified 125 lines with putative fruit-related phenotypes. Among these, six mutant lines displayed significant variations in fruit size and shape (Fig. 5a, Supplementary Fig. 4). Measurements of fruit diameter, tip length, and Brix content showed significant differences compared to the fruits from control plants (Fig. 5b). Sequencing of the sgRNAs and their target genes identified one single mutant and five double mutants associated with the phenotypes (Fig. 5c, d). For example, line 0815-105 targets two Dof transcription factor genes Solyc02g067230 (SlDof3) and Solyc02g088070 (SlDof8), resulting in smaller fruits (Fig. 5). While, Dof3 and Dof8 loss-of-functions were not reported in tomato, Dof9 single mutant was shown to regulate fruit yield47. Additionally, the sgRNA expressed in line number 0815-122 targets two genes from the ERF family, of which several family members are key transcriptional regulators in the ethylene response pathway48,49,50. Sequencing of the targeted genes showed that the gene family members, Solyc08g081650 and Solyc08g081670 (ERF118-like genes), were mutated, and the fruits contained lower Brix levels (Fig. 5). Mutant plants showed slightly smaller fruit weight compared to Ailsa Craig, while seed number per fruit and seed germination rate traits were not affected (Supplementary Fig. 4). Another example of lower Brix content is line 0815-155, which targets Solyc03g043820, Solyc03g043830, and Solyc03g043840, three unstudied bZIP family members. Interestingly, in both cases (lines 0815-122 and 0815-155), the targeted gene family members were genetically linked to each other, magnifying the strength of the approach in uncovering redundant genetically linked gene activities. Lines 0815-246 and 0815-350 showed significant modification in fruit shape, fruit weight, and reduced seed number per fruit (Fig. 5, Supplementary Fig. 4). Both lines target AP2 family genes using different sgRNA seq, where Solyc02g093150 is common between the two. Previous research indicates that AP2a, another member of the AP2 family, functions as a negative regulator of fruit ripening. Inhibiting AP2a expression enhances ethylene synthesis and accelerates the ripening process in tomato51.

Fig. 5: Multi-targeted CRISPR transcription factors library reveals diverse fruit-related phenotypes in tomato.
figure 5

a Representative images of fruit-related phenotypes in 18-week-old wild-type (Ailsa Craig) and the indicated mutant tomato plants. Scale bar = 1 cm. b Quantification of fruit-related phenotypes such as Brix score, diameter, and tip length. For line 0815−105, n = 7, AC n = 13. For line, 0815−122, n = 10, AC n = 10. For line 0815–155, n = 8, AC n = 12. For line 0815–246, n = 4, AC n = 4. For line 0815–350, n = 18, AC n = 6. For line 0815–456, n = 16, AC n = 5. Each data point represents a biological replicate. Statistical significance was evaluated by Students two-sided t-test, **p-value < 0.01, ***p-value < 0.001. Data are presented as box plot with the center by the median, box limits correspond to the 25th and 75th percentiles, whiskers extending to the minimum and maximum values. c DNA alignments of sgRNAs and target genes. Letters in red indicate mismatches. d Sequencing chromatograms of the targeted genes. sgRNA sequence is in blue, and deletions or substitutions are highlighted in red. e Phylogenetic trees of closely related homologs (targeted genes are highlighted in red). Source data are provided as a Source Data file.

Finally, line 0815-456 targets two ARF genes (ARF2B Solyc12g042070 and ARF3 Solyc02g077560) with significant changes in the fruit shape. Notably, the fruits of this line contain placenta but no seeds (Supplementary Fig. 4b). No previous reports have linked ARF2B and ARF3 to tomato fruit shape, however, ARF3 was shown to play a role in the development of epidermal cells and trichomes development. In this respect, the homologous gene ARF9 has been implicated in regulating tomato cell division and expansion, affecting fruit size52,53. Together, this data demonstrates that gene discovery at large scales, such as across all transcription factor-encoding genes, combined with the multi-genic editing of gene families, can reveal hidden phenotypic variation required for crop improvement.

Discussion

Classic mutagenesis screening techniques, which are often imprecise and produce unpredictable outcomes, have significant limitations. These methods typically generate random mutations, resulting in off-target effects that complicate the identification and characterization of relevant genes. Additionally, genetic linkage, which means that genes located close together on a chromosome tend to be inherited together, further complicates traditional breeding efforts52,54,55. Although often overlooked, this issue presents a substantial challenge for breeding programs that rely on crossing to develop desirable traits. Previously employed CRISPR systems offer specificity but lack scalability. Recently, the first CRISPR library for tomato was generated; it targeted all annotated transcription factors, and its application demonstrated the significant scalability of CRISPR libraries29. Despite the scale of this system, it lacked the ability to generate double or triple mutants, as only a single gene was targeted in each plant. Given that genetic redundancy buffers phenotypic plasticity, the inability to target multiple genes simultaneously possessed a significant limitation. Our research introduces a multi-targeted CRISPR library designed to tackle genetic redundancy by targeting multiple genes within the same family. Unlike previous single-gene targeting approaches, this strategy provides a robust platform for functional genomics and agricultural innovation by enabling simultaneous editing of multiple family members. One of the primary advantages of CRISPR-Cas9 technology is its ability to perform high-efficiency, site-specific gene editing with minimal off-target effects. Building on our previous work in Arabidopsis, which demonstrated the feasibility of genome-wide, multi-targeted CRISPR libraries37, we further extended this approach to tomato. Our successful implementation in tomato demonstrates the feasibility of this approach in crop species. This strategy could be readily adapted to any crop with a sequenced genome and an established transformation protocol, such as maize, wheat, and rice, expanding its potential for functional genomics and agricultural improvement. The developed library, comprising tens of thousands of unique sgRNAs designed as sub-libraries to target genes within functional groups, represents a significant step forward in utilizing CRISPR technology to enhance crop traits and improve agricultural productivity.

To simultaneously target multiple genes from the same family, we allowed for mismatches between the sgRNAs and the targeted genes. This approach allows for high coverage, but mismatches lower the efficiency of the system due to lower complementarity between the sgRNA and the target genes. As expected, our results demonstrated that a high number of mismatches significantly hindered the ability of Cas9 to induce double-stranded breaks in DNA. While we acknowledge that all sgRNAs designed by the CRISPys algorithm have passed additional limiting criteria of the CFD scoring function, such as mismatch position and substitution type, we still observe a clear trend within the approved sgRNAs: lower mismatch counts consistently correlate with higher cleavage efficiency. This suggests that even among sgRNAs that meet the CFD scoring thresholds, mismatch count remains a dominant factor influencing Cas9 activity. This finding underscores the impact of mismatches on CRISPR efficiency and highlights the need for optimized sgRNA design. Our findings are in line with previous reports on the deleterious effect of mismatches on Cas9 cleavage efficiencies in both bacterial and mammalian cells53,56. Several strategies could enhance the robustness of the system in designing future libraries. First, the use of multiplexing, wherein each plant is transformed with multiple sgRNAs, could enhance the overall effectiveness of the system by simultaneously targeting multiple genes without requiring mismatches. However, because of technical oligo synthesis and cloning limitations, this has not been carried out at the large-scale level. Second, genetically engineered versions of the Cas9 protein, such as an intronized Cas957, have increased cleavage efficiency in Arabidopsis and may offer similar benefits in other plant systems including tomato. Third, while our current design relied on the CFD39 scoring function to predict mismatch effects, newer scoring functions are continuously being developed that leverage large-scale experimental data and deep learning to better predict off-target effects58,59,60. Such improved scoring methods could be incorporated into our algorithm to optimize sgRNA design in future libraries.

Our library features a barcoding-sequencing system, CRISPR-GuideMap, that enables tracking of sgRNAs present in all plants. Unlike previous CRISPR screens where only plants displaying phenotypes are typically sequenced and analyzed, this approach allows the identification of sgRNAs in all plants in the library, regardless of their phenotypes. This comprehensive characterization provides information about the full spectrum of potential genetic modifications in the population.

For the M82 background transformation of Sub library 1, a small percentage of transformed plants (17 out of 253) had no detectable sgRNA sequences, likely due to technical issues with DNA extraction or PCR amplification. Analysis of the remaining samples revealed that 78% expressed a single sgRNA, 17% expressed two, and 4% harbored three or more sgRNAs. This distribution aligns with expectations based on existing literature40 and demonstrates the effectiveness of our barcoding-sequencing system in providing an accurate view of the sgRNA content in each plant. Notably, although most sgRNAs were evenly distributed, a few were significantly overrepresented, highlighting areas for further investigation to understand the factors contributing to their prevalence. One way to reduce the overrepresentation would be to perform transformations and tissue culture regeneration in a trackable manner, labeling each regenerated callus.

The ability to identify the sgRNAs present in every plant offers several important advantages for both research and breeding programs. First, by using next-generation sequencing, CRISPR-GuideMap enables the identification of plants harboring multiple sgRNAs, information that cannot be obtained through Sanger sequencing. This is particularly valuable as it ensures the reliability of genotype-phenotype causality inferences. Secondly, by identifying other plants within the transformed library that harbor the same or different sgRNAs targeting the same genes of interest, researchers can efficiently verify genotype-phenotype relationships and streamline verification. Additionally, this system expands homology research by allowing for the identification of plants with sgRNAs targeting neighboring genes on the phylogenetic tree, thereby enhancing our understanding of genetic relationships and evolutionary patterns. This information guides choice of lines for continued research. Second, by revealing the distribution of sgRNAs across the entire population of transformed plants, CRISPR-GuideMap provides a comprehensive overview of the genetic variation generated, benefiting both basic research and breeding applications.

Lastly, CRISPR-GuideMap can facilitate a ‘reverse genetics’ approach, allowing researchers to identify plants carrying sgRNAs targeting specific genes or gene families even prior to screening. For example, studies focusing on genes encoding particular nutrient or hormone transporters can be conducted using only plants with relevant sgRNAs, increasing screening efficiency. Overall, CRISPR-GuideMap represents a significant advancement in genetic screening technology. It not only maximizes the amount of usable data from the library but also enhances the system’s overall functionality, making it a powerful tool for research and breeding alike.

In conclusion, the developed CRISPR library in tomato represents a significant advancement in genetic engineering, extending the toolbox for functional genomics and crop improvement. By overcoming the limitations of genetic redundancy and implementing a barcoding-based sequencing strategy, our approach provides powerful multi-targeted tools for researchers and breeders. This innovative strategy has the potential to accelerate the development of various crops with enhanced traits, contributing to food security and sustainable agriculture.

Methods

Plant material and growth conditions

Tomato (S. lycopersicum) plants in M82 sp- and Ailsa Craig (Sl) backgrounds were used throughout this study. Plants were grown in a greenhouse or a growth room with long-day conditions (16 h light/8 h dark) at 20–30 °C.

Tomato Solanaceae Genomics Network (SGN) accession numbers: Solyc05g005990 (NPF1.10), Solyc05g006000 (NPF1.11), Solyc05g006010 (NPF1.12), Solyc03g005530 (PT2), Solyc03g005560 (PT6), Solyc02g067230 (SlDOF3), Solyc02g088070 (SlDOF8), Solyc08g081650 (ERF118-like), Solyc08g081670 (ERF118-like), Solyc03g043820, Solyc03g043830, Solyc03g043840, Solyc02g064960, Solyc02g093150, Solyc03g044300, Solyc12g042070 (ARF2b), Solyc02g077560 (ARF3).

Bacterial material and growth condition

All bacteria were grown on LB agar media: 20 g of LB (Lennox) (Accumedia) and 15 g bacteriological agar (Accumedia) were added to 1 L doubly distilled water and autoclaved for 20 min at 121 °C. Antibiotics were added at final concentrations of 50 μg/ml kanamycin, 100 μg/ml carbenicillin, 25 μg/ml gentamycin, 50 μg/ml spectinomycin, and 25 μg/ml rifampicin accordingly.

Plant DNA extraction and genotyping

To isolate genomic DNA from young tomato leaves to serve as a template for PCR to perform sequencing and genotyping, approximately 100 mg of tomato leaves were placed in a 2-ml round-tip Eppendorf tubes together with a metal bead and rapidly frozen in liquid nitrogen. Subsequently, the frozen tissue was crushed into a thin powder using a tissue-lyser. The powdered tissue was homogenized with 400 μl of DNA extraction buffer containing 200 mM Tris-HCL (pH 7.5–8.0), 25 mM EDTA, 250 mM NaCl, and 0.5% SDS. After homogenization, the tubes were briefly vortexed for 5 s and then subjected to centrifugation at 21,130 × g for 1 min in an Eppendorf mini centrifuge. The supernatant was transferred to a new tube, and DNA precipitation was achieved by adding 300 μl of isopropanol. Following a 5-min incubation at room temperature, the tube was centrifuged at 21,130 × g for 10 min at room temperature. The visible pellet was washed with 600 μl of 70% ethanol and centrifuged for 1 min at 21,130 × g at room temperature. Finally, the DNA pellet was resuspended in 50 μl doubly distilled water. The concentration of the extracted DNA was determined from absorbance measured using a Nanodrop spectrophotometer.

All lines shown in this study were screened in T1 and confirmed for phenotype and genotype in T2 generation. The presented mutants are either homozygous or biallelic. Plants were first genotyped for sgRNA and Cas9 insertion, followed by detailed genotyping for mutations in target genes using primers indicated in Supplementary Table 4. After validating homozygous or biallelic lines, plants were genotyped again to test whether the Cas9 was still present in the line or segregated out: lines npf1.10/11/12_2, 0815−105, 0815-246, and 0815-350 were found homozygous for Cas9; lines: npf1.10/11/12_1, 0815−122 and 0815- 155 were heterozygous for the Cas9; pt2/6 and 0815-456 have no Cas9 insertion.

Construction of multi-targeted CRISPR libraries and tomato transformation

The 20-nucleotide sgRNA target sites were appended to specific adaptors containing type IIS restriction enzymes BsaI sites (Supplementary Table 4). Synthesis of the 15,804 DNA oligonucleotides (total yield: 500 ng) corresponding to the sgRNAs was performed by Twist Bioscience. Using adaptor-specific primers (Supplementary Table 4), the libraries were prepared as described by Hu et al.37. Briefly, vectors were assembled using the Golden Gate cloning system61 to generate specific overhangs that allow for one-pot, directional assembly of multiple DNA fragments. Each fragment is flanked by unique 4-bp overhangs, enabling seamless and predefined assembly into a destination vector via simultaneous digestion and ligation. Final binary vectors, pMR284, were introduced into A. tumefaciens strain GV3101 by electroporation. The constructs were transformed into M82 sp- and Ailsa Craig cotyledons using transformation and regeneration methods described by McCormick62. Briefly, tomato cotyledons from 10-day-old seedlings of M82 sp- or Ailsa Craig are first excised and placed on regeneration medium for a 2-day pre-culture period to enhance their competence for transformation. The explants are then infected with Agrobacterium tumefaciens carrying the desired construct and co-cultivated in the dark for 2 days to allow for T-DNA transfer. After co-cultivation, the cotyledons are transferred to a selective regeneration medium containing appropriate antibiotics to inhibit Agrobacterium growth and select for transformed cells; they are subcultured every 14 days until shoot formation. Kanamycin-resistant T0 plants were moved to soil, and independent transgenic lines were numbered according to the order in which they were moved to soil.

Barcode primer design and amplification

Primers with no overhangs were tested to verify sufficient amplification of transgenic T0 plants from the library (forward primer, 5′-cacatcgcttagataagaaaacg-3′; reverse primer, 5′-cctaggtaatgccaactttgtac-3′). PCR was conducted using the Vazyme X2 Rapid Taq Kit with annealing at 54 °C, elongation for 5 s, 30 cycles. Next, 64 barcode sequences, each 8 nucleotides in length, were retrieved from Hamady et al.63. These barcodes were designed following specific criteria such as 40–60% G/C content and no consecutive triplets and avoid self-complementarity to ensure optimized PCR amplification. Thirty-two of the sequences were concatenated to the 5′ end of the forward primer, and the remaining 32 to the 5′ end of the reverse primer. A list of all final primers can be found in Supplementary Table 1, and a table of all combinations can be found in Supplementary Table 2.

Each sample was amplified using a unique combination of barcode primers. The length of each amplified fragment, including the two 8-bp barcodes, should be 222 bp. At this stage, the individual PCR products were pooled together and run on a gel using electrophoresis. The DNA was isolated from the gel using the Nucleospin Gel and PCR Clean-up Kit from Macherey and Nagel, and samples were PE−150 deep sequenced by Novogene.

The analysis of deep sequencing PE150 data commenced with rigorous quality checks to ensure the reliability of the reads. Each pair of reads was examined for an overlapping region of 78 bp, encompassing the sgRNA. Reads with discrepancies in this region were discarded. Additionally, reads containing barcode sequences at the 5′ end of reads that did not match our predefined list were removed from the dataset. Non-variable regions in the sequencing were scrutinized, and reads with mismatches, insertions, or deletions were removed as well. Once high-quality reads were obtained, we proceeded to assign pairs of reads to their respective plant numbers using the unique barcode combinations listed in Supplementary Table 2. Once the plant of origin was identified, the corresponding sgRNA sequence was assigned to that plant.

Despite the initial filtering, the data contained some “noise,” necessitating further evaluation to extract meaningful information. We categorized reads per plant based on their abundances and conducted a careful examination for adequate representation. Plants with the most abundant reads accounting for less than 20% of all reads for that plant, or with limited occurrences (<750 reads), were classified as having insufficient sequencing results. Plants for which the most abundant read was at least three times more frequent than the next most abundant read were identified as containing a single sgRNA. For the remaining plants, we analyzed the reads in order of abundance. When we encountered a read that was more than double the abundance of the following read, we determined the number of sgRNAs present in the plant, resulting in 2, 3, or in rare cases 4 sgRNAs.

Gene family classification and multi-target sgRNA design

The set of protein-coding genes of S. lycopersicum was obtained from the PLAZA 4.5 Plant Comparative Genomics Database64. To efficiently design sgRNAs for each gene family, we utilized the CRISPys algorithm, while considering homologous relationships within each family38. For a given gene family, its alignment was computed using MAFFT version 765, and phylogeny was reconstructed using a hierarchical clustering algorithm66. Large gene families were partitioned to subfamilies of at most eight genes.

The design strategy of CRISPys was then applied recursively to each subgroup of each gene tree, determining the optimal sgRNAs for targeting specific subfamilies. The CRISPys algorithm utilized the CFD score39 as the scoring function, with a targeting efficacy threshold (Ω) of 0.8. The potential sgRNA targets were confined to the first two-thirds of the coding sequence.

To avoid generating the same sgRNA twice, we accounted for cases where the same sgRNAs could be assigned to different subgroups of homologous genes, with one subgroup being a subset of another (e.g., generating candidate sgRNAs for both {gene1, gene2} and {gene1, gene2, gene3}) as a subset of homologous genes). In such cases, only one occurrence of the sgRNA was considered. Additionally, to prevent multiple sgRNAs from targeting essentially the same genomic region, we allowed a maximum of 2 bp overlap between sgRNAs.

After generating the sgRNAs, a genome-wide off-target detection search was applied. We defined an off-target gene as a potential genomic target outside the specified gene family, whereas on-target genes were genomic targets within the family, even with some mismatches. The off-target threshold was set at one-fifth of the on-target score. For instance, if an sgRNA had an on-target score of 0.9, the off-target threshold would be 0.18 (0.9*0.2 = 0.18). Any sgRNA with an off-target sequence having a CFD score above 0.18 was discarded, ensuring the selection of specific sgRNA sequences. Following this filtering, we limited the number of designed sgRNAs to a maximum of 8 per subgroup (i.e., internal node in the tree).

Phylogenetic tree building

Gene families were taken from Plaza Dicots 4.5. Amino acid sequences of proteins in every family of interest were aligned using MAFFT version 7 using BLOSUM62 scoring matrix with a gap opening penalty of 1.53 and an offset value of 0. Results were reformatted to PAUP/NEXUS format and downloaded from MAFFT website65,67. Nexus files were uploaded to phylogeny.fr using the “one-click” mode for analysis, with the Gblocks program enabled. Completed trees were downloaded in Newick format and rendered via FigTree v1.4.4 software.

Xanthomonas euvesicatoria bacterial inoculation assay

X. euvesicatoria strain 85−10 (Xe 85−10) was vacuum infiltrated into the leaves of 4-week-old wild-type M82 sp- and npf1 mutant plants. The bacteria were inoculated into the tomato leaves at a dilution of 2 × 105 colony-forming units/ml (O.D.600 = 0.0004) in a solution containing 10 mM MgCl2 and 0.08% Silwet L-77. Following the inoculation, 1 cm2 leaf discs were punched out of the third leaf at 0 DPI (i.e., at 3 h after inoculation) and at 3 DPI. Bacteria were quantified. Three leaf discs from the bacteria-inoculated leaves were then crushed in 2.00 ml Eppendorf tubes with 10 mM MgCl2. Next, tomato leaf extracts containing the bacteria were serially diluted and spotted onto LB agar plates to count the number of bacterial colonies that grew. Leaves were photographed at 4 DPI.

Shoot and root phenotyping

Seeds were germinated in soil with slow-release fertilizer (vendor). Plants received regular fertilization and drip irrigation (2 X 5 min daily). After 28 days, plants were moved to 12-L sand pots. Control conditions contained 3% phosphate (70 ppm N, 30 ppm P2O5, 70 ppm K2O, micronutrients - Koratin); and low-phosphate conditions contained 0.3% phosphate (70 ppm N, 3 ppm P2O5, 70 ppm K2O, micronutrients - Koratin). At 2 months, plants were scanned and analyzed using the Phenoroot imaging system (phenoroot.com). Shoots were cut at the soil level, separating the shoot from the root. Shoots were placed in brown paper bags at 60 °C for 72 h. After drying, each plant was weighed to determine shoot biomass. List of fertilizers and suppliers: Multi-K (13-0-46) (Haifa Group), Phosphoric Acid 85% (Haifa Group), Liquid Ammonium Sulfate 21% (Deshen Gat), Koratin (ICL).

For root length measurements on ½ MS plates with varying phosphate (P) levels, standard ½MS medium was used for 100% phosphate plates. For 0% phosphate plates, ½MS -P (phosphate-free) plates were prepared using MS -P powder (catalog number MSP11, Caisson Labs). To create plates containing 10% phosphate, standard ½ MS medium was mixed with ½MS -P medium in a 1:9 ratio, achieving a final phosphate concentration of 10% relative to standard MS. Sterilized M82 and pt2/6 seeds were germinated and root length measured after 6 days using ImageJ software.

Fruit phenotyping

Tomato CRISPR lines were grown under normal conditions in the greenhouse (25–30 °C), and naturally ripened tomato fruits were harvested for phenotyping characterization. The determination of total soluble solids of tomatoes was performed using a digital Brix refractometer (ATAGO PAL-BX/ACID3). The Brix measurement was conducted using juice extracted from ripe fruits. The diameter of the ripe fruits was measured using a vernier caliper. To measure the length of the pointed tip of the fruit, mature fruits were photographed longitudinally and analyzed using the software ImageJ. For the statistics of the weight of a single fruit, fruits of relatively uniform size on the plants at the age of 18 weeks were selected for measurement. After that, the seeds of each single fruit were scooped out for counting the number of seeds in a single fruit. Each biological replicate included 30 seeds that were germinated in a culture dish, and the germination rate was counted after 2 weeks. The germination rate = (the number of germinated seeds / 30) × 100%. For each mutant line, the measurements were taken from at least three biological replicates for analysis.

Statistics and reproducibility

Statistical analysis was performed using Microsoft Excel 2019 and GraphPad Prism v.8.0 (GraphPad software). Statistical significance was determined by using unpaired two-tailed Student’s t-test for two-group comparisons and one-way ANOVA for multiple comparisons. Asterisks indicate significant differences (*P < 0.05, **P < 0.01, ***P < 0.001). Different lowercase letters indicate significant differences (P < 0.05). All experiments were repeated independently three times with consistent results. No statistical method was used to predetermine sample size, no data were excluded from the analyses, the experiments were not randomized, and the investigators were not blinded to allocation during experiments and outcome assessment.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.