Frequently Asked Questions

How does it work?

CRISPR-mediated base editing enables programmable DNA base conversion without DNA double-strand break formation (Komor et al. Nature 2016). Different generations of CRISPR base editors (BE1, BE2, BE3 and BE3 variants) can convert cytidine to uridine, resulting in C -> T and G -> A substitutions in the genome. We and others ( Billon P., Bryant E. et al. Molecular Cell, 2017; Kuscu et al. Nature Methods 2017) have demonstrated that CRISPR-mediated base editing is an efficient system for introducing premature stop codons in human cells, resulting in the generation of gene knock-outs. Induction of STOP codons (iSTOP) is mediated by sgRNAs for iSTOP (named sgSTOPs) targeting 4 different codons (CAA, CAG, CGA, and TGG) located at a precise distance from a protospacer adjacent motif (PAM). CAA, CAG, and CGA codons can be converted into STOP codons when targeted on the coding strand, while TGG can be converted into STOP codons if targeted on the non-coding strand.

The Table

Column Name Description
Gene Name Gene symbol (UCSC gene names for all species except A. thaliana, which uses TAIR gene names)
Relative Position in Largest Isoform Relative position in the largest gene isoform of the coordinate of the targeted base (0= the beginning of the coding sequence, 1= the end of the coding sequence)
No Upstream G TRUE= no G is located on the immediate 5'-side of the targeted base
RFLP Loss Restriction enzymes that uniquely cut +/- 50 bases of genomic sequence from the targeted base before editing. Multiple enzymes are separated by "|".
RFLP Gain Restriction enzymes that uniquely cut +/- 50 bases of genomic sequence from the targeted base after editing. Multiple enzymes are separated by "|".
NMD Prediction (%) Percentage of isoforms predicted to incur nonsense-mediated decay. This prediction is based on the targeting of an isoform's coding sequence 55 bases upstream of the final exon-exon junction
PAM: The 20nt guide sequence for the corresponding PAM (targeted c is lowercase)
Off-Target Prediction PAM: Number of sequences that may be unintentionally targeted by the sgSTOP. Sequences matching sgRNAs (including the PAM) are searched in the genome while allowing up to two mismatches in positions 1-8 of the seed sequence.
Cancer type(s) Cancer type(s) in which the sgSTOP is predicted to model a nonsense mutation observed. Multiple cancer types are separated by "|". Only for H.sapiens.
Chromosome Chromosome Name/Number (hidden on website)
Strand Strand of the targeted base in the coding sequence (hidden on website)
Genomic Coordinate Genomic coordinate of the targeted base (hidden on website)
Targeted Codon Codon that is targeted (hidden on website)
Number of Isoforms Number of isoforms considered for the gene (hidden on the website)
Percent Isoforms Percentage of the isoforms that are targeted at the coordinate of the targeted base (hidden on the website)

The rows in red are hidden on the table but can be revealed by using the "Column Visibility" button. These columns, even when hidden, will be present on the excel or .csv file that the user may download using the "Download Table (Excel)" or "Download Table (.csv)" buttons.

Why can’t I see some of the columns?

The table is responsive to the size of the browser window which may lead some columns to be hidden. Please click the "+" sign (within the "Gene Name" column) to see those hidden columns for the corresponding row. If there is no "+" sign within the "Gene Name" column then all of the columns are being displayed. Furthermore, use the “column visibility” button to show/hide the columns you do not want to see. In parallel, it is possible to select specific parameters in “Advanced Search” to restrict the number of columns. Finally, we recommend clicking on the “Download Table (Excel)” or "Download Table (.csv)" buttons to download the result as a separate excel file (.xlsx) or comma separate value (.csv) file.

How do I find the targeted base in the sgSTOP sequence?

The lower case letter indicates the targeted base. For example: TTTTcAGCTTGACACAGGTT.

What are the hyperlinks on the sgSTOPs and how do I use them?

The links will only work for organisms in the UCSC genome browser. These links will direct users to the results page of the UCSC genome browser BLAT search for the sequence clicked. IMPORTANT! Before clicking the link on each sgSTOP sequence, make sure to set the BLAT search in the UCSC genome browser to the organism that you want to search by clicking here. For example, if you are searching for your gene of interest in humans please select "Human" in the "Genome" drop-down. To make sure you are in the proper BLAT search, then check the upper left corner and it should say "Human BLAT Search". Then click on the sgSTOP link and it will automatically search for the indicated sgSTOP in the UCSC genome browser.

The Search

How can I select sgSTOPs to introduce stop codons into my favorite gene using base editing?

In “Gene Search”, select the species using the drop-down menu and input the name of your favorite gene. Then click “Submit” to get the list of all the sgSTOPs available in your gene. In "Cancer Search", select the cancer type of interest with (left side) or without (right side) selecting your favorite gene using the drop-down menu and then click "Submit". The right side allows users to restrict the search in Homo sapiens for sgSTOPs that model nonsense mutations identified in cancer. The left side allows users to get a list of all the sgSTOPs for any gene that would model nonsense mutations in that cancer type. To restrict the number of sgSTOPs, we strongly recommend using the “Advanced Search” feature to filter the search. Especially, we suggest selecting for sgSTOPs that can be easily monitored on a gel using a Restriction Fragment Length Polymorphism (RFLP) Assay and have the maximum on-target efficiency (see details below).

The table uses UCSC or TAIR gene names, can the search recognize some aliases?

Yes! For many of the species, you can search for your gene of interest using an alias. For example, you can search for "tip60" in H. sapiens and obtain the sgSTOPs for KAT5 (UCSC gene name). For C. elegans and A. thaliana we suggest using the UCSC and TAIR gene names, respectively, since the alias list is not as comprehensive.

What are the different parameters in the “Advanced Search”?

PAM: Different BE3 variants available to introduce STOP codons (for details see Kim et al. Nature Biotechnology 2017)

To express the CRISPR base editors in mammalian cells, use the plasmids created by Komor et al. Nature 2016 and Kim et al. Nature Biotechnology 2017, as indicated below:

PAM: NGG, Base editor: BE3, plasmid on Addgene #73021

PAM: NGA, Base editor: VQR-BE3, plasmid on Addgene #85171

PAM: NGAG, Base editor: EQR-BE3, plasmid on Addgene #85172

PAM: NGCG, Base editor: VRER-BE3, plasmid on Addgene #85173

PAM: NNGRRT, Base editor: SaBE3, plasmid on Addgene #85169

PAM: NNNRRT, Base editor: SaKKH-BE3, plasmid on Addgene #85170

Off-target Prediction: Preliminary prediction on the uniqueness of the guides into the selected genome. This search tolerates 2 mismatches in positions 1 to 8 of the sgRNA.

NMD Prediction (%): Percentage of isoforms predicted to produce Nonsense Mediated Decay (NMD) from the insertion of a stop codon by the sgSTOP.

RFLP Assay: sgSTOPs that can be monitored on a gel by RFLP assay (see our paper Billon P., Bryant E. et al. Molecular Cell, 2017). This assay can monitor the efficiency of iSTOP-mediated base editing in cellular populations and clones.

Upstream G: sgSTOPs that do not have a G on the immediate 5'-side of the targeted base. These guides are expected to be more efficient.

How does the RFLP assay work?

The RFLP assay relies on the amplification of a targeted locus by PCR and digestion by a specific restriction enzyme.

Two situations are possible:

1- Restriction site(s) overlap with the targeted base. The transition of the base destroys the restriction site(s), thereby making an edited PCR product refractory to digestion. This can be detected on a gel. Find the list of the restriction sites(s) available for a given guide in the “loss” column.

2- Restriction site(s) are created by the transition of the targeted base. The change of the targeted base creates a restriction site that can be monitored by digestion on gel. Find the list of the restriction site(s) in the gain column. It is important to note that monitoring for the gain of a restriction site may underestimate the real base editing efficiency since several bases can be modified in the window of high activity of BE3, hindering the creation of the restriction site.

The restriction sites displayed in the “gain” and “loss” columns have been selected because they are unique within a window of [-50bp , +50bp] around the targeted base(s). This allows users to amplify by PCR a minimal region of 100bps, which can be easily digested and monitored on a gel. However, we encourage users to map the genomic loci around the targeted bases to ensure efficient PCR amplification of the locus of interest.

For testing these detection methods we invite users to try the sgSTOP targeting the gene SPRTN (5'-GGGCCAGCTGGAGGCCGTCG-3'). Base editing with this sgSTOP can be detected by both the gain and loss of a restriction site (see Figure 2C in Billon P., Bryant E. et al. Molecular Cell, 2017). Plasmids that express the SPRTN sgSTOP alone or in combination with an sgRNA targeting ATP1A1, a gene used for co-selection strategies (see Figure 3 in Billon P., Bryant E. et al. Molecular Cell, 2017), will be made available on Addgene.

Amplifying the SPRTN locus and checking by RFLP assay

Briefly, amplify the SPRTN locus using the primers PB571 5’-GCAAAGAGTAAAGGCTGAAACTAGC-3’ and PB572 5’-CACTATCATAAGGCAAATCAGGAAC-3’.

Next digest the PCR amplicons. A PvuII site will be efficiently lost upon base editing while an NheI site will be created. Therefore, PCR amplicons containing the editited base will be refractory to PvuII digestion but proficient for NheI digestion. However, if there is no base editing, then the PCR amplicons will be digested by PvuII but not NheI.

(Note: Digestion with NheI underestimates the efficiency of base editing because a second base can be potentially edited leading to the inactivation of the new NheI restriction site)

What makes an efficient guide?

The presence of a G upstream of the targeted base strongly inhibits BE3 activity. Guides expected to be less efficient due to the presence of a G can be removed from the search by selecting the “Upstream G” box in “Advanced Search”. If the users wants to knock-out their gene of interest, it is preferred to select sgSTOPs predicted to induce nonsense-mediated decay (NMD).

Why is the NMD prediction shown as a percentage?

The number represents the percentage of isoforms for the given gene that are predicted to be affected by NMD after insertion of a stop codon using the specified sgSTOP.

What does the “Off-target Prediction” column mean?

The “Off-target Prediction” column indicates the number of off-target positions in the genome that match the guide sequence (including the PAM) allowing for up to two mismatches in the first 8 positions of the guide (positions 1-8 of the seed sequence). Guides with an Off-target Prediction greater than 0 can be removed by selecting the “No off-targets” box under "Off-target Prediction" in “Advanced Search”.

For more advanced off-target prediction, please consider using one of the following tools:

1) CRISPR Design (Feng Zhang, MIT)

2) CRISPR Design tools (The Broad Institute)