A do-it-yourself protocol for simple transcription activator-like effector assembly

Background TALEs (transcription activator-like effectors) are powerful molecules that have broad applications in genetic and epigenetic manipulations. The simple design of TALEs, coupled with high binding predictability and specificity, is bringing genome engineering power to the standard molecular laboratory. Currently, however, custom TALE assembly is either costly or limited to few research centers, due to complicated assembly protocols, long set-up time and specific training requirements. Results We streamlined a Golden Gate-based method for custom TALE assembly. First, by providing ready-made, quality-controlled monomers, we eliminated the procedures for error-prone and time-consuming set-up. Second, we optimized the protocol toward a fast, two-day assembly of custom TALEs, based on four thermocycling reactions. Third, we increased the versatility for diverse downstream applications by providing series of vector sets to generate both TALENs (TALE nucleases) and TALE-TFs (TALE-transcription factors) under the control of different promoters. Finally, we validated our system by assembling a number of TALENs and TALE-TFs with DNA sequencing confirmation. We further demonstrated that an assembled TALE-TF was able to transactivate a luciferase reporter gene and a TALEN pair was able to cut its target. Conclusions We established and validated a do-it-yourself system that enables individual researchers to assemble TALENs and TALE-TFs within 2 days. The simplified TALE assembly combined with multiple choices of vectors will facilitate the broad use of TALE technology.


Background
With the recent emergence of transcription activator-like effector (TALE) technology, gene editing has entered an exciting new era [1][2][3]. While zinc finger nucleases have been well established for the purpose of generating targeted mutations [4], their challenging design and need for experimental optimization have restricted this technology to few, highly specialized laboratories. In contrast, TALEs are simple to design, able to target almost any DNA sequence within the genome, and promise less off-target effects compared to zinc-finger nucleases [5][6][7].
Native TALEs are transcription factors used by plantpathogenic bacteria in the genus Xanthomonas. They activate transcription of host genes by binding to specific sequences in the promoter region of the targeted gene [8,9]. Strikingly, the TALE DNA binding domain consists of tandem 33-35 amino acid repeats, followed by a single half repeat of 20 amino acids. Interestingly, the tandem repeats are nearly identical, except for two amino acid codons at position 12 and 13, referred to as "repeatvariable di-residue" (RVD). Each of the four most common RVDs specifies the binding to one of the four nucleotide bases [10,11]. Taking advantage of the simplicity of the TALE coding principle, customized TALEs can be easily designed to allow genetic and epigenetic manipulation. For example, the TALE DNA-binding domains can be combined with either a catalytic DNA endonuclease domain, such as FokI, to allow gene editing, or a transcription factor (TF) domain for gene activation. Indeed, both TALEN and TALE-TF, have been successfully applied to gene-editing or activation in a number of species [2,[12][13][14][15][16].
Because of the repetitive nature of the DNA binding domain, the assembly of customized TALEs by direct synthesis or traditional cloning is expensive and technically challenging. Realizing the potential of TALE technology, a number of approaches for TALE assembly have been devised to allow low to medium throughput ( [5,17]), or high-throughput with automation [18,19]. It is worthy to note that these methods are based on the Golden Gate procedure, a cloning strategy that makes use of type IIs restriction enzymes which cut sequences adjacent, rather than within their recognition sites, and allows seamless ligation of repetitive sequences in a specific order [5,17]. The Golden Gate cloning technique has proven powerful, but typically relies on the use of large numbers of plasmids or amplified monomers, making this strategy not feasible for the individual research lab.
The objective of this research was to establish a system for simple TALE assembly, and to develop vector sets with gene editing or gene activation capability under the control of different promoters, to allow for a variety of downstream applications.

Results and discussion
We established a do-it-yourself system for the fast and simple assembly of TAL-repeats into a collection of vectors for TALEN and TALE-TF expression in mammalian cells ( Figure 1). This system simplifies the assembly of custom TALEs in three main ways. First, by eliminating the need for time-consuming and error-prone set-up, it shortens time, effort and cost of assembly. Second, by providing a streamlined protocol, it simplifies an otherwise complicated approach, making it feasible for any standard molecular lab. And lastly, a collection of backbone vectors with TALEN and TALE-TF domains under the control of various promoters offers choices for diverse downstream applications in the mammalian system. Using this approach, individual researchers can assemble one or several TALEs into a vector of choice in just 2 days ( Figure 2) by using standard molecular techniques and a thermocyler.
The assembly is based on the Golden Gate method, which relies on the ability of type IIS restriction enzymes to cut outside of their recognition site. Type IIS recognition sites arranged in inverse orientation at the 5' and 3'  Figure 1 Schematic overview of multimer assembly from the ready-made monomer library into a vector of choice for gene editing or gene activation.  end of a DNA fragment will be removed upon cleavage, allowing simultaneous restriction and ligation. The continuous re-digestion of unwanted ligation products increases the formation of the desired construct. As type IIS fusion sites can be designed to have different sequences, Golden Gate cloning enables directional and seamless assembly of multiple DNA fragments. As a first step of our do-it-yourself protocol, we assembled monomers into multimers ( Figure 3A, B), using a procedure based on restriction, ligation and amplification. Multimer 1 and 2 are designed to be hexamers, but the length of multimer 3 can vary to allow variations in the final length, such as 14-19 bp binding sequences. To remove the incompletely assembled and thus linear ligation products, DNA exonuclease treatment was carried out after the multimer assembly. The correctly assembled circular multimers were subsequently amplified by PCR.
On day 2, gel-purified multimers were assembled into a vector of choice, using a second restriction-ligation-based procedure, followed by bacterial transformation ( Figure 3C). Colony PCR was performed for confirmation of insert size ( Figure 3D); typically, 40-90% of colonies displayed correct insert size. We recommend to use two colonies of correct insert size for sequence confirmation ( Figure 3E, F); typically 80-90% of sequences revealed correct assembly. A detailed protocol of this approach is provided in the Materials and Methods section.
We have successfully assembled a number of custom TALEs into various vectors. To examine functionality of an assembled TALE, we performed a co-transfection experiment in human embryonic kidney 293 cells. Consistent with previous findings [20], the custom-assembled TALE-TF activated luciferase activity 65-fold, validating its functionality ( Figure 3G, H).
To confirm functionality of an assembled TALEN, we performed a transfection experiment in human embryonic kidney 293 cells using a pair of assembled TALENs to target the AAVS1 locus in the human genome  Figure 3 A) A ready-made library of normalized, quality-controlled monomers provides the building blocks for TALE assembly. B) According to the custom TALE design, monomers are assembled into 2-3 multimers in a restriction and ligation-based procedure, using a thermocycler. In the example shown here, multimer 1 and 2 are hexamers while multimer 3 is a tetramer. C) E. coli transformation with the final assembly product in a vector of choice typically results in tens to hundreds of colonies, while the negative control should have significantly lower to no colonies D) Correct assembly of multimers into the vector can be assessed by colony PCR, and further confirmed by sequencing (E). F) As the TALE binding specificity is based on 4 types of RVDs, color-coding the RVD-encoding nucleotides can quickly reveal the correct order of tandem repeats. G) For functional validation, a custom TALE domain was assembled into the EF1-TALE-TF vector. In addition, a dual reporter construct was generated carrying 3 copies of the TALE binding sequence. H) Co-transfection of the custom TALE-TF and the dual reporter construct into HEK293 cells showed strong induction of luciferase activity, confirming the TALE-TF functionality.  Figure 4A). For TALEN activity, the Surveyor nuclease mutation detection assay provides a functional validation of successful de novo cutting with a particular pair of TALENs. As shown in Figure 4B, we designed primers that have an amplicon of 650 bp and flanking the centrally localized TALEN target. PCR-amplifcation using this primer pair produced a single band in both mockand TALEN-transfected samples. After denaturing, heteroduplex reannealing and Surveyor nuclease treatment, AAVS1 TALEN-transfected cells displayed an extra band of~320 bp as predicted ( Figure 4C-D). These data validate the function of assembled TALENs using Surveyor nuclease mutation detection assay.

Conclusions
We established and validated a do-it-yourself strategy that enables researchers to assemble TALE-TF/TALENs in just 2 days. The simplicity of this approach and its minimal hands-on time makes gene-editing an affordable and practical choice for the standard molecular lab. The choices of a number of useful vector sets should further broaden TALE technology to various applications.

Experimental design
Free online tools such as TALEN Targeter [6] and idTALE [21] are available to design TALENs and TALE-TFs that are specific and have a low risk of off-target effects. TALE-TFs only require one effector protein, while TALENs require the design of protein pairs, which bind two DNA sites, usually spaced 15-30 bp apart to allow for optimal FokI dimerization and cutting [5]. Length of DNA binding sequences may vary, typically ranging from 14-20 bp. In humans, 20 bp may offer high specificity, considering the genome size. It is worthy to note that longer domains (18-20 bp) may decrease cell toxicity by reducing the risk of off-target effects [19].
The appropriate vector can be chosen from a collection of TALEN and TALE-TF backbone vectors, listed in

Choosing monomers (5 min).
Divide the target sequences of 14-20 nucleotides into multimers, excluding the first (5') T and the last (3') nucleotide, which is vector encoded. The first two multimers should be hexamers, the last multimer is variable in size and contains however many monomers remain (excluding the vector-encoded last nucleotide).

Example T | A T C G C C | T C T A G C | C A C T* | G
Take the corresponding color-coded monomers from the EZ-TAL ™ kit (Figure 1). A special set of "end" monomers can be used to adjust for target sizes shorter than 20 nucleotides (14)(15)(16)(17)(18)(19) Table 2. Add 4 μL of mix to each multimer for a total of 10 μL. Place each multimer tube in a thermocycler and use cycling conditions specified in Table 3 for~2.5 hours. 3. Exonuclease treatment (bench time 10 min, total time 1.2 hours) To degrade any noncircular ligation products, add components specified in Table 4.
Add 5 μL of master mix to each multimer tube, for a total of 15 μL. Place each tube in a thermocycler and use cycling conditions specified in Table 5 for 1 hour. 4. Multimer amplification (bench time 30 min, total time 1.3 hours) To amplify each multimer, combine components specified in Table 6. Add 49 μL of mix to 1 μL of multimer template each. Perform multimer PCR using conditions specified in Table 7.

Day 2
5. Gel electrophoresis and purification of multimers (bench time 1 hour, total time 2 hours) Run all 50 μL sample of each amplified multimer in 1 large well or 2 medium-sized wells on a 2% agarose gel; include a molecular weight marker. Excise multimer bands of correct size. To estimate the correct band size, multiply the number of assembled monomers by 103 bp and add 20 bp. For example, a hexamer will run at about 640 bp. When excising the bands, avoid cross-contamination between multimers that are intended for assembly of different TALEs. While bright multimer bands in the range of 20 ng/μL and above are preferred, lower concentrations (10 ng/ μL range) can be assembled, though with lower efficiency. Purify the bands using an appropriate gel purification kit, such as QIAquick Extraction Kit (Qiagen, Valencia, CA) following manufacturer's instructions. 6. Assembly of multimers into vector (bench time 30 min, total time 4.5 hours) Choose an EZ-TAL ™ backbone vector according to the intended downstream application (see Table 1 for a list of available backbone vectors). The vector should contain the half-repeat that specifies the last    Combine the corresponding multimers, vector, and components of the EZ-TAL ™ kit as specified in Table 8 in a PCR tube for a total of 10 μL. Include a negative control as shown in Table 8. If concentration of multimers is below the recommended range (below 20 ng/μL), reduce vector concentration accordingly. Place tubes in a thermocycler and use the cycling conditions specified in Table 9. 7. Transformation (bench time 30 min, total time 1.5 hours) Transform competent E. coli with the assembly product, following manufacturer's instructions. Plate transformed E. coli on LB with Carbenicillin (100 μg/L); alternatively, Ampicillin (100 μg/L) can be used. It is expected to see tens to hundreds of colonies on plates transformed with the assembled product, while the negative control plates should display significantly fewer or no colonies ( Figure 3C).  Table 10. Add 19 μl of mix to 1 μL of diluted template (colony suspension or negative control in 100 μL H 2 O) for a total of 20 μL. Place tubes in thermocycler, and run using cycling conditions as specified in Table 11.

TALE assembly confirmation
Agarose gel electrophoresis of colony PCR (bench time 30 min, total time 1.

hours)
Run all 20 μL of the colony-PCR samples and negative control on a 1% agarose gel; include a molecular weight marker. The expected size of the band can be calculated as number of inserted monomers x 103 bp plus 250 bp.

Trouble-shooting advice
Poor multimer amplification A faint or missing multimer band in combination with smear in the high molecular-weight range may indicate failed or impaired exonuclease treatment. Ensure correct storage conditions of exonuclease enzyme and ATP, and make sure that all components are added in correct amounts. Do not skip or shorten exonuclease incubation time.

Colony PCR does not show bands of correct size
To assess quality and quantity of the gel-purified multimers, run an aliquot on an agarose gel. Multimers of 10 ng/μL and less can still be assembled, but with reduced efficiency. If concentration of multimers is lower than recommended (below 20 ng/μL), reduce vector concentration in the assembly step (step 6) accordingly. If necessary, redo any multimer of low concentration and/or quality.

Sequence error
About 10-20% of sequenced clones may display a sequence error, such as an incorrect repeat or a frame shift. In this case, we recommend sequencing some additional clones. If this does not reveal an error-free sequence, we recommend to redo the multimer that contains the error, and re-assemble into the vector. Note that the half-repeat RVD in the EF1-TALE-TF-NG vector is encoded by AAT GGC instead of AAC GGA, and that of the TALE-TF-NN vector is encoded by AAT AAC instead of AACAAC (see Table 1, footnote). Competing interests BL, NG, TC and JH declare financial competing interest as SBI (System Biosciences) employees. CUS declares competing interest as a collaborator with SBI.