Just as qPCR and sequencing studiesverified the integrity of theencoding DNA, we next sought to demonstrate that the multiple exposuresto aqueous-phase encoding conditions did not compromise either purityor yield of the synthesized compounds. LC-MS analysis of 1–8 cleaved from DESPS samples () revealed a predominant productpeak by absorbance detection of the coumarin chromophore (λ= 330 nm) installed in the linker (R, the top branch of A) and mass analysis of eachpredominant peak yielded parent ion m/z in agreement with the predicted mass of the compound. Trace sideproducts included unreacted resin linker (a), hydrolyzed haloacid(b, d), and truncations (c, e, f). These side products appear withequal abundance in the control solid-phase syntheses (SPS+) of eachcompound that was performed (omitting intervening aqueous encodingconditions). DESPS of 1–5 yieldedaverage compound purity of 48%. DESPS of 6–8 yielded average compound purity of 67%. Mass spectrometricanalysis of HPLC purified 1–8 usingETD-based fragmentation (MS/MS-ETD) yieldedz ion series sequencing data that agreed with the proposed oligomersequence, and high-resolution MS analysis of 1–8 agreed within 3.2 ppm of the predicted exact mass.
Solid-phase combinatorial synthesis and DNA-encodedlibraries (DELs)represent two of the most powerful strategies for generating largenonbiological molecular libraries. To date, DELs containing diversityranging from 105–1010 are possible byuniquely encoding single compounds with DNA.,, The solubility and information content ofthe DNA are the only fundamental limiting factors for library sizeand complexity. Furthermore, library screening throughput via solution-phasebinding assays is equally impressive because it is a selection, whichinterrogates all members of the library simultaneously. This approachhas recently gained incredible momentum as solution-phase encodedlibraries have yielded ligands of, for example, sirtuins, tankyrase 1, andPAD4. However, the reactions used togenerate DELs must solubilize DNA since the encoding DNA is the compoundcarrier, and HPLC purification accompanies each encoded synthesisstep (which does not guarantee compound fidelity). Solid-phase synthesis,on the other hand, is compatible with a broad range of solvents, thereare numerous solid-phase chemical reactions available, sample purification is trivial (washing), andeach bead harbors numerous copies of the compound for synthesis qualitycontrol. However, solid-phase strategies rely almost exclusively onmass analysis for structure elucidation, limiting diversity (as discussedabove) and analytical throughput.
We set out to combine the best attributes of OBOCsolid-phase combinatorialsynthesis and nucleic acid encoding in order to overcome the disadvantagesof each. We describe here DNA-encoded solid-phase synthesis (DESPS),which integrates solid-phase chemical synthesis of popular combinatoriallibrary scaffold types (e.g., peptoids) as well as those derived frommore specialized submonomers displaying stereochemical and regiochemicaldiversity. Accompanying the DESPS approachis a rationally designed encoding language that addresses constraintsbased on oligonucleotide secondary structure thermodynamics and compatibilitywith next-generation sequencing read lengths. Error correction informatics make the language almost resistant to the typicalsingle-base errors of DNA sequencing analysis, which we demonstrateusing PCR products obtained from single synthesis resin particlesdisplaying chimeric oligomers that would otherwise prove to be analyticallyintractable by mass spectrometry. We finally apply DESPS to OBOC librarysynthesis, incorporating scaffold, regiochemical and stereochemicaldiversification, and we present a mixed-scale strategy that allowslibrary quality control.
Of the three phosphorylation methods described above, only the phosphotriester approach (Fig. 1a) is really suitable for the synthesis of DNA sequences in solution. This method, which was developed largely in the 1970s, is very versatile and is particularly suitable for the coupling of oligonucleotide blocks (i.e., the addition of two or more nucleotide residues at a time) as well as for stepwise synthesis. Phosphotriester block coupling was a key feature of the original synthesis of the human insulin gene (12). Although the methodology has been refined (13) since then, the development of automated solid-phase synthesis (see above) in the 1980s provided a much faster and less labor-intensive method for the preparation of the very small (usually milligram or even smaller) quantities of synthetic DNA sequences that are generally required in molecular biology. Solution-phase synthesis is much more laborious in that it is normally advisable to purify the products by chromatography after each coupling step. Although such purification processes need not necessarily amount to much more than filtration through a bed of silica gel, they are time consuming. Furthermore, solution-phase synthesis has not yet been automated. It is, nevertheless, not at all unlikely that solution-phase synthesis will become the method of choice if really large (i.e., multikilogram to tonne) quantities of moderately sized (containing ca. 20 nucleotide residues) DNA sequences or their analogs are required in anti-sense or antigene chemotherapy. Automated solid-phase synthesis has recently been scaled-up to the multigram level (14) in order to provide sufficient material for clinical trials. However, if such clinical trials are successful and very much larger quantities of pure DNA sequences and their analogs are required for drug purposes, further substantial scaling-up of solid-phase synthesis may not prove to be a practical proposition. It is quite likely that the solution-phase synthesis or perhaps a combination of solution-phase and solid-phase synthesis might lend itself much more readily to scaling-up. The phosphotriester approach has the further advantage that the fully-protected intermediates obtained are soluble in organic solvents and may, therefore, be purified by conventional chromatographic techniques, and, after all of the protecting groups have been removed, the unprotected DNA sequences obtained may, if necessary, be further purified in the same way as material that has been prepared on a solid support.
The mixed-scale combinatorial library synthesis introduces numerousadvantages in miniaturizing the scale of both library synthesis andscreening. Reduced reagent consumption at this scale not only enablesusage of more expensive or designer monomers, such as the chiral chloropentenoicacids of this study, but also automatedhigh-throughput flow cytometry-based screening., Oligonucleotide-encoded peptide synthesis reached this degree ofminiaturization more than two decades ago,, but this line of research has remained dormant likely because paralleloligonucleotide synthesis has poor step economy relative to informationyield (3 synthesis steps yields 2 bits; one ligation step yielding12 bits) and introduces numerous aggressive reaction conditions andtwo orthogonal protection strategies. DESPS provides a more approachablestrategy for accessing the benefits of DNA-based encoding, and combinedwith 10-μm-scale library preparation, raises the possibilityof functional screening (e.g., in microfluidic droplets) by virtue of the solid-phase synthesis beadcolocalizing many copies of one compound library member. Direct functionalscreening would provide a powerful alternative to solution-phase librarycompetition binding as a mode of discovery.
Automated solid-phase synthesis by the phosphoramidite approach has also been used successfully (11) in the preparation of DNA sequences in which the base residues, sugar residues, and internucleotide linkages are modified. DNA sequences with attached fluorescent and other reporter groups have also been prepared by solid-phase synthesis. Such modified DNA sequences have found numerous important applications in molecular biology.
Relatively high molecular weight DNA sequences have been prepared successfully by the phosphotriester approach in solution by following essentially the procedure indicated in outline in Figure 1a. However, solution-phase synthesis is relatively laborious in that chromatographic purification steps are usually necessary after each coupling step. Nevertheless, if a very large quantity of a specific sequence is required (see text below), solution-phase synthesis may very well prove to be the method of choice. If, on the other hand, relatively small (i.e., milligram to gram) quantities of material are required for biological or biophysical studies, there is little doubt that solid-phase synthesis is to be preferred. While all three of the above phosphorylation methods (Fig. 1) have been used in solid-phase synthesis, the phosphoramidite approach (9) has emerged as the method of choice. This is mainly because its use leads to high coupling efficiencies and no significant side reactions. Furthermore, most commercial automatic synthesizers have been designed specifically to accommodate phosphoramidite chemistry. The main advantages of solid-phase synthesis, particularly by the phosphoramidite approach, are: (1) that it is very rapid and a DNA sequence containing, say, 50 nucleotide residues can easily be assembled and unblocked within one day; (2) only one purification step is required at the end of a synthesis as the growing DNA sequence is attached to a solid support (such as controlled pore glass [CPG] or polystyrene), and the excesses of all reagents are washed away; (3) all chemical reactions can be made to proceed in very high yield by using large excesses of reagents; and (4) the whole process may be fully automated in a DNA synthesizer. Solid-phase DNA synthesis has been developed to such an extent that the whole process can be carried out by a competent technician with no specialist knowledge of nucleotide chemistry. Automatic synthesizers, some of which are capable of assembling several different specific DNA sequences simultaneously, are readily available, and all the necessary building blocks [particularly phosphoramidites 17] and other reagents and solvents may be purchased in containers that are designed to be attached directly to the synthesizer.