UMD Undergraduate Research Journal

Efforts Toward Manipulating SMYD Proteins for Bio-orthogonal Profiling of Protein Methylation

Rhiannon Aguilar, Jamie McBean, Joshua Linscott, and Minkui Luo

Abstract

The SET- and MYND-domain-containing proteins, or SMYDs, are a family of protein lysine methyltransferases (PKMTs) that use the small molecule S-Adenosyl-L-methi- onine (SAM) as a cofactor to methylate various histone and non-histone targets. The Luo Laboratory has recently synthesized several synthetic analogues of SAM that can be utilized by engineered PKMTs to add a tag containing a terminal alkyne group, instead of a methyl group, on their substrates. This allows the modified proteins to react with fluorescent dyes via click chemistry for their detection. The goal of this research is to use these cofactors to profile the substrates of the SMYD proteins, a key step toward full elucidation of SMYDs' biological roles. So far, SMYD1, SMYD2, SMYD3, and SMYD5 have been cloned from bacterial pET28-MHL vector into mammalian pcDNA3 vector. Five single mutants of the mammalian vector clone of SMYD3 were made. Each mutation was strategically placed to alter the size and shape of SMYD3's cofactor-binding pocket. These five mutants have been transfected into HEK293T cells. Western blotting was used to confirm transfection success and whole-cell lysates have been screened with four synthetic clickable cofactors. Preliminary results indicate potential success with one of the cofactors across several of the mutants. Repeat experiments will be done to confirm results, and new mutagenesis sites will be explored in the future across all five SMYD proteins, beginning with SMYD2.

Figure 1: SMYD Methylation Reaction
The SMYD proteins use the cofactor S-Adenosly-L-Methionine (SAM) to meth- ylate their peptide substrates. The byproduct of this reaction is S-Adenosyl- Homocysteine (SAH).

Introduction

Epigenetic changes made to gene expression are not directly caused by the DNA's coding nucleotide sequence. Such changes affect a wide spectrum of genes and can be inherited through several generations. The various mechanisms behind epigenetics include chemical reactions involving the DNA backbone or histone proteins, such as methylation, acetylation, deacetylation, and demethylation. Other mechanisms, such as those involving RNA, are currently being studied [1]. Several classes of enzymes play key roles in mechanisms affecting gene expression, such as the DNA Methyltransferases (DNMTs), Protein Arginine Methyltransferases (PRMTs), Protein Lysine Methyltransferases (PKMTs), and Histone Deacetylaces (HDACs).

The PKMTs use a small-molecule cofactor, S-Adenosyl-L-Methionine (SAM), as the methyl donor [2] (Figure 1). Their targets include histone sites [3], each with an activating or deactivating effect on gene transcription. The histone substrates of many of the PKMTs are well-documented [3]. Additionally, non-histone targets of PKMTs are currently being studied. Several members of the SMYD family of PKMTs, for example, are known to methylate histone sites such as Histone 3 Lysine 4 (H3K4) and H3K36. Recent research has revealed, however, that SMYD2 also methylates p53 and a protein known to cause the cancer retinoblastoma [4], two tumor-suppressing pathways. These recent discoveries demonstrate how little is known about the SMYDs' substrate profiles and how much remains to be studied.

On a larger scale, the SMYDs have been shown to have important biological roles. SMYD1 has been linked to cardiac development; fetal mortality is common in mice with the SMYD1 gene knockout [5]. Overexpression of SMYD3 has been linked to the outgrowth of breast [6] and colorectal [7] cancer cells and its suppression can cause inhibition of the growth of these same cells. Relevance to vital organ development and cancer metastasis makes the SMYD family an interesting target of study. There is potentially a whole host of unknown substrates for this enzyme family, any of which may give a vital clue to the mechanisms behind the SMYDs' larger biological roles, including those causing disease. It is these substrates that are particularly interesting research subjects.

Figure 2: Methodology of BPPM
In nature, a native enzyme reacts with native SAM to produce a methylated protein substrate and SAH as a byproduct. In BPPM, the enzyme is mutated to accept a synthetic SAM analogue, producing a product labeled with the synthetic R-group of the analogue.

Substrate profiling of protein methyltransferases has been explored in recent research using synthetic SAM analogues which can be used to label substrates with an easily detectable group donated by the cofactor analogues. Initial efforts to use these synthetic cofactors attempted to use analogues that could be accepted by native enzymes in the cofactor-binding pocket intended for native SAM. However, this method proved to be useful for only a select few of the numerous PMTs [8]. Bioorthogonal Profiling of Protein Methylation, or BPPM (Figure 2), has been used successfully by the Luo Laboratory as an alternative method of substrate profiling. This method is considered bio-orthogonal because it does not interfere with natural processes of PKMTs. The SAM analogue can only be taken up by the engineered enzyme, so the native enzyme is not affected. If the cofactor binding pocket of the targeted PKMT, in this case a SMYD protein, is mutated in the correct way, it can be specifically engineered to take up a synthetic SAM analogue when the native enzyme would only accept SAM. The Luo Laboratory has synthesized several SAM analogues, notably those containing a terminal alkyne group such as 4-propargyloxy-but-2-enyl (Pob)-SAM [9] and (E)-hey-2-en-5-ynyl (Hey)-SAM [2]. These SAM analogues have been used recently to profile the substrates of methyltransferases PRMT1 and G9a, respectively, using targeted mutations to their cofactor binding pockets [2], [9]. The terminal alkyne group present on both synthetic cofactors is able to undergo high-efficiency click chemistry with azide dyes, allowing for analysis of labeled substrate proteins.

Given this method's success in the PKMT family, the goal of our research is to extend it to the SMYDs. Beginning with SMYD2 and SMYD3, the cofactor-binding pockets of the SMYDs will be engineered to accept synthetic SAM analogues, facilitating labeling of their protein substrates and allowing them to be profiled.

Materials and Methods

Molecular Cloning: DNA sequences were originally contained in pET28-MHL bacterial plasmid vector. The protein inserts were amplified using the Qiagen® HotStar® Hi-Fidelity Polymerase Chain Reaction Kit. To prepare for the SMYD insert, the mammalian plasmid vector pcDNA3, containing an unwanted insert, was obtained and digested with appropriate restriction enzymes (Hi-Fidelity EcoR1, Nhe1 and/or Not1). The PCR product of each SMYD sequence was digested with the same restriction enzymes, and then ligation was done using the T4 DNA Ligase enzyme. All enzymes were obtained from New England Biolabs, Inc®. A 1:6 ratio of insert:vector was used for ligation, and successful reactions were transformed into TOP10 E. coli cells. After sequencing, successful clones were subsequently identified and transformed into DH5a E. coli cells, where DNA was harvested using the Qiagen® Hi-Speed® Maxiprep kit®.

Mutagenesis: Mutagenesis of Y239, I238, and N181 were performed using the Qiagen® QuikChange® Mutagenesis Kit and transformed into XL10 Gold E. coli cells. Successful mutants, as determined by sequencing, were again transformed into DH5a cells, which amplified the DNA so it could be collected by using the Qiagen® Maxiprep Kit®.

Figure 3: Experimental Outline
The SMYD protein insert, originally contained in the pET28- MHL vector, was cloned into the pcDNA3 vector. The cloned vector was transfected into mammalian cells, which produced the protein. Finally, the click reaction was carried out, and the result was tested using in-gel fluorescence to check for targeted protein labeling.

Transfection and Cell Lysates: HEK293T cells were transfected with 4 µg of DNA in 250 µL of Opti-MEM and 8 µL of lipofectamine. They were allowed to grow for 8 hours before being treated with 15 µM AdOx. 48 hours after the AdOx treatment, they were pelleted and frozen. The frozen cells were thawed and lysed using sonication in a RIPA buffer containing Roche protease inhibitor and 1 mM TCEP. Protein concentrations were measured using the Bradford Assay, and cofactor incubation was carried out overnight in 50 mM 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid (HEPES) buffer (pH 8.5) with 0.0005% Bovine Serum Albumin (BSA), 0.005% Tween® 20, and 50 nM 5'-Methylthioadenosine/S-adenosylhomocysteine nucleosidase (MTAN). Click reactions were performed using 100µM tetramethylrhodamine azide, 1mM CuSO4, 2 mM tris(2-carboxyethyl)phosphine (TCEP), and 100 µM Tris[(1 - benzyl-1H-1,2,3-triazol-4-yl)methyl] amine (TBTA). After washing with methanol, water, and chloroform, resulting proteins were analyzed using in-gel fluorescence. The fluorescence assay picture was taken at wavelength 600 nm on a GE Healthcare® Typhoon Trio® Variable Mode Imager. A Western blot was run to confirm successful transfection.

Results

The SMYD DNA sequence contained in the pET28-MHL vector can be used to express the SMYD proteins in E. coli bacteria for in vitro experiments. However, a combination of factors made this an unfavorable system for testing the SMYD proteins. First, previous experiments revealed a markedly low activity level of the SMYDs in vitro, presenting difficulties when attempting experimentation. In addition, in vitro experiments often give inaccurate results, as they are not a thorough representation of the natural environment where the SMYDs, or any proteins, are active. As an alternative, we decided to transfect the SMYD proteins into mammalian HEK293T cells. Before transfection, it was necessary to clone the SMYD insert from the bacterial vector pET28-MHL into the mammalian vector pcDNA3 which could be used for transfection. Successful clones of SMYD1, SMYD2, SMYD3, and SMYD5 were obtained, and the plasmid DNA stocks were made ready for use.

Figure 4: The cofactor-binding pocket of SMYD3
Targeted residues N181, Y239, and I237 are labeled.

Of the four cloned SMYD proteins, SMYD3 was chosen for the first experiments, as its well-documented association with cancer cell proliferation makes it a particularly interesting target of study. Before transfection, five single mutations were chosen as the first SMYD3 variants to be screened against several synthetic cofactors. These five mutations - Y239A, Y239G, N181A, N181G, and I237A - are all located in the active site (Figure 4) and are designed to change its shape, making it larger to accept more bulky cofactor analogues. These three mutation sites (Y239, N181, and I237) in particular were targeted based on previous research. It has been noted that residues highly conserved within a family are more likely to be importantin maintaining the shape of the active site, and hence more likely, when mutated, to allow the pocket to accept SAM analogues. Y239 is an example of one such highly-conserved residue, as demonstrated by Figure 5 [10] (dark box). I237 (light box), is also partially conserved through the family. It appears in SMYD5 as an isoleucine and in SMYD1 as the similarlystructured valine. An alternative way targeted sites were chosen was by analyzing homologous enzymes to look for analogous residues. If mutations to these analogous residues were found to be successful, the residues in the SMYDs may be targets for useful mutations. While unsupported by formal publication, N181 was chosen because when an analogous asparagine is mutated in SET7/9, another PKMT, the structure of the cofactor-binding pocket is successfully changed without decreasing the measurable enzymatic activity (data not shown). Therefore, these five mutations were chosen, and all five mutants plus native SMYD3, were transfected into mammalian cells.

Figure 5: Overlay of SMYD Family
Targeted SMYD3 residues I237 and Y239 are labeled to highlight parallel residues among the SMYD family.

Figure 6: Synthetic SAM Analogues Used
The methyl group of SAM was exchanged for 3 different "R" groups: Pob-SAM (1), (E)-pent-2-en-4-ynyl-SAM (2), and Hey- SAM (3). A fourth, unpublished structure was also used.

After transfection, the cells were lysed and the resulting proteins were collected for screening against several clickable cofactors. Four were chosen; among them were Pob-SAM (1), (E)-pent-2-en-4-ynyl-SAM (2), Hey-SAM (3), and an unpublished SAM analogue (Figure 6). These four cofactors reacted efficiently with the azide dye used, and Figure 7 shows selected results from the in-gel fluorescence image obtained after the reaction, along with the Coomassie stain of the protein gels, used as a loading control. The boxed wells are notable because of the striking difference in the size and darkness of the bands present. The first two boxes (negative control and Native SMYD3) were not expected to label any substrates because the enzyme was not transfected and not mutated, respectively. For the mutants, it was hoped that the bands present would be significantly thicker and darker, indicating that the modified enzyme had taken up the synthetic cofactor and labeled its substrates with the clickable group present on that cofactor. The presence of thicker bands for mutants N181A and Y239A, in the well containing proteins potentially labeled with group X, suggests that these mutations successfully alter the SAM binding site in a way that it is able to accept compound X and its synthetic R group, which is labeling SMYD3's substrates in the cell lysates. A similar effect seems to be present in the in-gel fluorescence images for compound 2 on the mutant Y239A. However, it is noticeable on the Coomassie stain that there is significantly more protein in the Y239A well for compound 2 than for any other enzyme varieties.

Figure 7a: In-gel fluorescence image
In-gel fluorescence showed significant difference in the amount of protein labeling in mutations N181A and Y239A relative to the two controls. Compound X, the SAM analogue which produced this labeling, is the unpublished structure.

Figure 7b: Protein gel Coommassie stain
Stain showing the amount of protein loaded into each well of the fluorescence gel. The similarly-sized bands show that a similar amount of protein was loaded into each well and that the varying amounts of fluorescence were not caused by varying amounts of protein.

Parallel to the in-gel fluorescence assay, a Western Blot was run to confirm that transfection was successful and that the enzymes were in fact expressed in cells. The pcDNA3 vector into which the SMYD sequences were cloned contained a FLAG-tag, which was expressed along with the SMYD proteins. An antibody to the FLAG-tag was used in the Western blot, and all six transfections were shown to be successful (Figure 8).

Figure 8: Western Blot Image
The Western blot used the FLAG-tag present on each cloned protein to measure expression of transfected protein in the cells used. Native was over-expressed, as expected, but otherwise all of the transfections were successful and had relatively consistent expression.

Discussion

The results obtained from the fluorescence assay indicate that there may have been some success in engineering SMYD3 to accept SAM analogues. If this is the case, then more specific tests to identify elucidating a comprehensive profile of the histone and non-histone substrates of SMYD3. However, before this conclusion can be concretely drawn, it is necessary to repeat the experiment several times in an attempt to reduce the background that is present in the image. This is likely due to the presence of excess cofactor that has reacted with the dye independently of enzymatic activity. In future experiments, extra steps will first be taken to eliminate this excess cofactor.

Once the fluorescence assay process has been refined in a way that will allow us to accurately verify substrate labeling by the SMYD3 mutants, further exploration will be done. If one of the five initial mutants was successful, then more specific analysis of the labeled proteins can be done to determine their identities. If none of the first attempts are successful, then other mutations will be attempted within the active site. Instead of single mutants, double, and possibly triple, mutants will be made to alter the size and shape of the cofactor-binding pocket even more, making the likelihood of the mutant accepting the synthetic cofactors significantly higher.

While experiments are run with SMYD3, efforts will also be made to profile the substrates of the other SMYD proteins, with emphasis on SMYD2. Because the active sites of SMYD1, SMYD2, and SMYD3 line up closely, a mutation or combination of mutations that can be shown to be successful in SMYD3 may also be successful when used in SMYD1 or SMYD2. Also, if the experiments are run in parallel, a new mutation in SMYD1 or SMYD2 may be found that can be applied to SMYD3. These three are the principal targets because their biological significance is evident. SMYD4 and SMYD5 may eventually be added to the list of targets if it is later shown that they also have vital cellular functions. As a result of these preliminary efforts, we are on the way to developing a system that may be used to profile the substrates of the entire SMYD family. Also, once a highly reliable combination of mutations is matched with a synthetic cofactor, such a method can be applied to research beyond simply discovering a list of substrates that react with the SMYDs. Researchers inside and outside the Luo Laboratory may be able to screen biological systems, including particular cancers, to determine which proteins in that particular system may be regulated by a SMYD protein, and what can be done to alter this regulation in a way that will benefit the system.

Acknowledgements

I would like to thank the director, admin- istrators, and staff of the Gateways to the Laboratory Program and Tri-Institutional MD/PhD program for organizing such a rewarding summer experience. Thank you to Dr. Wei Xiong, for her helpful advice and consultations. Thanks to Rui Wang, Glorymar Ibanez-Sanchez, Alejandro Gener, and Han Chen for helping me get started with cell culture.

References

[1] A. Eccleston et al., "Epigenetics," Nature, vol. 447, no. 7143, pp. 395, 2007.

[2] K. Islam et al., "Expanding cofactor repertoire of protein lysine methyltransferase for substrate labeling," ACS Chemical Biology, vol. 6, no. 7, pp. 697-684, 2011.

[3] T. Kouzarides, "Chromatin modifications and their function," Cell, vol. 128, no. 4, pp. 693-705, 2007.

[4] L. A. Saddic et al., " Methylation of the retinoblastoma tumor suppressor by SMYD2," J. of Biological Chemistry, vol. 285, no. 48, pp. 37733-37740, 2010.

[5] P. D. Gottlieb PD et al., "Bop encodes a muscle-restricted protein containing MYND and SET domains and is essential for cardiac differentiation and morphogenesis," Nature Genetics, vol. 31, pp. 25-32, 2002.

[6] R. Hamamoto et al., "Enhanced SMYD3 expression is essential for the growth of breast cancer cells," Cancer Sci., vol. 97, no. 2, pp. 113-118, 2006.

[7] R. Hamamoto et al., "SMYD3 encodes a histone methyltransferase involved in the proliferation of cancer cells," Nature Cell Biology, vol. 6, no. 8, pp. 731-740, 2004.

[8] O. Binda et al., "A chemical method for labeling lysine methyltransferase substrates," ChemBioChem, vol. 12, no. 2, pp. 330-334, 2010.

[9] R. Wang et al., "Labeling substrates of protein arginine methyltransferase with engineered enzymes and matched S-Adenosyl-L-Methionine analogues," J. of the Amer. Chemical Soc., vol. 133, no. 20, pp. 7648-7651, 2011.

[10] M. Brown et al., "Identification and characterization of Smyd2 a split SET/MYND domain-containing histone H3 lysine 36-specific methyltransferase that interacts with the histone deacetylase complex," Molecular Cancer, vol. 5, pp. 26, 2006.