Sunteți pe pagina 1din 70

SPONSORED CONTENT

Supplement to Nature Publishing Group Journals 

Integrated biology Epigenetics

December 2012

Produced with support from:

Personalized Therapeutics The Power of Epigenetics

New Approaches to Personalized Cancer Treatment


Each day we learn more about the biology of cancer and how genetic mutations in cancer cells cause them to grow and spread. This is the age of personalized therapeutics medicines that hone in on

The next frontier of personalized therapeutics epigenetics


Epizyme is at the forefront of drug discovery and development, leveraging discoveries to

www.epizyme.com

Cover art provided by Epizyme, Inc.

Sponsors foreword
Disease-Driving Genes and Molecules to Target Them Create the Promise of Personalized Therapeutics

Nature Reprint Collection Epigenetics


publisher: Melanie Brazil editor: Terry L. Sheppard, Amy Donner copyeditor: Yasmin Tayag senior art editor: Erin Dewalt production editor: Carol Evangelista production manager: Mabel Eng, Kelly Hopkins marketing: Nazly De La Rosa sponsorship: Reya Silao, Yvette Smith sponsor: Epizyme Nature - WWW.NATURE.COM/NATURE The Macmillan Building, 4 Crinan Street, London N1 9XW, UK Tel: +44 (0) 20 7833 4000 e-mail: nature@nature.com CITING THE COLLECTION All papers have been previously published in Nature, Nature Reviews Drug Discovery and Nature Chemical Biology. Please use original citation, which can be found on the table of contents. VISIT THE COLLECTION www.nature.com/reprintcollections/epigenetics SUBSCRIPTIONS AND CUSTOMER SERVICES For UK/ROW (excluding Japan): Nature Publishing Group, Subscriptions, Brunel Road, Basingstoke, Hants, RG21 6XS, UK. Tel: +44 (0) 1256 329242. Subscriptions and customer services for Americas including Canada, Latin America and the Caribbean: Nature Publishing Group, Subscription Department, PO Box 5161, Brentwood, TN 37024-5161, USA. Tel: +1 (800) 524 2688 (US) or +1 615 850 5315 (outside the US).

ersonalized therapeutics pairs the identification of disease-causing genes with the discovery of innovative therapies that target key genetic aberrations. Today, a common precursor to personalized therapeutics is the discovery of small molecule tool compounds that bridge pathobiological understanding and chemical biology, thus establishing that drug-like chemicals can effectively modulate disease-relevant targets. From this crucible emerge bona fide drug discovery efforts that eventually lead to new medicines. Histone methylation stands at the dawn of such a transformational moment. Histones are methylated by a class of enzymes known as protein methyltransferases (PMTs) and this methyl marking is reversed by another class of enzymes known as histone demethylases (HDMs). The reprints collected here illustrate how genetic alterations in both specific PMTs and HDMs lead to pathogenic changes that drive particular human cancers. A clear roadmap for translating these biological observations into systematic drug discovery for the PMTs is also described within the collection. This approach has led to potent and selective small molecule inhibitors of several PMTs that display cancer-specific cell killing effects; these are exemplified in the reprint collection by inhibitors of G9a and of EZH2. The collection also highlights how selective PMT inhibitors may play a role in regenerative medicine, by mediating the conversion of differentiated cells into a more stem cell-like state of pluripotency. The ultimate test of these targets will come from clinical trials of specific enzyme inhibitors in genetically defined patients, with a relevant companion diagnostic. The first such clinical trial of a PMT inhibitor, EPZ-5676, began in September 2012; other specific inhibitors are likely to enter clinical trials shortly. How well the pathobiology of histone methylation translates into meaningful new medicines for genetically defined patients will be exciting to see.
Robert A. Copeland, Ph.D. Chief Scientific Officer, Epizyme, Inc. 3 P  rotein methyltransferases as a target class for drug discovery. Copeland, RA et al. Nat. Rev. Drug Discov. 8, 724732. (2009) 12 Frequent mutation of histonemodifying genes in non-Hodgkin lymphoma. Morin, RD et al. Nature 476, 298303 (2011). 18  A chemical probe selectively inhibits G9a and GLP methyltransferase activity in cells. Vedadi, M et al. Nat. Chem. Biol. 7, 566574 (2011) 27 Chromatin-modifying enzymes as modulators of reprogramming. Onder, TT et al. Nature 483, 598602 (2012) 32 Novel mutations target distinct subgroups of medulloblastoma. Robinson, G et al. Nature 488, 4348 (2012) 38 A selective inhibitor of EZH2 blocks H3K27 methylation and kills mutant lymphoma cells. Knutson, SK et al. Nat. Chem. Biol. 8, 890896 (2012)

This supplement is published by Nature Publishing Group on behalf of Epizyme. All content has been chosen by Epizyme.

NATURE REPRINT COLLECTION Epigenetics

S1

Supplement to Nature Publishing Group Journals

The human protein methyltransferases


Methyltransferases are enzymes that facilitate the transfer of a methyl (CH3) group to of methyl marks as regulators of gene expression. Human protein methyltransferases (PMTs) fall into two major familiesprotein lysine methyltransferases (PKMTs) and a reaction mechanism in which the nucleophilic acceptor site attacks the electrophilic protein arginine methyltransferases (PRMTs)that are distinguishable by the amino carbon of S-adenosyl-L -methionine (SAM) in an SN2 displacement reaction that pro- acid that accepts the methyl group and by the conserved sequences of their respective duces a methylated biomolecule and S-adenosyl-L -homocysteine (SAH) as a byprod- catalytic domains. Given their involvement in many cellular processes, PMTs have atuct. Methylation reactions are essential transformations in small-molecule metabolism, tracted attention as potential drug targets, spurring the search for small-molecule PMT dynamic and reversible methylation of amino acid side chains of chromatin proteins, cal probes that are active in cells will be required to elucidate the biological roles of particularly within the N-terminal tail of histone proteins, has revealed the importance PMTs and serve as potent leads for PMT-focused drug development.

Protein lysine methyltransferases (PKMTs)

The phylogenetic tree shows 51 genes predicted to encode PKMTs, which are positioned in the tree on the basis of the similarities of their amino acid sequences1. This tree excludes one validated PKMT, DOT1L, which lacks a SET domainthe catalytic domain

H H N H

H N CH3

H3C H N CH3

H3C CH3 N CH3

Proteinargininemethyltransferases(PRMTs)

The human PRMT phylogenetic tree comprises 45 predicted enzymes including the PKMT DOT1L1. There are two major types of PRMT; both catalyze the formation of monomethylarginine (Rme1) but distinct reaction mechanisms yield symmetric (Rme2s) or asymmetric (Rme2a)

N HN

H N H H H

CH3 N HN N H H H3C

CH3 N HN H N H H3C

N HN

H N

CH3

O HN HO

H N

O N N H

Cl Cl

SUV420H1 SMYD1

SUV420H2

MLL4

MLL SETD1B
N

N NH N N N OMe OMe

METTL11A METTL11B METTL13


O S O O N H S O

AZ505

ref. 5
SETD3 SETD6

SMYD5 SETD4

SMYD3 SMYD2 SMYD4 SETD7 SETD8 EZH1 EZH2

SETD1A

ECE2

COQ3 METTL12 METTL7A

BIX-01294
MLL2 MLL3

ref. 7

H N

ALKBH8 WBSCR22 WBSCR27 COQ5 C20orf7

PRMT7
HO H S N NS O NH O H N O N SN S O

METTL7B AS3MT

ref. 10
OH

PRMT10

METTL20 METTL10 PRMT5

H 2N HO2C

H
N

N O HO N N OH

NH2 N

PRMT2

PRMT6

DOT1L

NH

Chaetocin

ref. 15
PRMT1

PRMT3 CARM1 PRMT8


O H2N S NH N N CF3 F3C N N O N N H N NH2 S N O

ref. 6
PRDM5 SUV39H1 SUV39H2 EHMT1 EHMT2 SETMAR Q6ZW69 PRDM14 PRDM6 PRDM8 PRDM13 PRDM12 PRDM4 PRDM15 PRDM10 SETD2 ASH1L MLL5 SETD5
N N

MeO HO2C NH HN O HN N HO O N OH N NH2 N H2N H I N HO O N

N N OH

HN N

PRDM3 PRDM16 PRDM2 PRDM1 PRDM11 PRDM7 PRDM9

IBAO

ref. 13

ref. 11 EPZ004777
ASMT

SETDB1
N NH N N N OMe O N

ref. 4

SETDB2

ref. 12
METTL6 PRMT9 PRMT11 NOP2 NSUN7 NSUN5B NNMT INMT NSUN4 NSUN5 PNMT METTL8 METTL2A METTL2B

UNC-0224

ref. 8

NSD1

NH N N OMe O N HO

NSUN5C
O

WHSC1L1

Br

Br

OH

UNC-0638

ref. 9 ref. 14

NSUN3 NSUN6 NSUN2

WHSC1

Targeting PMTs

A selection of small-molecule PMT inhibitors with some target selectivity is shown (minimally validated in quantitative in vitro assays) around the trees along with the name of the molecule, citation information and the chemical structure2,3.

DOT1L is a validated therapeutic target for mixed-lineage leukemia . The majority of these leukemias result from chromosomal rearrangements that cause aberrant recruitment of DOT1L to MLL-fusion target genes. Inhibition of DOT1L with EPZ004777 demonstrated that these leukemia cells are addicted to DOT1L activity and established proof of concept for DOT1L inhibition as a therapeutic option.
4

Priority therapeutic targets also include MLL for leukemias; SETD1B

and CARM1 for neurodegeneration; as well as EZH2, SMYD3 and EHMTs for multiple cancers.

Additional PMTs have been implicated

in human diseases and may yet emerge as therapeutic targets.

Elucidation of the biological function of PMTs would be facilitated by the development of selective chemical probes; this is a compelling area for future chemical biology studies, given the paucity of available tool compounds, many of which remain to be validated in cells. In particular, the emergence of these enzyme families as therapeutic targets suggests that such chemical probes could yield lead compounds for drug development.

Understanding the mechanisms


especially for nonhistone targets, merits additional study.

associated with the underlying causes of multiple human diseases. Our patient-driven approach to the creation of personalized therapeutics represents the future of cancer therapy, creating better therapeutics matched to the right patients more quickly and at lower cost than traditional approaches.
www.epizyme.com

rcopeland@epizyme.com

H2N HO2C

H S Me HO

N O N N OH

NH2 N

H 2N HO2C

H S HO

N O N N OH

NH2 N

Dr. Victoria Richon Vice President, Biological Sciences vrichon@epizyme.com

4. Daigle, S.R. et al. Cancer Cell 20, 5365 (2011). 5. Ferguson, A.D. et al. Structure 19, 12621273 (2011). 6. Mori, S. et al. Bioorg. Med. Chem. 18, 81588166 (2010). 7. Kubicek, S. et al. Mol. Cell 25, 473481 (2007).

11. 12. 13. 14. 15.

Allan, M. et al. Bioorg. Med. Chem. Lett. 19, 12181223 (2009). Huynh, T. et al. Biorg. Med. Chem. Lett. 19, 29242927 (2009). Yao, Y. et al. J. Am. Chem. Soc. 133, 1674616749 (2011). Cheng, D. et al. J. Med. Chem. 54, 49284932 (2011). Greiner, D. et al. Nat. Chem. Biol. 1, 143145 (2005).

2011 Nature Publishing Group Available online at:

http://www.nature.com/nchembio/poster/hpm.pdf

FREE POSTER

Human protein methyltransferases


Nature Chemical Biology presents a poster highlighting the human protein methyltransferase families, the small molecules known to target them and the prospects for PMT-focused drug development. Human protein methyltransferases (PMTs) transfer one or more methyl groups to the sidechains of lysine or arginine amino acids. Given their roles in regulating gene expression and driving disease, PMTs have attracted attention as potential drug targets. Several classes of small-molecule PMT inhibitors have been identified, but new specific chemical probes that are active in cells will be required to elucidate the biological roles of PMTs and serve as leads for PMT-focused drug development.

Download the Poster today by visiting: www.nature.com/nchembio/poster/hpm

Poster sponsored by:

REVIEWS

First published in Nature Reviews Drug Discovery 8, 724732 (2009); doi: 10.1038/nrd2974

Protein methyltransferases as a target class for drug discovery


Robert A. Copeland, Michael E. Solomon and Victoria M. Richon

Abstract | The protein methyltransferases (PMTs) which methylate protein lysine and arginine residues and have crucial roles in gene transcription are emerging as an important group of enzymes that play key parts in normal physiology and human diseases. The collection of human PMTs is a large and diverse group of enzymes that have a common mechanism of catalysis. Here, we review the biological, biochemical and structural data that together present PMTs as a novel, chemically tractable target class for drug discovery.
Epigenetics
A stably heritable change in phenotype or gene expression in an organism or cell, resulting from changes in a chromosome that are not caused by a change in DNA sequence. The process of eukaryotic cell differentiation is one of the most well-known examples of epigenetic changes.

Epizyme, Inc., 840 Memorial Drive, Cambridge, Massachussets 02139, USA. Correspondence to R.A.C. e-mail: RCopeland@epizyme.com doi:10.1038/nrd2974

Cellular differentiation is one of the most important components of embryonic development and postnatal tissue maintenance and repair. Almost every nucleated cell of the human body contains the same, complete complement of genomic DNA. However, the ability of pluripotent cells to differentiate into distinct lineages and ultimate cell types is conferred by specific patterns of transcription of subsets of genes in the genome. A large and growing body of data support the idea that epigenetic regulation of gene transcription is a key biological determinant of cellular differentiation1. The chromosomes within eukaryotic cell nuclei are packaged together with structural proteins (histones) to form the complex known as chromatin. Four major histones (H2A, H2B, H3 and H4) form an octameric, disc-shaped aggregate composed of two copies of each histone type around which the DNA is wound to form regular, repeating units known as nucleosomes (FIG. 1). Chromatin exists in two main conformational states: a condensed state (heterochromatin) in which the nucleosomes are tightly packed together and gene transcription is largely repressed; and a more relaxed state (euchromatin) in which gene transcription is activated. Epigenetic regulation of gene transcription is mediated by selective, enzyme-catalysed, covalent modification of specific nucleotides within the genes and also by post-translational modifications of the histone proteins (FIG. 1). Modification of DNA can silence gene transcription directly, whereas the post-translational modifications of histones control the conformational transition between the heterochromatin and euchromatin states2. The enzymes that covalently modify DNA and histones are therefore the key mediators of epigenetic regulation of gene transcription. Several putative epigenetic enzymes have recently been identified and, in some cases, their catalytic mechanism and three-dimensional structures have been determined2,3.

Epigenetic enzymes that are encoded in the human genome catalyse group transfer reactions and can be categorized according to the nature of the covalent modifications that they catalyse and by the substrates upon which they act. In humans, these enzymes include DNA methyltransferases (DNMTs), which methylate the carbon atom at the 5-position of cytosine in the CpG dinucleotide sites of the genome; protein methyltransferases (PMTs), which methylate lysine or arginine residues on histones and other proteins; protein demethylases, which remove methyl groups from the lysine or arginine residues of proteins; histone acetyltransferases, which acetylate lysine residues on histones and other proteins; histone deacetylases (HDACs), which remove acetyl groups from lysine residues on histones and other proteins; ubiquitin ligases, which add ubiquitin to lysine residues on histones and other proteins; and specific kinases that phosphorylate serine residues on histones4,5. Given that small-molecule inhibitors have been successfully designed for HDACs and DNMTs (discussed below), it is likely that additional families of histonemodifying enzymes will also be amenable to small-molecule modulation. The opportunity for chemical-probe development and pharmacological control of epigenetic gene transcription is therefore of great interest in the fields of basic biology and drug discovery 4,5. Indeed, the role of these enzymes in human diseases is highlighted by the recent approval of three drugs by the US Food and Drug Administration6 that act as selective, small-molecule inhibitors of HDACs and DNMTs for the treatment of specific human cancers (TABLE 1). In recent years, there have been numerous reviews in the literature that highlight different aspects of the biology, disease association and/or structural biology of various histone-modifying enzymes. In this Review, we
www.nature.com/reviews/drugdisc

724 | SEPTEMBER 2009 | VolUME 8

NATURE REPRINT COLLECTION Epigenetics

S3

REVIEWS
focus on the PMTs, and in particular on those aspects that make PMTs attractive targets for drug discovery efforts. We summarize the data that contribute to the validation of PMTs as targets for specific human diseases, as well as the structural and mechanistic data that suggest PMTs are a tractable (that is, druggable) target class.
Acetyltransferases Deacetylases ~18 family members Ac Kinases Ligases P Ub K K S K R Me Me PKMTs 52 family members Demethylases ~30 family members PRMTs 10 family members

Target class
A group of proteins that are related by a common type of drug-binding pocket, but sufficiently diverse that selective inhibition of specific proteins can be achieved, using medicinal chemical elaboration of the basic chemotype structures.

SAM
S-adenosyl-l-methionine, the universal methyl group donor of all enzymatic methyltransferase reactions.

PKMTs and PRMTs in human disease In surveying the histone-modifying enzymes of the human genome, the enzymes that catalyse methylation of lysine residues (protein lysine methyltransferases (PKMTs)) and arginine residues (protein arginine methyltransferases (PRMTs)) are of substantial interest from the perspective of drug discovery and medicinal chemistry. The action of these enzymes is crucial in controlling gene regulation, and there is an increasing amount of biochemical and biological data to suggest that the enzymatic activities of several of these proteins have pathogenic roles in cancer, inflammatory diseases, neurodegenerative diseases and other conditions of importance715. For example, with the exception of DoT1-like, histone H3 methyltransferase (DoT1l; also known as KMT4), all human PKMTs contain a ~130 amino-acid domain, referred to as the SET domain, which constitutes the catalytic domain of these enzymes1416. Enhancer of zeste homologue 2 (EZH2; also known as KMT6) is a SET domain protein that forms the catalytic subunit of the 45- protein core of polycomb repressive complex 2 (PRC2). PRC2 is a PKMT that catalyses the methylation of lysine 27 of histone H3 (in the nomenclature of histone modification, this site is referred to as H3K27). Although EZH2 contains the catalytic active site, all of the proteins of the PRC2 complex are required for full PKMT activity. overexpression of EZH2 or another PRC2 subunit, suppressor of zeste 12 homologue (SUZ12), has been associated with numerous human cancer types, including prostate, breast, bladder, colon, skin, liver, endometrial, lung and gastric cancers, as well as lymphomas and myelomas15. In breast carcinomas, increased levels of EZH2 have been shown to correlate with increased invasiveness and proliferation rate; it has been suggested that EZH2 could be a prognostic indicator of patient outcome for breast cancer10. In cell culture, overexpression of EZH2 in breast epithelial cells causes anchorage-independent cell growth and increased invasiveness. Additionally, when EZH2-overexpressing cells were injected into the mammary fat pads of nude mice, the animals developed tumours, demonstrating the tumorigenicity of EZH2 overexpression. Importantly, the phenotypic effects of EZH2 overexpression are correlated with increased H3K27 methylation and are dependent on the presence of an intact SET domain, both of which imply a role for EZH2 enzymatic activity in pathogenesis10,15. Several other human PKMTs and PRMTs are strongly associated with human cancers, as summarized in TABLE 2. Similarly, there is compelling evidence that other PKMTs and PRMTs have a pathogenic role in serious human diseases other than cancer 79. For example, SET domain, bifurcated 1 (SETDB1)17 and coactivator-associated

Figure 1 | A nucleosome and the post-translational histone protein modifications that can influence Nature Reviews | Drug Discovery epigenetic regulation of gene transcription. Modifications of the histone protein tail are shown: changes in acetylation (Ac) by acetyltransferases and deacetylases, phosphorylation (P) by kinases, ubiquitylation (Ub) by ligases and changes in methylation (Me) by methyltransferases and demethylases. The enzyme families that are responsible for the various post-translational modifications are shown. PKMT, protein lysine methyltransferase; PRMT, protein arginine methyltransferase.

arginine methyltransferase 1 (CARM1; also known as PRMT4)18 have been implicated in the neurodegenerative diseases Huntingtons disease and spinal muscular atrophy, respectively. SET domain-containing lysine methyltransferase 7 (SETD7; also known as KMT7)19, CARM1 (REF. 20) and PRMT1 (REF. 21) have been associated with nuclear factor-B-related inflammatory diseases, and SET domain-containing protein 1A (SETD1A)22 and CARM1 (REF. 23) have been associated with viral infections involving Herpes simplex virus and human T lymphotrophic virus, respectively. PKMTs and PRMTs are therefore emerging as compelling targets for drug discovery efforts4,5.

PMTs as a drug target class From a chemical biology and medicinal chemistry perspective, the PKMTs and PRMTs are of interest because they have a common mechanism of catalysis (discussed below), involving a small, organic cofactor. As other druggable classes of enzymes, such as the protein kinases, share this mechanistic feature, it is likely that the PMTs will be similarly amenable to inhibition by small, organic molecules. The PKMTs and PRMTs catalyse methyl transfer from their universal methyl donor, S-adenosyl-l-methionine (SAM; also known as AdoMet) (FIG. 2), to a nitrogen atom of lysine or arginine side chains3. Protein substrate specificity can be stringent in these enzymes; some PKMTs seem to selectively methylate a particular lysine residue on a specific histone, and the extent of methylation on a single lysine residue (that is, mono-, di- or tri-methylation) that is catalysed by a particular enzyme can also
VolUME 8 | SEPTEMBER 2009 | 725

NATURE REVIEWS | Drug Discovery

S4

NATURE REPRINT COLLECTION Epigenetics

REVIEWS
Table 1 | Epigenetic-enzyme inhibitors for cancer therapy
generic name DNMT inhibitors
5-azacitidine Decitabine Vidaza Dacogen Approved in the United States for myelodysplastic syndrome Approved in United States for myelodysplastic syndrome Approved in United States for cutaneous manifestation in cutaneous T cell lymphoma New drug application filing Phase II Phase II Phase II Phase II Phase I Phase II

Alernative names

clinical status*

HDAC inhibitors
Vorinostat Romidepsin Panobinostat Belinostat Entinostat MGCD-0103 JNJ-26481585 Givinostat Zolinza FK228 LBH-589 PXD-101 MS-275 SNDX-275 MG-0103 None ITF2357

*See REFS 51,52 for comprehensive reviews of novel cancer therapies, including those described in the table. DNMT, DNA methyltransferase; HDAC, histone deacetylase.

be stringent (discussed below). Clearly, the structure of each enzyme is unique16,24, as is the resulting biology and pathobiology associated with each enzyme. Yet, the shared chemical mechanisms of the PKMTs and PRMTs allows for certain efficiencies and economies in the discovery of selective drugs for these enzymes, by treating them as a target class2527. Several of these enzymes have been found to catalyse methyl transfer to lysine or arginine residues on a number of cellular proteins; this is especially the case for the PRMTs, for which several cytosolic substrates have been identified24,28. With respect to gene regulation, however, the most important targets for both PKMTs and PRMTs are likely to be the histones, as post-translational modification of these proteins is clearly a determinant of chromatin remodelling and therefore regulation of gene transcription.

suppressor of variegation 39 (SUV39) family; the SET1 (also known as Mll) family; the SET2 (also known as NSD) family; the retinoblastoma protein-interacting zinc finger protein (RIZ) (also known as PRDM) family; the SET and MYND domain-containing (SMYD) family; the enhancer of zeste (EZ) family; and the SUV420 family. An eighth family, known as others, included the enzymes SETD7 and SETD8 (also known as PRSET7). Finally, DoT1l, the human, non-SET domain PKMT can be considered a ninth family of PKMTs. our group has recently extended this work to systematically identify all of the SET domains that are encoded by the human genome. This study has more than doubled the number of putative human PKMTs to 52 (51 SET domain proteins plus DoT1l) (l.F. Jerva, K.o. Elliston, V.M.R., M.E.S. and R.A.C., unpublished observations). From the perspective of drug discovery, the salient point is that these enzymes are numerous in humans. The PRMTs are similarly well represented in humans. There are at least eight human PRMTs for which some level of methyltransferase activity has been shown. These proteins have a canonical sequence domain that is associated with the binding sites for the cofactor and substrate (arginine), although the sequence conservation among these proteins is low. Estimates of the total number of PRMTs that are encoded by the human genome vary, depending on the method of sequence alignment and the level of alignment stringency that is applied. Nevertheless, it is clear that 1050 of these enzymes are represented in humans. The human PMTs are thus a large class of enzymes, and several of them already have well established disease association (discussed above). Furthermore, owing to the common features of their chemical mechanism of catalysis (discussed below), the PMTs are likely to be inherently tractable as targets for small-molecule drug intervention. The PMT target class, as defined here, therefore provides an important pool of potential targets for drug discovery efforts.

SAH
S-adenosyl-l-homocysteine, the universal product of all enzymatic methyltransferase reactions, formed by methyl group transfer from S-adenosyl-l-methionine.

Representation of PMTs in the human genome The PMT target class is represented in many species, and the human genome encodes several PKMTs and PRMTs. Attempts to quantify the representation of PKMTs in a particular organism, and to understand the relatedness of these proteins to one another, have focused on the sequence alignment of the SET domain because, as discussed above, this domain is common to all PKMTs (except DoT1l). Several attempts have been made to systematically group the PKMTs on the basis of sequence homology and substrate14,29. For example, in 2007, a nomenclature convention was proposed for the PKMTs, along with other types of chromatin-modifying enzymes29. In this study, 24 human PKMTs were identified. These SET domain PKMTs have been divided into related families on the basis of sequence alignment; initially four, and later seven, major families were defined in this manner14,16: the

The PMT active site The pursuit of the PMTs as a drug target class is facilitated by a rich literature base of crystallographic and enzyme kinetic studies of these enzymes that have helped to define their mechanisms of catalysis. All of these enzymes probably use a common bimolecular nucleophilic substitution (SN2) methyl transfer mechanism3,24,30. The lone pair electrons of a nitrogen atom (from lysine or arginine) attacks the electrophilic methylsulphonium cation of SAM at a 180 angle to the leaving group, to form a penta-coordinate carbon transition state. The transition state structure then collapses, with methyl group relocation to the nitrogen atom of the lysine or arginine side chain and formation of S-adenosyl-l-homocysteine (SAH; also known as AdoHcy) (FIG. 2) as a product. The use of a naturally occurring adenosyl analogue as the universal group transfer donor by PMTs is reminiscent of protein kinases another large family of druggable enzyme targets, the ATP-binding pockets of which have proved to be highly tractable targets for drug discovery 31,32. Furthermore, despite binding a
www.nature.com/reviews/drugdisc

726 | SEPTEMBER 2009 | VolUME 8

NATURE REPRINT COLLECTION Epigenetics

S5

REVIEWS
Table 2 | Selected PKMTs and PRMTs that have shown an association with human cancers
Protein methyltransferase
SUV39H1 EHMT2

Methylation substrates
H3K9 H3K9

cancers
Colon cancer Lung, prostate and hepatocellular carcinoma Leukaemia Acute myeloid leukaemia Myeloma Lung and breast cancers, and childhood acute myeloid leukaemia MLL-rearranged leukaemias Breast, liver, colon and gastric cancers Breast, prostate, colon, gastric, bladder and liver cancers, melanoma and lymphoma Breast cancers

cancer association
Increased expression in colorectal tumours; associated with transcriptional repression Increased expression in lung cancer cell lines; regulates centrosome duplication, presumably through chromatin structure Chromosomal aberrations involving MLL are a cause of acute leukaemias; the SET domain is lost in translocation Translocation fuses NSD1 to nucleoporin 98 in human acute myeloid leukaemia Translocated and increased expression in myeloma; associated with transcriptional regulation Amplified in lung cancer and breast cancer; translocation with nucleoporin 98; mediates transcriptional activation

refs
53 54,55

MLL NSD1 WHSC1 WHSC1L1

H3K4 H3K36 H3K36 and H4K20 H3K4

5658 59 6062 6364

DOT1L

H3K79

Recruited by MLL fusion partners MLLT1, MLLT2, MLLT3 and MLLT10 to homeobox genes; associated with transcriptional activation and elongation Overexpressed in multiple tumour types; associated with transcriptional activation Amplified and increased expression in several tumour types; a member of the polycomb repressive complex 2; associated with transcriptional repression

11, 66,67

SMYD3 EZH2

H3K4 H3K27

68,69 10,15,70,71

SETD7

H3K4

SET7-mediated methylation stabilizes the oestrogen receptor and is necessary for the recruitment of the oestrogen receptor to its target genes and target gene transactivation Amplified and overexpressed in cancers; associated with transcriptional repression Increased expression correlates with androgen independence in human prostate carcinoma; overexpressed in breast tumours and associated with transcriptional activation PRMT5 expression and H3R8 methylation levels are increased in lymphoid cancer cells; PRMT5 mediates p53 methylation, which promotes cell arrest rather than cell death; H4R3 methylation promotes recruitment of DNMT3A, subsequent promoter CpG methylation and gene silencing

72

PRDM14 CARM1

No known substrate H3R17, EP300CBP and NCOA3 H3R8, p53, SNRPD1, SNRPD3 and SUPT5H

Breast cancers Breast and prostate cancers Lymphoma

73 74,75

PRMT5

12, 76

CARM1, coactivator-associated arginine methyltransferase 1 (also known as PRMT4); CBP, CREB-binding protein; DNMT3A, DNA (cytosine-5-)-methyltransferase 3; EHMT2, euchromatic histone-lysine N-methyltransferase 2 (also known as G9A and KMT1C); EP300, E1A-binding protein p300; EZH2, enhancer of zeste homologue 2 (also known as KMT6); DOT1L, DOT1-like, histone H3 methyltransferase (also known as KMT4); MLL, myeloid, lymphoid or mixed-lineage leukaemia (also known as KMT2A); MLLT1, myeloid, lymphoid or mixed-lineage leukemia, translocated to 1; NCOA3, nuclear receptor coactivator 3; NSD1, nuclear receptor-binding SET domain protein 1; PKMT, protein lysine methyltransferase; PRDM14, PR domain-containing protein 14; PRMT, protein arginine methyltransferase; SETD7, SET domain-containing lysine methyltransferase 7 (also known as KMT7); SMYD3, SET and MYND domain-containing protein 3; SNRPD1, small nuclear ribonucleoprotein D1 polypeptide 16kDa (also known as SMD1); SNRPD3, small nuclear ribonucleoprotein D3 polypeptide 18kDa (also known as SMD3); SUPT5H, suppressor of Ty 5 homologue; SUV39H1, suppressor of variegation 39 homologue 1 (also known as KMT1A); WHSC1, WolfHirschhorn syndrome candidate 1 (also known as MMSET and NSD2); WHSC1L1, WolfHirschhorn syndrome candidate 1-like protein 1 (also known as NSD3).

common natural ligand, the ATP-binding pockets of protein kinases have afforded medicinal chemists a rich diversity of chemical scaffolds, which have resulted in a range of drug molecules of varying degrees of target selectivity 32. Similarly, the commonality of SAM use by the PMTs belies the structural, biological and pathological diversity of these enzymes. From the perspective of drug discovery and medicinal chemistry, the diversity of SAM-binding modes and catalytic mechanisms of these enzymes is of key importance.
NATURE REVIEWS | Drug Discovery

A common structural feature of PKMTs and PRMTs that distinguishes these enzymes from other proteins that use SAM is the overall architecture of their extended catalytic active sites. This generally consists of a SAM-binding pocket that is accessed from one face of the protein, and a narrow, hydrophobic, acceptor (that is, lysine or arginine) channel that extends to the opposite face of the protein surface, such that the two substrates enter the active site from opposite sides of the enzyme surface.
VolUME 8 | SEPTEMBER 2009 | 727

S6

NATURE REPRINT COLLECTION Epigenetics

REVIEWS
a
O H N O N H H N O O N H

H N O N H H OH

NH2 N N

H CH3 N NH2 N N

CH3 S+
O C 2

PMTs
O 2C

S+ H H NH3+ OH O H

N H OH

H H NH3+ OH

SAM +

SAH

Nu

LG

Nu

LG

Nu

+ LG

Figure 2 | PMT-catalysed methylation of proteins by an sN2 reaction with sAM as the methyl donor. The protein methyltransferases (PMTs) catalyse methyl transfer from Nature Drug Discovery their universal methyl donor, S-adenosyl-l-methionine (SAM; alsoReviews known |as AdoMet) to a nitrogen atom of lysine or arginine side chains to form S-adenosyl-l-homocysteine (SAH; also known as AdoHcy). a | The methyl group (shown in red) of the SAM sulphonium cation is attacked by the lone pair electrons of a lysine (shown here) or arginine (not shown) side-chain nitrogen atom. The reaction results in transfer of the methyl group to the attacking nitrogen atom and the production of SAH from the reaction cofactor. b | A more generalized chemical scheme of a bimolecular nucleophilic substitution (SN2) group transfer reaction, illustrating the attacking nucleophile (Nu; lysine or arginine in the case of PMTs), the leaving group (LG; the methyl group in the case of PMTs), and the transient but essential formation of a penta-coordinate carbon transition state ().

Crystallographic studies have revealed two distinct binding modes for SAM or SAH in the cofactor-binding pockets of PMTs24. For the SET domain PKMTs that have been co-crystallized with SAM or SAH, it is known that the cofactor adopts a U-shaped configuration within the active site (FIG. 3) that aligns the methylsulphonium cation of SAM at the base of the narrow lysine channel, in perfect juxtaposition to the -amino group of the acceptor lysine residue, which facilitates group transfer. This U-shaped configuration is induced by a conserved aspartate or glutamate residue that binds to the ribose hydroxyl groups, and a positively charged lysine or arginine residue that forms a salt bridge with the carboxylate of SAM. In striking contrast to the U-shaped configuration that is adopted by the cofactor when bound to PKMTs, SAM bound within the active site of PRMTs adopts an extended configuration that resembles the extended SAM configuration seen in the DNA methyltransferases; again, the binding motif results in alignment of the SAM methylsulphonium cation with the base of the acceptor-binding channel. Another distinction between cofactor binding within the PKMTs and the PRMTs is that, in PRMTs, dimer formation seems to be a crucial component of SAM binding and catalysis, whereas this is not the case for PKMTs3,24. The mechanistic consequences of obligate dimer formation in the PRMTs is not yet clear, but it may be involved in multiple methylations of the arginine residue.
728 | SEPTEMBER 2009 | VolUME 8

From the above discussion, it could be concluded that the configuration of the bound SAM is structurally related to the identity of the methyl acceptor nitrogen species upon which the enzymes act; that is, U-shaped for PKMTs and extended for PRMTs. However, data on the non-SET domain PKMT, DoT1l, do not support this conclusion. In the co-crystal structures of human DoT1l bound to SAM33, and the yeast homologue DoT1P bound to SAH34, the cofactor is bound in the extended configuration, similar to that seen in the PRMTs. Additionally, the solvent-exposed surface area of the bound cofactor in DoT1l is more similar to that seen in the PRMTs than the PKMTs, as is the overall amino-acid sequence around the cofactor-binding pocket24,33. Therefore, from a structural perspective, DoT1l seems to link the PKMT and PRMT groups of PMTs. The discovery and optimization of selective drugs for the PMTs will depend not only on the static structure of the active site of the enzyme, as revealed through crystallographic studies, but also on the structural dynamics of the active site that accompany catalytic turnover 27,35. Studies on the kinetic mechanisms of the PMTs may provide some information in this area. Some of the SET domain PKMTs, such as SETD7, perform a single round of catalysis on a lysine residue, resulting in a mono-methylated product, whereas other SET domain PKMTs catalyse multiple rounds of methylation on a specific lysine residue. Crystallographic studies suggest that the difference between singleturnover and multiple-turnover SET domain enzymes results from the degree of steric crowding and hydrogenbonding patterns in the lysine-binding channel of these enzymes3,24,36,37. In particular, the identity of an aromatic residue within the lysine-binding pocket seems to be the key determinant of the multiplicity of lysine methylation. In the PKMT DIM5, this residue is a phenylalanine (F281), and the enzyme can tri-methylate the acceptor lysine residue of its protein substrate. The corresponding residue in SETD7 is a tyrosine (Y305), and this enzyme can only mono-methylate its protein substrate. Remarkably, the mutant F281Y transforms DIM5 into a mono-methylating PKMT, and the corresponding mutant Y305F in SETD7 results in an enzyme that is capable of multiple rounds of lysine methylation38. These mutagenesis results have been extended to the PKMT euchromatic histone lysine N-methyltransferase 2 (EHMT2; also known as G9A)39, and the tyrosine phenylalanine switch seems to be a general determinant of product specificity among the SET domain PKMTs24. Molecular dynamics and hybrid quantum mechanical molecular mechanical studies also suggest a key role for bound water molecules (a water channel) in the extent of lysine methylation by PKMTs30. An outstanding question that has yet to be reconciled with the mechanistic hypothesis described above is how the quaternary nitrogen atom is deprotonated to generate a neutral amine methyl acceptor. At physiological pH, the lysine amine is protonated (the negative logarithm of the acid dissociation constant (pKa) of the side chain amine is ~10.8 (REF. 35)), and so there are no lone pair electrons to act as the attacking nucleophile in the
www.nature.com/reviews/drugdisc

NATURE REPRINT COLLECTION Epigenetics

S7

REVIEWS
a PRMT

b DOT1L

c SET domain

Figure 3 | variations in the configuration of sAM or sAH bound within the active sites of different PMTs. a | The representative conformation shown for the protein arginine methyltransferases (PRMTs) was taken Nature Reviews | Drug Discovery from the crystal structure of S-adenosyl-l-homocysteine (SAH; also known as AdoHcy) bound to coactivatorassociated arginine methyltransferase 1 (CARM1)49. b | The conformation shown for DOT1-like, histone H3 methyltransferase (DOT1L) was taken from the crystal structure of S-adenosyl-l-methionine (SAM; also known as AdoMet) bound to this protein33. c | The representative conformation shown for the protein lysine methyltransferases (PKMTs) was taken from the crystal structure of SAH bound to SET domain-containing lysine methyltransferase 8 (REF. 50). Carbon atoms are represented by grey circles; nitrogen atoms are represented by blue circles; oxygen atoms are represented by red circles; and sulphur atoms are represented by yellow circles.

General base catalysis


A mechanism that can occur in enzyme catalysis, in which a basic group accepts protons from a substrate molecule, usually to stabilize a charged transition-state species.

SN2-mediated methyl transfer reaction. A potential mechanism of deprotonation is through general base catalysis. However, inspection of the amino acids in the active sites of PKMTs reveals no obvious basic side chains that could act in this capacity. Another hypothesis is that the solvent acts as a proton sponge; however, this seems inconsistent with the fact that the lysine side chain is buried deeply in the protein, with no clear access to bulk solvent. An alternative hypothesis has recently been proposed, based on molecular dynamics simulations30. According to this model, binding of SAM and protein substrates creates a water shuttle that can remove a proton from the buried lysine side chain and ferry this proton along a contiguous chain of water molecules to be deposited into the bulk solvent. Additionally, the electrostatic repulsion created by the quarternary nitrogen atom and the positively charged SAM cofactor lowers the pKa of the lysine side-chain amine to ~8.2, thereby facilitating this deprotonation

process. Furthermore, the water shuttle hypothesis provides an alternative mechanism to explain the differences in extent of lysine methylation by the PKMTs. The molecular dynamics studies suggest that the ability to form a water shuttle will determine the extent of methylation that is catalysed by a given enzyme. For example, simulations of SETD7-mediated catalysis suggest that mono-methylation of lysine prevents re-formation of a new water shuttle, and so this enzyme terminates catalysis after one round of methylation. The same simulations suggest that other PKMTs, such as the ribulose bisphosphate carboxylase oxygenase large subunit methyltransferase, can readily re-form the water shuttle, leading to multiple rounds of methylation. Enzymes that perform multiple rounds of catalysis on a macromolecular substrate can do so by one of two mechanisms: a distributive enzyme mechanism, in which each round of catalysis results in macromolecular product dissociation and rebinding, or a processive mechanism, in which multiple rounds of catalysis proceed before dissociation of the macromolecular product. PMTs use both of these mechanisms: some SET domain PKMTs that perform multiple rounds of lysine methylation have been found to use a processive mechanism3,24, whereas DoT1l has been shown to perform multiple rounds of H3K79 methylation through a non-processive (distributive) mechanism40. The PRMTs are also capable of performing multiple rounds of arginine methylation to produce either monoor di-methylated arginine products. The PRMTs that have been studied so far follow an ordered, sequential mechanism in which SAM binds before the argininecontaining substrate, and di-methyl arginine production occurs through a processive mechanism3. on the basis of product specificity, PRMTs can be subdivided into two types: type I PRMTs, which produce an asymmetrical N,N -dimethyl arginine; and type II PRMTs, which produce a symmetrical N,N-dimethyl arginine3. The variations in active-site structure and chemical mechanism that are summarized above reflect a target class with the potential for substantial chemical diversity among small-molecule modulators of individual enzymes in the class. Therefore, the opportunity for the development of different chemotypes that compete with the common, natural ligands of these enzymes (for example, SAM, lysine and arginine), and can be modified to produce enzyme-selective inhibitors, seems promising.

Known inhibitors of PMTs Despite the convergence of data concerning PMTs, the search for potent, selective inhibitors of these enzymes has only recently begun in earnest. Some indirect approaches to inhibiting or depleting PMTs have been reported. For example, the antiviral compound 3-deazaneplanocin (DZNep) inhibits the enzyme SAH hydrolase and thereby increases intracellular levels of the universal product of PMTs, SAH41. Product inhibition by SAH would therefore be expected for all PMTs and other SAM-dependent enzymes, with the degree of inhibition for specific enzymes being related to their relative inhibition constant (Ki) and Michaelis constant (Km) values for
VolUME 8 | SEPTEMBER 2009 | 729

NATURE REVIEWS | Drug Discovery

S8

NATURE REPRINT COLLECTION Epigenetics

REVIEWS
SAH and SAM, respectively 27. Similarly, the activity of all SAM-dependent enzymes in a cell could be reduced by blocking SAM biosynthesis for example, by inhibiting dihydrofolate reductase or SAM synthase, which are two enzymes involved in SAM biosynthesis42. Also, the pan-HDAC inhibitor panobinostat has recently been shown to cause depletion of cellular levels of the PMT EZH2 (REF. 43). Although the mechanism by which this

Table 3 | Chemical structures and biochemical data for small-molecule inhibitors of PMTs
compound
SAH
N H2N CO2 H S O N N

structure
NH2 N

Mechanism and potency


Product of the reactions catalysed by PMTs IC50 values range from 0.1 to 20 M

selectivity*
Non-selective

refs
77,78

OH OH

Sinefungin
NH2 H2N CO2 H O N N

NH2 N N

Natural product analogue of SAM and SAH IC50 values range from 0.1 to 20 M

Non-selective

36

OH OH

Chaetocin

O H H N N SS N O O N SS N N H H O

OH

SAM-competitive inhibitor of SUV39 IC50 = 0.6 M

> 4-fold

79

OH N

BIX-01294
MeO MeO N N NH N N

SAM-non-competitive inhibitor of EHMT2 IC50 = 2.7 M

> 4-fold

80

Methylgene compound 7a of REF. 45

F3C N

CH3O N O H N

CARM1 inhibitor IC50 = 60 nM

> 100-fold for PRMT1 and SETD7

45

S NH O NH2 S O N N N H N O N

BristolMyers Squibb compound 7f of REF. 47

F3C N

CARM1 inhibitor IC50 = 40 nM

>100-fold for PRMT1 and PRMT3

46,47

NH2

CARM1, coactivator-associated arginine methyltransferase 1 (also known as PRMT4); EHMT2, euchromatic histone lysine N-methyltransferase 2 (also known as G9A and KMT1C); IC50, half-maximal inhibitory concentration; PMT, protein methyltransferase; PRMT, protein arginine methyltransferase; SAH, S-adenosyl-l-homocysteine (also known as AdoHcy); SAM, S-adenosyl-l-methionine (also known as AdoMet); SETD7, SET domain-containing lysine methyltransferase 7 (also known as KMT7); SUV39, suppressor of variegation 39; *Selectivity is given as the ratio of the IC50 value for the most potent inhibition at a non-target PMT over the IC50 value for the primary target. See REF. 27.

730 | SEPTEMBER 2009 | VolUME 8

www.nature.com/reviews/drugdisc

NATURE REPRINT COLLECTION Epigenetics

S9

REVIEWS
Structureactivity relationship
The relationship between the chemical structure of a compound and its pharmacological activity.

occurs is not yet fully understood, an approach of this type would nevertheless deplete the protein levels of EZH2 and so abolish the PMT catalytic activity of the enzyme along with any other non-enzymatic functions of EZH2. Direct inhibitors of PMTs have recently been reviewed4, along with other probes of histone-modifying enzymes. Some natural ligands for these enzymes have been known for some time, including the reaction product, SAH, and a natural inhibitor isolated from Streptomyces spp. cultures, sinefungin (TABLE 3). More selective inhibitors have been identified for SUV39 (chaetocin; reported half-maximal inhibitory concentration (IC50) = 0.6 M) and for EHMT2 (BIX-01294; reported IC50 = 1.6 M), but no further optimization of these compounds has been reported to date4. A co-crystal structure of BIX-01294 bound to EHMT2 has recently been published44. Surprisingly, the compound was found to bind to the enzyme non-competitively with respect to SAM, in a groove that is normally occupied by a portion of the protein substrate. More recently, two groups have reported potent, selective, pyrazole-based inhibitors of the PRMT CARM1 (REFS 4547) (TABLE 3). These compounds are the first examples of inhibitors of a specific PMT that are effective at nanomolar concentrations and display >100-fold selectivity for the primary target over related enzymes. The compound series reported by Methylgene45 was found to be inactive in cellular assays; no cellular data have been reported for the compound series from BristolMyers Squibb46,47. Therefore, although an exciting first step has been made towards developing selective inhibitors of PMTs, substantial work remains to be done before these findings can be translated into pharmacologically tractable species. The paucity of potent, selective, pharmacologically tractable inhibitors of the PMTs creates a crucial therapeutic gap which medicinal chemists should strive to fill. As described here, the pathobiological relevance of these enzymes, together with the structural and mechanistic information that suggests their druggability as a target class, converge to make the PMTs an attractive and important class of novel enzymes for contemporary drug discovery.
9. 10.

Conclusions There is a growing body of evidence that enzymes in this target class have important pathogenic roles in human diseases. The structures and enzymatic mechanisms of the PMTs support the view that pharmacological modulation of these enzymes by small-molecule inhibitors will be an effective means of therapeutic intervention in cancer and numerous other unmet medical needs. The discovery of small-molecule inhibitors of PMTs as starting points for drug development should clearly be a key focus of new research efforts. Beyond this goal, there are many opportunities to use chemical probes of PMT function to define the underlying biology and pathobiology that are associated with protein modification by these enzymes. The nature of PMT catalysis, and the available structural information about these enzymes, should facilitate the discovery of PMT ligands through mechanism- and structure-guided discovery methods48, as well as methods that do not rely on mechanistic knowledge, such as high-throughput screening of diverse chemical libraries. A key remaining question when considering the PMTs as a drug discovery target class is whether or not selective inhibition of particular enzymes can be achieved through targeting the SAM-binding pocket. This is analogous to the question that hindered the early acceptance of protein kinases as drug targets: whether it was possible to achieve selectivity among the ATPbinding pockets of the kinases. In retrospect, it is clear that the diversity of binding-site architecture and the binding-site dynamics associated with enzyme catalysis provide ample opportunities for selective inhibition of kinases through medicinal chemistry efforts. Will the same be true for the SAM-binding pockets of PMTs? Ultimately, structureactivity relationship profiles, selectivity and collateral inhibition of off-target enzymes by PMT inhibitors will need to be determined empirically. Despite these limitations, it is our hope that the data presented here will help to stimulate systematic exploration of the human PMT target class towards the goal of developing selective inhibitors of PMTs as therapeutic agents for human diseases.
16. Dillon, S. C., Zhang, X., Trievel, R. C. & Cheng, X. The SET-domain protein superfamily: protein lysine methyltransferases. Genome Biol. 6, 227 (2005). 17. Ryu, H. et al. ESET/SETDB1 gene expression and histone H3 (K9) trimethylation in Huntingtons disease. Proc. Natl Acad. Sci. USA 103, 1917619181 (2006). 18. Cheng, D., Cote, J., Shaaban, S. & Bedford, M. T. The arginine methyltransferase CARM1 regulates the coupling of transcription and mRNA processing. Mol. Cell 25, 7183 (2007). 19. Li, Y. et al. Role of the histone H3 lysine 4 methyltransferase, SET7/9, in the regulation of NF-B-dependent inflammatory genes. Relevance to diabetes and inflammation. J. Biol. Chem. 283, 2677126781 (2008). 20. Covic, M. et al. Arginine methyltransferase CARM1 is a promoter-specific regulator of NF-B-dependent gene expression. EMBO J. 24, 8596 (2005). 21. Hassa, P. O., Covic, M., Bedford, M. T. & Hottiger, M. O. Protein arginine methyltransferase 1 coactivates NF-Bdependent gene expression synergistically with CARM1 and PARP1. J. Mol. Biol. 377, 668678 (2008). 22. Huang, J. et al. Trimethylation of histone H3 lysine 4 by Set1 in the lytic infection of human herpes simplex virus 1. J. Virol. 80, 57405746 (2006).

1. 2.

3.

4. 5. 6. 7. 8.

Strahl, B. D. & Allis, C. D. The language of covalent histone modifications. Nature 403, 4145 (2000). Kouzarides, T. Chromatin modifications and their function. Cell 128, 693705 (2007). A thorough overview of post-translational modifications on core histones, the enzymes that mediate these modifications and the biological functions of the modification. Smith, B. C. & Denu, J. M. Chemical mechanisms of histone lysine and arginine modifications. Biochim. Biophys. Acta 1789, 4557 (2008). An excellent review of the chemical biology of lysine- and arginine-modifying enzymes. Cole, P. A. Chemical probes for histone-modifying enzymes. Nature Chem. Biol. 4, 590597 (2008). Keppler, B. R. & Archer, T. K. Chromatin-modifying enzymes as therapeutic targets Part 1. Expert Opin. Ther. Targets. 12, 13011312 (2008). Pray, L. At the flick of a switch: epigenetic drugs. Chem. Biol. 15, 640641 (2008). Jones, P. A. & Baylin, S. B. The epigenomics of cancer. Cell 128, 683692 (2007). Wilson, C. B., Rowell, E. & Sekimata, M. Epigenetic control of T-helper-cell differentiation. Nature Rev. Immunol. 9, 91105 (2009).

11. 12. 13.

14.

15.

Tsankova, N., Renthal, W., Kumar, A. & Nestler, E. J. Epigenetic regulation in psychiatric disorders. Nature Rev. Neurosci. 8, 355367 (2007). Kleer, C. G. et al. EZH2 is a marker of aggressive breast cancer and promotes neoplastic transformation of breast epithelial cells. Proc. Natl Acad. Sci. USA 100, 1160611611 (2003). Krivtsov, A. V. et al. H3K79 methylation profiles define murine and human MLL-AF4 leukemias. Cancer Cell 14, 355368 (2008). Jansson, M. et al. Arginine methylation regulates the p53 response. Nature Cell Biol. 10, 14311439 (2008). Hong, H. et al. Aberrant expression of CARM1, a transcriptional coactivator of androgen receptor, in the development of prostate carcinoma and androgen-independent status. Cancer 101, 8389 (2004). Schneider, R., Bannister, A. J. & Kouzarides, T. Unsafe SETs: histone lysine methyltransferases and cancer. Trends Biochem. Sci. 27, 396402 (2002). Simon, J. A. & Lange, C. A. Roles of the EZH2 histone methyltransferase in cancer epigenetics. Mutat. Res. 647, 2129 (2008).

NATURE REVIEWS | Drug Discovery

VolUME 8 | SEPTEMBER 2009 | 731

S10

NATURE REPRINT COLLECTION Epigenetics

REVIEWS
23. Jeong, S. J. et al. Coactivator-associated arginine methyltransferase 1 enhances transcriptional activity of the human T-cell lymphotropic virus type 1 long terminal repeat through direct interaction with Tax. J. Virol. 80, 1003610044 (2006). 24. Cheng, X., Collins, R. E. & Zhang, X. Structural and sequence motifs of protein (histone) methylation enzymes. Annu. Rev. Biophys. Biomol. Struct. 34, 267294 (2005). 25. Goldstein, D. M., Gray, N. S. & Zarrinkar, P. P. High-throughput kinase profiling as a platform for drug discovery. Nature Rev. Drug Discov. 7, 391397 (2008). 26. Mook, R. A. The importance and complexity of target class selectivity in drug discovery. The American Association for Cancer Research Education Book 223226 (The American Association for Cancer Research, Philadelphia, 2005). 27. Copeland, R. A. Evaluation of Enzyme Inhibitors in Drug Discovery: A Guide for Medicinal Chemists and Pharmacologists (Wiley, Hoboken, 2005). 28. Cheng, D. et al. Small molecule regulators of protein arginine methyltransferases. J. Biol. Chem. 279, 2389223899 (2004). 29. Allis, C. D. et al. New nomenclature for chromatinmodifying enzymes. Cell 131, 633636 (2007). 30. Zhang, X. & Bruice, T. C. Enzymatic mechanism and product specificity of SET-domain protein lysine methyltransferases. Proc. Natl Acad. Sci. USA 105, 57285732 (2008). This work provides a detailed theoretical basis to explain the substrate specificity of the protein lysine methyltransferases. 31. Fedorov, O. et al. A systematic interaction map of validated kinase inhibitors with Ser/Thr kinases. Proc. Natl Acad. Sci. USA 104, 2052320528 (2007). 32. Karaman, M. W. et al. A quantitative analysis of kinase inhibitor selectivity. Nature Biotech. 26, 127132 (2008). 33. Min, J., Feng, Q., Li, Z., Zhang, Y. & Xu, R. M. Structure of the catalytic domain of human DOT1L, a non-SET domain nucleosomal histone methyltransferase. Cell 112, 711723 (2003). 34. Sawada, K. et al. Structure of the conserved core of the yeast Dot1p, a nucleosomal histone H3 lysine 79 methyltransferase. J. Biol. Chem. 279, 4329643306 (2004). 35. Copeland, R. A. Enzymes: A Practical Introduction to Structure, Mechanism and Data Analysis 2nd edn (Wiley, Hoboken, 2000). 36. Couture, J. F., Hauk, G., Thompson, M. J., Blackburn, G. M. & Trievel, R. C. Catalytic roles for carbonoxygen hydrogen bonding in SET domain lysine methyltransferases. J. Biochem. 281, 1928019287 (2006). 37. Collins, R. E. et al. In vitro and in vivo analyses of a Phe/Tyr switch controlling product specificity of histone lysine methyltransferases. J. Biol. Chem. 280, 55635570 (2005). This study provides a structural basis for the wide range of lysine methylation patterns that is achieved by different SET domain PKMTs. 38. Trievel, R. C., Flynn, E. M., Houtz, R. L. & Hurley, J. H. Mechanism of multiple lysine methylation by the SET domain enzyme Rubisco LSMT. Nature Struct. Biol. 10, 545552 (2003). 39. Zhang, X. et al. Structural basis for the product specificity of histone lysine methyltransferases. Mol. Cell 12, 177185 (2003). 40. Frederiks, F. et al. Nonprocessive methylation by Dot1 leads to functional redundancy of histone H3K79 methylation states. Nature Struct. Mol. Biol. 15, 550557 (2008). 41. Chiang, P. K. Biological effects of inhibitors of S-adenosylhomocysteine hydrolase. Pharmacol. Ther. 77, 115134 (1998). 42. Bender, C. M., Zingg, J.-M. & Jones, P. A. DNA methylation as a target for drug design. Pharm. Res. 15, 175187 (1998). 43. Fiskus, W. et al. Panobinostat treatment depletes EZH2 and DNMT1 levels and enhances decitabine mediated de-repression of JunB and loss of survival of human acute leukemia cells. Cancer Biol. Ther. 8, 939950 (2009). 44. Chang, Y. et al. Structural basis for G9a-like protein lysine methyltransferase inhibition by BIX-01294. Nature Struct. Mol. Biol. 16, 312317 (2009). 45. Allan, M. et al. N-Benzyl-1-heteroaryl-3-(trifluorometh yl)-1H-pyrazole-5-carboxamides as inhibitors of co-activator associated arginine methyltransferase 1 (CARM1). Bioorg Med. Chem. Lett. 19, 12181223 (2009). The first examples of potent, drug-like inhibitors of a human PMT. 46. Purandare, A. V. et al. Pyrazole inhibitors of coactivator associated arginine methyltransferase 1 (CARM1). Bioorg Med. Chem. Lett. 18, 44384441 (2008). 47. Huynh, T. et al. Optimization of pyrazole inhibitors of coactivator associated arginine methyltransferase 1 (CARM1). Bioorg Med. Chem. Lett. 19, 29242927 (2009). 48. Copeland, R. A., Gontarek, R. & Luo, L. in Textbook of Drug Design and Discovery 4th edn Ch. 12 (eds. Krogsgaard-Larsen, P., Madsen, U. & Stromgaard, K.) 378407 (Taylor and Francis, New York, 2009). 49. Troffer-Charlier, N., Cura, V., Hassenboehler, P., Moras, D. & Cavarelli, J. Functional insights from structures of coactivator-associated arginine methyltransferase 1 domains. EMBO J. 26, 43914401 (2007). 50. Couture, J.-F., Collazo, E., Brunzelle, J. S. & Trievel, R. C. Structural and functional analysis of SET8, a histone H4 Lys-20 methyltransferase. Genes Dev. 19, 14551465 (2005). 51. Ma, W. W. & Adjei, A. A. Novel agents on the horizon for cancer therapy. CA Cancer J. Clin. 59, 111137 (2009). A review of the current knowledge on how aberrant epigenetic mechanisms can contribute to the development of cancer and the progress in developing therapies that target these mechanisms. 52. Cortez, C. C. & Jones, P. A. Chromatin, cancer and drug therapies. Mutat. Res. 647, 4451 (2008). 53. Kang, M. Y. et al. Association of the SUV39H1 histone methyltransferase with the DNA methyltransferase 1 at mRNA expression level in primary colorectal cancer. Int. J. Cancer 121, 21922197 (2007). 54. Watanabe, H. et al. Deregulation of histone lysine methyltransferases contributes to oncogenic transformation of human bronchoepithelial cells. Cancer Cell Int. 8, 15 (2008). 55. Kondo, Y. et al. Downregulation of histone H3 lysine 9 methyltransferase G9a induces centrosome disruption and chromosome instability in cancer cells. PLoS One 3, e2037 (2008). 56. Tkachuk, D., Kohler, S. & Cleary, M. L. Involvement of a homolog of Drosophila trithorax by 11q23 chromosomal translocations in acute leukemias. Cell 71, 691700 (1992). 57. Gu, Y. et al. The t(4;11) chromosome translocation of human acute leukemias fuses the ALL-1 gene, related to Drosophila trithorax, to the AF-4 gene. Cell 71, 701708 (1992). 58. Liedtke, M. & Cleary, M. L. Therapeutic targeting of MLL. Blood 113, 60616068 (2009). 59. Wang, G. G., Cai, L., Pasillas, M. P. & Kamps, M. P. NUP98-NSD1 links H3K36 methylation to Hox-A gene activation and leukaemogenesis. Nature Cell Biol. 9, 804812 (2007). 60. Marango, J. et al. The MMSET protein is a histone methyltransferase with characteristics of a transcriptional corepressor. Blood 111, 31453154 (2008). 61. Kim, J. Y. et al. Multiple-myeloma-related WHSC1/ MMSET isoform RE-IIBP is a histone methyltransferase with transcriptional repression activity. Mol. Cell Biol. 28, 20232034 (2008). 62. Lauring, J. et al. The multiple myeloma associated MMSET gene contributes to cellular adhesion, clonogenic growth, and tumorigenicity. Blood 111, 856864 (2008). 63. Angrand, P. O. et al. NSD3, a new SET domaincontaining gene, maps to 8p12 and is amplified in human breast cancer cell lines. Genomics 74, 7988 (2001). 64. Rosati, R. et al. NUP98 is fused to the NSD3 gene in acute myeloid leukemia associated with t(8;11) (p11.2;p15). Blood 99, 38573860 (2002). 65. Tonon, G. et al. High-resolution genomic profiles of human lung cancer. Proc. Natl Acad. Sci. USA 102, 96259630 (2005). 66. Okada, Y. et al. hDOT1L links histone methylation to leukemogenesis. Cell 121, 167178 (2005). 67. Bitoun, E., Oliver, P. L. & Davies, K. E. The mixedlineage leukemia fusion partner AF4 stimulates RNA polymerase II transcriptional elongation and mediates coordinated chromatin remodeling. Hum. Mol. Genet. 16, 92106 (2007). 68. Hamamoto, R. et al. SMYD3 encodes a histone methyltransferase involved in the proliferation of cancer cells. Nature Cell Biol. 6, 731740 (2004). 69. Hamamoto, R. et al. Enhanced SMYD3 expression is essential for the growth of breast cancer cells. Cancer Sci. 97, 113118 (2006). 70. Bracken, A. P. et al. EZH2 is downstream of the pRB-E2F pathway, essential for proliferation and amplified in cancer. EMBO J. 22, 53235335 (2003). 71. Varambally, S. et al. The polycomb group protein EZH2 is involved in progression of prostate cancer. Nature 419, 624629 (2002). 72. Subramanian, K. et al. Regulation of estrogen receptor alpha by the SET7 lysine methyltransferase. Mol. Cell 30, 336347 (2008). 73. Nishikawa, N. et al. Gene amplification and overexpression of PRDM14 in breast cancers. Cancer Res. 67, 96499657 (2007). 74. Majumder, S., Liu, Y., Ford, O. H., 3rd, Mohler, J. L. & Whang, Y. E. Involvement of arginine methyltransferase CARM1 in androgen receptor function and prostate cancer cell viability. Prostate 66, 12921301 (2006). 75. Frietze, S., Lupien, M., Silver, P. A. & Brown, M. CARM1 regulates estrogen-stimulated breast cancer growth through up-regulation of E2F1. Cancer Res. 68, 301306 (2008). 76. Zhao, Q. et al. PRMT5-mediated methylation of histone H4R3 recruits DNMT3A, coupling histone and DNA methylation in gene silencing. Nature Struct. Mol. Biol. 16, 304311 (2009). 77. Patnaik, D. et al. Substrate specificity and kinetic mechanism of mammalian G9a histone H3 methyltransferase. J. Biol. Chem. 279, 5324853258 (2004). 78. Chin, H. G., Patnaik, D., Esteve, P.-O., Jacobsen, S. E. & Pradhan, S. Catalytic properties and kinetic mechanism of human recombinant lys-9 histone H3 methyltransferase SUV39H1: participation of the chromodomain in enzymatic catalysis. Biochemistry 45, 32723284 (2006). 79. Greiner, D., Bonaldi, T., Eskeland, R., Roemer, E. & Imhof, A. Identification of a specific inhibitor of the histone methyltransferase SU(VAR)39. Nature Chem. Biol. 1, 143145 (2005). 80. Kubicek, S. et al. Reversal of H3K9me2 by a small-molecule inhibitor for the G9a histone methyltransferase. Mol. Cell 25, 473481 (2007).

Acknowledgements

We are grateful to K. Shiosaki, C. T. Walsh, H. R. Horvitz, Y. Zhang, and R. Gould for their insights, constant support and encouragement. We also thank K. Boater, E. Olhava, L. Jin and T. Luly for expert help in preparation of this manuscript.

Competing interests statement

The authors declare competing financial interests: see web version for details.

DATABASES
UniProtKB: http://www.uniprot.org CARM1 | DOT1L | EHMT2 | EZH2 | PRMT1 | SETD7 | SETD8 | SETD1A | SETDB1 | SUZ12

FURTHER INFORMATION
Authors homepage: http://www.epizyme.com
All liNks Are AcTive iN THe oNliNe PDf

732 | SEPTEMBER 2009 | VolUME 8

www.nature.com/reviews/drugdisc

NATURE REPRINT COLLECTION Epigenetics

S11

REVIEWS
focus on the PMTs, and in particular on those aspects that make PMTs attractive targets for drug discovery efforts. We summarize the data that contribute to the validation of PMTs as targets for specific human diseases, as well as the structural and mechanistic data that suggest PMTs are a tractable (that is, druggable) target class.
Acetyltransferases Deacetylases ~18 family members Ac Kinases P K S K R Me

REVIEWS
PKMTs 52 family members Demethylases ~30 family members PRMTs 10 family members

Me

Epigenetics
A stably heritable change in phenotype or gene expression in an organism or cell, resulting from changes in a chromosome that are not caused by a change in DNA sequence. The process of eukaryotic cell differentiation is one of the most well-known examples of epigenetic changes.

Target class
A group of proteins that are related by a common type of drug-binding pocket, but sufficiently diverse that selective inhibition of specific proteins can be achieved, using medicinal chemical elaboration of the basic chemotype Epizyme, structures. Inc., 840 Memorial Drive, Cambridge, SAM Massachussets 02139, USA. S-adenosyll-methionine, Correspondence to R.A.C. e-mail: the universal methyl group RCopeland@epizyme.com donor of all enzymatic doi:10.1038/nrd2974 methyltransferase reactions.

PKMTs and PRMTs in human disease Ub Ligases In surveying the histone-modifying enzymes of the K human genome, the enzymes that catalyse methylation of lysine residues (protein lysine methyltransferases (PKMTs)) and arginine residues (protein arginine methyltransferases (PRMTs)) are of substantial interest from the perspective of drug discovery and medicinal chemistry. action of these enzymes isSolomon crucial in and Victoria M. Richon RobertThe A. Copeland, Michael E. controlling gene regulation, and there is an increasing Figure 1 | A nucleosome and the post-translational Abstract | The protein methyltransferases (PMTs) which methylate protein lysine and histone protein modifications that can influence amount of biochemical and biological data to suggest Nature Reviews | Drug Discovery epigenetic regulation of gene transcription. that the enzymatic activities of several of these proteins arginine residues and have crucial roles in gene transcription are emerging as an have pathogenic roles in cancer, inflammatory diseases, Modifications of the histone protein tail are shown: important group of enzymes that play key parts in normal physiology and human diseases. neurodegenerative diseases and other conditions of changes in acetylation (Ac) by acetyltransferases and deacetylases, (P) by kinases, ubiquitylation 715 The collection of human PMTs is a large and diverse group phosphorylation of enzymes that have a common . importance (Ub) by ligases and changes in methylation (Me) by For example, with the exception of DoT1-like, histone mechanism of catalysis. Here, we review the biological, biochemical and structural data methyltransferases and demethylases. The enzyme families H3that methyltransferase (DoT1l; also known as KMT4), together present PMTs as a novel, chemically tractable target class for drug discovery. that are responsible for the various post-translational all human PKMTs contain a ~130 amino-acid domain, modifications are shown. PKMT, protein lysine referred to as the SET domain, which constitutes the methyltransferase; PRMT, protein arginine methyltransferase. Cellular domain differentiation is enzymes one of the most important 1416 . Enhancer of Epigenetic enzymes that are encoded in the human catalytic of these components of embryonic development and postnatal zeste homologue 2 (EZH2; also known as KMT6) is a genome catalyse group transfer reactions and can be tissue maintenance everysubunit nucleated SET domain protein and that repair. forms Almost the catalytic of categorized according to the nature of the covalent modifications that they catalyse and by the cell of the human body containsrepressive the same,complex complete methyltransferase 1 (CARM1; alsosubstrates known as the 45protein core of polycomb 2 arginine complement of genomic DNA. However, the ability of upon which 18 they act. In humans, these enzymes include (PRC2). PRC2 is a PKMT that catalyses the methyla- PRMT4) have been implicated in the neurodegenerapluripotent cells to differentiate into distinct lineages DNA methyltransferases (DNMTs), which methylate tion of lysine 27 of histone H3 (in the nomenclature of tive diseases Huntingtons disease and spinal muscular and ultimate cell types is conferred by specific patterns the carbon atom at the 5-position of cytosine in the histone modification, this site is referred to as H3K27). atrophy, respectively. SET domain-containing lysine of transcription of subsets of genes in the genome. A large CpG dinucleotide sites of the genome; protein methylAlthough EZH2 contains the catalytic active site, all of methyltransferase 7 (SETD7; also known as KMT7)19, and growing body of data support the idea that epigenetic transferases (PMTs), which methylate lysine or arginine the proteins of the PRC2 complex are required for full CARM1 (REF. 20) and PRMT1 (REF. 21) have been regulation of gene transcription is a key biological deter- residues on histones and other proteins; protein demethylPKMT activity. overexpression1 of EZH2 or another associated with nuclear factor-B-related inflammaminant of cellular differentiation . ases, which remove methyl groups from the lysine or PRC2 subunit, suppressor of zeste 12 homologue tory diseases, and SET domain-containing protein 1A The chromosomes within eukaryotic cell nuclei are arginine residues of proteins; histone acetyltransferases, (SUZ12), has been associated with numerous human (SETD1A)22 and CARM1 (REF. 23) have been associated packaged together with structural proteins (histones) which acetylate lysine residues on histones and other viralhistone infections involving(HDACs), Herpes simplex virus and cancer types, including prostate, breast, bladder, to form the complex known as chromatin. Four colon, major with proteins; deacetylases which remove human T lymphotrophic virus, respectively. PKMTs and skin, liver, endometrial, lung and gastric cancers, as well histones (H2A, H2B, H3 and15H4) form an octameric, acetyl groups from lysine residues on histones and . In breast carcinomas, PRMTs are therefore emerging as compelling targets for as lymphomas and myelomas disc-shaped aggregate composed of two copies of each other proteins; ubiquitin ligases, which add ubiquitin 4,5 . increased levels of EZH2 havethe been shown to correlate discovery efforts histone type around which DNA is wound to form drug to lysine residues on histones and other proteins; and with increased invasiveness and proliferation rate; it has specific kinases that phosphorylate serine residues on (FIG. 1). regular, repeating units known as nucleosomes as a been suggested that EZH2 could be a prognostic indica4,5 . drug target class Chromatin exists in two main conformational states: PMTs histones 10 . In cell a chemical biology and medicinal tor of patient outcome for breast cancer Given that small-molecule inhibitors havechemistry been suca condensed state (heterochromatin) in which theculture, nucleo- From overexpression of packed EZH2 in breast and epithelial cells causes perspective, the PKMTs andand PRMTs are (discussed of interest cessfully designed for HDACs DNMTs somes are tightly together gene transcription anchorage-independent growth and increased they have that a common mechanism ofhistonecatalysis below), it is likely additional families of is largely repressed; and acell more relaxed state (euchro- because invasiveness. Additionally, when EZH2-overexpressing (discussed below), involving a small, organic cofactor. As matin) in which gene transcription is activated. Epigenetic modifying enzymes will also be amenable to small-molcells were injected the mammary fat pads of nude other druggable classes of enzymes,for such as the protein ecule modulation. The opportunity chemical-probe regulation of gene into transcription is mediated by selective, mice, the animals developed tumours, demonstrating the kinases, shareand thispharmacological mechanistic feature, it is of likely that the development control epigenetic enzyme-catalysed, covalent modification of specific tumorigenicity of EZH2 overexpression. Importantly, the PMTs will be similarly amenable to inhibition by small, the nucleotides within the genes and also by post-translational gene transcription is therefore of great interest in phenotypic effects of EZH2 overexpression are correlated organic Modification fields ofmolecules. basic biology and drug discovery 4,5. Indeed, modifications of the histone proteins (FIG. 1). with increased H3K27 methylation and are dependent on theThe PKMTs and PRMTs in catalyse methyl transfer from of DNA can silence gene transcription directly, whereas role of these enzymes human diseases is highthe presence of an intact modifications SET domain, both of which imply their universal methyl donor, S -adenosyl-l-methionine the post-translational of histones control lighted by the recent approval of three drugs by the US 10,15 6 2), to a nitrogen atom . ( SAM; and also Drug known as AdoMet) (FIG. a the role for EZH2 enzymatic activity in the pathogenesis that act as selective, conformational transition between heterochromaFood Administration 3 2 . Protein substrate Several humanstates PKMTs and PRMTs are strongly of lysine or arginine side chains . The enzymes that covalently small-molecule inhibitors of HDACs and DNMTs spefor tin andother euchromatin TABLE 2. cificity can be stringent in these enzymes; some 1) PKMTs associated withand human cancers, as summarized . modify DNA histones are therefore the keyin mediators the treatment of specific human cancers (TABLE Similarly, there is compelling evidence that other PKMTs seem selectively particular lysine residue In to recent years, methylate there haveabeen numerous reviews of epigenetic regulation of gene transcription. in the literature that highlight different aspects of the putative enzymes have recently been on and Several PRMTs have a epigenetic pathogenic role in serious human a specific histone, and the extent of methylation on 79 biology, disease association structural biology of identified and, in some cases, their catalytic mechanism . For example, SET domain, a diseases other than cancer single lysine residue (that and/or is, mono-, di- or tri-methyl17 various histone-modifying Inenzyme this Review, we and three-dimensional structures have been determined2,3. ation) and coactivator-associated bifurcated 1 (SETDB1) that is catalysed by aenzymes. particular can also

First published in Nature Reviews Drug Discovery 8, 724732 (2009); doi: 10.1038/nrd2974

Protein methyltransferases as a target class for drug discovery

724 | SEPTEMBER 2009 | Discovery VolUME 8 NATURE REVIEWS | Drug NATURE REPRINT COLLECTION Epigenetics

www.nature.com/reviews/drugdisc VolUME 8 | SEPTEMBER 2009 | 725 S3

REVIEWS
Table 1 | Epigenetic-enzyme inhibitors for cancer therapy focus on the PMTs, and in particular on those aspects generic name Alernative clinical status* names that make PMTs attractive targets for drug discovery efforts. We summarize the data that contribute to DNMT inhibitors the validation of PMTs as targets for specific human 5-azacitidine Vidaza diseases, as Approved in the United States well as the structural and mechanistic data forPMTs myelodysplastic syndrome that suggest are a tractable (that is, druggable) target classApproved . Decitabine Dacogen in United States for
myelodysplastic syndrome

REVIEWS
suppressor of variegation 39 (SUV39) family; the SET1 (also known as Mll) family; the SET2 (also known as PKMTs Acetyltransferases NSD) family; the retinoblastoma protein-interacting 52 family zinc Deacetylases members finger protein (RIZ) (also known as PRDM) family; the ~18 family Me SET and MYND domain-containing (SMYD) family; Demethylases members ~30 family the enhancer of zeste (EZ) family; and the SUV420 K Ac family, known as Me family. An eighth othersmembers , included the PRMTs K SETD8 enzymes SETD7 and (also known as PRSET7). R 10 family P Kinases DoT1l, Finally, the human, non-SET domain PKMT members S can be considered Ub a ninth family of PKMTs. Ligases our group has recently extended this work to systemK atically identify all of the SET domains that are encoded by the human genome. This study has more than doubled the number of putative human PKMTs to 52 (51 SET domain proteins plus DoT1l) (l.F. Jerva, K.o. Elliston, V.M.R., M.E.S. and R.A.C., unpublished observations). From the perspective of drug discovery, the salient point Figure 1 | A nucleosome the post-translational is that these enzymes areand numerous in humans. histone protein modifications that can influence The PRMTs are similarly well represented in humans. Nature Reviews | Drug Discovery epigenetic gene transcription. There are atregulation least eight of human PRMTs for which some Modifications of the histone protein tail are shown: level of methyltransferase activity has been shown. These changes in acetylation (Ac) by acetyltransferases and proteins have phosphorylation a canonical sequence domainubiquitylation that is assocideacetylases, (P) by kinases, ated with the binding sites for the cofactor and substrate (Ub) by ligases and changes in methylation (Me) by (arginine), although the sequence conservation among methyltransferases and demethylases. The enzyme families these proteins is low. the total number of that are responsible forEstimates the variousof post-translational modifications areencoded shown. PKMT, protein lysine PRMTs that are by the human genome vary, methyltransferase; PRMT, protein of arginine methyltransferase. depending on the method sequence alignment and the level of alignment stringency that is applied. Nevertheless, it is clear that 1050 of these enzymes are represented in humans. arginine methyltransferase 1 (CARM1; also known as The human PMTs are thus a large class of enzymes, PRMT4)18 have been implicated in the neurodegeneraand several of them already have well established distive diseases Huntingtons disease and spinal muscular ease association (discussed above). Furthermore, owing atrophy, respectively. SET domain-containing lysine to the common features of their chemical mechanism methyltransferase 7 (SETD7; also known as KMT7)19, of catalysis (discussed below), the PMTs are likely to be CARM1 (REF. 20) and PRMT1 (REF. 21) have been inherently as targets for small-molecule drug associatedtractable with nuclear factor-B-related inflammaintervention. The PMT target class, as defined here, tory diseases, and SET domain-containing protein 1A therefore provides an important pool of potential targets 22 and CARM1 (REF. 23) have been associated (SETD1A) for drug discovery efforts. with viral infections involving Herpes simplex virus and

PKMTs and PRMTs in human disease HDAC inhibitors In surveying the histone-modifying enzymes of the Vorinostat Zolinza Approved in United States for cutaneous human genome, the enzymes that catalyse methylation manifestation in cutaneous T cell lymphoma of lysine residues (protein lysine methyltransferases Romidepsin FK228 New drug application filing (PKMTs)) and arginine residues (protein arginine Panobinostat LBH-589 Phase II (PRMTs)) are of substantial interest methyltransferases from the perspective of drug discovery and medicinal Belinostat PXD-101 Phase II chemistry. The action of these enzymes is crucial in Entinostat MS-275 Phase II controlling gene regulation, and there is an increasing SNDX-275 amount of biochemical and biological data to suggest MGCD-0103 MG-0103 Phase II that the enzymatic activities of several of these proteins JNJ-26481585 None have pathogenic Phase Iroles in cancer, inflammatory diseases, Givinostat ITF2357 neurodegenerative Phase II diseases and other conditions of 715 . cancer therapies, including those described importance *See REFS 51,52 for comprehensive reviews of novel For example, with the exception of DoT1-like, histone in the table. DNMT, DNA methyltransferase; HDAC, histone deacetylase. H3 methyltransferase (DoT1l; also known as KMT4), all human PKMTs contain a ~130 amino-acid domain, to(discussed as the SET domain, which constitutes bereferred stringent below). Clearly, the structure the of 1416 . Enhancer of catalytic domain of these 16,24 enzymes each enzyme is unique , as is the resulting biology zeste homologue 2 (EZH2; also known as KMT6) is a and pathobiology associated with each enzyme. Yet, the SET domain protein that forms the catalytic subunit of shared chemical mechanisms of the PKMTs and PRMTs the 45- protein core of polycomb repressive complex 2 allows for certain efficiencies and economies in the dis(PRC2). PRC2 is a PKMT that catalyses the methylacovery of selective drugs for these enzymes, by treating tion of lysine 27 of histone H3 (in the nomenclature of them as a target class2527. Several of these enzymes histone modification, this site is referred to as H3K27). have been found to catalyse methyl transfer to lysine Although EZH2 contains the catalytic active site, all of or arginine residues on a number of cellular proteins; the proteins of the PRC2 complex are required for full this is especially the case for the PRMTs, for which sevPKMT activity. overexpression of EZH2 or another 24,28 . With eral cytosolic substrates have been identified PRC2 subunit, suppressor of zeste 12 homologue respect to gene however, the most important (SUZ12), hasregulation, been associated with numerous human targets for both PKMTs and PRMTs are to be the cancer types, including prostate, breast,likely bladder, colon, histones, as post-translational modification of these proskin, liver, endometrial, lung and gastric cancers, as well teins is clearly a determinant of 15 chromatin . In breastremodelling carcinomas, as lymphomas and myelomas and therefore regulation of have gene been transcription. increased levels of EZH2 shown to correlate with increased invasiveness and proliferation rate; it has Representation of PMTs the be human genome been suggested that EZH2in could a prognostic indica10 The target class is represented in many species, and . In cell culture, torPMT of patient outcome for breast cancer the human genome and PRMTs. overexpression of encodes EZH2 inseveral breast PKMTs epithelial cells causes Attempts to quantify the representation PKMTs in a anchorage-independent cell growth of and increased particular organism, and to understand the relatedinvasiveness. Additionally, when EZH2-overexpressing Target class ness ofwere these proteins to the onemammary another, have focused on cells injected into fat pads of nude A group of proteins that are the sequence alignment of the SET domain because, the as mice, the animals developed tumours, demonstrating related by a common type discussed above, this domain is common to all PKMTs tumorigenicity of EZH2 overexpression. Importantly, the of drug-binding pocket, (except DoT1l). but sufficiently diverse that phenotypic effects of EZH2 overexpression are correlated selective inhibition of specific Several attempts have been made to systematically with increased H3K27 methylation and are dependent on proteins can be achieved, group the PKMTs on the basis of sequence homology and the presence of an intact SET domain, both of which imply using medicinal chemical 14,29 10,15 . For example, in 2007, a nomenclature consubstrate . a role for EZH2 enzymatic activity in pathogenesis elaboration of the basic vention was proposed the PKMTs, alongare with other Several other humanfor PKMTs and PRMTs strongly SAH chemotype structures. 29 S-adenosyl-l-homocysteine, 2. associated with human cancers,enzymes as summarized in TABLE . In this study, types of chromatin-modifying theSAM universal product of all Similarly, there is compelling evidence thatSET other PKMTs 24 human PKMTs were identified. These domain S-adenosyll-methionine, enzymatic methyltransferase and PRMTs have divided a pathogenic role infamilies serious on human PKMTs have been into related the the universal methyl reactions, formed by group example, SET domain, diseases other than cancer 79. For basis of sequence alignment; initially four, and later donor of alltransfer enzymatic methyl group from 17 and coactivator-associated bifurcated 1 (SETDB1) methyltransferase reactions. l-methionine. S-adenosylseven, major families were defined in this manner14,16: the
REVIEWS | Drug Discovery 726 | SEPTEMBER 2009 | VolUME 8 S4 NATURE

human T lymphotrophic virus, respectively. PKMTs and The PMT active siteemerging as compelling targets for PRMTs are therefore The pursuit of the PMTs 4,5as a drug target class is facilitated . drug discovery efforts by a rich literature base of crystallographic and enzyme kinetic studies of these class enzymes that have helped to PMTs as a drug target define mechanisms of catalysis. All of these enzymes From their a chemical biology and medicinal chemistry probably use athe common bimolecular nucleophilic subperspective, PKMTs and PRMTs are of interest 3,24,30 2) methyl transfer mechanism . The lone stitution (S because they have a common mechanism of catalysis N pair electrons of a nitrogen atom (from lysine or arginine) (discussed below), involving a small, organic cofactor. As attacks the electrophilic methylsulphonium cation other druggable classes of enzymes, such as the protein of SAM at a 180 to the feature, leaving it group, tothat form a kinases, share thisangle mechanistic is likely the penta-coordinate carbon transition state. The transition PMTs will be similarly amenable to inhibition by small, state structure then collapses, with methyl group reloorganic molecules. cation to the nitrogen atom catalyse of the lysine or transfer arginine side The PKMTs and PRMTs methyl from chain formation of S -adenosyl-l-homocysteine (SAH; their and universal methyl donor, S-adenosyl-l-methionine (FIG. 2) (FIG. as a 2) product. also known as AdoHcy) also known as AdoMet) , to a nitrogen atom (SAM; 3 use a naturally occurring adenosyl analogue . Protein substrate speof The lysine or of arginine side chains cificity can be stringent in thesedonor enzymes; some PKMTs as the universal group transfer by PMTs is remiseem toof selectively a another particular lysine residue niscent protein methylate kinases large family of on a specific histone, and the extent of methylation on druggable enzyme targets, the ATP-binding pockets a single lysine (that is, mono-, or tri-methylof which haveresidue proved to be highly ditractable targets 31,32 by a particular enzyme can also ation) that is catalysed . Furthermore, despite binding a for drug discovery
VolUME 8 |COLLECTION SEPTEMBEREpigenetics 2009 | 725 www.nature.com/reviews/drugdisc NATURE REPRINT

REVIEWS
focus on the PMTs, and in particular on those aspects that make PMTs targets for drug discovery Table 1 | Epigenetic-enzyme inhibitors for attractive cancer therapy efforts. We summarize the data that contribute to generic name Alernative clinical status* the validation of PMTs as targets for specific human names diseases, as well as the structural and mechanistic data DNMT inhibitors that suggest PMTs are a tractable (that is, druggable) 5-azacitidine Vidaza target class.Approved in the United States
for myelodysplastic syndrome Decitabine
Acetyltransferases

REVIEWS
52 family suppressor of variegation 39 (SUV39) family; the SET1 Deacetylases members (also known as Mll) family; the SET2 (also known as ~18 family Me Demethylases NSD) family; the retinoblastoma protein-interacting zinc members ~30 family finger protein (RIZ) (also known as PRDM) family; the K members Ac Me SET and MYND domain-containing (SMYD) family; K (EZ) family; the enhancer of zeste and the PRMTs SUV420 R 10 family P Kinases family. An eighth family, known as others, included the members S enzymes SETD7 and SETD8 (also known as PRSET7). Ub Ligases Finally, DoT1l, the human, non-SET domain PKMT K can be considered a ninth family of PKMTs. our group has recently extended this work to systematically identify all of the SET domains that are encoded by the human genome. This study has more than doubled the number of putative human PKMTs to 52 (51 SET domain proteins plus DoT1l) (l.F. Jerva, K.o. Elliston, V.M.R., and R.A.C., unpublished observations). FigureM.E.S. 1 | A nucleosome and the post-translational From the perspective of drug discovery, the salient point histone protein modifications that can influence is epigenetic that these enzymes are numerous in humans. Nature Reviews | Drug Discovery regulation of gene transcription. The PRMTs are similarly well represented humans. Modifications of the histone protein tail arein shown: changes in least acetylation (Ac) by acetyltransferases There are at eight human PRMTs for whichand some deacetylases, phosphorylation (P) by kinases, ubiquitylation level of methyltransferase activity has been shown. These (Ub) byhave ligases and changes in methylation (Me) by proteins a canonical sequence domain that is associmethyltransferases and demethylases. The and enzyme families ated with the binding sites for the cofactor substrate that are responsible for the various post-translational (arginine), although the sequence conservation among modifications are shown. PKMT,of protein lysine these proteins is low. Estimates the total number of methyltransferase; PRMT, protein arginine methyltransferase. PRMTs that are encoded by the human genome vary, depending on the method of sequence alignment and the level of alignment stringency that is applied. Nevertheless, it is clear that 1050 of these enzymes are arginine methyltransferase 1 (CARM1; also known as represented in humans. PRMT4)18 have been implicated in the neurodegeneraThe human PMTs are thus a large class of enzymes, tive diseases Huntingtons disease and spinal muscular and several of them already have well established disatrophy, respectively. SET domain-containing lysine ease association (discussed above). Furthermore, owing 19 methyltransferase 7 (SETD7; also known as KMT7) , to the common features of their chemical mechanism CARM1 (REF. 20) and PRMT1 (REF. 21) have been of catalysis (discussed below), the PMTs are likely to be associated with nuclear factor-B-related inflammainherently tractable as targets for small-molecule drug tory diseases, and SETtarget domain-containing protein intervention. The PMT class, as defined here, 1A 22 and CARM1 (REF. 23) have been associated (SETD1A) therefore provides an important pool of potential targets with viral infections involving Herpes simplex virus and for drug discovery efforts. human T lymphotrophic virus, respectively. PKMTs and PRMTs therefore as compelling targets for The PMT are active site emerging 4,5 . drug target class is facilitated drug discovery efforts The pursuit of the PMTs as a PKMTs

human genome, the enzymes that catalyse methylation (protein lysine methyltransferases Vorinostat Zolinzaof lysine residues Approved in United States for cutaneous (PKMTs))manifestation and arginine residues T (protein arginine in cutaneous cell lymphoma (PRMTs)) are of substantial interest Romidepsin FK228 methyltransferases New drug application filing from the perspective of drug discovery and medicinal Panobinostat LBH-589 Phase II chemistry. The action of these enzymes is crucial in Belinostat PXD-101 controllingPhase gene II regulation, and there is an increasing amount of biochemical and biological data to suggest Entinostat MS-275 Phase II SNDX-275 that the enzymatic activities of several of these proteins have pathogenic roles in cancer, inflammatory diseases, MGCD-0103 MG-0103 Phase II neurodegenerative JNJ-26481585 None Phase I diseases and other conditions of importance715. Givinostat ITF2357 Phase II For example, with the exception of DoT1-like, histone *See REFS 51,52 for comprehensive reviews of novel cancer (DoT1l; therapies, including those described H3 methyltransferase also known as KMT4), in the table. DNMT, DNA methyltransferase; HDAC, histone deacetylase. all human PKMTs contain a ~130 amino-acid domain, referred to as the SET domain, which constitutes the catalytic domain of these enzymes1416. Enhancer of be stringent (discussed below). Clearly, the structure zeste homologue 2 (EZH2; also known as KMT6) of is a each enzyme is unique16,24, as is the resulting biology SET domain protein that forms the catalytic subunit of and pathobiology associated with each enzyme. Yet, the the 45- protein core of polycomb repressive complex 2 shared chemical mechanisms of the PKMTs and PRMTs (PRC2). PRC2 is a PKMT that catalyses the methylaallows for certain efficiencies and economies in the distion of lysine 27 of histone H3 (in the nomenclature of covery of selective drugs for these enzymes, by treating histone modification, this site is referred to as H3K27). them as a target class2527. Several of these enzymes Although EZH2 contains the catalytic active site, all of have been found to catalyse methyl transfer to lysine the proteins of the PRC2 complex are required for full or arginine residues on a number of cellular proteins; PKMT activity. overexpression of EZH2 or another this is especially the case for the PRMTs, for which sevPRC2 subunit, suppressor of zeste 12 homologue 24,28 . With eral cytosolic substrates have been identified (SUZ12), has regulation, been associated with numerous human respect to gene however, the most important cancer types, including prostate, breast, bladder, colon, targets for both PKMTs and PRMTs are likely to be the skin, liver, endometrial, lung and gastric cancers, as well histones, as post-translational modification of these pro15 . In breast carcinomas, as lymphomas and myelomas teins is clearly a determinant of chromatin remodelling increased levels of EZH2 shown to correlate and therefore regulation of have gene been transcription. with increased invasiveness and proliferation rate; it has been suggested that EZH2in could a prognostic indicaRepresentation of PMTs the be human genome 10 . In cell culture, tor of patient outcome for breast cancer The PMT target class is represented in many species, and overexpression of EZH2 inseveral breast PKMTs epithelial cells causes the human genome encodes and PRMTs. anchorage-independent cell growthof and increased Attempts to quantify the representation PKMTs in a invasiveness. Additionally, when EZH2-overexpressing particular organism, and to understand the relatedTarget class cellsof were injected into fat pads of nude ness these proteins to the onemammary another, have focused on A group of proteins that are mice, the animals developed tumours, demonstrating the the sequence alignment of the SET domain because, as related by a common type tumorigenicity of EZH2 overexpression. Importantly, the discussed above, this domain is common to all PKMTs of drug-binding pocket, but sufficiently diverse that phenotypic effects of EZH2 overexpression are correlated (except DoT1l). selective inhibition of specific with increased H3K27 methylation and dependent on Several attempts have been made toare systematically proteins can be achieved, the presence of anon intact SET domain, both of which imply group the PKMTs the basis of sequence homology and using medicinal chemical 10,15 14,29 . a role for EZH2 enzymatic in pathogenesis . For example, in activity 2007, a nomenclature consubstrate elaboration of the basic Severalwas other humanfor PKMTs and PRMTs are strongly vention proposed the PKMTs, along with other chemotype structures. SAH 29 S-adenosyl-l-homocysteine, . In this types of chromatin-modifying enzymes TABLE 2. associated with human cancers, as summarized in study, the universal product of all SAM 24 human there PKMTs were identified. These SET domain Similarly, is compelling evidence that other PKMTs enzymatic methyltransferase S-adenosyll-methionine, PKMTs have been divided into related families on the and PRMTs have a pathogenic role in serious human reactions, formed by group the universal methyl basis of sequence alignment; initially four, and later example, SET domain, diseases other than cancer 79. For methyl group transfer from donor of all enzymatic 17 S -adenosyl-l-methionine. seven, major 1 families were defined in this manner14,16: the and coactivator-associated bifurcated (SETDB1) methyltransferase reactions.
HDAC inhibitors
726 | SEPTEMBER 2009 | VolUME 8 NATURE REVIEWS | Drug Discovery NATURE REPRINT COLLECTION Epigenetics

PKMTs and PRMTs in inUnited human disease Dacogen Approved States for myelodysplastic syndrome enzymes of the In surveying the histone-modifying

by a rich literature base of crystallographic and enzyme PMTsstudies as a drug target class that have helped to kinetic of these enzymes From a chemical biology and All medicinal chemistry define their mechanisms of catalysis. of these enzymes perspective, the PKMTs and PRMTs are of interest probably use a common bimolecular nucleophilic sub3,24,30 of catalysis because they have transfer a common mechanism 2) methyl mechanism . The lone stitution (SN (discussed involving small, organic cofactor. As pair electronsbelow), of a nitrogen atoma(from lysine or arginine) other druggable classes of enzymes, such as the protein attacks the electrophilic methylsulphonium cation this mechanistic feature, it is likely thatathe ofkinases, SAM atshare a 180 angle to the leaving group, to form PMTs will be similarly amenable state. to inhibition by small, penta-coordinate carbon transition The transition organic molecules. state structure then collapses, with methyl group reloThe PKMTs and PRMTs methyl transfer from cation to the nitrogen atom of catalyse the lysine or arginine side their universal methyl donor, S-adenosyl-l-methionine chain and formation of S-adenosyl-l-homocysteine (SAH; alsoas known as AdoMet) 2), to a nitrogen atom (SAM; (FIG. 2) as (FIG. a product. also known AdoHcy) substrate speof lysine side chains3. Protein The useor of arginine a naturally occurring adenosyl analogue ascificity the universal transfer donor by PMTs is remican be group stringent in these enzymes; some PKMTs niscent of selectively protein kinases another large family of seem to methylate a particular lysine residue druggable enzyme targets, pockets on on a specific histone, and the the ATP-binding extent of methylation ofawhich have proved be is, highly tractable targets single lysine residue to (that mono-, di- or tri-methyl31,32 . Furthermore, despite binding for drug that discovery ation) is catalysed by a particular enzyme can a also
www.nature.com/reviews/drugdisc VolUME 8 | SEPTEMBER 2009 | 725 S5

REVIEWS
Table 1 | Epigenetic-enzyme inhibitors for cancer therapy

REVIEWS

suppressor of variegation 39 (SUV39) family; the SET1 (also known as Mll) family; the SET2 (also known as Table 2 | Selected Alernative PKMTs and PRMTs that have shown an association with human cancers generic name clinical status* NSD) family; the retinoblastoma protein-interacting zinc names Protein Methylation cancers cancer association refs finger protein (RIZ) (also known as PRDM) family; the methyltransferase substrates DNMT inhibitors SET and MYND domain-containing (SMYD) family; 5-azacitidine Vidaza Approved the United States SUV39H1 H3K9 Colon in cancer Increased expression in colorectal tumours; associated with and the SUV420 53 the enhancer of zeste (EZ) family; for myelodysplastic syndrometranscriptional repression family. An eighth family, known as others, included the enzymes SETD7 and SETD8 (also known as PRSET7). Decitabine Dacogen Approved in United States for EHMT2 H3K9 Lung, prostate and Increased expression in lung cancer cell lines; regulates 54,55 myelodysplastic syndrome centrosome duplication, hepatocellular presumably through chromatin Finally, DoT1l, the human, non-SET domain PKMT carcinoma structure can be considered a ninth family of PKMTs. HDAC inhibitors MLL H3K4 Leukaemia Chromosomal aberrations involving MLL are a cause of acute 5658 our group has recently extended this work to systemVorinostat Zolinza Approved in United States for cutaneous leukaemias; the SET domain is lost in translocation atically identify all of the SET domains that are encoded manifestation in cutaneous T cell lymphoma by the human genome. This study has more than doubled NSD1 H3K36 Acute myeloid Translocation fuses NSD1 to nucleoporin 98 in human acute 59 Romidepsin FK228 Newleukaemia drug application filing myeloid leukaemia the number of putative human PKMTs to 52 (51 SET Panobinostat LBH-589 Phase II domain proteins in plus DoT1l) (l.F. Jerva, K.o. Elliston, WHSC1 H3K36 and Myeloma Translocated and increased expression myeloma; associated 6062 H4K20 with transcriptional regulation Belinostat PXD-101 Phase II V.M.R., M.E.S. and R.A.C., unpublished observations). From the perspective of drug discovery, point WHSC1L1 H3K4 Lung Amplified in lung cancer and breast cancer; translocation with the salient 6364 Entinostat MS-275 Phase II and breast is that these enzymes are numerous in humans. cancers, and childhood nucleoporin 98; mediates transcriptional activation SNDX-275 acute myeloid The PRMTs are similarly well represented in humans. MGCD-0103 MG-0103 Phase II leukaemia There are at least eight human PRMTs for which some JNJ-26481585 None Phase I DOT1L H3K79 MLL-rearranged Recruited by MLL fusion partners MLLT1, MLLT2,activity MLLT3 and 11, 66,67 level of methyltransferase has been shown. These leukaemias MLLT10 to homeobox genes; associated with transcriptional Givinostat ITF2357 Phase II proteins have a canonical sequence domain that is associactivation and elongation ated with the binding sites for the cofactor and substrate *See REFS 51,52 for comprehensive reviews of novel cancer therapies, including those described in the table. DNMT, DNA methyltransferase; HDAC, Breast, histone deacetylase. SMYD3 H3K4 liver, colon and Overexpressed in multiple tumour types; associated with conservation among 68,69 (arginine), although the sequence gastric cancers transcriptional activation these proteins is low. Estimates of the total number of EZH2 H3K27 Breast, prostate, Amplified and increased expression in several tumour types; 10,15,70,71 PRMTs that are encoded by the human genome vary, colon, gastric, bladder member of the repressive 2; associated be stringent (discussed below). Clearly, a the structure ofpolycomb depending on complex the method of sequence alignment and liver 16,24 cancers, with transcriptional repression , as is the resulting biology and the level of alignment stringency that is applied. each enzyme ismelanoma unique and and pathobiology associated with each enzyme. Yet, the Nevertheless, it is clear that 1050 of these enzymes are lymphoma shared chemical mechanisms of the PKMTs and PRMTs represented in humans. SETD7 H3K4 Breast cancers SET7-mediated methylation stabilizes the oestrogen receptor 72 allows for certain efficiencies and economies the dis-for the recruitment The human PMTs are thusreceptor a large class of enzymes, and is in necessary of the oestrogen covery of selective drugs for these enzymes, treating to its by target genes and target gene transactivation and several of them already have well established dis2527 . Several of these enzymes them as a target class ease association (discussed above). PRDM14 No known Breast cancers Amplified and overexpressed in cancers; associated with Furthermore, owing 73 have been found to catalyse methyl transfer to lysine to the common features of their chemical mechanism substrate transcriptional repression or arginine residues on a number of cellular proteins; of catalysiswith (discussed below), the PMTs are likely to be CARM1 H3R17, Breast and prostate Increased expression correlates androgen independence 74,75 this is especially the case for the PRMTs, which sev- carcinoma; inherently tractable as in targets small-molecule drug EP300CBP and cancers infor human prostate overexpressed breastfor tumours 24,28 NCOA3 and associated with transcriptional . With eral cytosolic substrates have been identified intervention.activation The PMT target class, as defined here, respect to gene regulation, however, the most important therefore provides an important pool of potential 12, targets PRMT5 H3R8, p53, Lymphoma PRMT5 expression and H3R8 methylation levels are increased 76 targets for both PKMTs and PRMTs arein likely to be the cells; for drug discovery efforts. SNRPD1, lymphoid cancer PRMT5 mediates p53 methylation, SNRPD3 and which promotes histones, as post-translational modification of these pro-cell arrest rather than cell death; H4R3 SUPT5H methylation promotes recruitment of DNMT3A, The PMT active site subsequent teins is clearly a determinant of chromatin remodelling promoter CpG methylation and gene silencing The pursuit of the PMTs as a drug target class is facilitated and therefore regulation of gene transcription. CARM1, coactivator-associated arginine methyltransferase 1 (also known as PRMT4); CBP, CREB-binding protein; DNMT3A, DNA (cytosine-5-)-methyltransferase by a rich literature base of crystallographic and enzyme 3; EHMT2, euchromatic histone-lysine N-methyltransferase 2 (also known as G9A and KMT1C); EP300, E1A-binding protein p300; EZH2, enhancer of zeste Representation of PMTs inH3 the human genome kinetic studies of these enzymes that have helped to homologue 2 (also known as KMT6); DOT1L, DOT1-like, histone methyltransferase (also known as KMT4); MLL, myeloid, lymphoid or mixed-lineage leukaemia (also known as KMT2A); MLLT1, myeloid, lymphoid or is mixed-lineage to 1; NCOA3, nuclear receptor coactivator 3; NSD1, nuclear The PMT target class representedleukemia, in manytranslocated species, and define their mechanisms of catalysis. All of these enzymes receptor-binding SET domain protein 1; PKMT, protein lysine methyltransferase; PRDM14, PR domain-containing protein 14; PRMT, protein arginine the human genome lysine encodes several PKMTs and PRMTs. probably useand a common bimolecular nucleophilic submethyltransferase; SETD7, SET domain-containing methyltransferase 7 (also known as KMT7); SMYD3, SET MYND domain-containing protein 3; 3,24,30 2) methyl transfer . The lone Attempts toD1 quantify the 16kDa representation of in a small stitution (SNribonucleoprotein SNRPD1, small nuclear ribonucleoprotein polypeptide (also known asPKMTs SMD1); SNRPD3, nuclear D3mechanism polypeptide 18kDa (also known as SMD3); SUPT5H, suppressor of Ty 5 homologue; SUV39H1, suppressor of variegation homologue 1 (also known as(from KMT1A); WHSC1, pair39 electrons of a nitrogen atom lysine or arginine) particular organism, and to understand the relatedWolfHirschhorn syndrome candidate 1 (also known as MMSET and NSD2); WHSC1L1, WolfHirschhorn syndrome candidate 1-like protein 1 (also known as NSD3). ness of these proteins to one another, have focused on attacks the electrophilic methylsulphonium cation the sequence alignment of the SET domain because, as of SAM at a 180 angle to the leaving group, to form a discussed above, this domain is common to all PKMTs penta-coordinate carbon transition state. The transition collapses, with methyl and group relo(except DoT1l). common natural ligand, the ATP-binding pockets of state Astructure common then structural feature of PKMTs PRMTs to the nitrogen atom enzymes of the lysine or arginine side Several attempts have been medicinal made to systematically protein kinases have afforded chemists a rich cation that distinguishes these from other proSAH; -adenosyl-l-homocysteine group the PKMTs on thescaffolds, basis of sequence homology and diversity of chemical which have resulted in chain teins and thatformation use SAMof isS the overall architecture of ( their 14,29 For example, in of 2007, a nomenclature con- also (FIG. 2) This as a product. known as AdoHcy) substrate a range of .drug molecules varying degrees of target extended catalytic active sites. generally consists of vention was32 proposed for the PKMTs, along with use other The use of a naturally occurring adenosyl SAH . Similarly, the commonality of SAM by a SAM-binding selectivity pocket that is accessed from analogue one face 29 S-adenosyl-l-homocysteine, In this study, as types of chromatin-modifying by PMTs is remithe PMTs belies the structural,enzymes biological. and pathologiofthe theuniversal protein, group and a transfer narrow, donor hydrophobic, acceptor the universal product of all 24cal human PKMTs were identified. These SET domain niscent of protein kinases another large family of diversity of these enzymes. From the perspective of (that is, lysine or arginine) channel that extends to the enzymatic methyltransferase PKMTs have been divided into related families on the druggable enzyme targets, the ATP-binding pockets drug discovery and medicinal chemistry, the diversity of opposite face of the protein surface, such that the two reactions, formed by basis of sequence alignment; initially four, and which have proved to be highly tractablesides targets SAM-binding modes and catalytic mechanisms of later these of substrates enter the active site from opposite of methyl group transfer from 31,32 . Furthermore, despite binding a S-adenosyl-l-methionine. seven, major were defined in this manner14,16: the for discovery enzymes is families of key importance. thedrug enzyme surface.
726 | SEPTEMBER 2009 | VolUME 8 REVIEWS | Drug Discovery S6 NATURE www.nature.com/reviews/drugdisc VolUME 8 COLLECTION | SEPTEMBEREpigenetics 2009 | 727 NATURE REPRINT

REVIEWS
a
O H N O N H

REVIEWS
focus on the PMTs, and in particular on those aspects that make PMTs attractive targets for drug discovery H O efforts. We summarize the data that contribute to N N for specific human the validation of PMTs as targets H O structural and mechanistic data diseases, as well as the that suggest PMTs are a tractable (that is, druggable) target class.
PKMTs Acetyltransferases 52 family From the above discussion, it could be concluded Deacetylases members that the configuration of the bound SAM is structurally ~18 family Me Demethylases members related to the identity of the methyl acceptor nitrogen ~30 family species upon which the enzymes act; that is, members U-shaped K Ac Me for PKMTs and extended for PRMTs. However, data on PRMTs the non-SET domain K PKMT, DoT1l, do not support this R H 10 family P Kinases N conclusion. In the co-crystal structures of human DoT1l members H CH3 NH2 NH2 S PKMTs and PRMTs in human disease bound to SAM33, and Ub the yeast homologue DoT1P bound Ligases N N N 34 In surveying the histone-modifying enzymes of the , the cofactor is K bound in the extended configuto SAH S+ human genome, N PMTs the enzymes that catalyse methylation ration, similar to that seen in the PRMTs. Additionally, N N
O

H N O N

CH3 S+
O C 2

O C of lysine residues (protein lysine methyltransferases 2 H H H (PKMTs)) and arginine residues H arginine H H (protein OH OH OH + NH3+ NH3are methyltransferases (PRMTs)) of OH substantial interest from the perspective of drug discovery and medicinal SAM SAH chemistry. The action of these enzymes is crucial in controlling gene regulation, and there is an increasing + and biological data to suggest amount of biochemical b LG Nu LG Nu + LG that theNu enzymatic activities of several of these proteins have pathogenic roles in cancer, inflammatory diseases, Figure 2 | PMT-catalysed methylation of proteins diseases by an sN2and reaction with sAM as of neurodegenerative other conditions the methyl donor. The protein methyltransferases (PMTs) catalyse methyl transfer from importance715. Nature Drug Discovery their universal methyl donor, S-adenosyl-l-methionine (SAM; alsoReviews known |as AdoMet) For example, with the exception of l DoT1-like, histone to a nitrogen atom of lysine or arginine side chains to form S-adenosyl-homocysteine H3 methyltransferase (DoT1l; also known as KMT4), (SAH; also known as AdoHcy). a | The methyl group (shown in red) of the SAM sulphonium human PKMTs contain a ~130 domain, cation is attacked by the loneall pair electrons of a lysine (shown here)amino-acid or arginine (not referred to as the SET which constitutes shown) side-chain nitrogen atom. The reaction results indomain, transfer of the methyl group the 1416 cofactor. to the attacking nitrogen atom and the production of these SAH from the reaction . Enhancer of catalytic domain of enzymes b | A more generalized chemical scheme of a bimolecular nucleophilic substitution (SN2)is a zeste homologue 2 (EZH2; also known as KMT6) group transfer reaction, illustrating the attacking nucleophile (Nu ; lysine or arginine in of SET domain protein that forms the catalytic subunit the case of PMTs), the leaving group (LG; the methyl group in the case of PMTs), and the the 45- protein core of polycomb repressive complex 2 transient but essential formation of a penta-coordinate carbon transition state (). (PRC2). PRC2 is a PKMT that catalyses the methylation of lysine 27 of histone H3 (in the nomenclature of histone modification, this site is referred to as H3K27). Although EZH2 contains the catalytic active site, all of Crystallographic studies have revealed two distinct the proteins of the PRC2 complex are required for full binding modes for SAM or SAH in the cofactor-binding PKMT activity. overexpression of EZH2 or another For the SET domain PKMTs that have pockets PMTs24. suppressor PRC2 of subunit, of zeste 12 homologue been co-crystallized with SAM or SAH, it is known that (SUZ12), has been associated with numerous human the cofactor adopts a U-shaped configuration within cancer types, including prostate, breast, bladder, colon, the active site (FIG. 3) that aligns the methylsulphonium skin, liver, endometrial, lung and gastric cancers, as well cation of SAM at the base of the15 narrow lysine channel, as lymphomas and myelomas . In breast carcinomas, in perfect juxtaposition to the -amino group of the increased levels of EZH2 have been shown to correlate acceptor lysine residue, which facilitates group transfer. with increased invasiveness and proliferation rate; it has This U-shaped configuration is induced by a conserved been suggested that EZH2 could be a prognostic indicaaspartate or glutamate residue that binds to the ribose tor of patient outcome for breast cancer10. In cell culture, hydroxyl groups, and a positively charged lysine or overexpression EZH2 in a breast epithelial cells arginine residue of that forms salt bridge with the causes caranchorage-independent cell growth and increased boxylate of SAM. In striking contrast to the U-shaped invasiveness. Additionally, when configuration that is adopted byEZH2-overexpressing the cofactor when Target class cells were injected into the mammary fat pads site of nude bound to PKMTs, SAM bound within the active of A group of proteins that are mice, the animals tumours, demonstrating the PRMTs adopts an developed extended configuration that resemrelated by a common type tumorigenicity of EZH2 overexpression. Importantly, the of drug-binding pocket, bles the extended SAM configuration seen in the DNA but sufficiently diverse that phenotypic effects of EZH2 overexpression are correlated methyltransferases; again, the binding motif results in selective inhibition of specific with increased H3K27 methylation and are dependent on alignment of the SAM methylsulphonium cation with proteins can be achieved, thebase presence ofacceptor-binding an intact SET domain, bothAnother of which disimply the of the channel. using medicinal chemical 10,15 a role for EZH2 enzymatic activitywithin in pathogenesis tinction between cofactor binding the PKMTs . elaboration of the basic Several other human PKMTs and PRMTs are strongly chemotype structures. and the PRMTs is that, in PRMTs, dimer formation associated with human cancers, as of summarized in TABLE seems to be a crucial component SAM binding and 2. SAM 3,24 Similarly, there is this compelling evidence that other PKMTs . The catalysis, whereas is not the case for PKMTs S-adenosyl-l-methionine, and PRMTs have a pathogenic role dimer in serious human mechanistic consequences of obligate formation the universal methyl group 79 . but Forit example, SET domain, diseases otheris than in the PRMTs not cancer yet clear, may be involved in donor of all enzymatic 17 and coactivator-associated bifurcated 1 (SETDB1) methyltransferase reactions. multiple methylations of the arginine residue.

the solvent-exposed surface area of the bound cofactor in DoT1l is more similar to that seen in the PRMTs than the PKMTs, as is the overall amino-acid sequence around the cofactor-binding pocket24,33. Therefore, from a structural perspective, DoT1l seems to link the PKMT and PRMT groups of PMTs. and the post-translational Figure 1 | A nucleosome The discovery and optimization of selective drugs for histone protein modifications that can influence Nature Reviews | Drug Discovery epigenetic gene the PMTs will regulation depend notof only on transcription. the static structure of Modifications the histone as protein tail are shown: the active site ofof the enzyme, revealed through cryschanges instudies, acetylation (Ac) by and tallographic but also onacetyltransferases the structural dynamics 27,35 phosphorylation (P)catalytic by kinases, ubiquitylation . ofdeacetylases, the active site that accompany turnover (Ub) by ligases and changes in methylation (Me) by Studies on the kinetic mechanisms of the PMTs may methyltransferases and demethylases. The enzyme families provide some information in this area. that are responsible for the various post-translational Some of the SET domain PKMTs, such as SETD7, modifications are shown. PKMT, protein lysine perform a single round ofprotein catalysis on a lysine residue, methyltransferase; PRMT, arginine methyltransferase. resulting in a mono-methylated product, whereas other SET domain PKMTs catalyse multiple rounds of methylation on a specific lysine residue. Crystallographic studies suggest that the difference between arginine methyltransferase 1 (CARM1; also singleknown as turnover and 18 multiple-turnover SET domain enzymes have been implicated in the neurodegeneraPRMT4) results from theHuntingtons degree of steric crowding and hydrogentive diseases disease and spinal muscular bonding patterns in the lysine-binding channel of these atrophy, respectively. SET domain-containing lysine 3,24,36,37 . In particular, the identity of anas aromatic enzymes methyltransferase 7 (SETD7; also known KMT7)19, residue within the lysine-binding pocket seems to be the (REF. 20) and PRMT1 (REF. 21) have been CARM1 key determinant of the multiplicity of lysine methylassociated with nuclear factor-B-related inflammaation. In the PKMT DIM5, this residue is a phenylalanine tory diseases, and SET domain-containing protein 1A (F281), and the enzyme can tri-methylate the acceptor (SETD1A)22 and CARM1 (REF. 23) have been associated lysine residue of its protein substrate. The correspondwith viral infections involving Herpes simplex virus and ing residue in SETD7 is a tyrosine (Y305), and this human T lymphotrophic virus, respectively. PKMTs and enzyme can only mono-methylate its protein substrate. PRMTs are therefore emerging as compelling targets for Remarkably, the mutant4,5 F281Y transforms DIM5 into drug discovery efforts . a mono-methylating PKMT, and the corresponding mutant Y305F in SETD7 results in an enzyme that PMTs as a drug target class is capable of multiple rounds of lysine methylation38. From a chemical biology and medicinal chemistry These mutagenesis results have been extended to the perspective, the histone PKMTs and N PRMTs are of interest PKMT euchromatic lysine -methyltransferase 2 because they have a common mechanism of catalysis 39 (EHMT2; also known as G9A) , and the tyrosine (discussed below), a small, organic cofactor. As phenylalanine switchinvolving seems to be a general determinant other druggable classes of enzymes, such as the protein 24 . of product specificity among the SET domain PKMTs kinases, share this mechanistic feature, itmechanical is likely that the Molecular dynamics and hybrid quantum PMTs will be similarly amenable to inhibition by for small, molecular mechanical studies also suggest a key role organic molecules. bound water molecules (a water channel) in the extent 30 Themethylation PKMTs and by PRMTs catalyse . methyl transfer from of lysine PKMTs their universal methyl donor, -adenosyl-l-methionine An outstanding question that S has yet to be reconciled known as AdoMet)described (FIG. 2), to above a nitrogen atom (SAM; with the also mechanistic hypothesis is how 3 . Protein substrate speofquaternary lysine or arginine side chains the nitrogen atom is deprotonated to genercificity can amine be stringent in acceptor. these enzymes; some PKMTs ate a neutral methyl At physiological seem selectively a particular lysine residue pH, theto lysine aminemethylate is protonated (the negative logon a specific histone, and the extent of methylation arithm of the acid dissociation constant (pKa) of the side on a single lysine residue (that is, mono-, di- are or tri-methyl(REF. 35)), and so there no lone chain amine is ~10.8 ation) that is catalysed by a particular enzymein can also pair electrons to act as the attacking nucleophile the
VolUME 8 | SEPTEMBER 2009 | 725 www.nature.com/reviews/drugdisc S7

NATURE REVIEWS | Drug Discovery 728 | SEPTEMBER 2009 | VolUME 8 NATURE REPRINT COLLECTION Epigenetics

REVIEWS
Table 1 | Epigenetic-enzyme inhibitors for cancer therapy
generic name DNMT inhibitors
5-azacitidine Decitabine Vidaza Dacogen Approved in the United States for myelodysplastic syndrome Approved in United States for myelodysplastic syndrome Approved in United States for cutaneous manifestation in cutaneous T cell lymphoma New drug application filing Phase II

REVIEWS
Alernative a PRMT names clinical status*

HDAC inhibitors
Vorinostat Romidepsin Panobinostat Belinostat Entinostat MGCD-0103 JNJ-26481585 Givinostat Zolinza FK228 LBH-589

b DOT1L

PXD-101 Phase II c SET domain MS-275 Phase II SNDX-275 MG-0103 None ITF2357 Phase II Phase I Phase II

*See REFS 51,52 for comprehensive reviews of novel cancer therapies, including those described in the table. DNMT, DNA methyltransferase; HDAC, histone deacetylase.

Figure 3 | variations in the configuration of sAM or sAH bound within the active sites of different PMTs. astringent | The representative shown the be (discussedconformation below). Clearly, the for structure of 16,24 protein arginine methyltransferases (PRMTs) was biology taken , Nature as is the resulting each enzyme is unique Reviews | Drug Discovery from the crystal structure of S-adenosyl-homocysteine and pathobiology associated with each lenzyme. Yet, the (SAH; also known as AdoHcy) bound to coactivatorshared chemical mechanisms of the PKMTs and PRMTs associated arginine methyltransferase 1 (CARM1)49. allows for certain efficiencies and economies in the H3 disb | The conformation shown for DOT1-like, histone covery of selective drugs for was these enzymes, by crystal treating methyltransferase (DOT1L) taken from the . Several of these enzymes them as a of target class2527 structure S-adenosyll-methionine (SAM; also known 33 . c | The representative as AdoMet) bound tocatalyse this protein have been found to methyl transfer to lysine shownon forathe protein lysine methylorconformation arginine residues number of cellular proteins; transferases (PKMTs) wasfor taken the crystal this is especially the case thefrom PRMTs, for which sevstructure of SAH bound to SET domain-containing eral cytosolic substrates have been identified24,28. With lysine methyltransferase 8 (REF. 50). Carbon atoms are respect to gene regulation, however, the most important represented by grey circles; nitrogen atoms targets for both PKMTs and PRMTs are likely to be the are represented by blue circles; oxygen atoms are represented by red circles; and sulphur atoms histones, as post-translational modification of these proare represented by yellow circles. teins is clearly a determinant of chromatin remodelling

and therefore regulation of gene transcription.

SAH General base catalysis

S-adenosyll-homocysteine, A mechanism that can occur thein universal of all enzyme product catalysis, in which a enzymatic methyltransferase basic group accepts protons reactions, formed by from a substrate molecule, methyl group transferafrom usually to stabilize charged l-methionine. S-adenosyltransition-state species.

Representation of PMTs in the human genome SN2-mediated methyl transfer reaction. A potential mechThe PMT target class is represented many species, and general base catalysis. anism of deprotonation is throughin the human genome encodes severalacids PKMTs and PRMTs. However, inspection of the amino in the active sites Attempts to quantify the representation of PKMTs in a of PKMTs reveals no obvious basic side chains that could particular and to understand the relatedact in this organism, capacity. Another hypothesis is that the solvent ness of these proteins to one another, have focused on acts as a proton sponge; however, this seems inconsistent the sequence alignment of the SET domain because, as with the fact that the lysine side chain is buried deeply discussed above, this domain is common to all PKMTs in the protein, with no clear access to bulk solvent. An (except DoT1l). alternative hypothesis has recently been proposed, based 30 to systematically Several attempts havesimulations been made . According to this on molecular dynamics group thebinding PKMTsof onSAM the basis sequence homology anda model, and of protein substrates creates 14,29 . For example, in 2007, a nomenclature consubstrate water shuttle that can remove a proton from the buried vention waschain proposed for the PKMTs, alongawith other lysine side and ferry this proton along contiguous 29 . In this study, types ofof chromatin-modifying enzymes chain water molecules to be deposited into the bulk 24 human PKMTs were These SETcreated domain solvent. Additionally, theidentified. electrostatic repulsion by PKMTs have been divided into related families charged on the the quarternary nitrogen atom and the positively basis ofcofactor sequence alignment; four, side-chain and later of the lysine SAM lowers the pKainitially 14,16 : the seven, major families were facilitating defined in this manner amine to ~8.2, thereby this deprotonation

suppressor of variegation 39 (SUV39) family; the SET1 (also known as Mll) family; the SET2 hypothesis (also known as process. Furthermore, the water shuttle proNSD) family; the retinoblastoma protein-interacting zinc vides an alternative mechanism to explain the differences finger protein (RIZ) (also known as PKMTs. PRDM)The family; the in extent of lysine methylation by the molecSET MYND domain-containing ular and dynamics studies suggest that the (SMYD) ability to family; form a the enhancer of determine zeste (EZ) family; the SUV420 water shuttle will the extent and of methylation that family. An eighth family, known as others , included the is catalysed by a given enzyme. For example, simulations enzymes SETD7 and SETD8suggest (also known as PRSET7). of SETD7-mediated catalysis that mono-methylFinally, DoT1l, the human, non-SET domain PKMT ation of lysine prevents re-formation of a new water shuttle, can a terminates ninth family of PKMTs. andbe soconsidered this enzyme catalysis after one round our group has recently extended this work to systemof methylation. The same simulations suggest that other atically identify ofribulose the SETbisphosphate domains that are encoded PKMTs, such asall the carboxylase by the human genome. This study has more than oxygenase large subunit methyltransferase, candoubled readily the number of putative human PKMTs to 52 (51 SET re-form the water shuttle, leading to multiple rounds of domain proteins plus DoT1l) (l.F. Jerva, K.o. Elliston, methylation. Enzymes that perform rounds of catalysis V.M.R., M.E.S. and R.A.C.,multiple unpublished observations). on a macromolecular substrate can do so by one of two From the perspective of drug discovery, the salient point mechanisms: a distributive enzyme mechanism, is that these enzymes are numerous in humans.in which each round of catalysis resultswell in macromolecular product The PRMTs are similarly represented in humans. dissociation and rebinding, or aPRMTs processive There are at least eight human for mechanism, which some in which multiple rounds activity of catalysis before dislevel of methyltransferase hasproceed been shown. These sociation of the macromolecular product. PMTs use both proteins have a canonical sequence domain that is associof these some domain and PKMTs that ated withmechanisms: the binding sites forSET the cofactor substrate perform multiple rounds of lysine methylation have been (arginine), although the sequence conservation among 3,24 DoT1l foundproteins to use a processive mechanism these is low. Estimates of the, whereas total number of has been shown to perform rounds of H3K79 PRMTs that are encoded bymultiple the human genome vary, methylationon through a non-processive (distributive) depending the method of sequence alignment 40 mechanism and the level .of alignment stringency that is applied. The PRMTs are alsothat capable of of performing multiple Nevertheless, it is clear 1050 these enzymes are rounds of arginine methylation to produce either monorepresented in humans. or The di-methylated arginine products. that human PMTs are thus a large The classPRMTs of enzymes, have been studied so far follow an ordered, sequential and several of them already have well established dismechanism in which SAMabove). binds before the arginineease association (discussed Furthermore, owing containing substrate, and of di-methyl argininemechanism production to the common features their chemical 3 . on basis occurs through a processive mechanism of catalysis (discussed below), the PMTs are the likely to of be product specificity, PRMTs can be subdivided into two inherently tractable as targets for small-molecule drug types: type I PRMTs, which produce an asymmetrical intervention. The PMT target class, as defined here, N,N -dimethyl arginine; and type II PRMTs, which therefore provides an important pool of potential targets produce a symmetrical N,N-dimethyl arginine3. for drug discovery efforts. The variations in active-site structure and chemical mechanism that are summarized above reflect a target The PMT active site class with the potential for substantial chemical diversity The pursuit of the PMTs as a drug target class is facilitated among small-molecule modulators of individual enzymes by a rich literature base of crystallographic and enzyme in the class. Therefore, the opportunity for the developkinetic studies of these enzymes that have helped to ment of different chemotypes that compete with the comdefine their mechanisms of catalysis. All ofexample, these enzymes mon, natural ligands of these enzymes (for SAM, probably use a common bimolecular nucleophilic sublysine and arginine), and can be modified to produce 3,24,30 2) methyl transfer mechanism . The lone stitution (S N enzyme-selective inhibitors, seems promising. pair electrons of a nitrogen atom (from lysine or arginine) attacks the electrophilic methylsulphonium cation Known inhibitors of PMTs of SAM at a convergence 180 angle to the leaving group, to form a Despite the of data concerning PMTs, the penta-coordinate carbon transition state. The transition search for potent, selective inhibitors of these enzymes has state thenin collapses, with indirect methyl group reloonly structure recently begun earnest. Some approaches cation to the nitrogen atom of the lysine orreported. arginine side to inhibiting or depleting PMTs have been For chain and formation of S-adenosyl-l-homocysteine (SAH; example, the antiviral compound 3-deazaneplanocin 2) as a product. also knowninhibits as AdoHcy) (DZNep) the (FIG. enzyme SAH hydrolase and The use of a naturally occurring adenosyl analogue thereby increases intracellular levels of the universal 41 as the universal group transfer donor by PMTsby is remi. Product inhibition SAH product of PMTs, SAH niscent of protein kinases for another largeand family of would therefore be expected all PMTs other druggable enzyme targets, the ATP-binding pockets SAM-dependent enzymes, with the degree of inhibition of which have proved torelated be highly tractable for specific enzymes being to their relative targets inhibi31,32 . Furthermore, despite binding a for drug discovery Michaelis constant (Km) values for tion constant (Ki) and
www.nature.com/reviews/drugdisc VolUME 8 |COLLECTION SEPTEMBEREpigenetics 2009 | 729 NATURE REPRINT

726 | SEPTEMBER 2009 | VolUME 8 NATURE REVIEWS | Drug Discovery S8

REVIEWS

REVIEWS
focus on the PMTs, and in particular on those aspects PKMTs Acetyltransferases thatand make PMTs attractive 27 targets for drug discovery family SAH SAM, respectively . Similarly, the activity of all two enzymes involved in SAM biosynthesis4252 . Also, the Deacetylases members efforts. We summarize the data that contribute SAM-dependent enzymes in a cell could be reduced by to pan-HDAC inhibitor panobinostat has recently been ~18 family Me the validation of PMTs as targets for specific human shown members blocking SAM biosynthesis for example, by inhibitto cause depletion of cellular levels ofDemethylases the PMT ~30 family diseases, as wellreductase as the structural mechanistic data EZH2 (REF. 43). Although the mechanism by which this ing dihydrofolate or SAM and synthase, which are K members Ac Me that suggest PMTs are a tractable (that is, druggable) PRMTs K target class. 10 family P inhibitorsRof PMTs Kinases Table 3 | Chemical structures and biochemical data for small-molecule members S PKMTs and PRMTs in human disease compound structure Mechanism selectivity* refs Ub Ligases and potency In surveying the histone-modifying enzymes of the K NH2 Product of the reactions SAH Non-selective 77,78 human genome, the enzymes that catalyse methylationcatalysed by PMTs N N of lysine residues (protein lysine methyltransferases IC50 values range from H2N S 0.1 to 20 M (PKMTs)) and arginine residues (protein arginine N N O methyltransferases (PRMTs)) are of substantial interest CO2 H from the perspective of drug discovery and medicinal OH OH chemistry. The action of these enzymes is crucial in controlling gene regulation, and there is NH an2increasingNatural product analogue Non-selective 36 Sinefungin Figure 1 | A nucleosome and the post-translational and SAH histone protein modifications that can influence amount of biochemical and biological data to suggestof SAM N NH 2 N values range from IC50 Nature Reviews | Drug Discovery epigenetic regulation of gene transcription. that the enzymatic H2N activities of several of these proteins0.1 to 20 M N Modifications of the histone protein tail are shown: N have pathogenic roles in cancer, inflammatory diseases, O 2H neurodegenerativeCO diseases and other conditions of changes in acetylation (Ac) by acetyltransferases and deacetylases, phosphorylation (P) by kinases, ubiquitylation OH OH importance715. (Ub) by ligases and changes in methylation (Me) by For example, with the exception of DoT1-like, histone methyltransferases and demethylases. The enzyme families OH O SAM-competitive > 4-fold 79 Chaetocin H3 methyltransferase (DoT1l; also known as KMT4),inhibitor of SUV39 that are responsible for the various post-translational H H N all human PKMTs contain a ~130 S N amino-acid domain,IC modifications = 0.6 M N are shown. PKMT, protein lysine S 50 referred to as the SET domain, which constitutes the methyltransferase; PRMT, protein arginine methyltransferase. 1416 O . Enhancer of catalytic domain of these enzymes zeste homologue 2 (EZH2; also O known as KMT6) is a SET domain protein that forms the catalytic subunit of N S S N repressive complex 2 arginine methyltransferase 1 (CARM1; also known as the 45- protein core N of polycomb H H (PRC2). PRC2 is a PKMT that catalyses the methyla- PRMT4)18 have been implicated in the neurodegeneraOH tion of lysine 27 of histoneO H3 (in the nomenclature of tive diseases Huntingtons disease and spinal muscular atrophy, respectively. SET domain-containing lysine histone modification, this site is referred to as H3K27).SAM-non-competitive > 4-fold 80 BIX-01294 N site, all ofinhibitor of EHMT2 methyltransferase 7 (SETD7; also known as KMT7)19, Although EZH2 contains the catalytic active N MeO N = 2.7 M (REF. 20) and PRMT1 (REF. 21) have been the proteins of the PRC2 complex are required for fullIC50CARM1 N EZH2 or another associated with nuclear factor-B-related inflammaPKMT activity. overexpression of MeO PRC2 subunit, suppressor of NH zeste 12 homologue tory diseases, and SET domain-containing protein 1A (SUZ12), has been associated with numerous human (SETD1A)22 and CARM1 (REF. 23) have been associated N cancer types, including prostate, breast, bladder, colon, with viral infections involving Herpes simplex virus and skin, liver, endometrial, lung and gastric cancers, as well human T lymphotrophic virus, respectively. PKMTs and F3C > 100-fold for 45 Methylgene CARM1 inhibitor 15 are therefore emerging as compelling targets for as lymphomas and myelomas CH3O . In breast carcinomas,IC PRMTs = 60 nM PRMT1 and compound 7a 50 4,5 . increased levels of EZH2 have been shown to correlate drug discovery efforts H SETD7 of REF. 45 N N with increased invasiveness and proliferation rate; it has N PMTs as a drug target class been suggested that EZH2 O could be a prognostic indica10 . In cell culture, From a chemical biology and medicinal chemistry tor of patient outcome for breast cancer S overexpression of EZH2 in breast epithelial cells causes perspective, the PKMTs and PRMTs are of interest NH anchorage-independent cell growth and increased because they have a common mechanism of catalysis invasiveness. Additionally, when (discussed below), involving a small, organic cofactor. As O NH2 EZH2-overexpressing cells were injected into the mammary fat pads of nude other druggable classes of enzymes, such as the protein BristolMyers 46,47 inhibitor Ndemonstrating theCARM1 mice, the animals developed tumours, kinases, share this mechanistic>100-fold feature, itfor is likely that the F3C S Squibb PRMT1 and IC50 = 40 nM tumorigenicity of EZH2 overexpression. Importantly, the PMTs will be similarly amenable to inhibition by small, compound 7f PRMT3 O phenotypic of EZH2 overexpression are correlated organic molecules. of REF. 47 effectsN N with increased H3K27 methylation and are dependent on The PKMTs and PRMTs catalyse methyl transfer from N N the presence of an intact SET domain, both of which imply their universal methyl donor, S-adenosyl-l-methionine H a role for EZH2 enzymatic N activity in pathogenesis10,15. (SAM; also known as AdoMet) (FIG. 2), to a nitrogen atom NH2 Several other human PKMTs O and PRMTs are strongly of lysine or arginine side chains3. Protein substrate speassociated with human cancers, as summarized in TABLE 2. cificity can be stringent in these enzymes; some PKMTs CARM1, coactivator-associated arginine methyltransferase 1 (also known as PRMT4); EHMT2, euchromatic histone lysine Similarly, there is compelling that other seem to selectively methylate a particular lysine residue , half-maximal inhibitory concentration; PMT, protein methyltransferase; N-methyltransferase 2 (also knownevidence as G9A and KMT1C); ICPKMTs 50 PRMT, protein arginine methyltransferase; SAH, -adenosyll-homocysteine knownhistone, as AdoHcy); SAM, S-adenosyll-methionine and PRMTs have a pathogenic role inSserious human on a(also specific and the extent of methylation on (also known as AdoMet); SETD7, SET domain-containing lysine methyltransferase 7 (also known as KMT7); SUV39, suppressor of . For example, SET domain, a single lysine residue (that is, mono-, di- or tri-methyldiseases other than cancer 79 variegation 39; *Selectivity is given as the ratio of the IC50 value for the most potent inhibition at a non-target PMT over the IC50 value 17 and coactivator-associated ation) that is catalysed by a particular enzyme can also bifurcated 1target. (SETDB1) . for the primary See REF. 27
VolUME 8 | SEPTEMBER 2009 | 725 www.nature.com/reviews/drugdisc S9

Target class
A group of proteins that are related by a common type of drug-binding pocket, but sufficiently diverse that selective inhibition of specific proteins can be achieved, using medicinal chemical elaboration of the basic chemotype structures.

SAM
S-adenosyl-l-methionine, the universal methyl group donor of all enzymatic methyltransferase reactions.

NATURE REVIEWS | Drug Discovery 730 | SEPTEMBER 2009 | VolUME 8 NATURE REPRINT COLLECTION Epigenetics

REVIEWS
Table 1 | Epigenetic-enzyme inhibitors for cancer therapy occurs is not yet fully understood, an approach of this type generic name Alernative clinical status* Structureactivity relationship names would nevertheless deplete the protein levels of EZH2 and so abolish the PMT catalytic activity of the enzyme along The relationship between DNMT inhibitors the chemical structure with any other non-enzymatic functions of EZH2. of a compound and its Vidaza 5-azacitidine Approvedof inPMTs the United States been reviewed4, Direct inhibitors have recently pharmacological activity. for myelodysplastic syndrome along with other probes of histone-modifying enzymes. Decitabine Dacogen Approved United States for Some natural ligandsin for these enzymes have been known myelodysplastic for some time, including thesyndrome reaction product, SAH, and a natural inhibitor isolated from Streptomyces spp. cultures, HDAC inhibitors (TABLE 3). More selective inhibitors have been sinefungin Vorinostat Zolinza Approved in United States for cutaneous identified manifestation for SUV39 (chaetocin; reported half-maximal in cutaneous T cell lymphoma inhibitory concentration (IC50) = 0.6 M) and for EHMT2 Romidepsin FK228 New drug application filing (BIX-01294; reported IC50 = 1.6 M), but no further optiPanobinostat LBH-589mization of Phase IIcompounds has been reported to date4. these A co-crystal structure of BIX-01294 bound to EHMT2 Belinostat PXD-101 Phase II has recently been published44. Surprisingly, the compound Entinostat MS-275 Phase II was found to bind to the enzyme non-competitively with SNDX-275 respect to SAM, in a groove that is normally occupied by MGCD-0103 MG-0103 Phase II a portion of the protein substrate. JNJ-26481585 None Phase I two groups have reported potent, selecMore recently, inhibitors of the PRMT CARM1 Givinostat ITF2357 tive, pyrazole-based Phase II (REFS 4547) (TABLE 3)therapies, . These compounds are the first *See REFS 51,52 for comprehensive reviews of novel cancer including those described examples of inhibitors of a specific PMT that are effective in the table. DNMT, DNA methyltransferase; HDAC, histone deacetylase. at nanomolar concentrations and display >100-fold selectivity for the primary target over related enzymes. The 45 was found to series reported by Methylgene becompound stringent (discussed below). Clearly, the structure of be inactive in cellular assays; no cellular data have been 16,24 each enzyme is unique , as is the resulting biology reported for theassociated compound series from BristolMyers and pathobiology with each enzyme. Yet, the 46,47 . Therefore, although exciting step Squibb shared chemical mechanisms of the an PKMTs andfirst PRMTs has been made towards developing selective inhibitors allows for certain efficiencies and economies in the disof PMTs, substantial work remains to be done before covery of selective drugs for these enzymes, by treating these findings can be translated into pharmacologically them as a target class2527. Several of these enzymes tractable species. have been found to catalyse methyl transfer to lysine The paucity of potent, selective, pharmacologically or arginine residues on a number of cellular proteins; tractable inhibitors of the PMTs creates a crucial therathis is especially the case for the PRMTs, for which sevpeutic gap which medicinal chemists should strive to eral cytosolic substrates have been identified24,28. With fill. As described here, the pathobiological relevance of respect to gene regulation, however, the most important these enzymes, together with the structural and mechatargets for both PKMTs and PRMTs are likely to be the nistic information that suggests their druggability as a histones, as post-translational modification these protarget class, converge to make the PMTsof an attractive teins is clearly a determinant of chromatin remodelling and important class of novel enzymes for contemporary and therefore regulation of gene transcription. drug discovery.

REVIEWS

suppressor of variegation 39 (SUV39) family; the SET1 (also known as Mll) family; the SET2 (also known as Conclusions NSD) family; the retinoblastoma protein-interacting There is a growing body of evidence that enzymes inzinc this finger (RIZ) (also known as PRDM) the targetprotein class have important pathogenic rolesfamily; in human SET and MYND domain-containing (SMYD) family; diseases. The structures and enzymatic mechanisms of the of zeste (EZ)that family; and the SUV420 theenhancer PMTs support the view pharmacological modufamily. eighth family, known as others, included the lation An of these enzymes by small-molecule inhibitors enzymes SETD7 and SETD8 known intervention as PRSET7). will be an effective means of(also therapeutic Finally, DoT1l, the human, non-SET in cancer and numerous other unmet domain medical PKMT needs. can be considered a ninth family of PKMTs. of PMTs as The discovery of small-molecule inhibitors starting points for drug development should clearly be our group has recently extended this work to systema key focus of new efforts. Beyond goal, atically identify all ofresearch the SET domains that arethis encoded there are many opportunities to use chemical probes by the human genome. This study has more than doubled of PMT function to define the underlying and the number of putative human PKMTs to biology 52 (51 SET pathobiology that are associated with protein modifidomain proteins plus DoT1l) (l.F. Jerva, K.o. Elliston, cation by these enzymes. The nature of PMT catalysis, V.M.R., M.E.S. and R.A.C., unpublished observations). and the structural information about these From the available perspective of drug discovery, the salient point enzymes, should facilitate the discovery of PMT ligands is that these enzymes are numerous in humans. through mechanismand structure-guided discovery The PRMTs are similarly well represented in humans. 48 , as well as methods do not rely on mechmethods There are at least eight humanthat PRMTs for which some anistic knowledge, such activity as high-throughput screening level of methyltransferase has been shown. These of diverse chemical libraries. proteins have a canonical sequence domain that is associkey remaining question when considering the atedA with the binding sites for the cofactor and substrate PMTs as aalthough drug discovery targetconservation class is whether or (arginine), the sequence among not selective inhibition of particular enzymes can be these proteins is low. Estimates of the total number of achieved through targeting pocket. PRMTs that are encoded by the the SAM-binding human genome vary, This is analogous to the question hindered the early depending on the method of that sequence alignment acceptance ofof protein kinases as drug targets: and the level alignment stringency that is whether applied. it was possible to achieve selectivity among the ATPNevertheless, it is clear that 1050 of these enzymes are binding pockets of the kinases. In retrospect, it is clear represented in humans. that the diversity of binding-site architecture and the The human PMTs are thus a large class of enzymes, binding-site dynamics associated with enzyme catalysis and several of them already have well established disprovide ample opportunities for selective inhibition of ease association (discussed above). Furthermore, owing kinases through medicinal chemistry efforts. Will the to the common features of their chemical mechanism same be true for the SAM-binding pockets of PMTs? of catalysis (discussed below), the PMTs are likely to be Ultimately, structureactivity relationship profiles, selecinherently tractable as targets for small-molecule drug tivity and collateral inhibition of off-target enzymes by intervention. The PMT target class, as defined here, PMT inhibitors will need to be determined empirically. therefore provides an important pool of potential targets Despite these limitations, it is our hope that the data prefor drughere discovery efforts. sented will help to stimulate systematic exploration of the human PMT target class towards the goal of develThe PMT active site oping selective inhibitors of PMTs as therapeutic agents The the PMTs as a drug target class is facilitated for pursuit human of diseases. by a rich literature base of crystallographic and enzyme Representation of PMTs in the human genome kinetic studies of these enzymes that have helped to The PMT represented in manyW., species, and define mechanisms catalysis. All of enzymes Dillon, S. C., of Zhang, X., Trievel, R. C.these & Cheng, X. 9. Tsankova, N., Renthal, Kumar, A. & Nestler, E. J. their16. 1. Strahl, B. D. & Allis, C. D. The language of target covalentclass is SET-domain protein superfamily: protein lysine Epigenetic psychiatric disorders. histone modifications. Nature 403 , 4145 (2000). the human genome encodes severalregulation PKMTsinand PRMTs. probably use a The common bimolecular nucleophilic submethyltransferases. Genome Biol. 6,3,24,30 227 (2005). Nature Rev. Neurosci. 8, 355367 (2007). 2. Kouzarides, T. Chromatin modifications and their 2) methyl transfer mechanism . The Attempts to quantify the representation of PKMTs in a stitution (S 17. Ryu, H. et al. ESET/SETDB1 gene expression and lone 10. Kleer, C. G. et al. EZH2 is a marker of aggressive function. Cell 128, 693705 (2007). N H3 (K9) trimethylation Huntingtons disease. cancer and promotes neoplastic transformation A thorough overview of post-translational pair electrons ofhistone a nitrogen atom (from in lysine or arginine) particular organism, andbreast to understand the relatedProc. Natl Acad. Sci. USA 103, 1917619181 (2006). of breast epithelial cells. Proc. Natl Acad. Sci. USA modifications on core histones, the enzymes that methylsulphonium cation ness of these proteins to one have focused on attacks the 18.electrophilic Cheng, D., Cote, J., Shaaban, S. & Bedford, M. T. 100,another, 1160611611 (2003). mediate these modifications and the biological The angle arginineto methyltransferase regulates the a 11.ofKrivtsov, A. V. et al. H3K79 methylation define at a 180 functions of the modification. of SAM the leaving CARM1 group, to form the sequence alignment the SET domain because, as profiles coupling of transcription and mRNA processing. murine and human MLL-AF4 leukemias. Cancer Cell 3. Smith, B. C. & Denu, J. M. Chemical mechanisms penta-coordinate carbon transition state. The transition discussed above, this domain is common to all PKMTs Mol. Cell 25 , 7183 (2007). 14 , 355368 (2008). of histone lysine and arginine modifications. 19. Li, Y. et collapses, al. Role of the with histone H3 lysine group 4 12. Jansson, M. et al. Arginine methylation regulates Biochim. Biophys. Acta 1789 , 4557 (2008). statethe structure then methyl relo(except DoT1l). methyltransferase, SET7/9, in the regulation of p53 response. Nature Cell Biol. 10, 14311439 An excellent review of the chemical biology of cation to the nitrogen atom of the lysine or arginine side Several attempts have been NF-B-dependent inflammatory genes. Relevance to (2008).made to systematically lysine- and arginine-modifying enzymes. diabetes and inflammation. J. Biol. Chem. 283, 13. Hong, H. et al. Aberrant expression of CARM1, 4. Cole, P. A. Chemical probes for histone-modifying SAH; chain and formation of S -adenosyl-l-homocysteine ( group the PKMTs on the basis of sequence homology and a transcriptional coactivator of androgen receptor, enzymes. Nature Chem. Biol. 4, 590597 (2008). 2677126781 (2008). 14,29 . For example, in inthe 2007, a nomenclature con- and 2) as methyltransferase a product. CARM1 also known20. as AdoHcy) substrate development of prostate carcinoma 5. Keppler, B. R. & Archer, T. K. Chromatin-modifying Covic, M. et (FIG. al. Arginine androgen-independent status. Cancer 101, 8389 enzymes as therapeutic targets Part 1. is a promoter-specific regulator of NF-B-dependent vention was proposed for the PKMTs, along with other The use of a naturally occurring adenosyl analogue SAH Expert Opin. Ther. Targets. 12 (2004). , 13011312 (2008). gene expression. EMBO J. 24, 8596 (2005). 29 S-adenosyll-homocysteine, . In this study, types of chromatin-modifying enzymes asT.the universal group transfer donor by is remi14. Schneider, R., Bannister, A. J. & Kouzarides, 6. Pray, L. At the flick of a switch: epigenetic drugs. 21. Hassa, P. O., Covic, M., Bedford, M. PMTs T. & Hottiger, M. O. Unsafe SETs: histone lysine methyltransferases Chem. Biol. 15 , 640641 (2008). Protein arginine methyltransferase 1 coactivates NF-Bthe universal product of all 24 human PKMTs were identified. These SET domain niscent of protein kinases another large with family of and cancer. Trends Biochem. Sci. 27 , 396402 7. Jones, P. A. & Baylin, S. B. The epigenomics of cancer. dependent gene expression synergistically CARM1 enzymatic methyltransferase (2002). Cell 128, 683692 (2007). PKMTs have been divided and PARP1. J. Mol. Biol.ATP-binding 377, 668678 (2008). into related families on the druggable enzyme targets, the pockets reactions, formed 15. Simon, J. A. & Lange, C. A. Roles of the EZH2 histone 8. Wilson, C. by B., Rowell, E. & Sekimata, M. 22. Huang, J. et al. Trimethylation of histone H3 lysine 4 basis of sequence alignment; initially four, and later of which have proved to be highly tractable targets methyl group transfer from methyltransferase in cancer epigenetics. Mutat. Res. Epigenetic control of T-helper-cell differentiation. by Set1 in the lytic infection of human herpes simplex 14,16 647, 2129 Nature Rev. Immunol. 9, 91105 (2009). virus 31,32 1. J..Virol. 80, 57405746 (2006). binding a l-methionine. Furthermore, despite S-adenosylseven, major families were defined in (2008). this manner : the for drug discovery
NATURE REVIEWS | Drug Discovery 726 | SEPTEMBER 2009 | VolUME 8 S10 VolUME 8 COLLECTION | SEPTEMBEREpigenetics 2009 | 731 www.nature.com/reviews/drugdisc NATURE REPRINT

REVIEWS
focus on the PMTs, and in particular on those aspects
Acetyltransferases

REVIEWS
PKMTs

that make PMTs attractive for drug 44. Chang, targets Y. et al. Structural basisdiscovery for G9a-like protein 64. Rosati, R. et al. NUP98 is fused to the NSD3 gene 23. Jeong, S. J. et al. Coactivator-associated arginine 52 family Deacetylasesin acute myeloid leukemia associated with lysine methyltransferase inhibition by BIX-01294. t(8;11) methyltransferase 1 enhances transcriptional activity members efforts. We summarize the data that contribute to Nature Struct. Mol. Biol. 16, 312317 (2009). ~18 family (p11.2;p15). Blood 99, 38573860 (2002). of the human T-cell lymphotropic virus type 1 long Me the validation as M. targets specific human 45. Allan, et al. Nfor -Benzyl-1-heteroaryl-3-(trifluorometh 65. Tonon, G. et al. High-resolution genomic profiles of terminal repeat through direct interaction with Tax. of PMTs Demethylases members yl)-1H-pyrazole-5-carboxamides as inhibitors of human lung cancer. Proc. Natl Acad. Sci. ~30 USA family 102, J. Virol. 80, 1003610044 (2006). diseases, as well as the structural and mechanistic data co-activator associated arginine methyltransferase 1 96259630 (2005). 24. Cheng, X., Collins, R. E. & Zhang, X. Structural and K members Ac Y. et al. hDOT1L links Me that suggest PMTs are(CARM1). a tractable is, druggable) Bioorg(that Med. Chem. Lett. 19, 12181223 66. Okada, histone methylation sequence motifs of protein (histone) methylation (2009). to leukemogenesis. Cell 121, 167178 (2005). enzymes. Annu. Rev. Biophys. Biomol. Struct. 34, PRMTs K target class. The first examples of potent, drug-like inhibitors 67. Bitoun, E., Oliver, P. L. mixed267294 (2005). R & Davies, K. E. The 10 family P Kinases of a human PMT. lineage leukemia fusion partner AF4 stimulates RNA 25. Goldstein, D. M., Gray, N. S. & Zarrinkar, P. P. members S 46. Purandare, A. V. et al. Pyrazole inhibitors of polymerase II transcriptional elongation and mediates High-throughput kinase profiling as a platform for drug PKMTs and PRMTs incoactivator human associated diseasearginine methyltransferaseLigases 1 coordinated discovery. Nature Rev. Drug Discov. 7, 391397 (2008). Ub chromatin remodeling. Hum. Mol. Genet. (CARM1). Bioorg Med. Chem. Lett. of 18, the 44384441 16, 92106 (2007). 26. Mook, R. A. The importance and In complexity of target surveying the histone-modifying enzymes K (2008). 68. Hamamoto, R. et al. SMYD3 encodes a histone class selectivity in drug discovery. The American the enzymes catalyse methylation 47. Huynh, T. that et al. Optimization of pyrazole inhibitors of methyltransferase involved in the proliferation of Association for Cancer Research human Education genome, Book cancer cells. Nature Cell Biol. 6, 731740 (2004). 223226 (The American Association for Cancer coactivator associated arginine methyltransferase 1 of lysine residues (protein lysine methyltransferases 69. Hamamoto, R. et al. Enhanced SMYD3 expression Research, Philadelphia, 2005). (CARM1). Bioorg Med. Chem. Lett. 19, 29242927 (PKMTs)) and arginine residues (protein arginine is essential for the growth of breast cancer cells. 27. Copeland, R. A. Evaluation of Enzyme Inhibitors in (2009). Cancer Sci. 97, 113118 (2006). Drug Discovery: A Guide for Medicinal Chemists and 48. Copeland, R. A., R. & Luo, L. in Textbook methyltransferases (PRMTs)) are ofGontarek, substantial interest 70. Bracken, A. P. et al. EZH2 is downstream of the Pharmacologists (Wiley, Hoboken, 2005). of Drug Design and Discovery 4th edn Ch. 12 (eds. from the perspective of drug discovery and medicinal pRB-E2F pathway, essential for proliferation and 28. Cheng, D. et al. Small molecule regulators of protein Krogsgaard-Larsen, P., Madsen, U. & Stromgaard, K.) amplified in cancer. EMBO J. 22, 53235335 arginine methyltransferases. J. Biol. Chem. 279, The action378407 (Taylor and Francis, New York, 2009). chemistry. of these enzymes is crucial in (2003). 2389223899 (2004). 49. Troffer-Charlier, N., Cura, V., Hassenboehler, P., controlling gene regulation, and there is an increasing Figure 1 71. | A nucleosome the post-translational Varambally, S. etand al. The polycomb group protein 29. Allis, C. D. et al. New nomenclature for chromatinMoras, D. & Cavarelli, J. Functional insights from EZH2 modifications is involved in progression prostate cancer. modifying enzymes. Cell 131, 633636 (2007). structures of coactivator-associated arginine histone protein thatof can influence amount of biochemical and biological data to suggest Nature 419, 624629 (2002). 30. Zhang, X. & Bruice, T. C. Enzymatic mechanism and methyltransferase 1 domains. EMBO J. 26, Nature Reviews | Drug Discovery epigenetic regulation of gene transcription. that the enzymatic activities of several of these proteins 72. Subramanian, K. et al. Regulation of estrogen product specificity of SET-domain protein lysine 43914401 (2007). receptor alpha by theprotein SET7 lysine methyltransferase. methyltransferases. Proc. Natl Acad. Sci. USA 105, 50. in Couture, J.-F., Collazo, E., Brunzelle, J. S. & Trievel, R. C. Modifications of the histone tail are shown: have pathogenic roles cancer, inflammatory diseases, Mol. Cell 30, 336347 (2008). 57285732 (2008). Structural and functional analysis of SET8, a histone changes73. in acetylation (Ac) by acetyltransferases and neurodegenerative diseases and other conditions Nishikawa, N. et al. Gene amplification and This work provides a detailed theoretical basis H4 Lys-20 methyltransferase. Genes Dev.of 19, deacetylases, phosphorylation (P) by ubiquitylation 715 overexpression of PRDM14 inkinases, breast cancers. to explain the substrate specificity of the protein 14551465 (2005). . importance Cancer 67, 96499657 (2007). (Me) by 51. Ma, W. W. & Adjei, A. A. Novel agents on the horizon lysine methyltransferases. (Ub) by ligases andRes. changes in methylation For map example, with the exception of DoT1-like, histone 74. Majumder, S., Liu, Y., Ford, O. H., 3rd, Mohler, J. L. for cancer therapy. CA Cancer J. Clin. 59, 111137 31. Fedorov, O. et al. A systematic interaction of methyltransferases and demethylases. The enzyme families (2009). validated kinase inhibitors with Ser/Thr kinases. & Whang, Y. E. Involvement of arginine H3 methyltransferase (DoT1l; also known as KMT4), that are responsible for theCARM1 various A review of the current knowledge on how aberrant Proc. Natl Acad. Sci. USA 104, 2052320528 (2007). methyltransferase in post-translational androgen receptor all analysis human contain a ~130 amino-acid domain, 32. Karaman, M. W. et al. A quantitative of PKMTs kinase epigenetic mechanisms can contribute to the modifications function and prostate cancer cell viability. Prostate are shown. PKMT, protein lysine inhibitor selectivity. Nature Biotech. 26, 127132 of cancer and the progress in 66, 12921301 (2006). referred to as the SET development domain, which constitutes the methyltransferase; protein arginine methyltransferase. (2008). developing therapies that target these 75. Frietze,PRMT, S., Lupien, M., Silver, P. A. & Brown, M. 33. Min, J., Feng, Q., Li, Z., Zhang, Y.catalytic & Xu, R. M. domain of these CARM1 regulates estrogen-stimulated breast cancer mechanisms. enzymes1416. Enhancer of Structure of the catalytic domain of human DOT1L, growth through up-regulation of E2F1. Cancer Res. 52. Cortez, C. C. & Jones, P. A. Chromatin, cancer and zeste homologue 2 (EZH2; also known as KMT6) is a a non-SET domain nucleosomal histone 68, 301306 (2008). drug therapies. Mutat. Res. 647, 4451 (2008). methyltransferase. Cell 112, 711723 53. Kang, M. Y. et al. Association the SUV39H1 SET (2003). domain protein that forms the catalytic of subunit of histone 76. Zhao, Q. et al. PRMT5-mediated methylation of 34. Sawada, K. et al. Structure of the conserved core of histone H4R3 recruits DNMT3A, coupling histone and methyltransferase with the DNA methyltransferase 1 arginine methyltransferase 1 (CARM1; also known as the 45protein core ofat polycomb repressive complex 2 cancer. the yeast Dot1p, a nucleosomal histone H3 lysine 79 DNA methylation in gene silencing. Nature Struct. mRNA expression level in primary colorectal 18 methyltransferase. J. Biol. Chem. 279 , 4329643306 Mol. Biol. 16, 304311 (2009). Int. J. Cancer 121, 21922197 (2007). been implicated in the neurodegeneraPRMT4) have (PRC2). PRC2 is a PKMT that catalyses the methyla(2004). 77. Patnaik, D. et al. Substrate specificity and kinetic 54. Watanabe, H. et al. Deregulation of histone lysine Huntingtons disease spinal muscular tion of lysineto 27 of histone H3 (in the nomenclature of tive diseases 35. Copeland, R. A. Enzymes: A Practical Introduction mechanism of mammalian G9a and histone H3 methyltransferases contributes to oncogenic Structure, Mechanism and Data Analysis 2nd edn methyltransferase. J. Biol. Chem. 279, 5324853258 transformation of human bronchoepithelial cells. SET domain-containing lysine histone modification, this site is referred to as H3K27). atrophy, respectively. (Wiley, Hoboken, 2000). (2004). Cancer Cell Int. 8, 15 (2008). 19 , methyltransferase 7 Patnaik, (SETD7; also known as KMT7) Although EZH2 contains the active site, all of 36. Couture, J. F., Hauk, G., Thompson, M. J., 78. Chin, H. G., D., Esteve, P.-O., Jacobsen, S. E. 55. Kondo, Y. catalytic et al. Downregulation of histone H3 lysine 9 Blackburn, G. M. & Trievel, R. C. Catalytic roles for of the PRC2 & Pradhan, S. Catalytic properties and21) kinetic methyltransferase G9a induces centrosome 20) and PRMT1 (REF. have been CARM1 (REF. the proteins complex are required for fulldisruption carbonoxygen hydrogen bonding in SET domain mechanism of human recombinant lys-9 histone H3 and chromosome instability in cancer cells. PLoS One nuclear factor-B-related inflammaPKMT of EZH2 or another associated with lysine methyltransferases. J. Biochem. 281, activity. overexpression methyltransferase SUV39H1: participation of the 3, e2037 (2008). 1928019287 (2006). chromodomain enzymatic catalysis. Biochemistry 56. Tkachuk, D., Kohler, S. & Cleary, M. L. Involvement and SET in domain-containing protein 1A PRC2 subunit, suppressor of zeste 12 homologue tory diseases, 37. Collins, R. E. et al. In vitro and in vivo analyses 45, 32723284 (2006). of a homolog of Drosophila trithorax by 11q23 22 and CARM1 (REF. 23) have been associated (SUZ12), has with numerous human of a Phe/Tyr switch controlling product specificity of been associated 79. Greiner, D., Bonaldi, T., Eskeland, R., Roemer, E. & chromosomal translocations in acute leukemias. (SETD1A) histone lysine methyltransferases. J. Biol. Chem. 280 , Imhof, A. Identification ofHerpes a specific simplex inhibitor of the and Cell 71, 691700 (1992). involving virus cancer types, including prostate, breast, bladder, colon, with viral infections 55635570 (2005). histone methyltransferase SU(VAR)39. Nature 57. Gu, Y. et al. The t(4;11) chromosome translocation of human T lymphotrophic virus,(2005). respectively. PKMTs and liver, lung acute and leukemias gastric cancers, as well This study provides a structuralskin, basis for the endometrial, Chem. Biol. 1, 143145 human fuses the ALL-1 gene, related 15 trithorax, to the AF-4 gene. Cell 71, wide range of lysine methylation patterns that is and myelomas Kubicek, S. et al. Reversal as of H3K9me2 by targets for to Drosophila . In breast carcinomas, PRMTs 80. are therefore emerging compelling as lymphomas achieved by different SET domain PKMTs. a small-molecule inhibitor for the G9a histone 701708 (1992). 4,5 . increased levels of EZH2 have been shown to correlate drug discovery efforts 38. Trievel, R. C., Flynn, E. M., Houtz, R. L. & Hurley, J. H. methyltransferase. Mol. Cell 25, 473481 (2007). 58. Liedtke, M. & Cleary, M. L. Therapeutic targeting of Mechanism of multiple lysine methylation by the SET MLL. Blood 113, 60616068 (2009). with increased invasiveness and proliferation rate; it has domain enzyme Rubisco LSMT. Nature Struct. Biol. 59. Wang, G. G., Cai, L., Pasillas, M. P. & Kamps, M. P. Acknowledgements PMTs aare drug target class C. T. Walsh, H. R. Horvitz, been suggested that EZH2 could be a prognostic indica10, 545552 (2003). NUP98-NSD1 links H3K36 methylation to Hox-A gene as We grateful to K. Shiosaki, 39. Zhang, X. et al. Structural basis for the product activation and leukaemogenesis. Nature Cell Biol. 9, Y. Zhang, and R. Gould for their insights, constant support and 10 . In cell culture, From a chemical biology and medicinal chemistry tor of patient outcome for breast cancer specificity of histone lysine methyltransferases. 804812 (2007). encouragement. We also thank K. Boater, E. Olhava, L. Jin Mol. Cell 12, 177185 (2003). overexpression of EZH2 60. Marango, J. et epithelial al. The MMSET protein is a histone and T. Luly forPKMTs expert helpand in preparation of this manuscript. in breast cells causes perspective, the PRMTs are of interest 40. Frederiks, F. et al. Nonprocessive methylation by Dot1 methyltransferase with characteristics of a anchorage-independent cell growth and increased because they have a common mechanism of catalysis leads to functional redundancy of histone H3K79 transcriptional corepressor. Blood 111, 31453154 Competing interests statement methylation states. Nature Struct. Mol. Biol. 15, Additionally, (2008). The below), authors declare competing financial interests: see web As invasiveness. when EZH2-overexpressing (discussed involving a small, organic cofactor. 550557 61. Kim, J. Y. et al. Multiple-myeloma-related WHSC1/ version for details. Target class (2008). were the mammary fatispads of nude other druggable classes of enzymes, such as the protein 41. Chiang, P. K. Biological effects of cells inhibitors of injected intoMMSET isoform RE-IIBP a histone A group of proteins that are S-adenosylhomocysteine hydrolase. Pharmacol. methyltransferase with transcriptional repression mice, the animals developed tumours, demonstrating the kinases, share this mechanistic feature, it is likely that the DATABASES related by a common type Ther. 77, 115134 (1998). activity. Mol. Cell Biol. 28, 20232034 (2008). UniProtKB: http://www.uniprot.org tumorigenicity of EZH2 overexpression. Importantly, the PMTs will be similarly amenable to inhibition by small, of drug-binding pocket, 42. Bender, C. M., Zingg, J.-M. & Jones, P. A. DNA 62. Lauring, J. et al. The multiple myeloma associated CARM1 | DOT1L | EHMT2 | EZH2 | PRMT1 | SETD7 | SETD8 | methylationdiverse as a target Pharm. Res. MMSET gene contributes are to cellular adhesion, organic molecules. but sufficiently thatfor drug design. phenotypic effects of EZH2 overexpression correlated SETD1A | SETDB1 | SUZ12 15, 175187 (1998). clonogenic growth, and tumorigenicity. Blood 111, with increased H3K27 methylation and are dependent on The PKMTs and PRMTs catalyse methyl transfer from 43. Fiskus, W. et al. Panobinostat treatment depletes 856864 (2008). FURTHER INFORMATION proteins can be achieved, EZH2 and DNMT1 levels and enhances decitabine 63. Angrand, P. O. et al. NSD3, a new SET domain- their universal the presence of an intact SET domain, both of which imply methyl donor, S-adenosyl-l-methionine Authors homepage: http://www.epizyme.com using medicinal chemical of JunB and loss of survival mediated de-repression containing gene, maps to 8p12 and is amplified in 10,15 All liNks Are AcTive iN THe oNliNe . SAM; also known as AdoMet) (FIG. 2),PDf to a nitrogen atom a role for EZH2 enzymatic activity in pathogenesis ( of human leukemia cells. Cancer Biol. Ther. 8, human breast cancer cell lines. Genomics 74, 7988 elaboration ofacute the basic 939950 (2009). (2001). and PRMTs are strongly Several other human PKMTs of lysine or arginine side chains3. Protein substrate speselective inhibition of specific chemotype structures.

SAM
S-adenosyl-l-methionine, the universal methyl group donor of all enzymatic methyltransferase reactions.

associated with human cancers, as summarized in TABLE 2. Similarly, there is compelling evidence that other PKMTs and PRMTs have a pathogenic role in serious human diseases other than cancer 79. For example, SET domain, bifurcated 1 (SETDB1)17 and coactivator-associated

cificity can be stringent in these enzymes; some PKMTs seem to selectively methylate a particular lysine residue on a specific histone, and the extent of methylation on a single lysine residue (that is, mono-, di- or tri-methylation) that is catalysed by a particular enzyme can also
VolUME 8 | SEPTEMBER 2009 | 725 www.nature.com/reviews/drugdisc S11

NATURE REVIEWS | Drug Discovery 732 | SEPTEMBER 2009 | VolUME 8 NATURE REPRINT COLLECTION Epigenetics

REVIEWS
focus on the PMTs, and in particular on those aspects that make PMTs attractive targets for drug discovery efforts. We summarize the data that contribute to the validation of PMTs as targets for specific human diseases, as well as the structural and mechanistic data that suggest PMTs are a tractable (that is, druggable) target class.
Acetyltransferases Deacetylases ~18 family members Ac Kinases P K S K R Me

REVIEWS
PKMTs 52 family members Demethylases ~30 family members PRMTs 10 family members

Me

Epigenetics
A stably heritable change in phenotype or gene expression in an organism or cell, resulting from changes in a chromosome that are not caused by a change in DNA sequence. The process of eukaryotic cell differentiation is one of the most well-known examples of epigenetic changes.

Target class
A group of proteins that are related by a common type of drug-binding pocket, but sufficiently diverse that selective inhibition of specific proteins can be achieved, using medicinal chemical elaboration of the basic chemotype Epizyme, structures. Inc., 840 Memorial Drive, Cambridge, SAM Massachussets 02139, USA. S-adenosyll-methionine, Correspondence to R.A.C. e-mail: the universal methyl group RCopeland@epizyme.com donor of all enzymatic doi:10.1038/nrd2974 methyltransferase reactions.

PKMTs and PRMTs in human disease Ub Ligases In surveying the histone-modifying enzymes of the K human genome, the enzymes that catalyse methylation of lysine residues (protein lysine methyltransferases (PKMTs)) and arginine residues (protein arginine methyltransferases (PRMTs)) are of substantial interest from the perspective of drug discovery and medicinal chemistry. action of these enzymes isSolomon crucial in and Victoria M. Richon RobertThe A. Copeland, Michael E. controlling gene regulation, and there is an increasing Figure 1 | A nucleosome and the post-translational Abstract | The protein methyltransferases (PMTs) which methylate protein lysine and histone protein modifications that can influence amount of biochemical and biological data to suggest Nature Reviews | Drug Discovery epigenetic regulation of gene transcription. that the enzymatic activities of several of these proteins arginine residues and have crucial roles in gene transcription are emerging as an have pathogenic roles in cancer, inflammatory diseases, Modifications of the histone protein tail are shown: important group of enzymes that play key parts in normal physiology and human diseases. neurodegenerative diseases and other conditions of changes in acetylation (Ac) by acetyltransferases and deacetylases, (P) by kinases, ubiquitylation 715 The collection of human PMTs is a large and diverse group phosphorylation of enzymes that have a common . importance (Ub) by ligases and changes in methylation (Me) by For example, with the exception of DoT1-like, histone mechanism of catalysis. Here, we review the biological, biochemical and structural data methyltransferases and demethylases. The enzyme families H3that methyltransferase (DoT1l; also known as KMT4), together present PMTs as a novel, chemically tractable target class for drug discovery. that are responsible for the various post-translational all human PKMTs contain a ~130 amino-acid domain, modifications are shown. PKMT, protein lysine referred to as the SET domain, which constitutes the methyltransferase; PRMT, protein arginine methyltransferase. Cellular domain differentiation is enzymes one of the most important 1416 . Enhancer of Epigenetic enzymes that are encoded in the human catalytic of these components of embryonic development and postnatal zeste homologue 2 (EZH2; also known as KMT6) is a genome catalyse group transfer reactions and can be tissue maintenance everysubunit nucleated SET domain protein and that repair. forms Almost the catalytic of categorized according to the nature of the covalent modifications that they catalyse and by the cell of the human body containsrepressive the same,complex complete methyltransferase 1 (CARM1; alsosubstrates known as the 45protein core of polycomb 2 arginine complement of genomic DNA. However, the ability of upon which 18 they act. In humans, these enzymes include (PRC2). PRC2 is a PKMT that catalyses the methyla- PRMT4) have been implicated in the neurodegenerapluripotent cells to differentiate into distinct lineages DNA methyltransferases (DNMTs), which methylate tion of lysine 27 of histone H3 (in the nomenclature of tive diseases Huntingtons disease and spinal muscular and ultimate cell types is conferred by specific patterns the carbon atom at the 5-position of cytosine in the histone modification, this site is referred to as H3K27). atrophy, respectively. SET domain-containing lysine of transcription of subsets of genes in the genome. A large CpG dinucleotide sites of the genome; protein methylAlthough EZH2 contains the catalytic active site, all of methyltransferase 7 (SETD7; also known as KMT7)19, and growing body of data support the idea that epigenetic transferases (PMTs), which methylate lysine or arginine the proteins of the PRC2 complex are required for full CARM1 (REF. 20) and PRMT1 (REF. 21) have been regulation of gene transcription is a key biological deter- residues on histones and other proteins; protein demethylPKMT activity. overexpression1 of EZH2 or another associated with nuclear factor-B-related inflammaminant of cellular differentiation . ases, which remove methyl groups from the lysine or PRC2 subunit, suppressor of zeste 12 homologue tory diseases, and SET domain-containing protein 1A The chromosomes within eukaryotic cell nuclei are arginine residues of proteins; histone acetyltransferases, (SUZ12), has been associated with numerous human (SETD1A)22 and CARM1 (REF. 23) have been associated packaged together with structural proteins (histones) which acetylate lysine residues on histones and other viralhistone infections involving(HDACs), Herpes simplex virus and cancer types, including prostate, breast, bladder, to form the complex known as chromatin. Four colon, major with proteins; deacetylases which remove human T lymphotrophic virus, respectively. PKMTs and skin, liver, endometrial, lung and gastric cancers, as well histones (H2A, H2B, H3 and15H4) form an octameric, acetyl groups from lysine residues on histones and . In breast carcinomas, PRMTs are therefore emerging as compelling targets for as lymphomas and myelomas disc-shaped aggregate composed of two copies of each other proteins; ubiquitin ligases, which add ubiquitin 4,5 . increased levels of EZH2 havethe been shown to correlate discovery efforts histone type around which DNA is wound to form drug to lysine residues on histones and other proteins; and with increased invasiveness and proliferation rate; it has specific kinases that phosphorylate serine residues on (FIG. 1). regular, repeating units known as nucleosomes as a been suggested that EZH2 could be a prognostic indica4,5 . drug target class Chromatin exists in two main conformational states: PMTs histones 10 . In cell a chemical biology and medicinal tor of patient outcome for breast cancer Given that small-molecule inhibitors havechemistry been suca condensed state (heterochromatin) in which theculture, nucleo- From overexpression of packed EZH2 in breast and epithelial cells causes perspective, the PKMTs andand PRMTs are (discussed of interest cessfully designed for HDACs DNMTs somes are tightly together gene transcription anchorage-independent growth and increased they have that a common mechanism ofhistonecatalysis below), it is likely additional families of is largely repressed; and acell more relaxed state (euchro- because invasiveness. Additionally, when EZH2-overexpressing (discussed below), involving a small, organic cofactor. As matin) in which gene transcription is activated. Epigenetic modifying enzymes will also be amenable to small-molcells were injected the mammary fat pads of nude other druggable classes of enzymes,for such as the protein ecule modulation. The opportunity chemical-probe regulation of gene into transcription is mediated by selective, mice, the animals developed tumours, demonstrating the kinases, shareand thispharmacological mechanistic feature, it is of likely that the development control epigenetic enzyme-catalysed, covalent modification of specific tumorigenicity of EZH2 overexpression. Importantly, the PMTs will be similarly amenable to inhibition by small, the nucleotides within the genes and also by post-translational gene transcription is therefore of great interest in phenotypic effects of EZH2 overexpression are correlated organic Modification fields ofmolecules. basic biology and drug discovery 4,5. Indeed, modifications of the histone proteins (FIG. 1). with increased H3K27 methylation and are dependent on theThe PKMTs and PRMTs in catalyse methyl transfer from of DNA can silence gene transcription directly, whereas role of these enzymes human diseases is highthe presence of an intact modifications SET domain, both of which imply their universal methyl donor, S -adenosyl-l-methionine the post-translational of histones control lighted by the recent approval of three drugs by the US 10,15 6 2), to a nitrogen atom . ( SAM; and also Drug known as AdoMet) (FIG. a the role for EZH2 enzymatic activity in the pathogenesis that act as selective, conformational transition between heterochromaFood Administration 3 2 . Protein substrate Several humanstates PKMTs and PRMTs are strongly of lysine or arginine side chains . The enzymes that covalently small-molecule inhibitors of HDACs and DNMTs spefor tin andother euchromatin TABLE 2. cificity can be stringent in these enzymes; some 1) PKMTs associated withand human cancers, as summarized . modify DNA histones are therefore the keyin mediators the treatment of specific human cancers (TABLE Similarly, there is compelling evidence that other PKMTs seem selectively particular lysine residue In to recent years, methylate there haveabeen numerous reviews of epigenetic regulation of gene transcription. in the literature that highlight different aspects of the putative enzymes have recently been on and Several PRMTs have a epigenetic pathogenic role in serious human a specific histone, and the extent of methylation on 79 biology, disease association structural biology of identified and, in some cases, their catalytic mechanism . For example, SET domain, a diseases other than cancer single lysine residue (that and/or is, mono-, di- or tri-methyl17 various histone-modifying Inenzyme this Review, we and three-dimensional structures have been determined2,3. ation) and coactivator-associated bifurcated 1 (SETDB1) that is catalysed by aenzymes. particular can also

First published in Nature Reviews Drug Discovery 8, 724732 (2009); doi: 10.1038/nrd2974

Protein methyltransferases as a target class for drug discovery

724 | SEPTEMBER 2009 | Discovery VolUME 8 NATURE REVIEWS | Drug NATURE REPRINT COLLECTION Epigenetics

www.nature.com/reviews/drugdisc VolUME 8 | SEPTEMBER 2009 | 725 S3

REVIEWS
Table 1 | Epigenetic-enzyme inhibitors for cancer therapy focus on the PMTs, and in particular on those aspects generic name Alernative clinical status* names that make PMTs attractive targets for drug discovery efforts. We summarize the data that contribute to DNMT inhibitors the validation of PMTs as targets for specific human 5-azacitidine Vidaza diseases, as Approved in the United States well as the structural and mechanistic data forPMTs myelodysplastic syndrome that suggest are a tractable (that is, druggable) target classApproved . Decitabine Dacogen in United States for
myelodysplastic syndrome

REVIEWS
suppressor of variegation 39 (SUV39) family; the SET1 (also known as Mll) family; the SET2 (also known as PKMTs Acetyltransferases NSD) family; the retinoblastoma protein-interacting 52 family zinc Deacetylases members finger protein (RIZ) (also known as PRDM) family; the ~18 family Me SET and MYND domain-containing (SMYD) family; Demethylases members ~30 family the enhancer of zeste (EZ) family; and the SUV420 K Ac family, known as Me family. An eighth othersmembers , included the PRMTs K SETD8 enzymes SETD7 and (also known as PRSET7). R 10 family P Kinases DoT1l, Finally, the human, non-SET domain PKMT members S can be considered Ub a ninth family of PKMTs. Ligases our group has recently extended this work to systemK atically identify all of the SET domains that are encoded by the human genome. This study has more than doubled the number of putative human PKMTs to 52 (51 SET domain proteins plus DoT1l) (l.F. Jerva, K.o. Elliston, V.M.R., M.E.S. and R.A.C., unpublished observations). From the perspective of drug discovery, the salient point Figure 1 | A nucleosome the post-translational is that these enzymes areand numerous in humans. histone protein modifications that can influence The PRMTs are similarly well represented in humans. Nature Reviews | Drug Discovery epigenetic gene transcription. There are atregulation least eight of human PRMTs for which some Modifications of the histone protein tail are shown: level of methyltransferase activity has been shown. These changes in acetylation (Ac) by acetyltransferases and proteins have phosphorylation a canonical sequence domainubiquitylation that is assocideacetylases, (P) by kinases, ated with the binding sites for the cofactor and substrate (Ub) by ligases and changes in methylation (Me) by (arginine), although the sequence conservation among methyltransferases and demethylases. The enzyme families these proteins is low. the total number of that are responsible forEstimates the variousof post-translational modifications areencoded shown. PKMT, protein lysine PRMTs that are by the human genome vary, methyltransferase; PRMT, protein of arginine methyltransferase. depending on the method sequence alignment and the level of alignment stringency that is applied. Nevertheless, it is clear that 1050 of these enzymes are represented in humans. arginine methyltransferase 1 (CARM1; also known as The human PMTs are thus a large class of enzymes, PRMT4)18 have been implicated in the neurodegeneraand several of them already have well established distive diseases Huntingtons disease and spinal muscular ease association (discussed above). Furthermore, owing atrophy, respectively. SET domain-containing lysine to the common features of their chemical mechanism methyltransferase 7 (SETD7; also known as KMT7)19, of catalysis (discussed below), the PMTs are likely to be CARM1 (REF. 20) and PRMT1 (REF. 21) have been inherently as targets for small-molecule drug associatedtractable with nuclear factor-B-related inflammaintervention. The PMT target class, as defined here, tory diseases, and SET domain-containing protein 1A therefore provides an important pool of potential targets 22 and CARM1 (REF. 23) have been associated (SETD1A) for drug discovery efforts. with viral infections involving Herpes simplex virus and

PKMTs and PRMTs in human disease HDAC inhibitors In surveying the histone-modifying enzymes of the Vorinostat Zolinza Approved in United States for cutaneous human genome, the enzymes that catalyse methylation manifestation in cutaneous T cell lymphoma of lysine residues (protein lysine methyltransferases Romidepsin FK228 New drug application filing (PKMTs)) and arginine residues (protein arginine Panobinostat LBH-589 Phase II (PRMTs)) are of substantial interest methyltransferases from the perspective of drug discovery and medicinal Belinostat PXD-101 Phase II chemistry. The action of these enzymes is crucial in Entinostat MS-275 Phase II controlling gene regulation, and there is an increasing SNDX-275 amount of biochemical and biological data to suggest MGCD-0103 MG-0103 Phase II that the enzymatic activities of several of these proteins JNJ-26481585 None have pathogenic Phase Iroles in cancer, inflammatory diseases, Givinostat ITF2357 neurodegenerative Phase II diseases and other conditions of 715 . cancer therapies, including those described importance *See REFS 51,52 for comprehensive reviews of novel For example, with the exception of DoT1-like, histone in the table. DNMT, DNA methyltransferase; HDAC, histone deacetylase. H3 methyltransferase (DoT1l; also known as KMT4), all human PKMTs contain a ~130 amino-acid domain, to(discussed as the SET domain, which constitutes bereferred stringent below). Clearly, the structure the of 1416 . Enhancer of catalytic domain of these 16,24 enzymes each enzyme is unique , as is the resulting biology zeste homologue 2 (EZH2; also known as KMT6) is a and pathobiology associated with each enzyme. Yet, the SET domain protein that forms the catalytic subunit of shared chemical mechanisms of the PKMTs and PRMTs the 45- protein core of polycomb repressive complex 2 allows for certain efficiencies and economies in the dis(PRC2). PRC2 is a PKMT that catalyses the methylacovery of selective drugs for these enzymes, by treating tion of lysine 27 of histone H3 (in the nomenclature of them as a target class2527. Several of these enzymes histone modification, this site is referred to as H3K27). have been found to catalyse methyl transfer to lysine Although EZH2 contains the catalytic active site, all of or arginine residues on a number of cellular proteins; the proteins of the PRC2 complex are required for full this is especially the case for the PRMTs, for which sevPKMT activity. overexpression of EZH2 or another 24,28 . With eral cytosolic substrates have been identified PRC2 subunit, suppressor of zeste 12 homologue respect to gene however, the most important (SUZ12), hasregulation, been associated with numerous human targets for both PKMTs and PRMTs are to be the cancer types, including prostate, breast,likely bladder, colon, histones, as post-translational modification of these proskin, liver, endometrial, lung and gastric cancers, as well teins is clearly a determinant of 15 chromatin . In breastremodelling carcinomas, as lymphomas and myelomas and therefore regulation of have gene been transcription. increased levels of EZH2 shown to correlate with increased invasiveness and proliferation rate; it has Representation of PMTs the be human genome been suggested that EZH2in could a prognostic indica10 The target class is represented in many species, and . In cell culture, torPMT of patient outcome for breast cancer the human genome and PRMTs. overexpression of encodes EZH2 inseveral breast PKMTs epithelial cells causes Attempts to quantify the representation PKMTs in a anchorage-independent cell growth of and increased particular organism, and to understand the relatedinvasiveness. Additionally, when EZH2-overexpressing Target class ness ofwere these proteins to the onemammary another, have focused on cells injected into fat pads of nude A group of proteins that are the sequence alignment of the SET domain because, the as mice, the animals developed tumours, demonstrating related by a common type discussed above, this domain is common to all PKMTs tumorigenicity of EZH2 overexpression. Importantly, the of drug-binding pocket, (except DoT1l). but sufficiently diverse that phenotypic effects of EZH2 overexpression are correlated selective inhibition of specific Several attempts have been made to systematically with increased H3K27 methylation and are dependent on proteins can be achieved, group the PKMTs on the basis of sequence homology and the presence of an intact SET domain, both of which imply using medicinal chemical 14,29 10,15 . For example, in 2007, a nomenclature consubstrate . a role for EZH2 enzymatic activity in pathogenesis elaboration of the basic vention was proposed the PKMTs, alongare with other Several other humanfor PKMTs and PRMTs strongly SAH chemotype structures. 29 S-adenosyl-l-homocysteine, 2. associated with human cancers,enzymes as summarized in TABLE . In this study, types of chromatin-modifying theSAM universal product of all Similarly, there is compelling evidence thatSET other PKMTs 24 human PKMTs were identified. These domain S-adenosyll-methionine, enzymatic methyltransferase and PRMTs have divided a pathogenic role infamilies serious on human PKMTs have been into related the the universal methyl reactions, formed by group example, SET domain, diseases other than cancer 79. For basis of sequence alignment; initially four, and later donor of alltransfer enzymatic methyl group from 17 and coactivator-associated bifurcated 1 (SETDB1) methyltransferase reactions. l-methionine. S-adenosylseven, major families were defined in this manner14,16: the
REVIEWS | Drug Discovery 726 | SEPTEMBER 2009 | VolUME 8 S4 NATURE

human T lymphotrophic virus, respectively. PKMTs and The PMT active siteemerging as compelling targets for PRMTs are therefore The pursuit of the PMTs 4,5as a drug target class is facilitated . drug discovery efforts by a rich literature base of crystallographic and enzyme kinetic studies of these class enzymes that have helped to PMTs as a drug target define mechanisms of catalysis. All of these enzymes From their a chemical biology and medicinal chemistry probably use athe common bimolecular nucleophilic subperspective, PKMTs and PRMTs are of interest 3,24,30 2) methyl transfer mechanism . The lone stitution (S because they have a common mechanism of catalysis N pair electrons of a nitrogen atom (from lysine or arginine) (discussed below), involving a small, organic cofactor. As attacks the electrophilic methylsulphonium cation other druggable classes of enzymes, such as the protein of SAM at a 180 to the feature, leaving it group, tothat form a kinases, share thisangle mechanistic is likely the penta-coordinate carbon transition state. The transition PMTs will be similarly amenable to inhibition by small, state structure then collapses, with methyl group reloorganic molecules. cation to the nitrogen atom catalyse of the lysine or transfer arginine side The PKMTs and PRMTs methyl from chain formation of S -adenosyl-l-homocysteine (SAH; their and universal methyl donor, S-adenosyl-l-methionine (FIG. 2) (FIG. as a 2) product. also known as AdoHcy) also known as AdoMet) , to a nitrogen atom (SAM; 3 use a naturally occurring adenosyl analogue . Protein substrate speof The lysine or of arginine side chains cificity can be stringent in thesedonor enzymes; some PKMTs as the universal group transfer by PMTs is remiseem toof selectively a another particular lysine residue niscent protein methylate kinases large family of on a specific histone, and the extent of methylation on druggable enzyme targets, the ATP-binding pockets a single lysine (that is, mono-, or tri-methylof which haveresidue proved to be highly ditractable targets 31,32 by a particular enzyme can also ation) that is catalysed . Furthermore, despite binding a for drug discovery
VolUME 8 |COLLECTION SEPTEMBEREpigenetics 2009 | 725 www.nature.com/reviews/drugdisc NATURE REPRINT

REVIEWS
focus on the PMTs, and in particular on those aspects that make PMTs targets for drug discovery Table 1 | Epigenetic-enzyme inhibitors for attractive cancer therapy efforts. We summarize the data that contribute to generic name Alernative clinical status* the validation of PMTs as targets for specific human names diseases, as well as the structural and mechanistic data DNMT inhibitors that suggest PMTs are a tractable (that is, druggable) 5-azacitidine Vidaza target class.Approved in the United States
for myelodysplastic syndrome Decitabine
Acetyltransferases

REVIEWS
52 family suppressor of variegation 39 (SUV39) family; the SET1 Deacetylases members (also known as Mll) family; the SET2 (also known as ~18 family Me Demethylases NSD) family; the retinoblastoma protein-interacting zinc members ~30 family finger protein (RIZ) (also known as PRDM) family; the K members Ac Me SET and MYND domain-containing (SMYD) family; K (EZ) family; the enhancer of zeste and the PRMTs SUV420 R 10 family P Kinases family. An eighth family, known as others, included the members S enzymes SETD7 and SETD8 (also known as PRSET7). Ub Ligases Finally, DoT1l, the human, non-SET domain PKMT K can be considered a ninth family of PKMTs. our group has recently extended this work to systematically identify all of the SET domains that are encoded by the human genome. This study has more than doubled the number of putative human PKMTs to 52 (51 SET domain proteins plus DoT1l) (l.F. Jerva, K.o. Elliston, V.M.R., and R.A.C., unpublished observations). FigureM.E.S. 1 | A nucleosome and the post-translational From the perspective of drug discovery, the salient point histone protein modifications that can influence is epigenetic that these enzymes are numerous in humans. Nature Reviews | Drug Discovery regulation of gene transcription. The PRMTs are similarly well represented humans. Modifications of the histone protein tail arein shown: changes in least acetylation (Ac) by acetyltransferases There are at eight human PRMTs for whichand some deacetylases, phosphorylation (P) by kinases, ubiquitylation level of methyltransferase activity has been shown. These (Ub) byhave ligases and changes in methylation (Me) by proteins a canonical sequence domain that is associmethyltransferases and demethylases. The and enzyme families ated with the binding sites for the cofactor substrate that are responsible for the various post-translational (arginine), although the sequence conservation among modifications are shown. PKMT,of protein lysine these proteins is low. Estimates the total number of methyltransferase; PRMT, protein arginine methyltransferase. PRMTs that are encoded by the human genome vary, depending on the method of sequence alignment and the level of alignment stringency that is applied. Nevertheless, it is clear that 1050 of these enzymes are arginine methyltransferase 1 (CARM1; also known as represented in humans. PRMT4)18 have been implicated in the neurodegeneraThe human PMTs are thus a large class of enzymes, tive diseases Huntingtons disease and spinal muscular and several of them already have well established disatrophy, respectively. SET domain-containing lysine ease association (discussed above). Furthermore, owing 19 methyltransferase 7 (SETD7; also known as KMT7) , to the common features of their chemical mechanism CARM1 (REF. 20) and PRMT1 (REF. 21) have been of catalysis (discussed below), the PMTs are likely to be associated with nuclear factor-B-related inflammainherently tractable as targets for small-molecule drug tory diseases, and SETtarget domain-containing protein intervention. The PMT class, as defined here, 1A 22 and CARM1 (REF. 23) have been associated (SETD1A) therefore provides an important pool of potential targets with viral infections involving Herpes simplex virus and for drug discovery efforts. human T lymphotrophic virus, respectively. PKMTs and PRMTs therefore as compelling targets for The PMT are active site emerging 4,5 . drug target class is facilitated drug discovery efforts The pursuit of the PMTs as a PKMTs

human genome, the enzymes that catalyse methylation (protein lysine methyltransferases Vorinostat Zolinzaof lysine residues Approved in United States for cutaneous (PKMTs))manifestation and arginine residues T (protein arginine in cutaneous cell lymphoma (PRMTs)) are of substantial interest Romidepsin FK228 methyltransferases New drug application filing from the perspective of drug discovery and medicinal Panobinostat LBH-589 Phase II chemistry. The action of these enzymes is crucial in Belinostat PXD-101 controllingPhase gene II regulation, and there is an increasing amount of biochemical and biological data to suggest Entinostat MS-275 Phase II SNDX-275 that the enzymatic activities of several of these proteins have pathogenic roles in cancer, inflammatory diseases, MGCD-0103 MG-0103 Phase II neurodegenerative JNJ-26481585 None Phase I diseases and other conditions of importance715. Givinostat ITF2357 Phase II For example, with the exception of DoT1-like, histone *See REFS 51,52 for comprehensive reviews of novel cancer (DoT1l; therapies, including those described H3 methyltransferase also known as KMT4), in the table. DNMT, DNA methyltransferase; HDAC, histone deacetylase. all human PKMTs contain a ~130 amino-acid domain, referred to as the SET domain, which constitutes the catalytic domain of these enzymes1416. Enhancer of be stringent (discussed below). Clearly, the structure zeste homologue 2 (EZH2; also known as KMT6) of is a each enzyme is unique16,24, as is the resulting biology SET domain protein that forms the catalytic subunit of and pathobiology associated with each enzyme. Yet, the the 45- protein core of polycomb repressive complex 2 shared chemical mechanisms of the PKMTs and PRMTs (PRC2). PRC2 is a PKMT that catalyses the methylaallows for certain efficiencies and economies in the distion of lysine 27 of histone H3 (in the nomenclature of covery of selective drugs for these enzymes, by treating histone modification, this site is referred to as H3K27). them as a target class2527. Several of these enzymes Although EZH2 contains the catalytic active site, all of have been found to catalyse methyl transfer to lysine the proteins of the PRC2 complex are required for full or arginine residues on a number of cellular proteins; PKMT activity. overexpression of EZH2 or another this is especially the case for the PRMTs, for which sevPRC2 subunit, suppressor of zeste 12 homologue 24,28 . With eral cytosolic substrates have been identified (SUZ12), has regulation, been associated with numerous human respect to gene however, the most important cancer types, including prostate, breast, bladder, colon, targets for both PKMTs and PRMTs are likely to be the skin, liver, endometrial, lung and gastric cancers, as well histones, as post-translational modification of these pro15 . In breast carcinomas, as lymphomas and myelomas teins is clearly a determinant of chromatin remodelling increased levels of EZH2 shown to correlate and therefore regulation of have gene been transcription. with increased invasiveness and proliferation rate; it has been suggested that EZH2in could a prognostic indicaRepresentation of PMTs the be human genome 10 . In cell culture, tor of patient outcome for breast cancer The PMT target class is represented in many species, and overexpression of EZH2 inseveral breast PKMTs epithelial cells causes the human genome encodes and PRMTs. anchorage-independent cell growthof and increased Attempts to quantify the representation PKMTs in a invasiveness. Additionally, when EZH2-overexpressing particular organism, and to understand the relatedTarget class cellsof were injected into fat pads of nude ness these proteins to the onemammary another, have focused on A group of proteins that are mice, the animals developed tumours, demonstrating the the sequence alignment of the SET domain because, as related by a common type tumorigenicity of EZH2 overexpression. Importantly, the discussed above, this domain is common to all PKMTs of drug-binding pocket, but sufficiently diverse that phenotypic effects of EZH2 overexpression are correlated (except DoT1l). selective inhibition of specific with increased H3K27 methylation and dependent on Several attempts have been made toare systematically proteins can be achieved, the presence of anon intact SET domain, both of which imply group the PKMTs the basis of sequence homology and using medicinal chemical 10,15 14,29 . a role for EZH2 enzymatic in pathogenesis . For example, in activity 2007, a nomenclature consubstrate elaboration of the basic Severalwas other humanfor PKMTs and PRMTs are strongly vention proposed the PKMTs, along with other chemotype structures. SAH 29 S-adenosyl-l-homocysteine, . In this types of chromatin-modifying enzymes TABLE 2. associated with human cancers, as summarized in study, the universal product of all SAM 24 human there PKMTs were identified. These SET domain Similarly, is compelling evidence that other PKMTs enzymatic methyltransferase S-adenosyll-methionine, PKMTs have been divided into related families on the and PRMTs have a pathogenic role in serious human reactions, formed by group the universal methyl basis of sequence alignment; initially four, and later example, SET domain, diseases other than cancer 79. For methyl group transfer from donor of all enzymatic 17 S -adenosyl-l-methionine. seven, major 1 families were defined in this manner14,16: the and coactivator-associated bifurcated (SETDB1) methyltransferase reactions.
HDAC inhibitors
726 | SEPTEMBER 2009 | VolUME 8 NATURE REVIEWS | Drug Discovery NATURE REPRINT COLLECTION Epigenetics

PKMTs and PRMTs in inUnited human disease Dacogen Approved States for myelodysplastic syndrome enzymes of the In surveying the histone-modifying

by a rich literature base of crystallographic and enzyme PMTsstudies as a drug target class that have helped to kinetic of these enzymes From a chemical biology and All medicinal chemistry define their mechanisms of catalysis. of these enzymes perspective, the PKMTs and PRMTs are of interest probably use a common bimolecular nucleophilic sub3,24,30 of catalysis because they have transfer a common mechanism 2) methyl mechanism . The lone stitution (SN (discussed involving small, organic cofactor. As pair electronsbelow), of a nitrogen atoma(from lysine or arginine) other druggable classes of enzymes, such as the protein attacks the electrophilic methylsulphonium cation this mechanistic feature, it is likely thatathe ofkinases, SAM atshare a 180 angle to the leaving group, to form PMTs will be similarly amenable state. to inhibition by small, penta-coordinate carbon transition The transition organic molecules. state structure then collapses, with methyl group reloThe PKMTs and PRMTs methyl transfer from cation to the nitrogen atom of catalyse the lysine or arginine side their universal methyl donor, S-adenosyl-l-methionine chain and formation of S-adenosyl-l-homocysteine (SAH; alsoas known as AdoMet) 2), to a nitrogen atom (SAM; (FIG. 2) as (FIG. a product. also known AdoHcy) substrate speof lysine side chains3. Protein The useor of arginine a naturally occurring adenosyl analogue ascificity the universal transfer donor by PMTs is remican be group stringent in these enzymes; some PKMTs niscent of selectively protein kinases another large family of seem to methylate a particular lysine residue druggable enzyme targets, pockets on on a specific histone, and the the ATP-binding extent of methylation ofawhich have proved be is, highly tractable targets single lysine residue to (that mono-, di- or tri-methyl31,32 . Furthermore, despite binding for drug that discovery ation) is catalysed by a particular enzyme can a also
www.nature.com/reviews/drugdisc VolUME 8 | SEPTEMBER 2009 | 725 S5

REVIEWS
Table 1 | Epigenetic-enzyme inhibitors for cancer therapy

REVIEWS

suppressor of variegation 39 (SUV39) family; the SET1 (also known as Mll) family; the SET2 (also known as Table 2 | Selected Alernative PKMTs and PRMTs that have shown an association with human cancers generic name clinical status* NSD) family; the retinoblastoma protein-interacting zinc names Protein Methylation cancers cancer association refs finger protein (RIZ) (also known as PRDM) family; the methyltransferase substrates DNMT inhibitors SET and MYND domain-containing (SMYD) family; 5-azacitidine Vidaza Approved the United States SUV39H1 H3K9 Colon in cancer Increased expression in colorectal tumours; associated with and the SUV420 53 the enhancer of zeste (EZ) family; for myelodysplastic syndrometranscriptional repression family. An eighth family, known as others, included the enzymes SETD7 and SETD8 (also known as PRSET7). Decitabine Dacogen Approved in United States for EHMT2 H3K9 Lung, prostate and Increased expression in lung cancer cell lines; regulates 54,55 myelodysplastic syndrome centrosome duplication, hepatocellular presumably through chromatin Finally, DoT1l, the human, non-SET domain PKMT carcinoma structure can be considered a ninth family of PKMTs. HDAC inhibitors MLL H3K4 Leukaemia Chromosomal aberrations involving MLL are a cause of acute 5658 our group has recently extended this work to systemVorinostat Zolinza Approved in United States for cutaneous leukaemias; the SET domain is lost in translocation atically identify all of the SET domains that are encoded manifestation in cutaneous T cell lymphoma by the human genome. This study has more than doubled NSD1 H3K36 Acute myeloid Translocation fuses NSD1 to nucleoporin 98 in human acute 59 Romidepsin FK228 Newleukaemia drug application filing myeloid leukaemia the number of putative human PKMTs to 52 (51 SET Panobinostat LBH-589 Phase II domain proteins in plus DoT1l) (l.F. Jerva, K.o. Elliston, WHSC1 H3K36 and Myeloma Translocated and increased expression myeloma; associated 6062 H4K20 with transcriptional regulation Belinostat PXD-101 Phase II V.M.R., M.E.S. and R.A.C., unpublished observations). From the perspective of drug discovery, point WHSC1L1 H3K4 Lung Amplified in lung cancer and breast cancer; translocation with the salient 6364 Entinostat MS-275 Phase II and breast is that these enzymes are numerous in humans. cancers, and childhood nucleoporin 98; mediates transcriptional activation SNDX-275 acute myeloid The PRMTs are similarly well represented in humans. MGCD-0103 MG-0103 Phase II leukaemia There are at least eight human PRMTs for which some JNJ-26481585 None Phase I DOT1L H3K79 MLL-rearranged Recruited by MLL fusion partners MLLT1, MLLT2,activity MLLT3 and 11, 66,67 level of methyltransferase has been shown. These leukaemias MLLT10 to homeobox genes; associated with transcriptional Givinostat ITF2357 Phase II proteins have a canonical sequence domain that is associactivation and elongation ated with the binding sites for the cofactor and substrate *See REFS 51,52 for comprehensive reviews of novel cancer therapies, including those described in the table. DNMT, DNA methyltransferase; HDAC, Breast, histone deacetylase. SMYD3 H3K4 liver, colon and Overexpressed in multiple tumour types; associated with conservation among 68,69 (arginine), although the sequence gastric cancers transcriptional activation these proteins is low. Estimates of the total number of EZH2 H3K27 Breast, prostate, Amplified and increased expression in several tumour types; 10,15,70,71 PRMTs that are encoded by the human genome vary, colon, gastric, bladder member of the repressive 2; associated be stringent (discussed below). Clearly, a the structure ofpolycomb depending on complex the method of sequence alignment and liver 16,24 cancers, with transcriptional repression , as is the resulting biology and the level of alignment stringency that is applied. each enzyme ismelanoma unique and and pathobiology associated with each enzyme. Yet, the Nevertheless, it is clear that 1050 of these enzymes are lymphoma shared chemical mechanisms of the PKMTs and PRMTs represented in humans. SETD7 H3K4 Breast cancers SET7-mediated methylation stabilizes the oestrogen receptor 72 allows for certain efficiencies and economies the dis-for the recruitment The human PMTs are thusreceptor a large class of enzymes, and is in necessary of the oestrogen covery of selective drugs for these enzymes, treating to its by target genes and target gene transactivation and several of them already have well established dis2527 . Several of these enzymes them as a target class ease association (discussed above). PRDM14 No known Breast cancers Amplified and overexpressed in cancers; associated with Furthermore, owing 73 have been found to catalyse methyl transfer to lysine to the common features of their chemical mechanism substrate transcriptional repression or arginine residues on a number of cellular proteins; of catalysiswith (discussed below), the PMTs are likely to be CARM1 H3R17, Breast and prostate Increased expression correlates androgen independence 74,75 this is especially the case for the PRMTs, which sev- carcinoma; inherently tractable as in targets small-molecule drug EP300CBP and cancers infor human prostate overexpressed breastfor tumours 24,28 NCOA3 and associated with transcriptional . With eral cytosolic substrates have been identified intervention.activation The PMT target class, as defined here, respect to gene regulation, however, the most important therefore provides an important pool of potential 12, targets PRMT5 H3R8, p53, Lymphoma PRMT5 expression and H3R8 methylation levels are increased 76 targets for both PKMTs and PRMTs arein likely to be the cells; for drug discovery efforts. SNRPD1, lymphoid cancer PRMT5 mediates p53 methylation, SNRPD3 and which promotes histones, as post-translational modification of these pro-cell arrest rather than cell death; H4R3 SUPT5H methylation promotes recruitment of DNMT3A, The PMT active site subsequent teins is clearly a determinant of chromatin remodelling promoter CpG methylation and gene silencing The pursuit of the PMTs as a drug target class is facilitated and therefore regulation of gene transcription. CARM1, coactivator-associated arginine methyltransferase 1 (also known as PRMT4); CBP, CREB-binding protein; DNMT3A, DNA (cytosine-5-)-methyltransferase by a rich literature base of crystallographic and enzyme 3; EHMT2, euchromatic histone-lysine N-methyltransferase 2 (also known as G9A and KMT1C); EP300, E1A-binding protein p300; EZH2, enhancer of zeste Representation of PMTs inH3 the human genome kinetic studies of these enzymes that have helped to homologue 2 (also known as KMT6); DOT1L, DOT1-like, histone methyltransferase (also known as KMT4); MLL, myeloid, lymphoid or mixed-lineage leukaemia (also known as KMT2A); MLLT1, myeloid, lymphoid or is mixed-lineage to 1; NCOA3, nuclear receptor coactivator 3; NSD1, nuclear The PMT target class representedleukemia, in manytranslocated species, and define their mechanisms of catalysis. All of these enzymes receptor-binding SET domain protein 1; PKMT, protein lysine methyltransferase; PRDM14, PR domain-containing protein 14; PRMT, protein arginine the human genome lysine encodes several PKMTs and PRMTs. probably useand a common bimolecular nucleophilic submethyltransferase; SETD7, SET domain-containing methyltransferase 7 (also known as KMT7); SMYD3, SET MYND domain-containing protein 3; 3,24,30 2) methyl transfer . The lone Attempts toD1 quantify the 16kDa representation of in a small stitution (SNribonucleoprotein SNRPD1, small nuclear ribonucleoprotein polypeptide (also known asPKMTs SMD1); SNRPD3, nuclear D3mechanism polypeptide 18kDa (also known as SMD3); SUPT5H, suppressor of Ty 5 homologue; SUV39H1, suppressor of variegation homologue 1 (also known as(from KMT1A); WHSC1, pair39 electrons of a nitrogen atom lysine or arginine) particular organism, and to understand the relatedWolfHirschhorn syndrome candidate 1 (also known as MMSET and NSD2); WHSC1L1, WolfHirschhorn syndrome candidate 1-like protein 1 (also known as NSD3). ness of these proteins to one another, have focused on attacks the electrophilic methylsulphonium cation the sequence alignment of the SET domain because, as of SAM at a 180 angle to the leaving group, to form a discussed above, this domain is common to all PKMTs penta-coordinate carbon transition state. The transition collapses, with methyl and group relo(except DoT1l). common natural ligand, the ATP-binding pockets of state Astructure common then structural feature of PKMTs PRMTs to the nitrogen atom enzymes of the lysine or arginine side Several attempts have been medicinal made to systematically protein kinases have afforded chemists a rich cation that distinguishes these from other proSAH; -adenosyl-l-homocysteine group the PKMTs on thescaffolds, basis of sequence homology and diversity of chemical which have resulted in chain teins and thatformation use SAMof isS the overall architecture of ( their 14,29 For example, in of 2007, a nomenclature con- also (FIG. 2) This as a product. known as AdoHcy) substrate a range of .drug molecules varying degrees of target extended catalytic active sites. generally consists of vention was32 proposed for the PKMTs, along with use other The use of a naturally occurring adenosyl SAH . Similarly, the commonality of SAM by a SAM-binding selectivity pocket that is accessed from analogue one face 29 S-adenosyl-l-homocysteine, In this study, as types of chromatin-modifying by PMTs is remithe PMTs belies the structural,enzymes biological. and pathologiofthe theuniversal protein, group and a transfer narrow, donor hydrophobic, acceptor the universal product of all 24cal human PKMTs were identified. These SET domain niscent of protein kinases another large family of diversity of these enzymes. From the perspective of (that is, lysine or arginine) channel that extends to the enzymatic methyltransferase PKMTs have been divided into related families on the druggable enzyme targets, the ATP-binding pockets drug discovery and medicinal chemistry, the diversity of opposite face of the protein surface, such that the two reactions, formed by basis of sequence alignment; initially four, and which have proved to be highly tractablesides targets SAM-binding modes and catalytic mechanisms of later these of substrates enter the active site from opposite of methyl group transfer from 31,32 . Furthermore, despite binding a S-adenosyl-l-methionine. seven, major were defined in this manner14,16: the for discovery enzymes is families of key importance. thedrug enzyme surface.
726 | SEPTEMBER 2009 | VolUME 8 REVIEWS | Drug Discovery S6 NATURE www.nature.com/reviews/drugdisc VolUME 8 COLLECTION | SEPTEMBEREpigenetics 2009 | 727 NATURE REPRINT

REVIEWS
a
O H N O N H

REVIEWS
focus on the PMTs, and in particular on those aspects that make PMTs attractive targets for drug discovery H O efforts. We summarize the data that contribute to N N for specific human the validation of PMTs as targets H O structural and mechanistic data diseases, as well as the that suggest PMTs are a tractable (that is, druggable) target class.
PKMTs Acetyltransferases 52 family From the above discussion, it could be concluded Deacetylases members that the configuration of the bound SAM is structurally ~18 family Me Demethylases members related to the identity of the methyl acceptor nitrogen ~30 family species upon which the enzymes act; that is, members U-shaped K Ac Me for PKMTs and extended for PRMTs. However, data on PRMTs the non-SET domain K PKMT, DoT1l, do not support this R H 10 family P Kinases N conclusion. In the co-crystal structures of human DoT1l members H CH3 NH2 NH2 S PKMTs and PRMTs in human disease bound to SAM33, and Ub the yeast homologue DoT1P bound Ligases N N N 34 In surveying the histone-modifying enzymes of the , the cofactor is K bound in the extended configuto SAH S+ human genome, N PMTs the enzymes that catalyse methylation ration, similar to that seen in the PRMTs. Additionally, N N
O

H N O N

CH3 S+
O C 2

O C of lysine residues (protein lysine methyltransferases 2 H H H (PKMTs)) and arginine residues H arginine H H (protein OH OH OH + NH3+ NH3are methyltransferases (PRMTs)) of OH substantial interest from the perspective of drug discovery and medicinal SAM SAH chemistry. The action of these enzymes is crucial in controlling gene regulation, and there is an increasing + and biological data to suggest amount of biochemical b LG Nu LG Nu + LG that theNu enzymatic activities of several of these proteins have pathogenic roles in cancer, inflammatory diseases, Figure 2 | PMT-catalysed methylation of proteins diseases by an sN2and reaction with sAM as of neurodegenerative other conditions the methyl donor. The protein methyltransferases (PMTs) catalyse methyl transfer from importance715. Nature Drug Discovery their universal methyl donor, S-adenosyl-l-methionine (SAM; alsoReviews known |as AdoMet) For example, with the exception of l DoT1-like, histone to a nitrogen atom of lysine or arginine side chains to form S-adenosyl-homocysteine H3 methyltransferase (DoT1l; also known as KMT4), (SAH; also known as AdoHcy). a | The methyl group (shown in red) of the SAM sulphonium human PKMTs contain a ~130 domain, cation is attacked by the loneall pair electrons of a lysine (shown here)amino-acid or arginine (not referred to as the SET which constitutes shown) side-chain nitrogen atom. The reaction results indomain, transfer of the methyl group the 1416 cofactor. to the attacking nitrogen atom and the production of these SAH from the reaction . Enhancer of catalytic domain of enzymes b | A more generalized chemical scheme of a bimolecular nucleophilic substitution (SN2)is a zeste homologue 2 (EZH2; also known as KMT6) group transfer reaction, illustrating the attacking nucleophile (Nu ; lysine or arginine in of SET domain protein that forms the catalytic subunit the case of PMTs), the leaving group (LG; the methyl group in the case of PMTs), and the the 45- protein core of polycomb repressive complex 2 transient but essential formation of a penta-coordinate carbon transition state (). (PRC2). PRC2 is a PKMT that catalyses the methylation of lysine 27 of histone H3 (in the nomenclature of histone modification, this site is referred to as H3K27). Although EZH2 contains the catalytic active site, all of Crystallographic studies have revealed two distinct the proteins of the PRC2 complex are required for full binding modes for SAM or SAH in the cofactor-binding PKMT activity. overexpression of EZH2 or another For the SET domain PKMTs that have pockets PMTs24. suppressor PRC2 of subunit, of zeste 12 homologue been co-crystallized with SAM or SAH, it is known that (SUZ12), has been associated with numerous human the cofactor adopts a U-shaped configuration within cancer types, including prostate, breast, bladder, colon, the active site (FIG. 3) that aligns the methylsulphonium skin, liver, endometrial, lung and gastric cancers, as well cation of SAM at the base of the15 narrow lysine channel, as lymphomas and myelomas . In breast carcinomas, in perfect juxtaposition to the -amino group of the increased levels of EZH2 have been shown to correlate acceptor lysine residue, which facilitates group transfer. with increased invasiveness and proliferation rate; it has This U-shaped configuration is induced by a conserved been suggested that EZH2 could be a prognostic indicaaspartate or glutamate residue that binds to the ribose tor of patient outcome for breast cancer10. In cell culture, hydroxyl groups, and a positively charged lysine or overexpression EZH2 in a breast epithelial cells arginine residue of that forms salt bridge with the causes caranchorage-independent cell growth and increased boxylate of SAM. In striking contrast to the U-shaped invasiveness. Additionally, when configuration that is adopted byEZH2-overexpressing the cofactor when Target class cells were injected into the mammary fat pads site of nude bound to PKMTs, SAM bound within the active of A group of proteins that are mice, the animals tumours, demonstrating the PRMTs adopts an developed extended configuration that resemrelated by a common type tumorigenicity of EZH2 overexpression. Importantly, the of drug-binding pocket, bles the extended SAM configuration seen in the DNA but sufficiently diverse that phenotypic effects of EZH2 overexpression are correlated methyltransferases; again, the binding motif results in selective inhibition of specific with increased H3K27 methylation and are dependent on alignment of the SAM methylsulphonium cation with proteins can be achieved, thebase presence ofacceptor-binding an intact SET domain, bothAnother of which disimply the of the channel. using medicinal chemical 10,15 a role for EZH2 enzymatic activitywithin in pathogenesis tinction between cofactor binding the PKMTs . elaboration of the basic Several other human PKMTs and PRMTs are strongly chemotype structures. and the PRMTs is that, in PRMTs, dimer formation associated with human cancers, as of summarized in TABLE seems to be a crucial component SAM binding and 2. SAM 3,24 Similarly, there is this compelling evidence that other PKMTs . The catalysis, whereas is not the case for PKMTs S-adenosyl-l-methionine, and PRMTs have a pathogenic role dimer in serious human mechanistic consequences of obligate formation the universal methyl group 79 . but Forit example, SET domain, diseases otheris than in the PRMTs not cancer yet clear, may be involved in donor of all enzymatic 17 and coactivator-associated bifurcated 1 (SETDB1) methyltransferase reactions. multiple methylations of the arginine residue.

the solvent-exposed surface area of the bound cofactor in DoT1l is more similar to that seen in the PRMTs than the PKMTs, as is the overall amino-acid sequence around the cofactor-binding pocket24,33. Therefore, from a structural perspective, DoT1l seems to link the PKMT and PRMT groups of PMTs. and the post-translational Figure 1 | A nucleosome The discovery and optimization of selective drugs for histone protein modifications that can influence Nature Reviews | Drug Discovery epigenetic gene the PMTs will regulation depend notof only on transcription. the static structure of Modifications the histone as protein tail are shown: the active site ofof the enzyme, revealed through cryschanges instudies, acetylation (Ac) by and tallographic but also onacetyltransferases the structural dynamics 27,35 phosphorylation (P)catalytic by kinases, ubiquitylation . ofdeacetylases, the active site that accompany turnover (Ub) by ligases and changes in methylation (Me) by Studies on the kinetic mechanisms of the PMTs may methyltransferases and demethylases. The enzyme families provide some information in this area. that are responsible for the various post-translational Some of the SET domain PKMTs, such as SETD7, modifications are shown. PKMT, protein lysine perform a single round ofprotein catalysis on a lysine residue, methyltransferase; PRMT, arginine methyltransferase. resulting in a mono-methylated product, whereas other SET domain PKMTs catalyse multiple rounds of methylation on a specific lysine residue. Crystallographic studies suggest that the difference between arginine methyltransferase 1 (CARM1; also singleknown as turnover and 18 multiple-turnover SET domain enzymes have been implicated in the neurodegeneraPRMT4) results from theHuntingtons degree of steric crowding and hydrogentive diseases disease and spinal muscular bonding patterns in the lysine-binding channel of these atrophy, respectively. SET domain-containing lysine 3,24,36,37 . In particular, the identity of anas aromatic enzymes methyltransferase 7 (SETD7; also known KMT7)19, residue within the lysine-binding pocket seems to be the (REF. 20) and PRMT1 (REF. 21) have been CARM1 key determinant of the multiplicity of lysine methylassociated with nuclear factor-B-related inflammaation. In the PKMT DIM5, this residue is a phenylalanine tory diseases, and SET domain-containing protein 1A (F281), and the enzyme can tri-methylate the acceptor (SETD1A)22 and CARM1 (REF. 23) have been associated lysine residue of its protein substrate. The correspondwith viral infections involving Herpes simplex virus and ing residue in SETD7 is a tyrosine (Y305), and this human T lymphotrophic virus, respectively. PKMTs and enzyme can only mono-methylate its protein substrate. PRMTs are therefore emerging as compelling targets for Remarkably, the mutant4,5 F281Y transforms DIM5 into drug discovery efforts . a mono-methylating PKMT, and the corresponding mutant Y305F in SETD7 results in an enzyme that PMTs as a drug target class is capable of multiple rounds of lysine methylation38. From a chemical biology and medicinal chemistry These mutagenesis results have been extended to the perspective, the histone PKMTs and N PRMTs are of interest PKMT euchromatic lysine -methyltransferase 2 because they have a common mechanism of catalysis 39 (EHMT2; also known as G9A) , and the tyrosine (discussed below), a small, organic cofactor. As phenylalanine switchinvolving seems to be a general determinant other druggable classes of enzymes, such as the protein 24 . of product specificity among the SET domain PKMTs kinases, share this mechanistic feature, itmechanical is likely that the Molecular dynamics and hybrid quantum PMTs will be similarly amenable to inhibition by for small, molecular mechanical studies also suggest a key role organic molecules. bound water molecules (a water channel) in the extent 30 Themethylation PKMTs and by PRMTs catalyse . methyl transfer from of lysine PKMTs their universal methyl donor, -adenosyl-l-methionine An outstanding question that S has yet to be reconciled known as AdoMet)described (FIG. 2), to above a nitrogen atom (SAM; with the also mechanistic hypothesis is how 3 . Protein substrate speofquaternary lysine or arginine side chains the nitrogen atom is deprotonated to genercificity can amine be stringent in acceptor. these enzymes; some PKMTs ate a neutral methyl At physiological seem selectively a particular lysine residue pH, theto lysine aminemethylate is protonated (the negative logon a specific histone, and the extent of methylation arithm of the acid dissociation constant (pKa) of the side on a single lysine residue (that is, mono-, di- are or tri-methyl(REF. 35)), and so there no lone chain amine is ~10.8 ation) that is catalysed by a particular enzymein can also pair electrons to act as the attacking nucleophile the
VolUME 8 | SEPTEMBER 2009 | 725 www.nature.com/reviews/drugdisc S7

NATURE REVIEWS | Drug Discovery 728 | SEPTEMBER 2009 | VolUME 8 NATURE REPRINT COLLECTION Epigenetics

REVIEWS
Table 1 | Epigenetic-enzyme inhibitors for cancer therapy
generic name DNMT inhibitors
5-azacitidine Decitabine Vidaza Dacogen Approved in the United States for myelodysplastic syndrome Approved in United States for myelodysplastic syndrome Approved in United States for cutaneous manifestation in cutaneous T cell lymphoma New drug application filing Phase II

REVIEWS
Alernative a PRMT names clinical status*

HDAC inhibitors
Vorinostat Romidepsin Panobinostat Belinostat Entinostat MGCD-0103 JNJ-26481585 Givinostat Zolinza FK228 LBH-589

b DOT1L

PXD-101 Phase II c SET domain MS-275 Phase II SNDX-275 MG-0103 None ITF2357 Phase II Phase I Phase II

*See REFS 51,52 for comprehensive reviews of novel cancer therapies, including those described in the table. DNMT, DNA methyltransferase; HDAC, histone deacetylase.

Figure 3 | variations in the configuration of sAM or sAH bound within the active sites of different PMTs. astringent | The representative shown the be (discussedconformation below). Clearly, the for structure of 16,24 protein arginine methyltransferases (PRMTs) was biology taken , Nature as is the resulting each enzyme is unique Reviews | Drug Discovery from the crystal structure of S-adenosyl-homocysteine and pathobiology associated with each lenzyme. Yet, the (SAH; also known as AdoHcy) bound to coactivatorshared chemical mechanisms of the PKMTs and PRMTs associated arginine methyltransferase 1 (CARM1)49. allows for certain efficiencies and economies in the H3 disb | The conformation shown for DOT1-like, histone covery of selective drugs for was these enzymes, by crystal treating methyltransferase (DOT1L) taken from the . Several of these enzymes them as a of target class2527 structure S-adenosyll-methionine (SAM; also known 33 . c | The representative as AdoMet) bound tocatalyse this protein have been found to methyl transfer to lysine shownon forathe protein lysine methylorconformation arginine residues number of cellular proteins; transferases (PKMTs) wasfor taken the crystal this is especially the case thefrom PRMTs, for which sevstructure of SAH bound to SET domain-containing eral cytosolic substrates have been identified24,28. With lysine methyltransferase 8 (REF. 50). Carbon atoms are respect to gene regulation, however, the most important represented by grey circles; nitrogen atoms targets for both PKMTs and PRMTs are likely to be the are represented by blue circles; oxygen atoms are represented by red circles; and sulphur atoms histones, as post-translational modification of these proare represented by yellow circles. teins is clearly a determinant of chromatin remodelling

and therefore regulation of gene transcription.

SAH General base catalysis

S-adenosyll-homocysteine, A mechanism that can occur thein universal of all enzyme product catalysis, in which a enzymatic methyltransferase basic group accepts protons reactions, formed by from a substrate molecule, methyl group transferafrom usually to stabilize charged l-methionine. S-adenosyltransition-state species.

Representation of PMTs in the human genome SN2-mediated methyl transfer reaction. A potential mechThe PMT target class is represented many species, and general base catalysis. anism of deprotonation is throughin the human genome encodes severalacids PKMTs and PRMTs. However, inspection of the amino in the active sites Attempts to quantify the representation of PKMTs in a of PKMTs reveals no obvious basic side chains that could particular and to understand the relatedact in this organism, capacity. Another hypothesis is that the solvent ness of these proteins to one another, have focused on acts as a proton sponge; however, this seems inconsistent the sequence alignment of the SET domain because, as with the fact that the lysine side chain is buried deeply discussed above, this domain is common to all PKMTs in the protein, with no clear access to bulk solvent. An (except DoT1l). alternative hypothesis has recently been proposed, based 30 to systematically Several attempts havesimulations been made . According to this on molecular dynamics group thebinding PKMTsof onSAM the basis sequence homology anda model, and of protein substrates creates 14,29 . For example, in 2007, a nomenclature consubstrate water shuttle that can remove a proton from the buried vention waschain proposed for the PKMTs, alongawith other lysine side and ferry this proton along contiguous 29 . In this study, types ofof chromatin-modifying enzymes chain water molecules to be deposited into the bulk 24 human PKMTs were These SETcreated domain solvent. Additionally, theidentified. electrostatic repulsion by PKMTs have been divided into related families charged on the the quarternary nitrogen atom and the positively basis ofcofactor sequence alignment; four, side-chain and later of the lysine SAM lowers the pKainitially 14,16 : the seven, major families were facilitating defined in this manner amine to ~8.2, thereby this deprotonation

suppressor of variegation 39 (SUV39) family; the SET1 (also known as Mll) family; the SET2 hypothesis (also known as process. Furthermore, the water shuttle proNSD) family; the retinoblastoma protein-interacting zinc vides an alternative mechanism to explain the differences finger protein (RIZ) (also known as PKMTs. PRDM)The family; the in extent of lysine methylation by the molecSET MYND domain-containing ular and dynamics studies suggest that the (SMYD) ability to family; form a the enhancer of determine zeste (EZ) family; the SUV420 water shuttle will the extent and of methylation that family. An eighth family, known as others , included the is catalysed by a given enzyme. For example, simulations enzymes SETD7 and SETD8suggest (also known as PRSET7). of SETD7-mediated catalysis that mono-methylFinally, DoT1l, the human, non-SET domain PKMT ation of lysine prevents re-formation of a new water shuttle, can a terminates ninth family of PKMTs. andbe soconsidered this enzyme catalysis after one round our group has recently extended this work to systemof methylation. The same simulations suggest that other atically identify ofribulose the SETbisphosphate domains that are encoded PKMTs, such asall the carboxylase by the human genome. This study has more than oxygenase large subunit methyltransferase, candoubled readily the number of putative human PKMTs to 52 (51 SET re-form the water shuttle, leading to multiple rounds of domain proteins plus DoT1l) (l.F. Jerva, K.o. Elliston, methylation. Enzymes that perform rounds of catalysis V.M.R., M.E.S. and R.A.C.,multiple unpublished observations). on a macromolecular substrate can do so by one of two From the perspective of drug discovery, the salient point mechanisms: a distributive enzyme mechanism, is that these enzymes are numerous in humans.in which each round of catalysis resultswell in macromolecular product The PRMTs are similarly represented in humans. dissociation and rebinding, or aPRMTs processive There are at least eight human for mechanism, which some in which multiple rounds activity of catalysis before dislevel of methyltransferase hasproceed been shown. These sociation of the macromolecular product. PMTs use both proteins have a canonical sequence domain that is associof these some domain and PKMTs that ated withmechanisms: the binding sites forSET the cofactor substrate perform multiple rounds of lysine methylation have been (arginine), although the sequence conservation among 3,24 DoT1l foundproteins to use a processive mechanism these is low. Estimates of the, whereas total number of has been shown to perform rounds of H3K79 PRMTs that are encoded bymultiple the human genome vary, methylationon through a non-processive (distributive) depending the method of sequence alignment 40 mechanism and the level .of alignment stringency that is applied. The PRMTs are alsothat capable of of performing multiple Nevertheless, it is clear 1050 these enzymes are rounds of arginine methylation to produce either monorepresented in humans. or The di-methylated arginine products. that human PMTs are thus a large The classPRMTs of enzymes, have been studied so far follow an ordered, sequential and several of them already have well established dismechanism in which SAMabove). binds before the arginineease association (discussed Furthermore, owing containing substrate, and of di-methyl argininemechanism production to the common features their chemical 3 . on basis occurs through a processive mechanism of catalysis (discussed below), the PMTs are the likely to of be product specificity, PRMTs can be subdivided into two inherently tractable as targets for small-molecule drug types: type I PRMTs, which produce an asymmetrical intervention. The PMT target class, as defined here, N,N -dimethyl arginine; and type II PRMTs, which therefore provides an important pool of potential targets produce a symmetrical N,N-dimethyl arginine3. for drug discovery efforts. The variations in active-site structure and chemical mechanism that are summarized above reflect a target The PMT active site class with the potential for substantial chemical diversity The pursuit of the PMTs as a drug target class is facilitated among small-molecule modulators of individual enzymes by a rich literature base of crystallographic and enzyme in the class. Therefore, the opportunity for the developkinetic studies of these enzymes that have helped to ment of different chemotypes that compete with the comdefine their mechanisms of catalysis. All ofexample, these enzymes mon, natural ligands of these enzymes (for SAM, probably use a common bimolecular nucleophilic sublysine and arginine), and can be modified to produce 3,24,30 2) methyl transfer mechanism . The lone stitution (S N enzyme-selective inhibitors, seems promising. pair electrons of a nitrogen atom (from lysine or arginine) attacks the electrophilic methylsulphonium cation Known inhibitors of PMTs of SAM at a convergence 180 angle to the leaving group, to form a Despite the of data concerning PMTs, the penta-coordinate carbon transition state. The transition search for potent, selective inhibitors of these enzymes has state thenin collapses, with indirect methyl group reloonly structure recently begun earnest. Some approaches cation to the nitrogen atom of the lysine orreported. arginine side to inhibiting or depleting PMTs have been For chain and formation of S-adenosyl-l-homocysteine (SAH; example, the antiviral compound 3-deazaneplanocin 2) as a product. also knowninhibits as AdoHcy) (DZNep) the (FIG. enzyme SAH hydrolase and The use of a naturally occurring adenosyl analogue thereby increases intracellular levels of the universal 41 as the universal group transfer donor by PMTsby is remi. Product inhibition SAH product of PMTs, SAH niscent of protein kinases for another largeand family of would therefore be expected all PMTs other druggable enzyme targets, the ATP-binding pockets SAM-dependent enzymes, with the degree of inhibition of which have proved torelated be highly tractable for specific enzymes being to their relative targets inhibi31,32 . Furthermore, despite binding a for drug discovery Michaelis constant (Km) values for tion constant (Ki) and
www.nature.com/reviews/drugdisc VolUME 8 |COLLECTION SEPTEMBEREpigenetics 2009 | 729 NATURE REPRINT

726 | SEPTEMBER 2009 | VolUME 8 NATURE REVIEWS | Drug Discovery S8

REVIEWS

REVIEWS
focus on the PMTs, and in particular on those aspects PKMTs Acetyltransferases thatand make PMTs attractive 27 targets for drug discovery family SAH SAM, respectively . Similarly, the activity of all two enzymes involved in SAM biosynthesis4252 . Also, the Deacetylases members efforts. We summarize the data that contribute SAM-dependent enzymes in a cell could be reduced by to pan-HDAC inhibitor panobinostat has recently been ~18 family Me the validation of PMTs as targets for specific human shown members blocking SAM biosynthesis for example, by inhibitto cause depletion of cellular levels ofDemethylases the PMT ~30 family diseases, as wellreductase as the structural mechanistic data EZH2 (REF. 43). Although the mechanism by which this ing dihydrofolate or SAM and synthase, which are K members Ac Me that suggest PMTs are a tractable (that is, druggable) PRMTs K target class. 10 family P inhibitorsRof PMTs Kinases Table 3 | Chemical structures and biochemical data for small-molecule members S PKMTs and PRMTs in human disease compound structure Mechanism selectivity* refs Ub Ligases and potency In surveying the histone-modifying enzymes of the K NH2 Product of the reactions SAH Non-selective 77,78 human genome, the enzymes that catalyse methylationcatalysed by PMTs N N of lysine residues (protein lysine methyltransferases IC50 values range from H2N S 0.1 to 20 M (PKMTs)) and arginine residues (protein arginine N N O methyltransferases (PRMTs)) are of substantial interest CO2 H from the perspective of drug discovery and medicinal OH OH chemistry. The action of these enzymes is crucial in controlling gene regulation, and there is NH an2increasingNatural product analogue Non-selective 36 Sinefungin Figure 1 | A nucleosome and the post-translational and SAH histone protein modifications that can influence amount of biochemical and biological data to suggestof SAM N NH 2 N values range from IC50 Nature Reviews | Drug Discovery epigenetic regulation of gene transcription. that the enzymatic H2N activities of several of these proteins0.1 to 20 M N Modifications of the histone protein tail are shown: N have pathogenic roles in cancer, inflammatory diseases, O 2H neurodegenerativeCO diseases and other conditions of changes in acetylation (Ac) by acetyltransferases and deacetylases, phosphorylation (P) by kinases, ubiquitylation OH OH importance715. (Ub) by ligases and changes in methylation (Me) by For example, with the exception of DoT1-like, histone methyltransferases and demethylases. The enzyme families OH O SAM-competitive > 4-fold 79 Chaetocin H3 methyltransferase (DoT1l; also known as KMT4),inhibitor of SUV39 that are responsible for the various post-translational H H N all human PKMTs contain a ~130 S N amino-acid domain,IC modifications = 0.6 M N are shown. PKMT, protein lysine S 50 referred to as the SET domain, which constitutes the methyltransferase; PRMT, protein arginine methyltransferase. 1416 O . Enhancer of catalytic domain of these enzymes zeste homologue 2 (EZH2; also O known as KMT6) is a SET domain protein that forms the catalytic subunit of N S S N repressive complex 2 arginine methyltransferase 1 (CARM1; also known as the 45- protein core N of polycomb H H (PRC2). PRC2 is a PKMT that catalyses the methyla- PRMT4)18 have been implicated in the neurodegeneraOH tion of lysine 27 of histoneO H3 (in the nomenclature of tive diseases Huntingtons disease and spinal muscular atrophy, respectively. SET domain-containing lysine histone modification, this site is referred to as H3K27).SAM-non-competitive > 4-fold 80 BIX-01294 N site, all ofinhibitor of EHMT2 methyltransferase 7 (SETD7; also known as KMT7)19, Although EZH2 contains the catalytic active N MeO N = 2.7 M (REF. 20) and PRMT1 (REF. 21) have been the proteins of the PRC2 complex are required for fullIC50CARM1 N EZH2 or another associated with nuclear factor-B-related inflammaPKMT activity. overexpression of MeO PRC2 subunit, suppressor of NH zeste 12 homologue tory diseases, and SET domain-containing protein 1A (SUZ12), has been associated with numerous human (SETD1A)22 and CARM1 (REF. 23) have been associated N cancer types, including prostate, breast, bladder, colon, with viral infections involving Herpes simplex virus and skin, liver, endometrial, lung and gastric cancers, as well human T lymphotrophic virus, respectively. PKMTs and F3C > 100-fold for 45 Methylgene CARM1 inhibitor 15 are therefore emerging as compelling targets for as lymphomas and myelomas CH3O . In breast carcinomas,IC PRMTs = 60 nM PRMT1 and compound 7a 50 4,5 . increased levels of EZH2 have been shown to correlate drug discovery efforts H SETD7 of REF. 45 N N with increased invasiveness and proliferation rate; it has N PMTs as a drug target class been suggested that EZH2 O could be a prognostic indica10 . In cell culture, From a chemical biology and medicinal chemistry tor of patient outcome for breast cancer S overexpression of EZH2 in breast epithelial cells causes perspective, the PKMTs and PRMTs are of interest NH anchorage-independent cell growth and increased because they have a common mechanism of catalysis invasiveness. Additionally, when (discussed below), involving a small, organic cofactor. As O NH2 EZH2-overexpressing cells were injected into the mammary fat pads of nude other druggable classes of enzymes, such as the protein BristolMyers 46,47 inhibitor Ndemonstrating theCARM1 mice, the animals developed tumours, kinases, share this mechanistic>100-fold feature, itfor is likely that the F3C S Squibb PRMT1 and IC50 = 40 nM tumorigenicity of EZH2 overexpression. Importantly, the PMTs will be similarly amenable to inhibition by small, compound 7f PRMT3 O phenotypic of EZH2 overexpression are correlated organic molecules. of REF. 47 effectsN N with increased H3K27 methylation and are dependent on The PKMTs and PRMTs catalyse methyl transfer from N N the presence of an intact SET domain, both of which imply their universal methyl donor, S-adenosyl-l-methionine H a role for EZH2 enzymatic N activity in pathogenesis10,15. (SAM; also known as AdoMet) (FIG. 2), to a nitrogen atom NH2 Several other human PKMTs O and PRMTs are strongly of lysine or arginine side chains3. Protein substrate speassociated with human cancers, as summarized in TABLE 2. cificity can be stringent in these enzymes; some PKMTs CARM1, coactivator-associated arginine methyltransferase 1 (also known as PRMT4); EHMT2, euchromatic histone lysine Similarly, there is compelling that other seem to selectively methylate a particular lysine residue , half-maximal inhibitory concentration; PMT, protein methyltransferase; N-methyltransferase 2 (also knownevidence as G9A and KMT1C); ICPKMTs 50 PRMT, protein arginine methyltransferase; SAH, -adenosyll-homocysteine knownhistone, as AdoHcy); SAM, S-adenosyll-methionine and PRMTs have a pathogenic role inSserious human on a(also specific and the extent of methylation on (also known as AdoMet); SETD7, SET domain-containing lysine methyltransferase 7 (also known as KMT7); SUV39, suppressor of . For example, SET domain, a single lysine residue (that is, mono-, di- or tri-methyldiseases other than cancer 79 variegation 39; *Selectivity is given as the ratio of the IC50 value for the most potent inhibition at a non-target PMT over the IC50 value 17 and coactivator-associated ation) that is catalysed by a particular enzyme can also bifurcated 1target. (SETDB1) . for the primary See REF. 27
VolUME 8 | SEPTEMBER 2009 | 725 www.nature.com/reviews/drugdisc S9

Target class
A group of proteins that are related by a common type of drug-binding pocket, but sufficiently diverse that selective inhibition of specific proteins can be achieved, using medicinal chemical elaboration of the basic chemotype structures.

SAM
S-adenosyl-l-methionine, the universal methyl group donor of all enzymatic methyltransferase reactions.

NATURE REVIEWS | Drug Discovery 730 | SEPTEMBER 2009 | VolUME 8 NATURE REPRINT COLLECTION Epigenetics

REVIEWS
Table 1 | Epigenetic-enzyme inhibitors for cancer therapy occurs is not yet fully understood, an approach of this type generic name Alernative clinical status* Structureactivity relationship names would nevertheless deplete the protein levels of EZH2 and so abolish the PMT catalytic activity of the enzyme along The relationship between DNMT inhibitors the chemical structure with any other non-enzymatic functions of EZH2. of a compound and its Vidaza 5-azacitidine Approvedof inPMTs the United States been reviewed4, Direct inhibitors have recently pharmacological activity. for myelodysplastic syndrome along with other probes of histone-modifying enzymes. Decitabine Dacogen Approved United States for Some natural ligandsin for these enzymes have been known myelodysplastic for some time, including thesyndrome reaction product, SAH, and a natural inhibitor isolated from Streptomyces spp. cultures, HDAC inhibitors (TABLE 3). More selective inhibitors have been sinefungin Vorinostat Zolinza Approved in United States for cutaneous identified manifestation for SUV39 (chaetocin; reported half-maximal in cutaneous T cell lymphoma inhibitory concentration (IC50) = 0.6 M) and for EHMT2 Romidepsin FK228 New drug application filing (BIX-01294; reported IC50 = 1.6 M), but no further optiPanobinostat LBH-589mization of Phase IIcompounds has been reported to date4. these A co-crystal structure of BIX-01294 bound to EHMT2 Belinostat PXD-101 Phase II has recently been published44. Surprisingly, the compound Entinostat MS-275 Phase II was found to bind to the enzyme non-competitively with SNDX-275 respect to SAM, in a groove that is normally occupied by MGCD-0103 MG-0103 Phase II a portion of the protein substrate. JNJ-26481585 None Phase I two groups have reported potent, selecMore recently, inhibitors of the PRMT CARM1 Givinostat ITF2357 tive, pyrazole-based Phase II (REFS 4547) (TABLE 3)therapies, . These compounds are the first *See REFS 51,52 for comprehensive reviews of novel cancer including those described examples of inhibitors of a specific PMT that are effective in the table. DNMT, DNA methyltransferase; HDAC, histone deacetylase. at nanomolar concentrations and display >100-fold selectivity for the primary target over related enzymes. The 45 was found to series reported by Methylgene becompound stringent (discussed below). Clearly, the structure of be inactive in cellular assays; no cellular data have been 16,24 each enzyme is unique , as is the resulting biology reported for theassociated compound series from BristolMyers and pathobiology with each enzyme. Yet, the 46,47 . Therefore, although exciting step Squibb shared chemical mechanisms of the an PKMTs andfirst PRMTs has been made towards developing selective inhibitors allows for certain efficiencies and economies in the disof PMTs, substantial work remains to be done before covery of selective drugs for these enzymes, by treating these findings can be translated into pharmacologically them as a target class2527. Several of these enzymes tractable species. have been found to catalyse methyl transfer to lysine The paucity of potent, selective, pharmacologically or arginine residues on a number of cellular proteins; tractable inhibitors of the PMTs creates a crucial therathis is especially the case for the PRMTs, for which sevpeutic gap which medicinal chemists should strive to eral cytosolic substrates have been identified24,28. With fill. As described here, the pathobiological relevance of respect to gene regulation, however, the most important these enzymes, together with the structural and mechatargets for both PKMTs and PRMTs are likely to be the nistic information that suggests their druggability as a histones, as post-translational modification these protarget class, converge to make the PMTsof an attractive teins is clearly a determinant of chromatin remodelling and important class of novel enzymes for contemporary and therefore regulation of gene transcription. drug discovery.

REVIEWS

suppressor of variegation 39 (SUV39) family; the SET1 (also known as Mll) family; the SET2 (also known as Conclusions NSD) family; the retinoblastoma protein-interacting There is a growing body of evidence that enzymes inzinc this finger (RIZ) (also known as PRDM) the targetprotein class have important pathogenic rolesfamily; in human SET and MYND domain-containing (SMYD) family; diseases. The structures and enzymatic mechanisms of the of zeste (EZ)that family; and the SUV420 theenhancer PMTs support the view pharmacological modufamily. eighth family, known as others, included the lation An of these enzymes by small-molecule inhibitors enzymes SETD7 and SETD8 known intervention as PRSET7). will be an effective means of(also therapeutic Finally, DoT1l, the human, non-SET in cancer and numerous other unmet domain medical PKMT needs. can be considered a ninth family of PKMTs. of PMTs as The discovery of small-molecule inhibitors starting points for drug development should clearly be our group has recently extended this work to systema key focus of new efforts. Beyond goal, atically identify all ofresearch the SET domains that arethis encoded there are many opportunities to use chemical probes by the human genome. This study has more than doubled of PMT function to define the underlying and the number of putative human PKMTs to biology 52 (51 SET pathobiology that are associated with protein modifidomain proteins plus DoT1l) (l.F. Jerva, K.o. Elliston, cation by these enzymes. The nature of PMT catalysis, V.M.R., M.E.S. and R.A.C., unpublished observations). and the structural information about these From the available perspective of drug discovery, the salient point enzymes, should facilitate the discovery of PMT ligands is that these enzymes are numerous in humans. through mechanismand structure-guided discovery The PRMTs are similarly well represented in humans. 48 , as well as methods do not rely on mechmethods There are at least eight humanthat PRMTs for which some anistic knowledge, such activity as high-throughput screening level of methyltransferase has been shown. These of diverse chemical libraries. proteins have a canonical sequence domain that is associkey remaining question when considering the atedA with the binding sites for the cofactor and substrate PMTs as aalthough drug discovery targetconservation class is whether or (arginine), the sequence among not selective inhibition of particular enzymes can be these proteins is low. Estimates of the total number of achieved through targeting pocket. PRMTs that are encoded by the the SAM-binding human genome vary, This is analogous to the question hindered the early depending on the method of that sequence alignment acceptance ofof protein kinases as drug targets: and the level alignment stringency that is whether applied. it was possible to achieve selectivity among the ATPNevertheless, it is clear that 1050 of these enzymes are binding pockets of the kinases. In retrospect, it is clear represented in humans. that the diversity of binding-site architecture and the The human PMTs are thus a large class of enzymes, binding-site dynamics associated with enzyme catalysis and several of them already have well established disprovide ample opportunities for selective inhibition of ease association (discussed above). Furthermore, owing kinases through medicinal chemistry efforts. Will the to the common features of their chemical mechanism same be true for the SAM-binding pockets of PMTs? of catalysis (discussed below), the PMTs are likely to be Ultimately, structureactivity relationship profiles, selecinherently tractable as targets for small-molecule drug tivity and collateral inhibition of off-target enzymes by intervention. The PMT target class, as defined here, PMT inhibitors will need to be determined empirically. therefore provides an important pool of potential targets Despite these limitations, it is our hope that the data prefor drughere discovery efforts. sented will help to stimulate systematic exploration of the human PMT target class towards the goal of develThe PMT active site oping selective inhibitors of PMTs as therapeutic agents The the PMTs as a drug target class is facilitated for pursuit human of diseases. by a rich literature base of crystallographic and enzyme Representation of PMTs in the human genome kinetic studies of these enzymes that have helped to The PMT represented in manyW., species, and define mechanisms catalysis. All of enzymes Dillon, S. C., of Zhang, X., Trievel, R. C.these & Cheng, X. 9. Tsankova, N., Renthal, Kumar, A. & Nestler, E. J. their16. 1. Strahl, B. D. & Allis, C. D. The language of target covalentclass is SET-domain protein superfamily: protein lysine Epigenetic psychiatric disorders. histone modifications. Nature 403 , 4145 (2000). the human genome encodes severalregulation PKMTsinand PRMTs. probably use a The common bimolecular nucleophilic submethyltransferases. Genome Biol. 6,3,24,30 227 (2005). Nature Rev. Neurosci. 8, 355367 (2007). 2. Kouzarides, T. Chromatin modifications and their 2) methyl transfer mechanism . The Attempts to quantify the representation of PKMTs in a stitution (S 17. Ryu, H. et al. ESET/SETDB1 gene expression and lone 10. Kleer, C. G. et al. EZH2 is a marker of aggressive function. Cell 128, 693705 (2007). N H3 (K9) trimethylation Huntingtons disease. cancer and promotes neoplastic transformation A thorough overview of post-translational pair electrons ofhistone a nitrogen atom (from in lysine or arginine) particular organism, andbreast to understand the relatedProc. Natl Acad. Sci. USA 103, 1917619181 (2006). of breast epithelial cells. Proc. Natl Acad. Sci. USA modifications on core histones, the enzymes that methylsulphonium cation ness of these proteins to one have focused on attacks the 18.electrophilic Cheng, D., Cote, J., Shaaban, S. & Bedford, M. T. 100,another, 1160611611 (2003). mediate these modifications and the biological The angle arginineto methyltransferase regulates the a 11.ofKrivtsov, A. V. et al. H3K79 methylation define at a 180 functions of the modification. of SAM the leaving CARM1 group, to form the sequence alignment the SET domain because, as profiles coupling of transcription and mRNA processing. murine and human MLL-AF4 leukemias. Cancer Cell 3. Smith, B. C. & Denu, J. M. Chemical mechanisms penta-coordinate carbon transition state. The transition discussed above, this domain is common to all PKMTs Mol. Cell 25 , 7183 (2007). 14 , 355368 (2008). of histone lysine and arginine modifications. 19. Li, Y. et collapses, al. Role of the with histone H3 lysine group 4 12. Jansson, M. et al. Arginine methylation regulates Biochim. Biophys. Acta 1789 , 4557 (2008). statethe structure then methyl relo(except DoT1l). methyltransferase, SET7/9, in the regulation of p53 response. Nature Cell Biol. 10, 14311439 An excellent review of the chemical biology of cation to the nitrogen atom of the lysine or arginine side Several attempts have been NF-B-dependent inflammatory genes. Relevance to (2008).made to systematically lysine- and arginine-modifying enzymes. diabetes and inflammation. J. Biol. Chem. 283, 13. Hong, H. et al. Aberrant expression of CARM1, 4. Cole, P. A. Chemical probes for histone-modifying SAH; chain and formation of S -adenosyl-l-homocysteine ( group the PKMTs on the basis of sequence homology and a transcriptional coactivator of androgen receptor, enzymes. Nature Chem. Biol. 4, 590597 (2008). 2677126781 (2008). 14,29 . For example, in inthe 2007, a nomenclature con- and 2) as methyltransferase a product. CARM1 also known20. as AdoHcy) substrate development of prostate carcinoma 5. Keppler, B. R. & Archer, T. K. Chromatin-modifying Covic, M. et (FIG. al. Arginine androgen-independent status. Cancer 101, 8389 enzymes as therapeutic targets Part 1. is a promoter-specific regulator of NF-B-dependent vention was proposed for the PKMTs, along with other The use of a naturally occurring adenosyl analogue SAH Expert Opin. Ther. Targets. 12 (2004). , 13011312 (2008). gene expression. EMBO J. 24, 8596 (2005). 29 S-adenosyll-homocysteine, . In this study, types of chromatin-modifying enzymes asT.the universal group transfer donor by is remi14. Schneider, R., Bannister, A. J. & Kouzarides, 6. Pray, L. At the flick of a switch: epigenetic drugs. 21. Hassa, P. O., Covic, M., Bedford, M. PMTs T. & Hottiger, M. O. Unsafe SETs: histone lysine methyltransferases Chem. Biol. 15 , 640641 (2008). Protein arginine methyltransferase 1 coactivates NF-Bthe universal product of all 24 human PKMTs were identified. These SET domain niscent of protein kinases another large with family of and cancer. Trends Biochem. Sci. 27 , 396402 7. Jones, P. A. & Baylin, S. B. The epigenomics of cancer. dependent gene expression synergistically CARM1 enzymatic methyltransferase (2002). Cell 128, 683692 (2007). PKMTs have been divided and PARP1. J. Mol. Biol.ATP-binding 377, 668678 (2008). into related families on the druggable enzyme targets, the pockets reactions, formed 15. Simon, J. A. & Lange, C. A. Roles of the EZH2 histone 8. Wilson, C. by B., Rowell, E. & Sekimata, M. 22. Huang, J. et al. Trimethylation of histone H3 lysine 4 basis of sequence alignment; initially four, and later of which have proved to be highly tractable targets methyl group transfer from methyltransferase in cancer epigenetics. Mutat. Res. Epigenetic control of T-helper-cell differentiation. by Set1 in the lytic infection of human herpes simplex 14,16 647, 2129 Nature Rev. Immunol. 9, 91105 (2009). virus 31,32 1. J..Virol. 80, 57405746 (2006). binding a l-methionine. Furthermore, despite S-adenosylseven, major families were defined in (2008). this manner : the for drug discovery
NATURE REVIEWS | Drug Discovery 726 | SEPTEMBER 2009 | VolUME 8 S10 VolUME 8 COLLECTION | SEPTEMBEREpigenetics 2009 | 731 www.nature.com/reviews/drugdisc NATURE REPRINT

REVIEWS
focus on the PMTs, and in particular on those aspects
Acetyltransferases

REVIEWS
PKMTs

that make PMTs attractive for drug 44. Chang, targets Y. et al. Structural basisdiscovery for G9a-like protein 64. Rosati, R. et al. NUP98 is fused to the NSD3 gene 23. Jeong, S. J. et al. Coactivator-associated arginine 52 family Deacetylasesin acute myeloid leukemia associated with lysine methyltransferase inhibition by BIX-01294. t(8;11) methyltransferase 1 enhances transcriptional activity members efforts. We summarize the data that contribute to Nature Struct. Mol. Biol. 16, 312317 (2009). ~18 family (p11.2;p15). Blood 99, 38573860 (2002). of the human T-cell lymphotropic virus type 1 long Me the validation as M. targets specific human 45. Allan, et al. Nfor -Benzyl-1-heteroaryl-3-(trifluorometh 65. Tonon, G. et al. High-resolution genomic profiles of terminal repeat through direct interaction with Tax. of PMTs Demethylases members yl)-1H-pyrazole-5-carboxamides as inhibitors of human lung cancer. Proc. Natl Acad. Sci. ~30 USA family 102, J. Virol. 80, 1003610044 (2006). diseases, as well as the structural and mechanistic data co-activator associated arginine methyltransferase 1 96259630 (2005). 24. Cheng, X., Collins, R. E. & Zhang, X. Structural and K members Ac Y. et al. hDOT1L links Me that suggest PMTs are(CARM1). a tractable is, druggable) Bioorg(that Med. Chem. Lett. 19, 12181223 66. Okada, histone methylation sequence motifs of protein (histone) methylation (2009). to leukemogenesis. Cell 121, 167178 (2005). enzymes. Annu. Rev. Biophys. Biomol. Struct. 34, PRMTs K target class. The first examples of potent, drug-like inhibitors 67. Bitoun, E., Oliver, P. L. mixed267294 (2005). R & Davies, K. E. The 10 family P Kinases of a human PMT. lineage leukemia fusion partner AF4 stimulates RNA 25. Goldstein, D. M., Gray, N. S. & Zarrinkar, P. P. members S 46. Purandare, A. V. et al. Pyrazole inhibitors of polymerase II transcriptional elongation and mediates High-throughput kinase profiling as a platform for drug PKMTs and PRMTs incoactivator human associated diseasearginine methyltransferaseLigases 1 coordinated discovery. Nature Rev. Drug Discov. 7, 391397 (2008). Ub chromatin remodeling. Hum. Mol. Genet. (CARM1). Bioorg Med. Chem. Lett. of 18, the 44384441 16, 92106 (2007). 26. Mook, R. A. The importance and In complexity of target surveying the histone-modifying enzymes K (2008). 68. Hamamoto, R. et al. SMYD3 encodes a histone class selectivity in drug discovery. The American the enzymes catalyse methylation 47. Huynh, T. that et al. Optimization of pyrazole inhibitors of methyltransferase involved in the proliferation of Association for Cancer Research human Education genome, Book cancer cells. Nature Cell Biol. 6, 731740 (2004). 223226 (The American Association for Cancer coactivator associated arginine methyltransferase 1 of lysine residues (protein lysine methyltransferases 69. Hamamoto, R. et al. Enhanced SMYD3 expression Research, Philadelphia, 2005). (CARM1). Bioorg Med. Chem. Lett. 19, 29242927 (PKMTs)) and arginine residues (protein arginine is essential for the growth of breast cancer cells. 27. Copeland, R. A. Evaluation of Enzyme Inhibitors in (2009). Cancer Sci. 97, 113118 (2006). Drug Discovery: A Guide for Medicinal Chemists and 48. Copeland, R. A., R. & Luo, L. in Textbook methyltransferases (PRMTs)) are ofGontarek, substantial interest 70. Bracken, A. P. et al. EZH2 is downstream of the Pharmacologists (Wiley, Hoboken, 2005). of Drug Design and Discovery 4th edn Ch. 12 (eds. from the perspective of drug discovery and medicinal pRB-E2F pathway, essential for proliferation and 28. Cheng, D. et al. Small molecule regulators of protein Krogsgaard-Larsen, P., Madsen, U. & Stromgaard, K.) amplified in cancer. EMBO J. 22, 53235335 arginine methyltransferases. J. Biol. Chem. 279, The action378407 (Taylor and Francis, New York, 2009). chemistry. of these enzymes is crucial in (2003). 2389223899 (2004). 49. Troffer-Charlier, N., Cura, V., Hassenboehler, P., controlling gene regulation, and there is an increasing Figure 1 71. | A nucleosome the post-translational Varambally, S. etand al. The polycomb group protein 29. Allis, C. D. et al. New nomenclature for chromatinMoras, D. & Cavarelli, J. Functional insights from EZH2 modifications is involved in progression prostate cancer. modifying enzymes. Cell 131, 633636 (2007). structures of coactivator-associated arginine histone protein thatof can influence amount of biochemical and biological data to suggest Nature 419, 624629 (2002). 30. Zhang, X. & Bruice, T. C. Enzymatic mechanism and methyltransferase 1 domains. EMBO J. 26, Nature Reviews | Drug Discovery epigenetic regulation of gene transcription. that the enzymatic activities of several of these proteins 72. Subramanian, K. et al. Regulation of estrogen product specificity of SET-domain protein lysine 43914401 (2007). receptor alpha by theprotein SET7 lysine methyltransferase. methyltransferases. Proc. Natl Acad. Sci. USA 105, 50. in Couture, J.-F., Collazo, E., Brunzelle, J. S. & Trievel, R. C. Modifications of the histone tail are shown: have pathogenic roles cancer, inflammatory diseases, Mol. Cell 30, 336347 (2008). 57285732 (2008). Structural and functional analysis of SET8, a histone changes73. in acetylation (Ac) by acetyltransferases and neurodegenerative diseases and other conditions Nishikawa, N. et al. Gene amplification and This work provides a detailed theoretical basis H4 Lys-20 methyltransferase. Genes Dev.of 19, deacetylases, phosphorylation (P) by ubiquitylation 715 overexpression of PRDM14 inkinases, breast cancers. to explain the substrate specificity of the protein 14551465 (2005). . importance Cancer 67, 96499657 (2007). (Me) by 51. Ma, W. W. & Adjei, A. A. Novel agents on the horizon lysine methyltransferases. (Ub) by ligases andRes. changes in methylation For map example, with the exception of DoT1-like, histone 74. Majumder, S., Liu, Y., Ford, O. H., 3rd, Mohler, J. L. for cancer therapy. CA Cancer J. Clin. 59, 111137 31. Fedorov, O. et al. A systematic interaction of methyltransferases and demethylases. The enzyme families (2009). validated kinase inhibitors with Ser/Thr kinases. & Whang, Y. E. Involvement of arginine H3 methyltransferase (DoT1l; also known as KMT4), that are responsible for theCARM1 various A review of the current knowledge on how aberrant Proc. Natl Acad. Sci. USA 104, 2052320528 (2007). methyltransferase in post-translational androgen receptor all analysis human contain a ~130 amino-acid domain, 32. Karaman, M. W. et al. A quantitative of PKMTs kinase epigenetic mechanisms can contribute to the modifications function and prostate cancer cell viability. Prostate are shown. PKMT, protein lysine inhibitor selectivity. Nature Biotech. 26, 127132 of cancer and the progress in 66, 12921301 (2006). referred to as the SET development domain, which constitutes the methyltransferase; protein arginine methyltransferase. (2008). developing therapies that target these 75. Frietze,PRMT, S., Lupien, M., Silver, P. A. & Brown, M. 33. Min, J., Feng, Q., Li, Z., Zhang, Y.catalytic & Xu, R. M. domain of these CARM1 regulates estrogen-stimulated breast cancer mechanisms. enzymes1416. Enhancer of Structure of the catalytic domain of human DOT1L, growth through up-regulation of E2F1. Cancer Res. 52. Cortez, C. C. & Jones, P. A. Chromatin, cancer and zeste homologue 2 (EZH2; also known as KMT6) is a a non-SET domain nucleosomal histone 68, 301306 (2008). drug therapies. Mutat. Res. 647, 4451 (2008). methyltransferase. Cell 112, 711723 53. Kang, M. Y. et al. Association the SUV39H1 SET (2003). domain protein that forms the catalytic of subunit of histone 76. Zhao, Q. et al. PRMT5-mediated methylation of 34. Sawada, K. et al. Structure of the conserved core of histone H4R3 recruits DNMT3A, coupling histone and methyltransferase with the DNA methyltransferase 1 arginine methyltransferase 1 (CARM1; also known as the 45protein core ofat polycomb repressive complex 2 cancer. the yeast Dot1p, a nucleosomal histone H3 lysine 79 DNA methylation in gene silencing. Nature Struct. mRNA expression level in primary colorectal 18 methyltransferase. J. Biol. Chem. 279 , 4329643306 Mol. Biol. 16, 304311 (2009). Int. J. Cancer 121, 21922197 (2007). been implicated in the neurodegeneraPRMT4) have (PRC2). PRC2 is a PKMT that catalyses the methyla(2004). 77. Patnaik, D. et al. Substrate specificity and kinetic 54. Watanabe, H. et al. Deregulation of histone lysine Huntingtons disease spinal muscular tion of lysineto 27 of histone H3 (in the nomenclature of tive diseases 35. Copeland, R. A. Enzymes: A Practical Introduction mechanism of mammalian G9a and histone H3 methyltransferases contributes to oncogenic Structure, Mechanism and Data Analysis 2nd edn methyltransferase. J. Biol. Chem. 279, 5324853258 transformation of human bronchoepithelial cells. SET domain-containing lysine histone modification, this site is referred to as H3K27). atrophy, respectively. (Wiley, Hoboken, 2000). (2004). Cancer Cell Int. 8, 15 (2008). 19 , methyltransferase 7 Patnaik, (SETD7; also known as KMT7) Although EZH2 contains the active site, all of 36. Couture, J. F., Hauk, G., Thompson, M. J., 78. Chin, H. G., D., Esteve, P.-O., Jacobsen, S. E. 55. Kondo, Y. catalytic et al. Downregulation of histone H3 lysine 9 Blackburn, G. M. & Trievel, R. C. Catalytic roles for of the PRC2 & Pradhan, S. Catalytic properties and21) kinetic methyltransferase G9a induces centrosome 20) and PRMT1 (REF. have been CARM1 (REF. the proteins complex are required for fulldisruption carbonoxygen hydrogen bonding in SET domain mechanism of human recombinant lys-9 histone H3 and chromosome instability in cancer cells. PLoS One nuclear factor-B-related inflammaPKMT of EZH2 or another associated with lysine methyltransferases. J. Biochem. 281, activity. overexpression methyltransferase SUV39H1: participation of the 3, e2037 (2008). 1928019287 (2006). chromodomain enzymatic catalysis. Biochemistry 56. Tkachuk, D., Kohler, S. & Cleary, M. L. Involvement and SET in domain-containing protein 1A PRC2 subunit, suppressor of zeste 12 homologue tory diseases, 37. Collins, R. E. et al. In vitro and in vivo analyses 45, 32723284 (2006). of a homolog of Drosophila trithorax by 11q23 22 and CARM1 (REF. 23) have been associated (SUZ12), has with numerous human of a Phe/Tyr switch controlling product specificity of been associated 79. Greiner, D., Bonaldi, T., Eskeland, R., Roemer, E. & chromosomal translocations in acute leukemias. (SETD1A) histone lysine methyltransferases. J. Biol. Chem. 280 , Imhof, A. Identification ofHerpes a specific simplex inhibitor of the and Cell 71, 691700 (1992). involving virus cancer types, including prostate, breast, bladder, colon, with viral infections 55635570 (2005). histone methyltransferase SU(VAR)39. Nature 57. Gu, Y. et al. The t(4;11) chromosome translocation of human T lymphotrophic virus,(2005). respectively. PKMTs and liver, lung acute and leukemias gastric cancers, as well This study provides a structuralskin, basis for the endometrial, Chem. Biol. 1, 143145 human fuses the ALL-1 gene, related 15 trithorax, to the AF-4 gene. Cell 71, wide range of lysine methylation patterns that is and myelomas Kubicek, S. et al. Reversal as of H3K9me2 by targets for to Drosophila . In breast carcinomas, PRMTs 80. are therefore emerging compelling as lymphomas achieved by different SET domain PKMTs. a small-molecule inhibitor for the G9a histone 701708 (1992). 4,5 . increased levels of EZH2 have been shown to correlate drug discovery efforts 38. Trievel, R. C., Flynn, E. M., Houtz, R. L. & Hurley, J. H. methyltransferase. Mol. Cell 25, 473481 (2007). 58. Liedtke, M. & Cleary, M. L. Therapeutic targeting of Mechanism of multiple lysine methylation by the SET MLL. Blood 113, 60616068 (2009). with increased invasiveness and proliferation rate; it has domain enzyme Rubisco LSMT. Nature Struct. Biol. 59. Wang, G. G., Cai, L., Pasillas, M. P. & Kamps, M. P. Acknowledgements PMTs aare drug target class C. T. Walsh, H. R. Horvitz, been suggested that EZH2 could be a prognostic indica10, 545552 (2003). NUP98-NSD1 links H3K36 methylation to Hox-A gene as We grateful to K. Shiosaki, 39. Zhang, X. et al. Structural basis for the product activation and leukaemogenesis. Nature Cell Biol. 9, Y. Zhang, and R. Gould for their insights, constant support and 10 . In cell culture, From a chemical biology and medicinal chemistry tor of patient outcome for breast cancer specificity of histone lysine methyltransferases. 804812 (2007). encouragement. We also thank K. Boater, E. Olhava, L. Jin Mol. Cell 12, 177185 (2003). overexpression of EZH2 60. Marango, J. et epithelial al. The MMSET protein is a histone and T. Luly forPKMTs expert helpand in preparation of this manuscript. in breast cells causes perspective, the PRMTs are of interest 40. Frederiks, F. et al. Nonprocessive methylation by Dot1 methyltransferase with characteristics of a anchorage-independent cell growth and increased because they have a common mechanism of catalysis leads to functional redundancy of histone H3K79 transcriptional corepressor. Blood 111, 31453154 Competing interests statement methylation states. Nature Struct. Mol. Biol. 15, Additionally, (2008). The below), authors declare competing financial interests: see web As invasiveness. when EZH2-overexpressing (discussed involving a small, organic cofactor. 550557 61. Kim, J. Y. et al. Multiple-myeloma-related WHSC1/ version for details. Target class (2008). were the mammary fatispads of nude other druggable classes of enzymes, such as the protein 41. Chiang, P. K. Biological effects of cells inhibitors of injected intoMMSET isoform RE-IIBP a histone A group of proteins that are S-adenosylhomocysteine hydrolase. Pharmacol. methyltransferase with transcriptional repression mice, the animals developed tumours, demonstrating the kinases, share this mechanistic feature, it is likely that the DATABASES related by a common type Ther. 77, 115134 (1998). activity. Mol. Cell Biol. 28, 20232034 (2008). UniProtKB: http://www.uniprot.org tumorigenicity of EZH2 overexpression. Importantly, the PMTs will be similarly amenable to inhibition by small, of drug-binding pocket, 42. Bender, C. M., Zingg, J.-M. & Jones, P. A. DNA 62. Lauring, J. et al. The multiple myeloma associated CARM1 | DOT1L | EHMT2 | EZH2 | PRMT1 | SETD7 | SETD8 | methylationdiverse as a target Pharm. Res. MMSET gene contributes are to cellular adhesion, organic molecules. but sufficiently thatfor drug design. phenotypic effects of EZH2 overexpression correlated SETD1A | SETDB1 | SUZ12 15, 175187 (1998). clonogenic growth, and tumorigenicity. Blood 111, with increased H3K27 methylation and are dependent on The PKMTs and PRMTs catalyse methyl transfer from 43. Fiskus, W. et al. Panobinostat treatment depletes 856864 (2008). FURTHER INFORMATION proteins can be achieved, EZH2 and DNMT1 levels and enhances decitabine 63. Angrand, P. O. et al. NSD3, a new SET domain- their universal the presence of an intact SET domain, both of which imply methyl donor, S-adenosyl-l-methionine Authors homepage: http://www.epizyme.com using medicinal chemical of JunB and loss of survival mediated de-repression containing gene, maps to 8p12 and is amplified in 10,15 All liNks Are AcTive iN THe oNliNe . SAM; also known as AdoMet) (FIG. 2),PDf to a nitrogen atom a role for EZH2 enzymatic activity in pathogenesis ( of human leukemia cells. Cancer Biol. Ther. 8, human breast cancer cell lines. Genomics 74, 7988 elaboration ofacute the basic 939950 (2009). (2001). and PRMTs are strongly Several other human PKMTs of lysine or arginine side chains3. Protein substrate speselective inhibition of specific chemotype structures.

SAM
S-adenosyl-l-methionine, the universal methyl group donor of all enzymatic methyltransferase reactions.

associated with human cancers, as summarized in TABLE 2. Similarly, there is compelling evidence that other PKMTs and PRMTs have a pathogenic role in serious human diseases other than cancer 79. For example, SET domain, bifurcated 1 (SETDB1)17 and coactivator-associated

cificity can be stringent in these enzymes; some PKMTs seem to selectively methylate a particular lysine residue on a specific histone, and the extent of methylation on a single lysine residue (that is, mono-, di- or tri-methylation) that is catalysed by a particular enzyme can also
VolUME 8 | SEPTEMBER 2009 | 725 www.nature.com/reviews/drugdisc S11

NATURE REVIEWS | Drug Discovery 732 | SEPTEMBER 2009 | VolUME 8 NATURE REPRINT COLLECTION Epigenetics

ARTICLE
First published in Nature 476, 298303 (2011); doi: 10.1038/nature10351

doi:10.1038/nature10351

Frequent mutation of histone-modifying genes in non-Hodgkin lymphoma


Ryan D. Morin1*, Maria Mendez-Lago1*, Andrew J. Mungall1, Rodrigo Goya1, Karen L. Mungall1, Richard D. Corbett1, Nathalie A. Johnson2, Tesa M. Severson1, Readman Chiu1, Matthew Field1, Shaun Jackman1, Martin Krzywinski1, David W. Scott2, Diane L. Trinh1, Jessica Tamura-Wells1, Sa Li1, Marlo R. Firme1, Sanja Rogic2, Malachi Griffith1, Susanna Chan1, Oleksandr Yakovenko1, Irmtraud M. Meyer3, Eric Y. Zhao1, Duane Smailus1, Michelle Moksa1, Suganthi Chittaranjan1, Lisa Rimsza4, Angela Brooks-Wilson1,5, John J. Spinelli6,7, Susana Ben-Neriah2, Barbara Meissner2, Bruce Woolcock2, Merrill Boyle2, Helen McDonald1, Angela Tam1, Yongjun Zhao1, Allen Delaney1, Thomas Zeng1, Kane Tse1, Yaron Butterfield1, Inan Birol1, Rob Holt1, Jacqueline Schein1, Douglas E. Horsman2, Richard Moore1, Steven J. M. Jones1, Joseph M. Connors2, Martin Hirst1, Randy D. Gascoyne2,8 & Marco A. Marra1,9

Follicular lymphoma (FL) and diffuse large B-cell lymphoma (DLBCL) are the two most common non-Hodgkin lymphomas (NHLs). Here we sequenced tumour and matched normal DNA from 13 DLBCL cases and one FL case to identify genes with mutations in B-cell NHL. We analysed RNA-seq data from these and another 113 NHLs to identify genes with candidate mutations, and then re-sequenced tumour and matched normal DNA from these cases to confirm 109 genes with multiple somatic mutations. Genes with roles in histone modification were frequent targets of somatic mutation. For example, 32% of DLBCL and 89% of FL cases had somatic mutations in MLL2, which encodes a histone methyltransferase, and 11.4% and 13.4% of DLBCL and FL cases, respectively, had mutations in MEF2B, a calcium-regulated gene that cooperates with CREBBP and EP300 in acetylating histones. Our analysis suggests a previously unappreciated disruption of chromatin biology in lymphomagenesis. Non-Hodgkin lymphomas (NHLs) are cancers of B, T or natural killer lymphocytes. The two most common types of NHL, follicular lymphoma (FL) and diffuse large B-cell lymphoma (DLBCL), together comprise 60% of new B-cell NHL diagnoses each year in North America1. FL is an indolent and typically incurable disease characterized by clinical and genetic heterogeneity. DLBCL is aggressive and likewise heterogeneous, comprising at least two distinct subtypes that respond differently to standard treatments. Both FL and the germinal centre B-cell (GCB) cell of origin (COO) subtype of DLBCL derive from germinal centre B cells, whereas the activated B-cell (ABC) variety, which has a more aggressive clinical course, is thought to originate from B cells that have exited, or are poised to exit, the germinal centre2. Current knowledge of the specific genetic events leading to DLBCL and FL is limited to the presence of a few recurrent genetic abnormalities2. For example, 8590% of FL and 3040% of GCB DLBCL cases3,4 harbour t(14;18)(q32;q21), which results in deregulated expression of the BCL2 oncoprotein. Other genetic abnormalities unique to GCB DLBCL include amplification of the c-REL gene and of the miR-17-92 microRNA cluster5. In contrast to GCB cases, 24% of ABC DLBCLs harbour structural alterations or inactivating mutations affecting PRDM1, which is involved in differentiation of GCB cells into antibody-secreting plasma cells6. ABCspecific mutations also affect genes regulating NF-kB signalling7,8,9, with TNFAIP3 (also known as A20) and MYD88 (ref. 10) the most abundantly mutated in 24% and 39% of cases, respectively. To enhance our understanding of the genetic architecture of B-cell NHL, we undertook a study to (1) identify somatic mutations and
1

(2) determine the prevalence, expression and focal recurrence of mutations in FL and DLBCL. Using strategies and techniques applied to cancer genome and transcriptome characterization by ourselves and others11,12,13, we sequenced tumour DNA and/or RNA from 117 tumour samples and 10 cell lines (Supplementary Tables 1 and 2) and identified 651 genes (Supplementary Figure 1) with evidence of somatic mutation in B-cell NHL. After validation, we showed that 109 genes were somatically mutated in two or more NHL cases. We further characterized the frequency and nature of mutations within MLL2 and MEF2B, which were among the most frequently mutated genes with no previously known role in lymphoma.

Identification of recurrently mutated genes


We sequenced the genomes or exomes of 14 NHL cases, all with matched constitutional DNA sequenced to comparable depths (Supplementary Tables 1 and 2). After screening for single nucleotide variants followed by subtraction of known polymorphisms and visual inspection of the sequence read alignments, we identified 717 nonsynonymous variants (coding single nucleotide variants; cSNVs) affecting 651 genes (Supplementary Figure 1 and Supplementary Methods). We identified between 20 and 135 cSNVs in each of these genomes. Only 25 of the 651 genes with cSNVs were represented in the cancer gene census (December 2010 release)14. We performed RNA sequencing (RNA-seq) on these 14 NHL cases and an expanded set of 113 samples comprising 83 DLBCL, 12 FL and 8 B-cell NHL cases with other histologies and 10 DLBCL-derived cell lines (Supplementary Table 2). We analysed these data to identify

Canadas Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, British Columbia V5Z 1L3, Canada. 2Centre for Lymphoid Cancer, BC Cancer Agency, Vancouver, British Columbia V5Z 1L3, Canada. 3Centre for High-throughput Biology, Department of Computer Science, Vancouver, British Columbia V6T 1Z4, Canada. 4Department of Pathology, University of Arizona, Tucson, Arizona 85724, USA. 5Department of Biomedical Physiology and Kinesiology, Simon Fraser University, Burnaby, British Columbia V5A 1S6, Canada. 6Cancer Control Research, BC Cancer Agency, Vancouver, British Columbia V5Z 1L3, Canada. 7School of Population and Public Health, University of British Columbia, Vancouver, British Columbia V6T 1Z3, Canada. 8Department of Pathology, University of British Columbia, Vancouver, British Columbia V6T 2B5, Canada. 9Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia V6H 3N1, Canada. *These authors contributed equally to this work.

2 9 8 | N AT U R E | VO L 4 7 6 | 1 8 AU G U S T 2 0 1 1

S12

NATURE REPRINT COLLECTION Epigenetics

ARTICLE RESEARCH RESEARCH ARTICLE


novel fusion transcripts (Supplementary Table 3) and cSNVs (Fig. 1). 40 30 We identified 240 genes with at least one cSNV in a genome/exome or ABC GCB an mutation hot spot (see later), and with cSNVs 20 in at least U RNA-seq FL 10 three cases in total (Supplementary Table 4). We selected cSNVs from MYD88 CD79B each of these 240 genes for re-sequencing to confirm their somatic BCL6s TNFAIP3 status. We did not re-sequence genes with previously documented CARD11 FAS mutations in lymphoma (for example, CD79B, BCL2). We confirmed TMEM30A the somatic status of 543 cSNVs in 317 genes, with 109 genesCD58 having at CD70 least two confirmed somatic mutations (Supplementary Tables 4 and STAT3 ETS1 5). Of the successfully re-sequenced cSNVs predicted from the genHIST1H1C CCND3 omes, 171 (94.5%) were confirmed somatic, 7 were false calls and 3 KLHL6 BTG1 genes were present in the germ line. These 109 recurrently mutated BTG2 were significantly enriched for genes implicated in lymphocyte IRF8activaB2M 24 tion (P 5 8.3 3 10 ; for example, STAT6, BCL10), lymphocyte EP300 dif23 CREBBP ferentiation (P 5 3.5 3 10 ; for example, CARD11), and regulation MLL2 of apoptosis (P 5 1.9 3 1023; for example, BTG1, BTG2).FOXO1 Also sigTNFRSF14 MEF2B nificantly enriched were genes linked to transcriptional regulation TP53 (P 5 5.4 3 1024; for example, TP53) and genes involved inBCL2 methyla24 22 SGK1 including tion (P 5 2.2 3 10 ) and acetylation (P 5 1.2 3 10 ), GNA13 EZH2 histone methyltransferase (HMT) and acetyltransferase (HAT) BCL2s enzymes known previously to be mutated in lymphoma (for example, <0.05 EZH2 (ref. 13) and CREBBP (ref. 15); Supplementary Methods). 0.10.05 0.30.1 Mutation hot spots can result from mutations at sites under strong selective pressure and we have previously identified such sites using Figure 2 | Overview of mutations and potential cooperative interactions in RNA-seq data13. We searched our RNA-seq data for genes with mutaNHL. This heat map displays possible trends towards co-occurrence (red) and tionexclusion hot spots, and identified genes that not mutated in the 14 mutual (blue) of somatic 10 mutations and were structural rearrangements. genomes (PIM1, by FOXO1 , CCND3 , TP53 IRF4 , and BTG2 , CD79B, Colours were assigned taking the minimum value, of a leftright-tailed BCL7A , IKZF3 B2M ), of awhich (FOXO1 , CCND3 , BTG2 Fishers exact test. Toand capture trends P-valuefive threshold of 0.3 was used, with ,
MYD88 CD79B BCL6s TNFAIP3 CARD11 FAS TMEM30A CD58 CD70 STAT3 ETS1 HIST1H1C CCND3 KLHL6 BTG1 BTG2 IRF8 B2M EP300 CREBBP MLL2 FOXO1 TNFRSF14 MEF2B TP53 BCL2 SGK1 GNA13 EZH2 BCL2s Cases

the darkest shade of the colour indicating those meeting statistical significance (P # 0.05). The relative frequency of mutations in ABC (blue), GCB (red), TNFRSF14 unclassifiable (black) DLBCLs and FL (yellow) cases is shown on cSNVs the left. Genes BCL10 CD58 CNV loss were arranged with those having significant (P , 0.05, Fishers exact test) BTG2 CNV gain enrichment for mutations in ABC cases (blue triangle) towards CNV the top (andgain high-level LOH left) and those with significant enrichment for mutations in GCB cases (red MEF2B triangle) towards CD70 the bottom (and right). The total number of cases in which BCL2 each gene contained either cSNVs or confirmed somatic mutations is shown at GNA13 STAT3 the top. The cluster of blue squares (upper-right) results from the mutual CD79B MYD88 TP53 exclusion of the ABC-enriched mutations (for example, MYD88, CD79B ) from IRF8 the GCB-enriched mutations (for example, EZH2, GNA13). Presence of structural rearrangements involving the two oncogenes BCL6 and BCL2 CREBBP KLHL6 BCL2 (indicated as BCL6s and BCL2s) was determined with FISH techniques using B2M probes (Supplementary Methods). break-apart IGH

chromosome (BAC) clone sequencing in eight FL cases to show that in all eight cases the mutations were in trans, affecting both MLL2 alleles. This observation is consistent with the notion that there is a complete, FOXO1 or near-complete, loss of MLL2 in the tumour cells of such patients. BTG1 With the exception of two primary FL cases and two DLBCL cell MLL2 HIST1H1C lines (Pfeiffer and SU-DHL-9), the majority of MLL2 mutations seemed CCND3 ETS1 TMEM30A to be heterozygous. Analysis of Affymetrix 500k SNP array data from SGK1 two FL cases with apparent homozygous mutations revealed CARD11 that both FAS tumours showed copy number neutral loss of heterozygosity (LOH) EZH2 for the region of chromosome 12 containing MLL2 (Supplementary Methods). in addition to bi-allelic mutation, LOHtargets is a second, Figure 1 Thus, | Genome-wide visualization of somatic mutation in NHL. Overview of structural rearrangements and copy number variations (CNVs) in albeit less common mechanism by which MLL2 function is lost. the 11 DLBCL and cSNVs in the 109 recurrently mutated genes MLL2 was thegenomes most frequently mutated gene in FL, and among the identified in ourmutated analysis. Inner arcs fusion most frequently genes inrepresent DLBCLsomatic (Fig. 2). Wetranscripts confirmed identified in at least one of the 11 genomes. The CNVs and LOH detected in MLL2 mutations in 31 of 35 FL patients (89%), in 12 of 37 DLBCL each of the 11 DLBCL tumour/normal pairs are displayed on the concentric sets patients (32%), in 10 of 17 DLBCL cell lines (59%) and in none of the of rings. The inner 11 rings show regions of enhanced homozygosity plotted eight normal centroblast samples we sequenced. Our analysis prewith blue (interpreted as LOH). The outer 11 rings show somatic CNVs. Purple dicted that the majority of the somatic observed somatic in MLL2 circles indicate the position of genes with mutations at least two confirmed mutations with circle diameter proportional to theframe number ofwere cases with cSNVs were inactivating (91% disrupted the reading or truncatin that gene. Circles representing the genes with evidence ingdetected point mutations), indicating to us that MLL2 is significant a tumour supfor positive selection are pressor of significance inlabelled. NHL. Coincidence between recurrently mutated
genes and regions of gain/loss are colour-coded in the labels (green, loss; red, gain). For example B2M, which encodes beta-2-microglobulin, is recurrently Recurrent point mutations in MEF2B mutated and is deleted in two cases.

Our selective pressure analysis also revealed genes with stronger pressure for acquisition of amino acid substitutions than for nonsense
NATURE REPRINT COLLECTION Epigenetics

IKZF3 and B2M) were not previously known targets of point mutation in NHL (Supplementary Table 6 and Supplementary Methods). FOXO1, BCL7A and B2M had hot spots affecting their start codons. The effect of a FOXO1 start codon mutation, which was observed in MLL2 three cases, was further studied using a cell line in which the initiating COG5141 SET HMG box PHD PHD ATG was mutated to TTG. Western blots probed with a FYRN FOXO1 FYRC antibody revealed a band with a reduced molecular weight, indicative 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 bp of a FOXO1 amino-terminal truncation (Supplementary Figure 2), N81K b N81Y consistent with use D83G of the next in-frame ATG for translation iniY69C tiation. A second hot spot in FOXO1 at T24 was mutated in two cases. D83V Y69H T24 is reportedly phosphorylated by AKT subsequent to B-cell recep16 tor (BCR) stimulation inducing FOXO1 nuclear export. D83A K4E MEF2B We analysed MADS box MEF2 the RNA-seq data to determine whether any of the somatic mutations in the 109 mutated genes showed evid0 50 100 150 recurrently 200 250 300 350 bp ence for allelic imbalance with expression favouring one allele. Out of Figure 3 | Summary and effect of somatic mutations affecting MLL2 and 380 expressed heterozygous mutant alleles, we observed preferential MEF2B. a, Re-sequencing the MLL2 locus in 89 samples revealed mainly expression of the mutation for 16.8% (64/380) and preferential nonsense (red circles) and frameshift-inducing indel mutations (orange expression of triangles the wild for and 27.8% (106/380; triangles; inverted fortype insertions upright trianglesSupplementary for deletions). A Tablenumber 7). Seven genes showed somatic evidence for significant preferential smaller of non-synonymous mutations (green circles) and expression of or the mutantaffecting allele in at least two cases: BCL2 , CARD11 , point mutations deletions splice sites (yellow stars) were also CD79B,All EZH2 , IRF4 , MEF2B and TP53 ; Supplementary Methods. In observed. of the non-synonymous point mutations affected a residue within 27 out 43 cases with BCL2 cSNVs, favoured the mutant either theof catalytic SET domain, the FYRC expression domain (FY-rich carboxy-terminal domain) PHD zincwith finger domains. The effect of thesehypothesis splice-site mutations allele, or consistent the previously-described that the on MLL2 splicing was also explored (Supplementary deregulated) Figure 7). b, The cSNVs translocated (and hence, transcriptionally allele of 17 DLBCL cases sequenced and somatic mutations in MEF2B in all FL and . Examples of mutations BCL2 is targeted by found somatic hypermutation are with the same symbols. amino acids with variants in at least atshown known oncogenic hot spotOnly sitesthe such as F123I in CARD11 (ref. 18) two patients are labelled. cSNVs were most prevalent inallele the first proteinshowed allelic imbalance favouring the mutant in two some cases. coding exons of MEF2B (exons 2 and 3). The crystal structure of MEF2 bound Similarly, we noted expression favouring two novel hot spot mutato EP300 supports the idea that two of the mutated sites (L67 and Y69) are tions in MEF2B (Y69 and D83) and two sites in EZH2 not previously important in the interaction between these proteins (Supplementary Figure 8 reported as mutated in lymphoma (A682G and A692V). 50 . and Supplementary Discussion) We sought to distinguish new cancer-related mutations from passenger mutations using the approach proposed previously19. We mutations. One such gene was MEF2B, which had not previously been reasoned that this would reveal genes with strong selection signatures, linked to lymphoma. We found that (15.7%) cases had MEF2B and mutations in such genes would be20 good candidate cancer drivers. cSNVs and 4 (3.1%) cases had MEF2C cSNVs. All cSNVs detected by We identified 26 genes with significant evidence for positive selection RNA-seq affected either MADS box or MEF2 domains. To deter(false discovery rate 5the 0.03, Supplementary Methods), with either mine the pressure frequency scope non-synonymous of MEF2B mutations, we Sangerselective forand acquiring point mutations or sequenced exons 2 and 3 in 261 primary FL samples; 259 DLBCL truncating/nonsense mutations (Supplementary Methods; Table 1 and primary tumours; 17 8). cell lines; 35 cases of lymphoma assorted NHL (IBL, Supplementary Table Included were known oncogenes composite FL and PBMCL); and eight non-malignant centroblast (BCL2, CD79B (ref. 9), CARD11 (ref. 18), MYD88 (ref. 10) and EZH2 samples. We also used a showed capture signatures strategy (Supplementary Methods) (ref. 13)), all of which indicative of selection for tonon-synonymous sequence the entire MEF2B coding region in the 261 FL samples, variants. revealing six additional variants outside exons 2 and 3. We thus idenEvidence for of inactivating tified 69 cases (34selection DLBCL, 12.67%; and 35 FL, changes 15.33%) with MEF2B cSNVs or indels, failingsuppressor to observegenes novel to variants in other NHL and We expected tumour show strong selection for non-malignant Of mutations. the variants (80%) affected residues the acquisitionsamples. of nonsense In55 our analysis, the eight most within the MADS and MEF2 encoded by exons 2 and 3 significant genes box included sevendomains with strong selective pressure for (Supplementary Table 11; Fig. 3b). Each patient generally had a single nonsense mutations, including the known tumour suppressor genes TP53 and TNFRSF14 20 ; Table 1). CREBBP , recently reported as MEF2B variant and we(ref. observed relatively few (eight in total, 10.7%) 15 commonly inactivated in or DLBCL , also showed some evidence truncation-inducing SNVs indels. Non-synonymous SNVs were for by acquisition of nonsense mutations and cSNVs with (Supplementary Figure far the most common type of change observed, 59.4% of detected 3 and affecting Supplementary Table We also observed enrichment for variants K4, Y69, N81 9). or D83. In 12 cases MEF2B mutations nonsense mutations in BCL10 , a positive regulator of NF-kB, in which were shown to be somatic, including representative mutations at each 21 have been described in lymphomas ofoncogenic K4, Y69, truncated N81 and products D83 (Supplementary Table 12). We did not. The remaining genes ( BTG1 , GNA13 , SGK1 and detect mutations strongly in ABC significant cases, indicating that somatic mutations in MLL2)have had a no reported indevelopment lymphoma. GNA13 affected by MEF2B role unique role to the of GCB was DLBCL and FL mutations in 22 cases including multiple nonsense mutations. GNA13 (Fig. 2). encodes the alpha subunit of a heterotrimeric G-protein coupled receptor responsible for modulating RhoA activity22. Some of the 23,24 , including a mutated residues of negatively affectsomatic its function Table 2 | Summary types of MLL2 mutations T203A mutation, which also showed allelic imbalance the Sample Type FL DLBCL DLBCL cell-line favouring Centroblast mutant allele (Supplementary Table 7). GNA13 protein was reduced Truncation 18 4 7 0 or absent on western 22 blots in cell lines harbouring Indel with frameshift 8 6 either a nonsense 0 mutation, a stop codon4 deletion, a frame shifting 0 deletion, or changes Splice site 2 0 SNV 3 2 Methods 2 0 affecting splice sites (Supplementary and Supplementary Any mutation/ Figure 4).
a
number of cases Percentage
1 8 AU G U S T 2 0 1 1 | VO L 4 7 6 | N AT U R E | 2 9 9

GCB enrichment

ABC enrichment

31/35 89

12/37 32

10/17 59

0/8 0

1 8 AU G U S T 2 0 1 1 | VO L 4 7 6 | N AT U R E | 3 0 1

S13

RESEARCH ARTICLE RESEARCH ARTICLE


Table 1 | Overview of cSNVs and confirmed somatic mutations in most frequently mutated genes Table 1 | Overview of cSNVs and confirmed somatic mutations in most frequently mutated genes
Gene Gene NS NS Cases Cases S S T T NS NS Total Total S S T Somatic cSNVs Somatic cSNVs (RNA-seq (RNA-seq cohort) * cohort)* P (raw) P (raw) q NS SP NS SP T SP T SP Skew Skew{ (M, WT, both) (M, WT, both){

SGK1 encodes a phosphatidylinositol-3-OH kinase (PI(3)K)regulated kinase with functions including regulation of FOXO SGK1 encodes a phosphatidylinositol-3-OH kinase (PI(3)K)transcription factors25, regulation of NF-kB by phosphorylating IkB regulated kinase with functions including regulation of FOXO kinase26, and negative regulation of NOTCH signalling27. SGK1 also transcription factors25, regulation of NF-kB by phosphorylating IkB resides26 within a region of chromosome 6 commonly deleted in DLBCL kinase 5 , and negative regulation of NOTCH signalling27. SGK1 also (Fig. 1) . The mechanism by which SGK1 and GNA13 inactivation may resides within a region of is chromosome commonly deleted DLBCL contribute to lymphoma unclear, but 6 the strong degree of in apparent 5 . The mechanism by which SGK1 and GNA13 inactivation may (Fig. 1) selection towards their inactivation and their overall high mutation contribute to lymphoma is unclear, but the strong degree of apparent frequency (each mutated in 18 of 106 DLBCL cases) suggests that their selection towards their inactivation and their high mutation loss contributes to B-cell NHL. Certain genes are overall known to be mutated frequency (each mutated in 18 of 106 DLBCL cases) suggests their more commonly in GCB DLBCLs (for example, TP53 (ref. that 28) and loss contributes B-cell NHL. Certain genes are known to be mutated EZH2 (ref. 13)). to Here, both SGK1 and GNA13 mutations were found 24 more commonly in(P GCB DLBCLs TP53 (ref. 28) and andexample, 2.28 3 10 , Fishers exact only in GCB cases 5 1.93 3 1023(for EZH2 13)). both SGK1 and GNA13 mutations weregenes found test; n(ref. 5 15 andHere, 18, respectively) (Fig. 2). Two additional 23 24 and 2.28 3 10 , Fishers only in GCB (P 5 1.93 3 10no (MEF2B and cases TNFRSF14 ) with previously described roleexact in test; n 5 15 and 18, respectively) (Fig. 2). Two additional genes DLBCL showed a similar restriction to GCB cases (Fig. 2). (MEF2B and TNFRSF14) with no previously described role in DLBCL showed a similar restriction to GCB cases (Fig. 2). Inactivating MLL2 mutations

mutations in the same case. The P values indicated in bold are the upper limit on the P value for that gene determined with the approach described in ref. 19 (see Supplementary Methods), q is the BenjaminiIndividual cases with (NS), (S) and truncating (T) mutations and total number of mutations of class are shown separately because some genes contained multiple corrected q value, andnon-synonymous NS SP and T SP refer to synonymous selective pressure estimates from this model for the the acquisition of non-synonymous oreach truncating mutations, respectively. Genes with a superscript of either A or G mutations same case. The P values indicated are the cases, upper respectively limit on the P forFishers that gene determined with the approach described in ref. 19 (see Supplementary Methods), q is the Benjaminiwere foundin tothe have mutations significantly enriched in bold ABC or GCB (Pvalue , 0.05, exact test). corrected q value, and NS SP and T SP refer to selective pressure estimates from this model are for the non-synonymous or truncating mutations, respectively. Genes with a superscript of either A or G * Additional somatic mutations identified in larger cohorts and insertion/deletion mutations not acquisition included in of this total. were found to have mutations significantly in ABC or GCB cases, respectively P ,where 0.05, Fishers exact { Both indicates that we observed separateenriched cases in which skewed expression was seen (but this skew was test). not consistent for the mutant or wild-type allele. * somatic mutations identified in larger cohorts and insertion/deletion mutations are not included in this total. { Additional Genes significant at a false discovery rate of 0.03. SNVs in BCL2 and previously confirmed hot spot mutations in EZH2 and CD79B are probably somatic in these samples based on published observations of others. { indicates that we observed separate cases in purifying which skewed expression was positive seen but where this skew was not consistent for the mutant or wild-type allele. 1 Both Selective pressure estimates are both , 1 indicating selection rather than selection acting on this gene. { Genes significant at a false discovery rate of 0.03. SNVs in BCL2 and previously confirmed hot spot mutations in EZH2 and CD79B are probably somatic in these samples based on published observations of others. 1 Selective pressure estimates are both , 1 indicating purifying selection rather than positive selection acting on this gene.

MLL2{ 16 8 17 17 8 18 10 6.85 3 1028 8.50 3 1027 0.834 14.4 WT 28 G 28 2727 MLL2 { 16 8 17 17 18 10 6.85 10 8.50 3 10 0.834 11814.4 WT TNFRSF14 { 7 1 7 8 18 7 11 6.85 33 10 8.50 3 10 7.52 Both G 28 G 28 2727 TNFRSF14 { 7 1 7 8 1 7 11 6.85 3 10 8.50 3 10 7.52 118 SGK1 { 18 6 6 37 10 6 9 6.85 3 10 8.50 3 10 19.5 61.7 2Both G 28 28 2727 SGK1 { 18 6 6 37 10 6.85 10 8.50 3 10 19.5 BCL10{ 2 0 4 3 0 46 49 6.85 33 10 8.50 3 10 3.62 11261.7 WT 2 28 28 2727 BCL10 {G{ 2 0 4 3 6.85 10 8.50 3 10 3.62 112 WT GNA13 21 1 2 33 10 24 54 6.85 33 10 8.50 3 10 24.1 25.7 Both G 28 27 G 2 8 2 7 GNA13 21 1 2 33 6.85 10 8.50 3 10 24.1 25.7 Both TP53 { { 20 2 1 23 31 12 22 5 6.85 33 10 8.50 3 10 15.6 14.1 Both G 28 27 G 2 8 2 7 TP53 20 2 1 23 22 6.85 10 8.50 3 10 15.6 14.1 Both EZH2 { 33 0 0 33 03 01 33 6.85 33 10 8.50 3 10 11.4 0.00 Both G 28 28 2727 EZH2 33 0 0 33 33 6.85 10 8.50 3 10 11.4 BTG2{ { 12 6 1 14 60 10 2 6.85 33 10 8.50 3 10 23.9 35.10.00 2Both 2 8 2 7 G 28 27 BTG2 12 6 1 14 6.85 10 8.50 3 10 23.9 35.1 BCL2 { { 42 45 0 96 1056 01 43 2 9.35 33 10 8.50 3 10 3.78 0.00 M 2 G 28 28 2727 BCL2 42 45 0 96 105 43 9.35 10 8.50 3 10 3.78 0.00 BCL6{1{ 11 2 0 12 2 00 2 9.35 33 10 8.50 3 10 0.175 0.00 M M 28 28 2727 BCL6 1 11 2 0 12 9.35 10 8.50 3 10 0.175 0.00 M CIITA{1 5 3 0 6 32 00 22 9.35 33 10 8.50 3 10 0.086 0.00 28 27 2627 CIITA 5 3 0 6 9.35 10 8.50 3 10 0.086 FAS{ {1 2 0 4 3 03 40 22 1.52 33 10 1.17 3 10 2.54 66.50.00 WT 27 2626 27 BTG1 11 6 2 11 70 24 10 2 1.52 33 10 1.17 3 10 17.5 52.5 Both FAS {{ 2 0 4 3 1.52 10 1.17 3 10 2.54 66.5 WT 27 2626 27 MEF2B 20 2 0 20 27 02 10 2.05 33 10 1.47 3 10 14.2 0.00 MBoth BTG1 { G{ 11 6 2 11 10 1.52 10 1.17 3 10 17.5 52.5 27 2626 G 27 IRF8{ { 11 5 3 14 52 30 3 4.55 33 10 3.03 3 10 8.82 28.20.00 WT M MEF2B 20 2 0 20 10 2.05 10 1.47 3 10 14.2 27 26 TMEM30A { 1 0 4 1 05 43 43 6.06 33 10 3.79 3 10 0.785 65.0 WTWT IRF8 { 11 5 3 14 4.55 1027 3.03 3 1026 8.82 28.2 26 2526 27 CD58{ 2 0 3 2 00 34 24 2.42 33 10 1.43 3 10 2.29 69.2 2 WT TMEM30A { 1 0 4 1 6.06 10 3.79 3 10 0.785 65.0 25 2525 2 6 KLHL6 10 2 2 12 20 23 42 1.00 33 10 5.26 3 10 5.42 16.4 2 2 CD58 {{ 2 0 3 2 2.42 10 1.43 3 10 2.29 69.2 A 25 2 5 MYD88 13 2 0 14 22 02 94 1.00 33 10 5.26 3 10 12.4 0.00 WT 2 KLHL6 { { 10 2 2 12 1.00 1025 5.26 3 1025 5.42 16.4 2 5 2 5 CD70{ A{ 5 0 1 5 02 20 39 1.70 33 10 8.48 3 10 7.08 44.00.00 2 WT MYD88 13 2 0 14 1.00 1025 5.26 3 1025 12.4 A 25 25 CD79B 7 2 1 9 20 12 53 2.00 33 10 9.52 3 10 10.9 18.3 M 2 8.48 3 1025 7.08 44.0 CD70 { { 5 0 1 5 1.70 1025 25 24 A CCND3{ 7 1 2 7 12 21 65 2.80 33 10 1.27 3 10 6.55 36.3 WT M CD79B { 7 2 1 9 2.00 1025 9.52 3 1025 10.9 18.3 24 2424 25 CREBBP 20 7 4 24 71 42 96 1.00 33 10 4.35 3 10 2.72 6.04 Both CCND3 {{ 7 1 2 7 2.80 10 1.27 3 10 6.55 36.3 WT 24 2424 24 HIST1H1C 9 0 0 10 07 04 69 1.80 33 10 7.50 3 10 11.9 0.00 Both CREBBP { { 20 7 4 24 1.00 10 4.35 3 10 2.72 6.04 Both 24 2324 2 4 B2M{ 7 0 0 7 00 00 46 3.90 33 10 1.56 3 10 16.6 0.00 WT HIST1H1C { 9 0 0 10 1.80 10 7.50 3 10 11.9 0.00 Both 24 2323 24 ETS1 { 10 1 0 10 10 00 44 4.10 33 10 1.58 3 10 5.76 0.00 WTWT B2M { 7 0 0 7 3.90 10 1.56 3 10 16.6 0.00 23 23 24 23 CARD11 14 3 0 14 31 00 34 1.90 33 10 7.04 3 10 3.37 0.00 Both ETS1 { { 10 1 0 10 4.10 10 1.58 3 10 5.76 0.00 WT 23 22 23 23 FAT2{1 { 2 1 0 2 13 00 23 6.30 33 10 2.25 3 10 0.128 0.00 2Both CARD11 14 3 0 14 1.90 10 7.04 3 10 3.37 0.00 23 22 23 22 IRF4{ 11 9 4 0 26 51 00 52 7.00 33 10 2.41 3 10 0.569 0.00 Both2 FAT2 { 2 1 0 2 6.30 10 2.25 3 10 0.128 0.00 3 22 23 22 FOXO1 8 4 0 10 45 00 45 7.60 33 10 2.53 3 10 4.02 0.00 2Both IRF4 {1 { 9 4 0 26 7.00 10 2.41 3 10 0.569 0.00 22 22 STAT3 { 9 0 0 9 04 00 44 2.19 33 10 6.08 3 10 2 2 0.00 Both2 3 22 FOXO1 8 4 0 10 7.60 10 2.53 3 10 4.02 22 22 RAPGEF1 8 3 0 10 30 00 34 2.98 33 10 7.45 3 10 22 2 2 WT STAT3 9 0 0 9 2.19 1022 6.08 3 1022 Both ABCA7 12 3 0 15 3 0 2 7.76 3 1022 1.67 3 102122 2 2 WT 22 RAPGEF1 8 3 0 10 3 0 3 2.98 3 10 7.45 3 10 2 2 WT 22 21 RNF213 10 8 0 10 8 0 2 7.87 3 10 22 1.67 3 10 21 2 2 2 ABCA7 12 3 0 15 3 0 2 7.76 3 10 1.67 3 10 2 2 WT MUC16 17 12 0 39 25 0 2 8.32 3 1022 1.73 3 102121 2 2 2 22 RNF213 10 8 0 10 8 0 2 7.87 3 10 1.67 3 10 2 2 2 21 HDAC7 8 4 0 8 4 0 2 8.94 3 1022 1.82 3 10 2 2 WT 21 MUC16 17 12 0 39 25 0 2 8.32 3 1022 1.73 3 10 2 2 2 21 PRKDC 7 3 0 7 4 0 2 1.06 3 1021 2.05 3 10 2 2 2 21 HDAC7 8 4 0 8 4 0 2 8.94 3 1022 1.82 3 10 2 2 WT 21 SAMD9 9 2 0 9 2 0 2 1.79 3 1021 3.01 3 10 2 2 2 PRKDC 7 3 0 7 4 0 2 1.06 3 1021 2.05 3 1021 2 2 2 TAF1 10 0 0 10 0 0 2 3.03 3 1021 4.74 3 102121 2 2 2 21 SAMD9 9 2 0 9 2 0 2 1.79 3 10 3.01 3 10 2 2 2 21 PIM1 20 19 0 33 34 0 11 3.40 3 10 21 5.23 3 102121 2 2 WT TAF1 10 0 0 10 0 0 2 3.03 3 10 4.74 3 10 2 2 2 COL4A2 8 2 0 8 2 0 2 7.64 3 1021 8.99 3 102121 2 2 2 21 PIM1 20 19 0 33 34 0 11 3.40 3 10 5.23 3 10 2 2 WT EP300 8 7 1 8 7 1 3 9.54 3 1021 1.00 2 2 WT 21 21 COL4A2 8 2 0 8 2 0 2 7.64 3 10 8.99 3 10 2 2 2 1 each class are shown separately because some genes contained multiple Individual cases with non-synonymous (S) and the of mutations EP300 8 7 (NS), synonymous 1 8and truncating 7 (T) mutations 1 3total number 9.54 3 102of 1.00 2 2 WT

address the possibility that variable RNA-seq coverage of MLL2 failed to capture some mutations, we PCR-amplified the entire MLL2 locus address the possibility that variable RNA-seq coverage of MLL2 failed (,36 kilobases) in 89 cases (35 primary FLs, 17 DLBCL cell lines, and to capture some mutations, we PCR-amplified the entire MLL2 locus 37 DLBCLs). Of these cases 58 were among the RNA-seq cohort. (,36 kilobases) in 89 cases (35 primary FLs, 17 DLBCL cell lines, and Illumina amplicon re-sequencing (Supplementary Methods) revealed 37 DLBCLs). Of these cases 58 were among the RNA-seq cohort. 78 mutations, confirming the RNA-seq mutations in the overlapping Illumina re-sequencing (Supplementary revealed cases and amplicon identifying 33 additional mutations. WeMethods) confirmed the 78 mutations, confirming the RNA-seq mutations in the overlapping somatic status of 46 variants using Sanger sequencing (Supplemencases and identifying 33that additional We confirmed tary Table 10), and showed 20 of the mutations. 33 additional mutations werethe somatic status of 46 variants sequencing insertions or deletions (indels). using Three Sanger SNVs at splice sites(Supplemenwere also tary Table 10), and showed that 20 of the additional were detected, as were 10 new cSNVs that had not33 been detectedmutations by RNA-seq. insertions or deletions (indels). Three SNVs at MLL2 splice (Fig. sites 3a). were also The somatic mutations were distributed across Of detected, were 10 new cSNVs that had not been detected these, 37%as (n 5 29/78) were nonsense mutations, 46% (nby 5RNA-seq. 36/78) The somatic mutations were distributed across MLL2 (Fig.point 3a). Of were indels that altered the reading frame, 8% (n 5 6/78) were these, 37% n 5 29/78) were nonsense mutations, 46% (n 5 36/78) mutations at ( splice sites and 9% (n 5 7/78) were non-synonymous were indels that altered the reading frame, n 5 6/78) weresite point amino acid substitutions (Table 2). Four of 8% the (somatic splice mutations at effects splice sites and 9% (n 5 7/78) were non-synonymous mutations had on MLL2 transcript length and structure. For amino acid substitutions splice (Table 2). Four of resulted the somatic splice site example, two heterozygous site mutations in the use of mutations had effects on MLL2 transcript length and structure. For MLL2 showed the most significant evidence for selection and the a novel splice donor site and an intron retention event. Inactivating mutations example, two heterozygous splice site mutations resulted in the use of Approximately half of the NHL cases we sequenced had two MLL2 largest number MLL2 of nonsense SNVs. Our RNA-seq analysis indicated a novel splice donor site and an intron retention event. artificial MLL2 showed the most significant for MLL2 selection and To the mutations that 26.0% (33/127) of cases carried evidence at least one cSNV. (Supplementary Table 10). We used bacterial Approximately half of the NHL cases we sequenced had two MLL2 largest number of nonsense SNVs. Our RNA-seq analysis indicated 3 0 0 26.0% | N AT U RE | VOL 7 6 | 1carried 8 AUGU ST 2 0 1one 1 that (33/127) of 4cases at least MLL2 cSNV. To mutations (Supplementary Table 10). We used bacterial artificial
3 0 0 | N AT U R E | VO L 4 7 6 | 1 8 AU G U S T 2 0 1 1

S14

NATURE REPRINT COLLECTION Epigenetics

ARTICLE RESEARCH ARTICLE RESEARCH


ABC GCB U GCB FL ABC U FL 40 a 4030 a 3020 2010 10 MYD88 CD79B MLL2 MYD88 BCL6s COG5141 SET HMG box PHD PHD CD79B MLL2 TNFAIP3 FYRN BCL6s SET COG5141 HMG box PHD PHD CARD11 FYRC TNFAIP3 FYRN FAS CARD11 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 FYRC 5500 bp TMEM30A FAS 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 bp CD58 N81K TMEM30A b 0 N81Y CD70 CD58 N81K D83G b STAT3 N81Y CD70 Y69C D83G ETS1 STAT3 D83V Y69H HIST1H1C Y69C ETS1 CCND3 D83V Y69H HIST1H1C KLHL6 CCND3 D83A K4E BTG1 KLHL6 MEF2B BTG2 D83A K4E BTG1 MADS box MEF2 IRF8 MEF2B BTG2 B2M MADS box MEF2 IRF8 0 50 100 150 200 250 300 350 bp EP300 B2M CREBBP 0 50 100 150 200 250 300 350 bp EP300 MLL2 Figure 3 | Summary and effect of somatic mutations affecting MLL2 and CREBBP FOXO1 MLL2 TNFRSF14 Figure MEF2B a, Re-sequencing the of MLL2 locusmutations in 89 samples revealed mainly 3. | Summary and effect somatic affecting MLL2 and FOXO1 MEF2B nonsense circles) andthe frameshift-inducing indel mutations (orange TNFRSF14 MEF2B . a,(red Re-sequencing MLL2 locus in 89 samples revealed mainly TP53 MEF2B BCL2 triangles; inverted triangles for insertions and upright triangles for deletions). A nonsense (red circles) and frameshift-inducing indel mutations (orange TP53 SGK1 BCL2 smaller inverted number of non-synonymous somatic mutations (green and A GNA13 triangles; triangles for insertions and upright triangles forcircles) deletions). SGK1 EZH2 point mutations deletions affecting splice sites (yellow stars)circles) were also smaller number ofor non-synonymous somatic mutations (green and GNA13 BCL2s EZH2 observed. All ofor the non-synonymous point mutations affected residue within point mutations deletions affecting splice sites (yellow stars)awere also BCL2s

ABC enrichment

GCB enrichment

GCB enrichment

ABC enrichment

<0.05

MYD88 CD79B MYD88 BCL6s CD79B TNFAIP3 BCL6s CARD11 TNFAIP3 FAS CARD11 TMEM30A CD58FAS TMEM30A CD70 CD58 STAT3 CD70 ETS1 STAT3 HIST1H1C ETS1 CCND3 HIST1H1C KLHL6 CCND3 BTG1 KLHL6 BTG2 BTG1 IRF8 BTG2 B2M IRF8 EP300 B2M CREBBP EP300 MLL2 CREBBP FOXO1 MLL2 TNFRSF14 FOXO1 MEF2B TNFRSF14 TP53 MEF2B BCL2 TP53 SGK1 BCL2 GNA13 SGK1 EZH2 GNA13 BCL2s EZH2 BCL2s

Cases Cases

0.10.05 <0.05 0.30.1 0.10.05

Figure 2 | Overview of mutations and potential cooperative interactions in NHL. heat map displays possible trends towards co-occurrence (red)in and Figure 2 | This Overview of mutations and potential cooperative interactions mutual of possible somatic mutations and structural rearrangements. NHL. Thisexclusion heat map(blue) displays trends towards co-occurrence (red) and Colours were assigned by taking mutations the minimum of a leftand right-tailed mutual exclusion (blue) of somatic andvalue structural rearrangements. Fishers exact test. To trends a P-value threshold of 0.3 was used, with Colours were assigned bycapture taking the minimum value of a leftand right-tailed the darkest shade the colour indicating those meeting significance Fishers exact test. Toof capture trends a P-value threshold ofstatistical 0.3 was used, with P # 0.05). The relative frequency of mutations in ABC (blue), GCB (red), the ( darkest shade of the colour indicating those meeting statistical significance unclassifiable (black) frequency DLBCLs and FL (yellow)in cases is shown the(red), left. Genes (P # 0.05). The relative of mutations ABC (blue), on GCB were arranged with those having significant (P ,is 0.05, Fishers exact unclassifiable (black) DLBCLs and FL (yellow) cases shown on the left. test) Genes enrichment for mutations in ABC cases (blue towards thetest) top (and were arranged with those having significant (P ,triangle) 0.05, Fishers exact left) and for those with significant mutations in GCB cases (red enrichment mutations in ABCenrichment cases (blue for triangle) towards the top (and triangle) towards the bottom (and right).for The total number of cases in which left) and those with significant enrichment mutations in GCB cases (red each gene contained either cSNVs or confirmed somatic mutations is shown at triangle) towards the bottom (and right). The total number of cases in which the top. The cluster of blue squares (upper-right) results from the mutual each gene contained either cSNVs or confirmed somatic mutations is shown at exclusion of the ABC-enriched mutations (for example, MYD88, CD79B) from the top. The cluster of blue squares (upper-right) results from the mutual the GCB-enriched mutations (for example, EZH2, GNA13). Presence of exclusion of the ABC-enriched mutations (for example, MYD88, CD79B) from structural rearrangements involving the two oncogenes BCL6 and BCL2 the GCB-enriched mutations (for example, EZH2, GNA13). Presence of (indicated as BCL6s and BCL2s) was determined with FISH techniques using structural rearrangements involving the two oncogenes BCL6 and BCL2 break-apart probes (Supplementary Methods). (indicated as BCL6s and BCL2s) was determined with FISH techniques using break-apart probes (Supplementary Methods). chromosome (BAC) clone sequencing in eight FL cases to show that in

0.30.1

either the catalytic SET domain, the FYRC domain (FY-rich carboxy-terminal observed. All of the non-synonymous point mutations affected a residue within domain) or PHD SET zinc domain, finger domains. The effect of these splice-site mutations either the catalytic the FYRC domain (FY-rich carboxy-terminal on MLL2 was also explored Figure 7). b, The cSNVs domain) or splicing PHD zinc finger domains.(Supplementary The effect of these splice-site mutations and somatic mutations found in MEF2B in all FL and DLBCL cases sequenced on MLL2 splicing was also explored (Supplementary Figure 7). b, The cSNVs are shown with the same symbols. Only the amino acids with variants in at least and somatic mutations found in MEF2B in all FL and DLBCL cases sequenced two patients cSNVs Only were the most prevalent thevariants first two proteinare shown withare thelabelled. same symbols. amino acidsin with in at least coding exons of MEF2B cSNVs (exons 2 and most 3). The crystal structure of two MEF2 bound two patients are labelled. were prevalent in the first proteinto EP300 supports the (exons idea that two 3). of the mutated sites (L67of and Y69) are coding exons of MEF2B 2 and The crystal structure MEF2 bound in the interaction between these proteinssites (Supplementary Figure 8 toimportant EP300 supports the idea that two of the mutated (L67 and Y69) are 50 . these proteins (Supplementary Figure 8 and Supplementary Discussion) important in the interaction between and Supplementary Discussion)50.

all eight cases the mutations were in trans, affecting both MLL2 alleles. chromosome (BAC) clone sequencing in eight FL cases to show that in This observation is consistent with the notion that there is a complete, all eight cases the mutations were in trans, affecting both MLL2 alleles. or near-complete, loss of MLL2 in the tumour cells of such patients. This observation is consistent with the notion that there is a complete, With the exception of two primary FL cases and two DLBCL cell or near-complete, loss of MLL2 in the tumour cells of such patients. lines (Pfeiffer and SU-DHL-9), the majority of MLL2 mutations seemed With the exception of two primary FL cases and two DLBCL cell to be heterozygous. Analysis of Affymetrix 500k SNP array data from lines (Pfeiffer and SU-DHL-9), the majority of MLL2 mutations seemed two FL cases with apparent homozygous mutations revealed that both to be heterozygous. of Affymetrix 500k array data (LOH) from tumours showed Analysis copy number neutral loss ofSNP heterozygosity two FLthe cases with of apparent homozygous mutations revealed that both for region chromosome 12 containing MLL2 (Supplementary tumours showed copy number to neutral lossmutation, of heterozygosity Methods). Thus, in addition bi-allelic LOH is a(LOH) second, for albeit the region of chromosome 12 by containing MLL2 (Supplementary less common mechanism which MLL2 function is lost. Methods). Thus, in most addition to bi-allelic mutation, is a second, MLL2 was the frequently mutated gene inLOH FL, and among the albeit less common mechanism by which MLL2 function is lost. most frequently mutated genes in DLBCL (Fig. 2). We confirmed MLL2 the most mutated gene in FL, among the MLL2 was mutations in frequently 31 of 35 FL patients (89%), in and 12 of 37 DLBCL most frequently mutated genes in DLBCL (Fig. 2).and Wein confirmed patients (32%), in 10 of 17 DLBCL cell lines (59%) none of the MLL2 mutations in 31 of 35samples FL patients (89%), in 12 of analysis 37 DLBCL eight normal centroblast we sequenced. Our prepatients (32%), in majority 10 of 17 DLBCL cell lines (59%) and in none of the dicted that the of the somatic mutations observed in MLL2 Sample Type FL DLBCL DLBCL cell-line Centroblast eight normal centroblast sequenced. Ouror analysis prewere inactivating (91% samples disruptedwe the reading frame were truncatTable 2 | Summary of types of MLL2 somatic mutations dicted the majority of the somatic mutations observed in MLL2 18 4 7 0 ing that point mutations), indicating to us that MLL2 is a tumour sup- Truncation Sample FL DLBCL DLBCL cell-line Centroblast Indel Type with frameshift 22 8 6 0 were inactivating (91% disrupted pressor of significance in NHL.the reading frame or were truncatSplice site 0 0 184 42 7 0 ing point mutations), indicating to us that MLL2 is a tumour sup- Truncation SNV 2 0 Indel with frameshift 223 82 6 0 pressor of significance in NHL. Recurrent point mutations in MEF2B Any mutation/ Splice site 4 2 0 0 number of cases 31/35 12/37 10/17 0/8 Our selective pressure analysis also revealed genes with stronger pres- SNV 3 2 2 0 Percentage 89 32 59 0 Recurrent point mutations in MEF2B Any mutation/ sure for acquisition of amino acid substitutions than for nonsense number of cases 31/35 12/37 10/17 0/8 Our selective pressure analysis also revealed genes with stronger pres89 59 18 A U G U S T 32 2011 | VOL 4 7 6 | N AT U R0 E | 301 sure for acquisition of amino acid substitutions than for nonsense Percentage
NATURE REPRINT COLLECTION Epigenetics
1 8 AU G U S T 2 0 1 1 | VO L 4 7 6 | N AT U R E | 3 0 1

mutations. One such gene was MEF2B, which had not previously been linked to One lymphoma. We found that 20 (15.7%) cases had MEF2B mutations. such gene was MEF2B , which had not previously been cSNVs 4 (3.1%) cases had MEF2C cSNVs. All cases cSNVs detected by linked toand lymphoma. We found that 20 (15.7%) had MEF2B RNA-seq MADS box or MEF2 domains. To detercSNVs and affected 4 (3.1%)either cases the had MEF2C cSNVs. All cSNVs detected by mine the frequency and ofbox MEF2B mutations, we SangerRNA-seq affected either thescope MADS or MEF2 domains. To detersequenced exons 2 and in 261 samples; 259 DLBCL mine the frequency and 3 scope ofprimary MEF2B FL mutations, we Sangerprimary tumours; 17 cell 35 cases ofsamples; assorted259 NHL (IBL, sequenced exons 2 and 3 in lines; 261 primary FL DLBCL composite FL and17 PBMCL); and eight non-malignant centroblast primary tumours; cell lines; 35 cases of assorted NHL (IBL, samples. We a capture (Supplementary Methods) composite FL also and used PBMCL); and strategy eight non-malignant centroblast to sequence the entire MEF2B coding region in the 261 FL samples, samples. We also used a capture strategy (Supplementary Methods) six additional variants outside exons 2 the and 261 3. We identorevealing sequence the entire MEF2B coding region in FLthus samples, tified 69 cases (34 DLBCL, 12.67%; and 35 FL, 15.33%) with MEF2B revealing six additional variants outside exons 2 and 3. We thus idencSNVs or indels, failing to observe novel variants in other NHL and tified 69 cases (34 DLBCL, 12.67%; and 35 FL, 15.33%) with MEF2B non-malignant samples. Of the variants 55 (80%) affected residues cSNVs or indels, failing to observe novel variants in other NHL and within the MADS box and MEF2 domains encoded by exons 2 and 3 non-malignant samples. Of the variants 55 (80%) affected residues (Supplementary Table 11; Fig. 3b). Each patient generally had a single within the MADS box and MEF2 domains encoded by exons 2 and 3 MEF2B variant and we observed relatively few (eight in total, 10.7%) (Supplementary Table 11; Fig. 3b). Each patient generally had a single truncation-inducing SNVs or indels. Non-synonymous SNVs were by MEF2B variant and we observed relatively few (eight in total, 10.7%) far the most common type of change observed, with 59.4% of detected truncation-inducing SNVs indels. Non-synonymous SNVs were by variants affecting K4, Y69,or N81 or D83. In 12 cases MEF2B mutations far the most common type of change observed, with 59.4% of detected were shown to be somatic, including representative mutations at each variants K4, Y69, N81 or D83. In 12 cases MEF2B mutations of K4, affecting Y69, N81 and D83 (Supplementary Table 12). We did not were shown to be somatic, including representative mutations at each detect mutations in ABC cases, indicating that somatic mutations in ofMEF2B K4, Y69, N81 and D83 to (Supplementary Table 12). We did not have a role unique the development of GCB DLBCL and FL detect in ABC cases, indicating that somatic mutations in (Fig. mutations 2). MEF2B have a role unique to the development of GCB DLBCL and FL (Fig. 2). Table 2 | Summary of types of MLL2 somatic mutations

S15

RESEARCH ARTICLE RESEARCH ARTICLE


of histone proteins in a deacetylated repressive chromatin state42. levels induce the nuclear export of In our study of genome, transcriptome and exome sequences from Increased cytoplasmic Ca Gene Cases Total Somatic cSNVs P (raw) q NS SP T SP Skew 127 B-cell NHL cases, we identified 109 genes with clear evidence of HDACs, enabling the recruitment of HATs such as CREBBP/ (RNA-seq (M, WT, both){ NS S T NS Significant S T cohort)* EP300, facilitating transcription at MEF2 target genes. Mutation of somatic mutation in multiple individuals. selection seems , EP300 or MEF2B may have an impact on14.4 the expression of to act 26 of these the acquisition of nonsense10 or CREBBP MLL2 { on at least 16 8 for 17 17 8 either 18 6.85 3 1028 8.50 3 1027 0.834 WT 28 27 target owing reduced acetylation of nucleosomes near missense G mutations. the best7of our 8 knowledge, the of MEF2 TNFRSF14 { 7 To 1 1 7 majority 11 6.85 3 10genes 8.50 to 3 10 7.52 118 Both 28 27 SGK1 { 18 previously 6 6 associated 37 10 9 6.85 3 10 8.50 3 10 19.5 61.7 2 In these genes (Supplementary Figure 5; Supplementary Discussion). these G genes had not been with any6cancer type. 28 BCL10 { 2 0 4 3 mutations 0 4 4 6.85 3 10 8.50 3 1027 3.62 EZH2 112 WT light of the recent finding that heterozygous Y641 mutations We observed an enrichment of somatic affecting genes G 28 27 GNA13 { 21 1 2 33 1 2 5 6.85 3 10 8.50 3 10 25.7 Both enhance overall H3K27 trimethylation24.1 activity of PRC2 (refs 43, 44), involved in transcriptional regulation and, more specifically, chroTP53 G{ 20 2 1 23 3 1 22 6.85 3 1028 8.50 3 1027 15.6 14.1 Both G 2 8 2 7 it is possible of both MLL2 and EZH20.00 could cooperate matin { modification. EZH2 33 0 0 33 0 0 33 6.85 3 10that mutation 8.50 3 10 11.4 Both 28 27 in reducing the expression of some of the same target genes. Our 2 data BTG2 { 12from our 6 analysis 1 1 suppressor 2 6.85 3 10 8.50 3 10 23.9 35.1 MLL2 emerged as14 a major6tumour G 28 27 BCL2 { 42 45 0 96 105 0 43 9.35 3 10 8.50 3 10 3.78 0.00 M post-transcriptional modification of histones is of key locus in NHL. It is one of six human H3K4-specific methyltrans- indicate that (1) BCL6{129 11 2 0 12 2 0 2 9.35 3 1028 8.50 3 1027 0.175 0.00 M importance in share homology with the Drosophila trithorax ferases , all of which 28 germinal centre 27B cells and (2) deregulated histone CIITA{1 5 3 0 6 3 0 2 9.35 3 10 8.50 3 10 0.086 0.00 7 gene. is mark assocmodification due to 1.17 these is likely to 66.5 result in reduced FAS { Trimethylated 2 H3K4 0 (H3K4me3) 4 3 an epigenetic 0 4 2 1.52 3 102 3mutations 1026 2.54 WT 2 7 2 6 iated with the promoters of actively transcribed genes. By laying down acetylation and enhanced and acts as a core driver Both event BTG1{ 11 6 2 11 7 2 10 1.52 3 10 1.17 methylation, 3 10 17.5 52.5 G 27 26 MEF2B { MLLs20 2 0 2 0 10 2.05 3 10 1.47 3 10 14.2 M this mark, are responsible for the20 transcriptional regulation of in the development of NHL (Supplementary Figure 0.00 5). 30 IRF8 { 11 5 3 homeobox 14 5 3 3 4.55 3 1027 3.03 3 1026 8.82 28.2 WT developmental genes including the ( Hox) gene family 27 26 TMEM30A{ 0 4 1 0 4 4 6.06 3 10 3.79 3 10 0.785 65.0 WT which collectively1 control segment specificity and cell fate in the METHODS SUMMARY CD58{ 231,32 0 3 2 0 3 2 2.42 3 1026 1.43 3 1025 2.29 69.2 2 25 50% tumour cells. Genomes, exomes and . Each member to target developing embryo All samples at least KLHL6 { 10 2 MLL 2 family 12 2 is thought 2 4 1.00 3analysed 1025 contained 5.26 3 10 5.42 16.4 2 33 A 5 25 and in 14 addition,2MLL20is known 9 to transcriptomes different of Hox genes were sequenced using a combination of Illumina GAIIx and MYD88 { subsets 13 2 0 1.00 3 102 5.26 3 10 12.4 0.00 WT 25 25 8.48 3 10 44.0 2 CD70 { the transcription 5 0of a diverse 1 5of genes 0 34. Recently, 2 3 1.70 3 10 HiSeq 2000 instruments to read lengths of 7.08 between 36 and 100 nucleotides. MLL2 regulate set A 5 CD79B { were reported 7 2 1 9 2 cell 1 2.00 3 102 9.52 3 1025the Agilent 10.9 SureSelect 18.3 M capture was performed using Target Enrichment in Exome mutations in a small-cell lung cancer line35 and 5 CCND3{ 1 2 7 1 2 6 2.80 3 1025 1.27 3 1024 6.55 36.3 WT 36 7 System Protocol (Version 1.0, September 2009). Alignment was accomplished , but the frequency of nonsense mutations affecting renal carcinoma 24 24 CREBBP{ 20 7 4 24 7 4 9 3 10 4.35 3 10 2.72 6.04 were manuBoth 45 using1.00 BWA and variants were identified using SNVmix46. Variants MLL2 in cancers established reports. HIST1H1C { these 9 0 was not 0 10 0 in these 0 6 1.80 3 1024 7.50 3 1024 11.9 0.00 Both ally reviewed in IGV and were confirmed (where applicable) by PCR followed by 2 4 23 Inactivating mutations were reported recently in in B2M { 7 0 0 7 0 MLL20or MLL3 4 3.90 3 10 1.56 3 10 16.6 0.00 WT 24 23 re-sequencing. Structural rearrangements either Sanger sequencing or Illumina 37 ETS1 { 10 1 0 10 1 0 4 4.10 3 10 1.58 3 10 5.76 0.00 WT 16% of medulloblastoma patients , further implicating MLL2 as a 47 3 in genomes using ABySS0.00 . Gene expression CARD11 { 14 3 0 14 3 0 3 1.90 3and 102transcriptomes 7.04 3were 1023identified 3.37 Both cancer gene. 2 3 22 values used for subtype assignment were calculated as reads per kilobase 2 gene FAT2{1 2 1 0 2 1 0 2 6.30 3 10 2.25 3 10 0.128 0.00 22 48 Our to model per3 million reads (RPKM) values and subtypes IRF4 {1 data link 9MLL24 somatic 0 mutations 26 5 B-cell 0 NHL. The 5 7.00 1023 mapped 2.41 3 10 0.569 0.00 were assigned Both 22 reported to 0 be inactivating and in0eight of the FOXO1 { mutations 8 are likely 4 10 4 4 3 103 of the2.53 3 10 4.02 0.00 2 using7.60 an adaptation method developed for data from Affymetrix expression 2 49 STAT3 9 mutations, 0 0 confirmed 9 0 both 0alleles were 4 2.19 3 102 6.08 previously 3 1022 classified 2 by this standard 2 Both cases with multiple we that arrays trained with samples approach. 22 22 RAPGEF1 8 resulting 3 10 3 0 of MLL2 3 2.98 3 10 7.45 3 10 2 2 WT affected, presumably in0essentially complete loss 22 ABCA7 12 3 0 15 3 0 2 7.7613 3 10 1.67 accepted 3 1021 7 July 2011. 2 2 WT Received November 2010; function. The high prevalence of MLL2 mutations in FL (89%) equals 22 21 RNF213 10 8 0 10 8 0 2 7.87 3 10 1.67 3 10 2 2 2 Published online 27 July 2011. the frequency of17 the t(14;18)(q32;q21) translocation, which is conMUC16 12 0 39 25 0 2 8.32 3 1022 1.73 3 1021 2 2 2 3 . In DLBCL sidered the most 8 prevalent HDAC7 4 genetic 0 abnormality 8 4 in FL 0 2 8.94 3 1022 1.82 3 1021 2 2 WT 1. Anderson, J.2 R., Armitage, J. O., Weisenburger, D. D., Non-Hodgkins Lymphoma PRKDC 3 lines, 0 4 frequencies 0 2 1.06 3 10 1 2.05 3 1021 2 2 2 tumour samples 7 and cell MLL2 7 mutation were Classification Project. Epidemiology of the non-Hodgkins lymphomas: SAMD9 9 2 also exceeding 0 9 the prevalence 2 0 of the most 2 1.79 3 1021 3.01 3 1021 2 2 2 32% and 59%, respectively, distributions 2of 1 the major subtypes 21differ by geographic locations. Ann. Oncol. 9, TAF1 10 0 0 10 0 0 2 3.03 3 10 4.74 3 10 2 2 2 717720 (1998). frequent cytogenetic abnormalities, such as the various translocations 21 21 PIM1 20 19 0 33 34 0 11 3.40 G. 3& 10 5.23 3 10 lymphomas. 2 N. Engl. J. Med. 2 362, 14171429 WT 2. Lenz, Staudt, L. M. Aggressive 2 1 2 1 involving 3q27, which occur of are enriched COL4A2 8 2 in 2530% 0 8 DLBCLs 2 and 0 2 7.64 3 10 8.99 3 10 2 2 2 (2010). 21 MLL2 in 1 both DLBCL in ABC cases38. Importantly, EP300 8 7 we found 1 8 mutated 7 3 9.54 3 10 1.00 2 the t(14;18)(q32;q21): 2 WT 3. Horsman, D. E. et al. Follicular lymphoma lacking subtypes (Fig. 2). Our analyses thus indicate that MLL2 actsand asthe a total number identification of two disease subtypes. Br. J. Haematol. 120, 424433 (2003). Individual cases with non-synonymous (NS), synonymous (S) and truncating (T) mutations of mutations of each class are shown separately because some genes contained multiple 4. determined Iqbal, J. with et al. BCL2 translocation defines unique tumor subset within mutations the same suppressor case. The P values indicated in bold areDLBCL the upper limit on the P value for that gene the approach described in ref. 19 a (see Supplementary Methods), q is the the Benjaminicentral in tumour in FL and both subtypes. center B-cell-like diffuse largerespectively. B-cell lymphoma. Am. J. Pathol. of 165, corrected q value, and NS SP and T SP refer to selective pressure estimates from this model for the acquisition germinal of non-synonymous or truncating mutations, Genes with a superscript either A or G The MEF2 gene family encodes four related transcription factors were found to have mutations significantly enriched in ABC or GCB cases, respectively (P , 0.05, Fishers exact test). 159166 (2004). that recruit histone-modifying enzymes including histone deacetylases * Additional somatic mutations identified in larger cohorts and insertion/deletion mutations are not included in this G. total. 5. Lenz, et al. Molecular subtypes of diffuse large B-cell lymphoma arise by distinct { Both indicates thatHATs we observed separate cases in which skewed expression was seen but where this skew genetic was not consistent for the mutant or wild-type allele. pathways. Proc. Natl Acad. Sci. USA 105, 1352013525 (2008). (HDACs) and in a calcium-regulated manner. Although trun{ Genes significant at a false discovery rate of 0.03. SNVs in BCL2 and previously confirmed hot spot mutations in EZH2 and CD79B are probably somatic in these samples based on published observations of others. 6. Pasqualucci, L. et al. Inactivation of the PRDM1/BLIMP1 gene in diffuse large B cell cating were are detected in our purifying analysis of MEF2 gene family 1 Selectivevariants pressure estimates both , 1 indicating selection rather than positive selection acting on this gene. lymphoma. J. Exp. Med. 203, 311317 (2006). members, our analysis suggests that, in contrast to MLL2, MEF2 family 7. Kato, M. et al. Frequent inactivation of A20 in B-cell lymphomas. Nature 459, members tend to selectively acquire non-synonymous amino acid sub712716 (2009). the possibility that variable RNA-seq coverage of MLL2 failed SGK1 encodes a phosphatidylinositol-3-OH kinase (PI(3)K)- address stitutions. In the case of MEF2B, 59.4% of all the cSNVs were found at 8. Compagno, M. et al. Mutations of multiple genes cause deregulation of NF-kB in capture some mutations, we PCR-amplified the entire MLL2 locus regulated kinase with functions including regulation of FOXO to diffuse large B-cell lymphoma. Nature 459, 717721 (2009). four sites within the protein (K4, Y69, N81 and D83), and all four of Davis, R. E. et al. Chronic active B-cell-receptor in diffuse large B-cell and transcription factors25, regulation of NF-kB by phosphorylating IkB 9.(, 36 kilobases) in 89 cases (35 primarysignalling FLs, 17 DLBCL cell lines, these sites were confirmed to be targets of somatic mutation. D83 is Nature 8892 (2010). kinase26, and negative regulation of NOTCH signalling27. SGK1 also 37lymphoma. DLBCLs). Of 463, these cases 58 were among the RNA-seq cohort. affected in 39% of the MEF2B alterations, resulting in replacement of 10. Ngo, V. N. et al. Oncogenically active MYD88 mutations in human lymphoma. resides within a region of chromosome 6 commonly deleted in DLBCL Illumina amplicon re-sequencing (Supplementary Methods) revealed Nature 470, 115119 (2011). the charged aspartate with any of alanine, glycine or valine. Although 5 . The mechanism by which SGK1 and GNA13 inactivation may (Fig. 1) 78 mutations, the RNA-seq mutations the overlapping 11. Mardis, E. R. et confirming al. Recurring mutations found by sequencing in an acute myeloid we cannot yet predict the consequences of these substitutions on leukemia genome. N. Engl.33 J. Med. 361, 10581066 (2009).We confirmed the contribute to lymphoma is unclear, but the strong degree of apparent cases and identifying additional mutations. protein function, it seems likely that their effect would have an impact 12. Shah, S. P. et al. Mutational evolution in a lobular breast tumour profiled at single selection towards their inactivation and expression their overall high mutation somatic status of 46 variants using Sanger sequencing (Supplemennucleotide resolution. Nature 461, 809813 (2009). on the ability of MEF2B to facilitate gene and thus have a frequency (each mutated in 18 of 106 DLBCL cases) suggests that their tary Table 10), and showed that 20 of the 33 additional mutations 13. Morin, R. D. et al. Somatic mutations altering EZH2 (Tyr641) in follicular and were role in promoting the malignant transformation of germinal centre B diffuse large lymphomas of germinal-center origin. loss to B-cell NHL. Certain genes are known to be mutated insertions orB-cell deletions (indels). Three SNVs at Nature splice Genet. sites 42, were also cellscontributes to lymphoma (Supplementary Discussion). 181185 (2010). more commonly in GCB DLBCLs (for example, 28) and detected, as were 10 new cSNVs that had not been detected by RNA-seq. MEF2B mutations can be linked to CREBBP and TP53 EP300(ref. mutations, 14. Futreal, P. A. et al. A census of human cancer genes. Nature Rev. Cancer 4, 177183 EZH2 (ref. 13)). Here, both SGK1 and GNA13 mutations were found The somatic mutations were distributed across MLL2 (Fig. 3a). Of (2004). and to recurrent Y641 mutations 2 in 3 EZH2 (ref. 13). 24One target of and 2.28 3 10 , Fishers exact only in GCB cases (P 5 1.93 3 is 10 H3K27, Pasqualucci, L. 5 et al. Inactivating mutations of acetyltransferase genes B-cell these, 37% (n 29/78) were nonsense mutations, 46% (in n5 36/78) CREBBP/EP300 HAT activity which is methylated by 15. lymphoma. Nature 471, 189195 (2011). frame, 8% (n 5 6/78) were point test; n 5 15 and 18, respectively) (Fig. 2). Two additional genes were indels that altered the reading EZH2 to repress transcription. There is evidence that the action of 16. Yusuf, I., Zhu, X., Kharas, M. G., Chen, J. & Fruman, D. A. Optimal B-cell proliferation ( MEF2B and TNFRSF14 ) with no previously described role of in mutations at splice sites and 9% (n 5 7/78) were non-synonymous EZH2 antagonizes that of CREBBP/EP300 (ref. 39). One function requires phosphoinositide 3-kinase-dependent inactivation of FOXO transcription DLBCL a either similar restriction to GCB cases (Fig. 2). genes40, amino acid substitutions (Table 2). Four of the somatic splice site factors. Blood 104, 784787 (2004). MEF2 is showed to recruit HDACs or CREBBP/EP300 to target Saito, M. ethad al. BCL6 suppression of BCL2 via Miz1 and its disruption in diffuse For mutations effects on MLL2 transcript length and structure. and it has been suggested that HDACs compete with CREBBP/EP300 17. large B cell lymphoma. Proc. Natl Acad. Sci. USA 106, 1129411299 (2009). Inactivating MLL2 site mutations example, two heterozygous splice site mutations resulted in the use of for the same binding on MEF2 (ref. 41). Under normal Ca21 18. Lenz, G. et al. Oncogenic CARD11 mutations in human diffuse large B cell splice donor site and an (2008). intron retention event. MLL2 showed the most significant evidence formaintain selectionthe and the a novel levels, MEF2 is bound by type IIa HDACs, which tails lymphoma. Science 319, 16761679 Approximately half of the NHL cases we sequenced had two MLL2 largest number of nonsense SNVs. Our RNA-seq analysis indicated 3 0 2 |26.0% N A T U(33/127) RE | VOL 7 6 | 1carried 8 AUGU ST 2 0 1 one 1 that of4 cases at least MLL2 cSNV. To mutations (Supplementary Table 10). We used bacterial artificial
Table 1 | Overview of cSNVs and confirmed somatic mutations in most frequently mutated genes 21
3 0 0 | N AT U R E | VO L 4 7 6 | 1 8 AU G U S T 2 0 1 1

Discussion

S16

NATURE REPRINT COLLECTION Epigenetics

ARTICLE RESEARCH RESEARCH ARTICLE


19. Greenman, C., Wooster, R., Futreal, P. A., Stratton, M. R. & Easton, D. F. Statistical 47. Robertson, G. et al. De novo assembly and analysis of RNA-seq data. Nature 40 a analysis of pathogenicity of somatic mutations in cancer. Genetics 173, Methods 7, 909912 (2010). 30 21872198 (2006). 48. Mortazavi, A., Williams, B. A., Mccue, K., Schaeffer, L. & Wold, B. Mapping and ABC GCB 20 20. K. J. et al. Acquired TNFRSF14 mutations in follicular lymphoma U Cheung, FL quantifying mammalian transcriptomes by RNA-Seq. Nature Methods 5, 621628 10 are associated with worse prognosis. Cancer Res. 70, 91669174 (2010). (2008). MYD88 21. Du, M. Q. et al. BCL10 gene mutation in lymphoma. Blood 95, 38853890 (2000). MLL2 49. Wright, G. et al. A gene expression-based method to diagnose clinically distinct CD79B BCL6s 22. Kreutz, B., Hajicek, N., Yau, D. M., Nakamura, S. & Kozasa, T. Distinct regions of COG5141 HMG box PHD PHD subgroups of diffuse large B cell lymphoma. Proc. Natl Acad. Sci. USA 100,SET TNFAIP3 FYRN Ga13 participate in its regulatory interactions with RGS homology domain99919996 (2003). CARD11 FYRC containing RhoGEFs. Cell. Signal. 19, 16811689 (2007). FAS 50.0 He, J. et al. Structure of p300 bound to MEF2 on DNA reveals a mechanism of bp 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 TMEM30A 23. Bhattacharyya, R. & Wedegaertner, P. Ga13 requires palmitoylation for plasma enhanceosome assembly. Nucleic Acids Res. (2011). N81K membrane localization, Rho-dependent signaling, and promotion of CD58 p115b N81Y CD70 D83G Supplementary Information is linked to the online version of the paper at RhoGEF membrane binding. J. Biol. Chem. 275, 1499214999 (2000). STAT3 Y69C 24. Manganello, J. M., Huang, J., Kozasa, T., Voyno-Yasenetskaya, T. A. & LeETS1 Breton, G. C. www.nature.com/nature. D83V Y69H HIST1H1C Protein kinase A-mediated phosphorylation of the Ga13 switch I region alters the CCND3 Acknowledgements This study was funded in part by funding from the National Gabc13-G protein-coupled receptor complex and inhibits Rho activation. J. Biol. KLHL6 D83A Cancer Institute Office of Cancer Genomics (Contract No. HHSN261200800001E), the K4E BTG1 Chem. 278, 124130 (2003). MEF2B BTG2 Terry Fox Foundation (grant 019001, Biology of Cancer: Insights from Genomic 25. Brunet, A. et al. Protein kinase SGK mediates survival signals by phosphorylating MADS box MEF2 IRF8 Analyses of Lymphoid Neoplasms) and Genome Canada/Genome British Columbia the forkhead transcription factor FKHRL1 (FOXO3a). Mol. Cell. Biol. 21, 952965 B2M 0 Competition 50 100 150 High Resolution 200 250 bp EP300 Grant III (Project Title: Analysis of300 Follicular350 Lymphoma (2001). CREBBP Genomes) to J.M.C., R.D.G. and M.A.M. We acknowledge support from NIH grants 26. Tai, D. J. C., Su, C.-C., Ma, Y.-L. & Lee, E. H. Y. SGK1 phosphorylation of MLL2 IkB kinase a Figure 3 | Summary and effect of somatic mutations MLL2and and P50CA130805-01 SPORE in Lymphoma, Tissue Resource affecting Core (PI Fisher) FOXO1 and p300 Up-regulates NF-kB activity and increases N-methyl-D-aspartate TNFRSF14 MEF2B . a, Re-sequencing MLL2 to locus in 89 samplesand revealed mainly 1U01CA114778 Molecular the Signatures Improve Diagnosis Outcome in receptor NR2A and NR2B expression. J. Biol. Chem. 284, 40734089 (2009). MEF2B Lymphoma (PIcircles) Chan). and A.J.M. is a Career Development Program Fellow of the Leukemia (red frameshift-inducing indel mutations (orange 27. Mo, J. et al. Serum- and glucocorticoid-inducible kinase 1 (SGK1) controls TP53Notch1 nonsense and Lymphoma Society. N.A.J. was a research fellow of the Terry Fox Foundation (award BCL2 signaling by downregulation of protein stability through Fbw7 ubiquitin ligase. triangles; inverted triangles for insertions and upright triangles for deletions). A SGK1 NCIC 019005) and the Michael Smith Foundation for Health Research J. Cell Sci. 124, 100112 (2011). smaller number of non-synonymous somatic mutations (green circles) and GNA13 (ST-PDF-01793). M.A.M. is a Terry Fox Young Investigator and a Michael Smith Senior 28. Young, K. H. et al. Structural profiles of TP53 gene mutations predictEZH2 clinical mutations or deletions affecting splice sites (yellow were also BCL2s Research Scholar. R.D.M. is a Vanier Scholar (CIHR) and holds stars) a MSFHR senior outcome in diffuse large B-cell lymphoma: an international collaborative study. point observed. of the non-synonymous point mutations a residue within graduateAll studentship. M.M.-L. acknowledges support fromaffected a Postdoctoral Fellowship Blood 112, 30883098 (2008). <0.05 from the the catalytic Spanish Ministry of Education, underdomain the Programa Nacional de Movilidad SET domain, the FYRC (FY-rich carboxy-terminal 29. Shilatifard, A. Molecular implementation and physiological roles for histone H3 either 0.10.05 de Recursos Humanos del Plan Nacional de I-D 1i 2008-2011. D.W.S. was supported lysine 4 (H3K4) methylation. Curr. Opin. Cell Biol. 20, 341348 (2008). domain) or PHD zinc finger domains. The effect of these splice-site mutations 0.30.1 by the Terry Fox Foundation Strategic Health Research Training Program in Cancer 30. Milne, T. et al. MLL targets SET domain methyltransferase activity to Hox gene on MLL2 splicing was also explored (Supplementary Figure 7). b, The cSNVs Research at Canadian Institutes of Health Research (Grant No. TGT-53912). J.J.S. promoters. Mol. Cell 10, 11071117 (2002). somatic mutations found in MEF2BCancer in all FL and DLBCL cases sequenced Figure | Overview ofgenes mutations and potential cooperative interactions in and 31. 2 Krumlauf, R. Hox in vertebrate development. Cell 78, 191201 (1994). acknowledges funding from The Canadian Society and the Canadian Institutes with the same symbols. Only amino acids with variants in at least NHL. heatE. map displays possible trends towards co-occurrence (red) and are 32. This Canaani, et al. ALL-1//MLL1, a homologue of Drosophila TRITHORAX, modifies ofshown Health Research. R.G. is supported by athe UBC Four Year Fellowship. I.M.M. is directly involved in infant and acute leukaemia. Br. J. Cancer 90, acknowledges the Canadian Foundation Innovation for ain Leaders Opportunity Fund. two patients are labelled. cSNVs werefor most prevalent the first two proteinmutual chromatin exclusion and (blue) of somatic mutations structural rearrangements. (2004).by taking the minimum value of a left- and right-tailed The laboratory for this study undertaken at thestructure Genome Sciences Centre, coding exons ofwork MEF2B (exons 2was and 3). The crystal of MEF2 bound Colours756760 were assigned 33. Wang, P. et al. Global analysis of H3K4 methylation defines MLL family member Columbia Cancer Research Centre Centresites for Translational and Applied EP300 supports the idea that two of and the the mutated (L67 and Y69) are Fisherstargets exact test. To capture trends a P-value threshold of 0.3 was used, with toBritish and points to a role for MLL1-mediated H3K4 methylation in the regulation Genomics, a program of the Provincial Health Services Authority Laboratories. The important in the interaction between these proteins (Supplementary Figure 8 the darkest shade of theinitiation colour indicating those meeting significance of transcriptional by RNA polymerase II. Mol. statistical Cell. Biol. 29, 60746085 authors would like to thank C. Greenman for supplying his software and also 50 . and Supplementary Discussion) (P # 0.05). The relative frequency of mutations in ABC (blue), GCB (red), (2009). acknowledge D. Gerhard and S. Aparicio for discussions and guidance. Special thanks 34. Issaeva, I. et al. Knockdown of ALR (MLL2) reveals ALR target genes andGenes leads to to C. Suragh, R. Roscoe, A. Troussard and A. Drobnies for expert project management unclassifiable (black) DLBCLs and FL (yellow) cases is shown on the left. alterations in cell adhesion and growth. Mol. Cell. Biol.Fishers 27, 18891903 (2007). assistance, and to the Library Construction, Sequencing and Bioinformatics teams at were arranged with those having significant (P , 0.05, exact test) such geneThe was MEF2B , which had not previously been 35. Pleasance, E. D. et al. A small-cell lung cancer genome with complex signatures of mutations. the GenomeOne Sciences Centre. content of this publication does not necessarily enrichment for mutations in ABC cases (blue triangle) towards the top (and tobacco exposure. Nature 463, 184190 (2010). reflect the of policiesWe of the Department Health and Human Services, nor does linked to views lymphoma. found thatof 20 (15.7%) cases had MEF2B left) and those with significant enrichment for mutations in GCB cases (red 36. Dalgliesh, G. L. et al. Systematic sequencing of renal carcinoma reveals inactivation mention of trade names,cases commercial products, or organizations imply endorsement by cSNVs and 4 (3.1%) had MEF2C cSNVs. All cSNVs detected by oftowards histone modifying genes. Nature 463, 360363 (2010). of cases in which triangle) the bottom (and right). The total number the US Government. 37. Parsons, D. W. either et al. The genetic ofsomatic the childhood cancer each gene contained cSNVs or landscape confirmed mutations is shown at RNA-seq affected either the MADS box or MEF2 domains. To deterAuthor Contributions M.A.M., D.E.H., and J.M.C. conceived of the Sangerstudy and Science 331, 435439 (2011). mine the frequency andR.D.G., scope of M.H. MEF2B mutations, we the top.medulloblastoma. The cluster of blue squares (upper-right) results from the mutual led the design of the experiments. R.D.M. performed the analysis of sequence data, 38. Iqbal, J. et al. Distinctive patterns of BCL6 molecular alterations and their exons 2 and 3M.M.-L., in 261 primary FL produced samples; 259 and DLBCL exclusion of the ABC-enriched mutations (for example, MYD88 CD79B ) from sequenced identified mutations and, with A.J.M. and M.A.M., figures wrote functional consequences in different subgroups of diffuse large, B-cell lymphoma. the GCB-enriched mutations (for example, EZH2, GNA13). Presence of the manuscript. M.M.-L., A.J.M., S.Chan, D.S., H.M., NHL J.S., M.M., T.Z., Leukemia 21, 23322343 (2007). primary tumours; 17 cellD.L.T., lines; 35 S.Chittarajan, cases of assorted (IBL, 39. Pasini, D. et al. Characterization antwo antagonistic switch between histone H3 A.D., K.T., Y.B., M.R.F., and T.M.S. designed performed experiments to structural rearrangements involving of the oncogenes BCL6 and BCL2 composite FL andJ.T.-W. PBMCL); and eight and non-malignant centroblast lysine methylation and acetylation in the transcriptional regulation of amplify, discover and validate mutations. R.G., M.G. and I.M.M. contributed to analyses (indicated as 27 BCL6 s and BCL2 s) was determined with FISH techniques using samples. Wethe also used a capture strategy Methods) Polycomb group target genes. Nucleic Acids Res. (2010). and reviewed manuscript. N.A.J., M.B., B.W. and(Supplementary B.M. prepared the samples, break-apart probes (Supplementary Methods). 40. Giordano, A. & Avantaggiati, M. p300 and CBP: partners for life and death. J. Cell. to performed sample sorting and COO analysis and contributed to 261 the text. and sequence the entire MEF2B coding region in the FLA.B.-W. samples, Physiol. 181, 218230 (1999). J.J.S. collected and preparedvariants constitutional DNA samples. K.L.M., R.C., S.L., M.F. and S.J. revealing six additional outside exons 2 and 3. We thus idenchromosome (BAC) clone in eight FL cases to show 41. Han, A., He, J., Wu, Y., Liu,sequencing J. O. & Chen, L. Mechanism of recruitment ofthat class in II generated de novo assemblies and identified mutations. M.K., S.R., M.G., O.Y. and E.Y.Z. casesand (34 DLBCL,to 12.67%; andperformed 35 FL, 15.33%) with MEF2B histone by myocyte enhancer J. Mol. Biol. 345, 91102 all eight casesdeacetylases the mutations were in trans, factor-2. affecting both MLL2 alleles. tified wrote69 software contributed figures. R.D.C. copy number analysis and (2005). produced figure and S.B.-N.to performed confirmatory FISH experiments. Y.Z. and A.T. or aindels, failing observe novel variants in other NHL and This observation is consistent with the notion that there is a complete, cSNVs 42. Youn, H. & Liu, J. Cabin1 represses MEF2-dependent Nur77 expression and T cell produced the sequencing libraries. I.B., R.H., S.J.M.J., R.M., J.S. and M.H. contributed to non-malignant samples. Of the variants 55 (80%) affected residues or near-complete, loss of MLL2 in the tumour cells of such patients. apoptosis by controlling association of histone deacetylases and acetylases with the development of experimental and analytical protocols. L.R. provided materials and the MADS box and MEF2 domains encoded by exons 2 and 3 MEF2. 13, of 8594 (2000). With the Immunity exception two primary FL cases and two DLBCL cell within reviewed the manuscript. 43. Yap, D. B. et al. Somatic mutations at EZH2 Y641 act dominantly through a (Supplementary Table 11; Fig. 3b). Each patient generally had a single lines (Pfeiffer and SU-DHL-9), the majority of MLL2 mutations seemed Author Information The SRA accession number for the submission of the data not mechanism of selectively altered PRC2 catalytic activity, to increase H3K27 MEF2B variant and we observed relatively fewis(eight inthe total, 10.7%) to be heterozygous. Analysis of Affymetrix 500k SNP array data from included in previous publications is SRP001599, which linked to dbGAP study trimethylation. Blood 117, 24512459 (2011). accession phs000235.v2.p1. Reprints and permissions informationSNVs is available atby 44. C. apparent J. et al. Coordinated activities of wild-type revealed plus mutant EZH2 drive truncation-inducing SNVs or indels. Non-synonymous were two FLSneeringer, cases with homozygous mutations that both www.nature.com/reprints. This paper is distributed under the terms of the Creative tumor-associated hypertrimethylation of lysine 27 on histone H3 (H3K27) in far the most common type of change observed, with 59.4% of detected tumours showed copy number neutral loss of heterozygosity (LOH) Commons Attribution-Non-Commercial-Share Alike licence, and is freely available to human B-cell lymphomas. Proc. Natl Acad. Sci. USA 107, 2098020985 (2010). affecting K4, Y69, N81 or D83. In 12 cases mutations for 45. theLi, region of chromosome 12 containing MLL2 with (Supplementary all readers at www.nature.com/nature. The authors declare noMEF2B competing financial H. & Durbin, R. Fast and accurate short read alignment BurrowsWheeler variants interests. Readers are welcome including to comment representative on the online version of this article at transform. Bioinformatics 25, 17541760 shown to be somatic, mutations at each Methods). Thus, in addition to bi-allelic(2009). mutation, LOH is a second, were www.nature.com/nature. Correspondence and requests for materials should be 46. Goya, R. et al. SNVMix: predicting single nucleotide variants from next-generation ofaddressed K4, Y69, N81 and D83 (Supplementary Table 12). We did not albeit less common mechanism by which MLL2(2010). function is lost. sequencing of tumors. Bioinformatics 26, 730736 to M.A.M. (mmarra@bcgsc.ca).
MYD88 CD79B BCL6s TNFAIP3 CARD11 FAS TMEM30A CD58 CD70 STAT3 ETS1 HIST1H1C CCND3 KLHL6 BTG1 BTG2 IRF8 B2M EP300 CREBBP MLL2 FOXO1 TNFRSF14 MEF2B TP53 BCL2 SGK1 GNA13 EZH2 BCL2s Cases

GCB enrichment

ABC enrichment

MLL2 was the most frequently mutated gene in FL, and among the most frequently mutated genes in DLBCL (Fig. 2). We confirmed MLL2 mutations in 31 of 35 FL patients (89%), in 12 of 37 DLBCL patients (32%), in 10 of 17 DLBCL cell lines (59%) and in none of the eight normal centroblast samples we sequenced. Our analysis predicted that the majority of the somatic mutations observed in MLL2 were inactivating (91% disrupted the reading frame or were truncating point mutations), indicating to us that MLL2 is a tumour suppressor of significance in NHL.

detect mutations in ABC cases, indicating that somatic mutations in MEF2B have a role unique to the development of GCB DLBCL and FL (Fig. 2).

Table 2 | Summary of types of MLL2 somatic mutations


Sample Type FL DLBCL DLBCL cell-line Centroblast

Recurrent point mutations in MEF2B


Our selective pressure analysis also revealed genes with stronger pressure for acquisition of amino acid substitutions than for nonsense
NATURE REPRINT COLLECTION Epigenetics

Truncation Indel with frameshift Splice site SNV Any mutation/ number of cases Percentage

18 22 4 3 31/35 89

4 8 2 2 12/37 32

7 6 0 2 10/17 59

0 0 0 0 0/8 0

1 8 AU G U S T 2 0 1 1 | VO L 4 7 6 | N AT U R E | 3 0 3

1 8 AU G U S T 2 0 1 1 | VO L 4 7 6 | N AT U R E | 3 0 1

S17

ARTICLE
PUBLISHED ONLINE: 10 JULY 2011 | DOI: 10.1038/NCHEMBIO.599
First published in Nature Chemical Biology 7, 566574 (2011); doi: 10.1038/ nchembio.599

A chemical probe selectively inhibits G9a and GLP methyltransferase activity in cells
Masoud Vedadi1,12, Dalia Barsyte-Lovejoy1,12, Feng Liu2,12, Sylvie Rival-Gervier3,5, Abdellah Allali-Hassani1, Viviane Labrie6, Tim J Wigle2, Peter A DiMaggio7, Gregory A Wasney1, Alena Siarheyeva1, Aiping Dong1, Wolfram Tempel1, Sun-Chong Wang6,8, Xin Chen2, Irene Chau1, Thomas J Mangano9, Xi-ping Huang9, Catherine D Simpson2, Samantha G Pattenden2, Jacqueline L Norris2, Dmitri B Kireev2, Ashutosh Tripathy10, Aled Edwards1, Bryan L Roth9, William P Janzen2, Benjamin A Garcia7, Arturas Petronis6, James Ellis3,4, Peter J Brown1, Stephen V Frye2, Cheryl H Arrowsmith1,11* & Jian Jin2*
Protein lysine methyltransferases G9a and GLP modulate the transcriptional repression of a variety of genes via dimethylation of Lys9 on histone H3 (H3K9me2) as well as dimethylation of non-histone targets. Here we report the discovery of UNC0638, an inhibitor of G9a and GLP with excellent potency and selectivity over a wide range of epigenetic and non-epigenetic targets. UNC0638 treatment of a variety of cell lines resulted in lower global H3K9me2 levels, equivalent to levels observed for small hairpin RNA knockdown of G9a and GLP with the functional potency of UNC0638 being well separated from its toxicity. UNC0638 markedly reduced the clonogenicity of MCF7 cells, reduced the abundance of H3K9me2 marks at promoters of known G9a-regulated endogenous genes and disproportionately affected several genomic loci encoding microRNAs. In mouse embryonic stem cells, UNC0638 reactivated G9a-silenced genes and a retroviral reporter gene in a concentration-dependent manner without promoting differentiation.
rotein lysine methylation is increasingly recognized as a major signaling mechanism in eukaryotic cells. This process has been most heavily studied in the context of epigenetic regulation of gene expression through methylation of lysine residues of histone proteins16, but a growing number of known non-histone substrates suggest that the impact of lysine methylation is not limited to chromatin biology710. Protein lysine methyltransferases (PKMTs) catalyze the transfer of a methyl group from S-adenosyl-L-methionine (SAM) to the -amino group of lysine residues of proteins, including histones1,11. Since the first PKMT was characterized in 2000 (ref. 12), more than 50 human PKMTs have been identified1,11. PKMTs show substantial variations in protein substrate selectivity and the degree of methylation on lysine, from mono- to di- to trimethylation. Selective pharmacological inhibition of individual PKMTs catalytic activity in cellular systems is a useful strategy for deciphering the complex signaling mechanisms of histone and protein lysine methylation. However, very few small-molecule tools are currently available for probing the activity of individual PKMTs13. Growing evidence suggests that PKMTs are important in the development of various human diseases11,14. In particular, G9a (also known as KMT1C or EHMT2), which was initially identified as a

histone H3 Lys9 (H3K9) methyltransferase15, is overexpressed in various human cancers including leukemia8, prostate carcinoma8,16, hepatocellular carcinoma17 and lung cancer18. It has been shown that knockdown of G9a inhibits prostate, lung and leukemia cancer cell growth16,18,19. The closely related protein GLP (also known as KMT1D or EHMT1) shares 80% sequence identity with G9a in their respective SET domains and forms a heterodimer with G9a20. In addition to catalyzing mono- and dimethylation of H3K9 (refs. 15,20), both G9a and GLP dimethylate Lys373 of the tumor suppressor p53, inactivating p53s transcriptional activity8. Moreover, G9a has been shown to be involved in cocaine addiction21, mental retardation22, maintenance of HIV-1 latency23 and DNA methylation in mouse embryonic stem (mES) cells2426. Furthermore, pharmacologic inhibition of G9a and GLP has been reported to facilitate reprogramming of mouse fetal neural precursor cells into induced pluripotent stem (iPS) cells27,28. This broad range of cellular and disease-related activities poses a challenge for understanding G9a- and GLP-related biology and for the potential targeting of these proteins therapeutically. Thus, selective, potent and cell-active chemical probes for G9a and GLP would be extremely valuable tools for investigating the cellular role of these PKMTs, as well as for assessing their potential as therapeutic targets.

1 Structural Genomics Consortium, University of Toronto, Toronto, Ontario, Canada. 2Center for Integrative Chemical Biology and Drug Discovery, Division of Medicinal Chemistry and Natural Products, UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA. 3Developmental and Stem Cell Biology Program, SickKids Hospital, Toronto, Ontario, Canada. 4Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada. 5INRA, UMR 1198 Biologie du Dveloppement et Reproduction, Jouy en Josas, France. 6Krembil Family Epigenetic Laboratory, Centre for Addiction and Mental Health, Toronto, Ontario, Canada. 7Department of Chemistry, Princeton University, Princeton, New Jersey, USA. 8Institute of Systems Biology and Bioinformatics, National Central University, Jhongli City, Taiwan. 9National Institute of Mental Health Psychoactive Drug Screening Program, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA. 10Department of Biochemistry and Biophysics, UNC Macromolecular Interactions Facility, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA. 11Ontario Cancer Institute, Campbell Family Cancer Research Institute and Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada. 12These authors contributed equally to this work. *e-mail: carrow@uhnres.utoronto.ca or jianjin@unc.edu

5 66 S18

NATURE CHEMICAL BIOLOGY | VOL 7 | AUGUST 2011 | www.nature.com/naturechemicalbiology NATURE REPRINT COLLECTION Epigenetics

NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.599


N Ph N NH N N N 1 (BIX01294) G9a: IC50 = 180 nM GLP: IC50 = 34 nM N OMe OMe Improve in vitro potency N N 2 (UNC0321) G9a: IC50 < 15 nM GLP: IC50 = 15 nM N NH N N OMe O O N Improve cellular potency N N N R

ARTICLE

OMe O N

3 (UNC0638), R = H G9a: IC50 < 15 nM (n = 4) GLP: IC50 = 19 1 nM (n = 2) 4 (UNC0737), R = Me (negative control) G9a: IC50 = 5,000 200 nM (n = 2) GLP: IC50 > 10,000 nM (n = 2)

Scheme 1 | Discovery of UNC0638. Structure-based design and SAR exploration of the quinazoline template represented by BIX01294 led to the identification of UNC0321, a G9a and GLP inhibitor with high in vitro potency but poor cellular potency. The design and synthesis of several generations of new analogs aimed at improving cell membrane permeability while maintaining high in vitro potency resulted in the discovery of UNC0638, which has balanced in vitro potency and physicochemical properties aiding cell penetration. UNC0737, the N-methyl analog of UNC0638, was discovered as a structurally similar but less potent G9a and GLP inhibitor for use as a negative control.

The recent report of BIX01294 (1), a small-molecule inhibitor of G9a and GLP29, was an important advance, as this compound is, to our knowledge, the first potent and selective PKMT inhibitor. BIX01294 has since been used successfully as a probe of G9a in cellular reprogramming27,28 and reactivation of latent HIV-1 (ref. 23). BIX01294 at 4.1 M reduced the abundance of the H3K9me2 mark in bulk histones in several cell lines and reduced H3K9me2 levels at G9a target genes29. However, BIX01294 was toxic to cells at concentrations higher than 4.1 M (ref. 29). This poor separation between the concentration producing robust functional effects in cells and the concentration causing toxicity has limited the compounds usefulness as a G9a and GLP chemical probe. To provide a high-quality chemical probe30 of G9a and GLP with an improved ratio of toxicity to functional potency (toxicity/function ratio, which is determined dividing the EC50 value of observed toxicity by the IC50 value of the functional potency), we have explored this 2,4-diamino-6,7-dimethoxyquinazoline template. We have previously reported the discovery of UNC0224 and UNC0321 (2) as potent and selective G9a and GLP inhibitors and described robust structure-activity relationships (SAR) of their analogs31,32. Other studies of the SAR of this scaffold have resulted in the discovery of E72 as a potent and selective GLP inhibitor33. However, UNC0321 (Supplementary Results, Supplementary Fig. 1) and E72 (ref. 33) are less potent than BIX01294 in cellular assays. Here we report that UNC0638 (3) is a potent, selective and cellpenetrant chemical probe for G9a and GLP, with a toxicity/function ratio of >100, compared to <6 for BIX01294. We describe the discovery of UNC0638 and its in vitro potency, selectivity, mechanism of action and kinetics, X-ray cocrystal structure and robust ontarget activities in cells. This greatly improved, well-characterized chemical probe represents a substantial advance in PKMT probe discovery and will enable better understanding of the epigenetic and cellular role(s) of G9a and GLP.

aimed at increasing lipophilicity while maintaining high in vitro potency. Among the newly synthesized compounds, UNC0638 (Scheme 1), which has balanced in vitro potency and physicochemical properties aiding cell penetration, showed high potency in cellular assays and was considerably less toxic to cells than BIX01294 (see below). UNC0638 was efficiently synthesized via a novel seven-step synthetic sequence (Supplementary Scheme 1). In contrast to our previous synthetic route to UNC0321 (ref. 32), this new synthesis avoided the Mitsunobu reaction as the last synthetic step and thus greatly facilitated purification of the final compounds. In addition, we designed and synthesized UNC0737 (4) (Scheme 1), the N-methyl analog of UNC0638, as a structurally similar but less potent G9a and GLP inhibitor for use as a negative control. UNC0737 was designed to eliminate the hydrogen bond interaction seen in the G9aUNC0224 cocrystal structure between Asp1083 of G9a and the secondary amino group at the 4-position of UNC0224s quinazoline
Table 1 | Selectivity of UNC0638 against epigenetic targets
Target G9a GLP SUV39H2 SUV39H1 SETD7 SMYD3 MLL EZH2 DOT1L SETD8 PRDM1 PRDM10 PRDM12 PRMT1 PRMT3 HTATIP JMJD2E DNMT1
a

IC50 (nM)a <15


c

Ki (nM)a 3 0.05
d

Tm shift (C)b 4 8 NT NT NT ND NT NT ND ND ND ND ND NT ND ND NT NT

19 1c >10,000c >10,000e >10,000 NT >10,000e >10,000e NT > 10,000c NT NT NT > 10,000e > 10,000 NT 4,500 1,100f 107,000 6,000g
c c

Previously, initial inhibitor design and synthesis based on the X-ray cocrystal structures of the GLPBIX01294 (PDB 3FPD)34 and G9a UNC0224 (PDB 3K5K)31 complexes led us to discover UNC0321, a potent and selective inhibitor of G9a and GLP32 (Scheme 1). However, UNC0321 was less potent in cellular assays than BIX01294 (Supplementary Fig. 1), even though it was more potent than BIX01294 in biochemical assays. We hypothesized that the poor cellular potency of UNC0321 was probably due to poor cell membrane permeability. Here, to improve the cellular potency of this series of compounds, we exploited the SAR of the quinazoline scaffold discovered previously31,32 and designed several generations of new analogs

RESULTS Discovery of UNC0638

IC50 or Ki values are the average of at least two separate experiments. bResults from single DSF or differential static light scattering (DSLS) assay at 100 M. cSAHH-coupled assay results. dMCE assay results. eAssay results from BPS Bioscience. fAlphaScreen assay results. gRadioactive methyl transfer assay results. NT, not tested. ND, not detected.

NATURE BIOLOGY | VOL 7 | AUGUST 2011 | www.nature.com/naturechemicalbiology NATURECHEMICAL REPRINT COLLECTION Epigenetics

5 67 S19

ARTICLE

NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.599


a
Vobs (M/min) 0.4 0.3 0.2 0.1 0 0 UNC0638 (nM) 13.3 8.9 3.9 2.6 1.8 1.2 0 20 40 60 80 H3K9 peptide (M) 100

Kmpeptide (M)

ring31. Indeed, UNC0737 was >300-fold less potent than UNC0638 in G9a and GLP biochemical assays (see below). The inhibitory effect of UNC0638 on G9a and GLP activity was first evaluated using the fluorescence-based S-adenosyl-L-homocysteine hydrolase (SAHH)-coupled assay, which monitors the conversion of the cofactor, SAM, to the cofactor product, S-adenosyl-L-homocysteine (SAH)35. UNC0638 was a potent G9a (IC50 < 15 nM (n = 4)) and GLP inhibitor (IC50 = 19 1 nM (n = 2)) in these SAHH-coupled assays (Table 1). An endoproteinase-coupled microfluidic capillary electrophoresis (MCE) assay36, which is orthogonal and complementary to the SAHH-coupled assay, was also used to evaluate G9a inhibition by UNC0638, yielding an IC50 < 10 nM (n = 3). In addition, UNC0638 displaced a fluorescein-labeled 15-mer H3 peptide (residues 115) with high efficiency in a G9a fluorescence-polarization assay, suggesting that UNC0638 binds in the substrate peptidebinding site of G9a (Supplementary Fig. 2). UNC0638 also stabilized G9a and GLP in differential scanning fluorimetry (DSF) experiments, with Tm shifts of 4 C and 8 C, respectively, consistent with high-affinity binding (Supplementary Fig. 3). We next determined detailed mechanism-of-action and MichaelisMenten kinetic parameters associated with both the peptide and SAM as a function of UNC0638 concentration (Fig. 1ad). These experiments confirmed that UNC0638 was competitive with the peptide substrate, as the Km of the peptide (Kmpeptide) increased linearly with UNC0638 concentration (Fig. 1b), and noncompetitive with cofactor SAM, as the apparent Km of SAM (Kmapp) remained constant in the presence of increasing concentrations of the compound (Fig. 1d). The Ki of UNC0638 was determined to be 3.0 0.05 nM (n = 2). Consistent with this, the Morrison Ki (ref. 37) for UNC0638 was 3.7 0.2 nM (n = 3) (Supplementary Fig. 4). Kinetics of the inhibition of G9a by UNC0638 was also studied using surface plasmon resonance (SPR). UNC0638 bound G9a tightly, with rapid association (ka = 2.12 106 1/ms) and disassociation (kd = 5.7 102 1/s) rates (Supplementary Fig. 5), consistent with a classic reversible mechanism of inhibition of G9a. The Kd of UNC0638 measured from equilibrium analysis of the Langmuir binding isotherms in the SPR studies was 27 nM, consistent with results from homogeneous assays. As expected, UNC0737, the N-methyl analog of UNC0638, was a poor inhibitor of G9a (IC50 = 5,000 200 nM (n = 2)) and GLP (IC50 > 10,000 nM (n = 2)) in the SAHH-coupled assays (Supplementary Table 2). The combination of the high structural similarity between UNC0737 and UNC0638 and the >300-fold loss of potency in UNC0737 compared to UNC0638 makes UNC0737 an appropriate negative control for use in cellular and functional assays. The selectivity of UNC0638 over a wide range of epigenetic targets was evaluated (Table 1). Notably, UNC0638 was inactive against other H3K9 (SUV39H1 and SUV39H2), H3K27 (EZH2), H3K4 (SETD7, MLL and SMYD3), H3K79 (DOT1L) and H4K20 (SETD8) methyltransferases, as well as PRDM1, PRDM10 and PRDM12. In addition, UNC0638 was inactive against protein arginine methyltransferases PRMT1 and PRMT3, and HTATIP, a histone acetyltransferase. Of note, UNC0638 had weak but measurable activity against JMJD2E (IC50 = 4,500 1,100 nM (n = 3)), a Jumonji protein demethylase and DNA methyltransferase DNMT1 (IC50 = 107,000 6,000 nM (n = 2)). Nevertheless, the selectivity of UNC0638 for G9a and GLP over JMJD2E was >200-fold, and selectivity for G9a and GLP over DNMT1 was >5,000-fold. We also evaluated the selectivity of UNC0638 over a broad range of non-epigenetic targets, including G protein coupled receptors (GPCRs), ion channels, transporters and kinases (Supplementary Tables 3 and 4). UNC0638 was clean (<30% inhibition at 1 M)
5 68 S20

80 60 40 20 0 0 Ki = 3.0 0.05 nM 0.005 0.010 UNC0638 (M) 0.015

UNC0638 is a potent and substrate-competitive inhibitor

c
Vobs (M min1) 0.03 0.02 0.01 0 0

UNC0638 (nM) 13.3 8.9 5.9 3.9 2.6 1.8 1.2 0 50 100 SAM (M)

d
Kmapp (M)

25 20 15 10 5 0 0 0.004 0.002 UNC0638 (M) Asp 1083


Hydrogen bonding (side chain)

0.006

Asp1078 Asp1083

Tyr 1154
(side chain)

Tyr1154
Hydrogen bonding (side chain) Hydrogen bonding (side chain)

Leu1086 Asp1088

Asp 1088

Leu 1086

Figure 1 | UNC0638 competes with the peptide substrate but not with the cofactor SAM. We determined the velocity of the reaction by measuring the conversion of substrate to product at six time points spanning 100 min, analyzed these data by linear regression to determine initial steady-state enzyme velocity and fitted them to Michaelis-Menten kinetics. The Km values of the peptide and SAM were then plotted as a function of UNC0638 concentrations. (a,b) UNC0638 is competitive with the H3K9 peptide substrate, as Kmpeptide increases linearly with compound concentration. (c,d) UNC0638 does not compete with the cofactor SAM, as Kmapp was not affected by the compound. (e,f) The X-ray cocrystal structure of the G9a UNC0638SAH complex confirms the mechanism of action of UNC0638. UNC0638 (in gray, blue and red sticks) occupies the peptide binding groove and does not interact with the SAM binding pocket. The 7-(3-pyrrolidin-1-yl-) propoxy side chain of UNC0638 interacts with the lysine binding channel.

UNC0638 is a selective inhibitor of G9a and GLP

against 26 out of 29 targets in the Ricerca Selectivity Panel. At 1 M concentration, UNC0638 showed 64%, 90% and 69% inhibition of muscarinic M2, adrenergic 1A and adrenergic 1B receptors, respectively. Because UNC0638 hit three GPCRs in the Ricerca Selectivity Panel, we further assessed its selectivity against GPCRs by testing UNC0638 in the US National Institute of Mental Healths Psychoactive Drug Screen Program Selectivity Panel, which consists of a total of 45 targets, including 36 GPCRs. UNC0638 had <50% inhibition at 1 M against 39 targets in the panel, and >50% inhibition at 1 M against 6 targets in the panel. Ki in the radioligand binding assay for each of the six interacting GPCRs was subsequently determined. The Ki measurements showed UNC0638 was at least 100-fold selective for G9a over these six GPCRs. In M1, M2 and M4 functional assays, UNC0638 had no agonist activity, low antagonist potency against M1 and M4 (IC50 > 10,000 nM (n = 2)), and modest antagonist potency against M2 (IC50 = 480 10 nM (n = 2)). Furthermore, UNC0638 was tested against a panel of 24 kinases and showed <10% inhibition at 1 M against these kinases. Therefore, we conclude that when used at appropriate concentrations (for example, <500 nM), the effects of UNC0638 on histone or other lysine methylation substrates can be interpreted as primarily due to the inhibition of G9a and GLP. The selectivity of UNC0737 is summarized in Supplementary Table 2. Like UNC0638, UNC0737 was inactive against SUV39H2, SETD7, SETD8 and PRMT3, had a binding-affinity range of 60 to

NATURE CHEMICAL BIOLOGY | VOL 7 | AUGUST 2011 | www.nature.com/naturechemicalbiology NATURE REPRINT COLLECTION Epigenetics

NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.599


a
UNC0638 (nM) 5,000 2,500 500 250 50 25 5 0 No-antibody controls H3K9me2 Cell viability shRNA H3K9me2 viability Percentage H3K9me2 Contr G9a GLP G9a GLP

ARTICLE
c
70 Percentage H3K9me2 60 50 40 30 20 10 0 80 250 Compound (nM) 500 1d 2d 3d 4d 2+2d 2+2dw

b
110 100 90 80 70 60 50 40 30 20 10 0

BIX01294 UNC0638 UNC0737

G9a and GLP shRNA 100 101 103 102 Compound (nM) 104

d
120 Percentage MTT 100 80 60 40 20 0 0 10

BIX01294 UNC0638 UNC0737

e
Percentage response

101

3 102 104 10 Compound (nM)

105

H3K9me2 UNC0638 MTT 110 100 90 80 70 60 50 40 30 20 10 0 103 100 101 102 104 Compound (nM)

f
Percentage response

105

110 100 90 80 70 60 50 40 30 20 10 0 100

H3K9me2 BIX01294 MTT

101

103 102 104 Compound (nM)

105

Figure 2 | UNC0638 inhibits cellular H3K9 dimethylation and shows good separation of functional potency and toxicity in MDA-MB-231 cells. (a) UNC0638 (48 h) or G9a and/or GLP shRNAs, reduced H3K9 dimethylation levels. H3K9me2 antibody was used for cell immunostaining (in-cell western) and results normalized to cell number measured by uptake nucleic acid dye (DRAQ5). (b) UNC0638 was considerably more potent than BIX01294 and UNC0737 (negative control) in reducing cellular H3K9me2 levels, which were measured after MDA-MB-231 cells were treated with inhibitors for 48 h. Dashed line indicates level of H3K9me2 resulting from G9a and GLP knockdown. (c) Cellular levels of H3K9me2 were progressively reduced from 1 d to 4 d exposure to UNC0638 at three concentrations (80 nM, representing IC50; 250 nM, representing IC90; and 500 nM, representing 2 IC90). The reductions with 250-nM and 500-nM treatments after 4 d were equal or very close to that of G9a and GLP knockdown cells. Refreshing the inhibitor after 2 d (2+2d) increased inhibition by UNC0638 at 80 nM but had little further effect at 250 and 500 nM. The effects of UNC0638 were long-lasting. In cells with 2 d exposure to UNC0638, levels of H3K9me2 remained low after washout of compound followed by 2 d incubation without the inhibitor (2+2dw). (d) UNC0638 and UNC0737 had lower cellular toxicity than BIX01294 in MTT (3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyl tetrazolium bromide)) assays. (e,f) UNC0638 had good separation of functional potency (decrease in H3K9me2 levels) and toxicity (from the MTT assay), whereas BIX01294 had poor separation of these effects.

1,000 nM (Ki values) for 1A, 2C, M1, M2 and M4 receptors, had low to modest potency (IC50 800 to >10,000 nM) in M1, M2 and M4 functional assays and was inactive against a panel of 24 kinases. The similar molecular profiles of UNC0737 and UNC0638 against epigenetic targets (except G9a and GLP) and non-epigenetic targets make UNC0737 an appropriate negative control in terms of selectivity.

Crystal structure of the G9aUNC0638SAH complex

An X-ray crystal structure of the G9aUNC0638SAH complex (2.56- resolution; Fig. 1e,f and Supplementary Table 1) provides structural insight into the mechanism of action. First, UNC0638 occupies the substrate binding groove and does not interact with the SAM binding pocket. This finding is consistent with the results from the inhibitor-peptide-SAM competition experiments. Second, the hydrogen of the secondary amino group at the 4-position of the quinazoline ring indeed forms a hydrogen bond with Asp1083, explaining the marked potency loss of UNC0737 compared to UNC0638. Finally, the lysine binding channel is occupied by the 7-(3-pyrrolidin-1-yl-)propoxy side chain. Compared to the X-ray crystal structures of the GLPBIX01294 (PDB 3FPD)34 and G9a UNC0224 (PDB 3K5K)31 complexes, the same binding mode was observed for UNC0638 (Supplementary Fig. 6).

degradation products of UNC0638. In mouse drug metabolism and pharmacokinetic studies, UNC0638 had high clearance, short halflife, high volume distribution and low exposure after intravenous, oral or intraperitoneal administration (Supplementary Table 5). Thus, although UNC0638 is probably not suitable for in vivo animal studies owing to low exposure levels, its high stability under cellular assay conditions, in combination with high potency and selectivity, makes UNC0638 an ideal chemical tool for cell-based studies.

UNC0638 has high cellular potency and low toxicity

1 H NMR and LC-MS analysis of a solution of UNC0638 (10 mM) in deuterated DMSO and deuterated H2O (90:10 ratio) that had been kept at room temperature for 4 weeks indicated that UNC0638 was stable under these conditions; no degradation products were found. Incubation of UNC0638 with or without MCF7, U2OS or H1299 cells in two types of cell media for 65 h also did not produce

UNC0638 is stable under cellular assay conditions

G9a and GLP are the primary enzymes affecting dimethylation of histone H3K9 in cells15,20. To assess the cellular potency of UNC0638, we used an H3K9me2 antibody cell immunofluorescence or in-cell western assay. This assay allows rapid processing of multiple samples for H3K9me2 immunofluorescence signal (Fig. 2a, green signal) and normalization to cell number via the use of the nucleic acid dye DRAQ5 (Fig. 2a, red signal). We verified the specificity of the antibody by comparison of dose-dependent dot blots and by the reduced cellular immunofluorescence signal in G9a and GLP knockdown experiments (Fig. 2a). Initially, the data were normalized to total H3 levels (Supplementary Fig. 7); however, this was found to be consistent with the DRAQ5 normalization, and subsequently the latter was used. We characterized UNC0638 and UNC0737 in MDA-MB231 cells because of their robust H3K9me2 levels and good tolerance of G9a and GLP knockdown. In MDA-MB-231 and MCF7 cells, treatment with several short hairpin RNAs (shRNAs) reduced G9a and GLP to 2540% of the levels in control experiments, and also yielded consistently lower levels of H3K9me2 (Supplementary Fig. 8). In MDA-MB-231 cells, UNC0638 (48 h exposure) reduced H3K9me2 levels in a concentration-dependent manner with an IC50
5 69 S21

NATURE BIOLOGY | VOL 7 | AUGUST 2011 | www.nature.com/naturechemicalbiology NATURECHEMICAL REPRINT COLLECTION Epigenetics

ARTICLE
a
100 90 80 Control shRNA 521.307

NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.599


H3K9me2

100 90 80 70 60 50 40 30 20 10 0

Control 521.306

H3K9me2

Relative abundance

Relative abundance

70 60 50 40 30 20 10 0 520 521 522 522.309 522.810 523 524 m/z 521.808 shRNA G9a 523.822 shRNA G9a + GLP 524.323 524.824 525 526 526.337 526.839 527.340 527 528

521.807

BIX01294 523.821

524.323 522.308 522.810 521 522 523 524.824 524 m/z BIX01294 530.829 525

UNC0638 526.337 526.838 527.339 526 527 528

c
100 90 80 70 60 50 40 30 20 10 0 528 529 530 529.317 528.815 Control shRNA 528.314

shRNA G9a 530.829 shRNA G9a + GLP 533.345

H3K9me3

d
100 90 80 70 60 50 40 30 20 10 0

Control 528.314

H3K9me3 UNC0638 533.345

Relative abundance

Relative abundance

531.331 533.846

528.815

531.331 533.846

531.832

534.347

529.317

531.832

534.347

531 m/z

532

533

534

535

528

529

530

531 m/z

532

533

534

535

Figure 3 | Quantitative MS analysis of histone post-translational modifications in MDA-MB-231 cells. (ad) MS of doubly charged peptides (KSTGGKAPR) corresponding to H3K9me2 (a,b) and H3K9me3 (c,d). In a,c, cells were treated with indicated shRNAs (control, D0 propionyl labeled; G9a, D0,D5 propionyl labeled, ~2.5 m/z heavier than the control; G9a + GLP, D5,D5 propionyl labeled, ~5 m/z heavier than the control). In b,d, cells were mock-treated (control, D0 propionyl labeled) or treated with 1 M of BIX01294 (D0,D5 propionyl labeled, ~2.5 m/z heavier than the control) or UNC0638 (D5,D5 propionyl labeled, ~5 m/z heavier than the control) for 48 h.

of 81 9 nM (n = 3), which indicates considerably higher potency than BIX01294 (IC50 = 500 43 nM (n = 3)) (Fig. 2b). The maximum effect of UNC0638 in reducing H3K9me2 levels was greater than that of BIX01294 and close, but not equal, to that of the double knockdown of G9a and GLP via shRNA (Fig. 2b). Consistent with its poor in vitro potency, UNC0737 (negative control) showed poor cellular potency in the in-cell western (IC50 > 5,000 nM (n = 3); Fig. 2b) and chromatin immunoprecipitation (ChIP) assays (Supplementary Fig. 9). We next studied the time course of effects of UNC0638 on cellular levels of H3K9me2. Because H3K9me2 has a half-life of about 1 d (ref. 38), we hypothesized that exposure beyond 48 h might result in even greater reduction of the mark. We found that H3K9me2 levels in MDA-MB-231 cells gradually decreased over the course of treatment (Fig. 2c). After 4 d, the cellular H3K9me2 levels under treatment with 250 or 500 nM of UNC0638 were equal or very close to those of G9a and GLP knockdown cells. At both UNC0638 concentrations, changing the cell medium after 2 d (denoted 2+2d in Fig. 2c) had little effect compared with not changing the medium (denoted 4d in Fig. 2c). Notably, reduced cellular levels of H3K9me2 were still observed at the 4-d time point after cells were exposed to UNC0638 for 2 d, followed by washout of the compound and another 2 d of culture without the inhibitor (denoted 2+2dw in Fig. 2c). The level of H3K9me2 at day 2+2dw was inversely proportional to the original dosage of UNC0638, suggesting that residual amounts of UNC0638 remain in the cells and can have a lasting effect. Inhibitor treatment did not affect the protein levels of G9a or GLP (Supplementary Fig. 10) or the mRNA levels of G9a (Supplementary Fig. 11), indicating the observed effects were due to inhibition of the enzymatic function of the proteins and not to changes in protein abundance. One of the desirable characteristics of a good chemical probe is low toxicity due to off-target effects. Both UNC0638 (EC50 = 11,000 710 nM (n = 3)) and UNC0737 (EC50 = 8,700 790 nM (n = 3)) were considerably less toxic than BIX01294 (EC50 = 2,700 76 nM
570 S22

(n = 3)) in an MTT assay (Fig. 2d). Notably, UNC0737 had cellular toxicity similar to that of UNC0638, suggesting that the observed cellular toxicity is probably not due to inhibition of G9a and GLP in this cell type. Thus, the toxicity/function ratio of UNC0638 was 138, whereas the same ratio for BIX01294 was <6 (Fig. 2e,f). This much improved toxicity/function ratio enables UNC0638 to be used in cell-based model systems over a range of concentrations without interference from cellular toxicity. Furthermore, we have evaluated the cellular potency and toxicity of UNC0638 in other tumor and normal cell lines. UNC0638 had high potency, ranging from 48 to 238 nM, in reducing H3K9me2 levels in breast, prostate, colon carcinomas and normal fibroblast cells, with the two prostate carcinoma cell lines, PC3 (IC50 = 59 nM) and 22RV1 (IC50 = 48 nM), being the most sensitive (Supplementary Table 6). The EC50 for the cellular toxicity of UNC0638 (from MTT assays) in these tumor and normal cell lines was considerably higher than the corresponding IC50 for the functional effects. The toxicity/function ratio of UNC0638 in these cell lines varied by up to ten-fold (19 for IMR90 compared with 233 for PC3), but was well above the value of 6 observed for BIX01294 in MDA-MB-231 cells. These results again support the conclusion that UNC0638 is suitable as a chemical probe of G9a and GLP in a broad range of cell types without interference from potential off-target toxicity. Although UNC0638 is well tolerated by several cell types in terms of general cell viability, we investigated whether it might affect the growth properties of cancer cell lines. At concentrations of UNC0638 that considerably reduce H3K9me2 levels and for which acute offtarget toxicity is minimal, we monitored the effect of G9a and GLP inhibition on cell growth using a clonogenicity assay. There was a marked concentration-dependent reduction of clonogenicity in MCF7 cells upon treatment with UNC0638 or upon G9a or GLP knockdown but much less effect on MDA-MB-231 cells (Supplementary Fig. 12). These data show that inhibition of G9a and GLP can have differential phenotypic effects depending on the cell type, possibly

NATURE CHEMICAL BIOLOGY | VOL 7 | AUGUST 2011 | www.nature.com/naturechemicalbiology NATURE REPRINT COLLECTION Epigenetics

NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.599


a
Control 0.9 2.8 UNC0638 320 nM 0.9 Scale chr. 3 2.8
17,300,000

ARTICLE

200 kb

17,400,000 17,500,000

Control

17,600,000 17,700,000

UNC0638 320 nM

Excess of DNA hypomethylated probes

Scale chr. X 2.2

MAGEA1
2 kb 152,139,000 152,140,000 152,141,000 152,142,000 152,143,000

4,000 3,000 2,000 1,000 0

Chromosome 3

Control

Control 0.03 2.2 UNC0638 320 nM 0.03 MAGEA1 RefSeq genes


UNC0638 320 nM

Figure 4 | Effects of UNC0638 on H3K9me2 and DNA methylation. (a) Example of a genomic region (3p24.3) showing reductions in H3K9me2 after UNC0638 exposure (P = 2.0 1010). Light blue bars show the log ratios of anti-H3K9me2 to IgG in control (top) and treated (bottom) samples. (b) Administration of UNC0638 decreased H3K9me2 in the MAGEA1 promoter. Log ratios are shown as in a. The promoter was defined to be 4 kilobase pairs upstream and 0.5 kilobase pairs downstream of the transcription start site. (c) UNC0638 did not change DNA methylation levels. MCF7 cells were exposed to either UNC0638 (at 70 or 320 nM), 5-azacytidine (DAC) or control. y-axis scale is the excess of the number of significantly hypomethylated probes over the number of hypermethylated probes on chromosome 3. Significance cutoff was set at P = 103 for twosample t-test between treated and control log intensities.

related to differences in epigenetic state or p53 status (MCF7 cells have functional p53, whereas MDA-MB-231 cells do not). To confirm the effect of UNC0638 on cellular levels of H3K9me2 and assess the potential effect on other histone post-translational modification marks, we analyzed acid-extracted histones from MDAMB-231 cells treated with UNC0638 using quantitative MSbased proteomics as previously described39. After treatment of MDA-MB231 cells with UNC0638 (1 M for 48 h), the levels of H3K9me2 were considerably lower, similar to those observed with shRNA double knockdown of G9a and GLP (Fig. 3a,b). BIX01294 (1 M for 48 h) reduced the cellular levels of H3K9me2 to a lesser extent than UNC0638 did. These results are consistent with the findings from the in-cell western assay. We note that the levels of unmodified H3K9 were higher upon treatment with UNC0638, consistent with decreased modification by G9a and GLP (Supplementary Fig. 13). In contrast, the levels of H3K9me3 remained constant with all treatments (Fig. 3c,d), suggesting that, at least in these cells, trimethylation of H3K9 is not dependent on prior dimethylation of H3K9 by G9a and/or GLP. We also analyzed other well-known histone marks after treatment with UNC0638, BIX01294 and shRNAs targeting G9a and GLP. With the exception of acetylation of histone H3 Lys14 (H3K14ac), no changes in abundance were observed for 21 different modification states of H3 and H4 (Supplementary Table 7). Notably, the levels of H3K14ac doubled both with UNC0638 treatment and with G9a and GLP knockdown, suggesting a possible link or cross-talk between H3K9me2 and H3K14ac (Supplementary Fig. 13). This result is consistent with a previous finding in HEK293 cells in which G9a and GLP were knocked down via siRNA39.

MS analysis confirms that UNC0638 reduces H3K9me2 levels

To better understand how UNC0638 might regulate specific genes, we investigated the H3K9me2 levels at genomic loci along chromosomes 3 and X (chr3 and chrX). In chromatin immunoprecipitation on chip (ChIP-chip) experiments using a selective H3K9me2 antibody, MCF7 cells treated with UNC0638 at 320 nM (the IC90 for H3K9me2 inhibition) for 14 d had significantly fewer genomic regions containing H3K9me2 on chr3 and chrX (P < 2.2 1016; Fig. 4a and Supplementary Fig. 14). Lower levels of H3K9me2 were observed in the MAGEA1 promoter in our ChIP-chip study (P = 4.3 103; Fig. 4b), and this was confirmed in an independent ChIPquantitative PCR (ChIP-qPCR) analysis. In agreement with previously reported data29, our ChIP-chip and ChIP-qPCR data show significant, concentration-dependent reductions of H3K9me2 levels at the TBC1D5 and MAGEA2 promoters (P = 2.0 1010 and 2.6 103, respectively) but not at the MAGEB4 promoter (P = 0.07) after exposure to UNC0638 (Supplementary Fig. 15ac). Thus, UNC0638 shows robust on-target modulation of H3K9me2 levels, consistent with its activity as a selective G9a and GLP inhibitor. H3K9me2 is modestly correlated with euchromatic silenced genes40,41; however, it has also been reported at active genes42. Notably, when we examined the genes with the greatest reduction of H3K9me2 levels within their gene bodies, we found an enrichment of miRNA genes: 25% of the top 100 most affected genes encoded miRNAs, whereas only 5% of the probed genes on chr3 and chrX were miRNA genes (Supplementary Table 8). Furthermore, under the conditions of these experiments (treatment for 14 d at 70 or 320 nM), we found that the total levels of DNA methylation were not altered on chr3 and chrX in UNC0638-treated MCF7 cells compared to control-treated cells (Fig. 4c and Supplementary Fig. 15d). G9a inactivation has previously been shown to be ineffective in altering global DNA methylation in human cancer cell lines (in contrast to mES cells)24. We note that although our result supports the conclusion that inhibition of G9a catalytic activity does not produce global changes in genomic DNA methylation, it does not exclude the possibility of small, targeted changes below the resolution of these experiments. Taken together, these results further support the value of UNC0638 as a tool for investigating the effects of specific and global changes to H3K9me2 levels in human cells. Embryonic stem cells are unique in their ability to efficiently silence retroviruses through epigenetic mechanisms including H3K9 dimethylation43. To investigate the ability of UNC0638 to reactivate silent retrovirus vectors, we first determined the cellular potency of UNC0638 and BIX01294 in J1 mES cells. Consistent with the above results, UNC0638 showed greater cellular potency than BIX01294 (at 48 h, IC50 = 138 and 2,041 nM, respectively; Supplementary Fig. 16a). To establish retrovirus silencing, we infected J1 mES cells with an HSC1-EF1-EGFP-Puromycin retrovirus and selected for transduced cells with a short puromycin treatment. We observed the initial 100% EGFP+ cell population diminish to 3036% EGFP+ cells as retrovirus silencing was gradually established over 150 d of extended culture. To investigate the ability of the probe compounds to reactivate silent retrovirus vectors, we followed EGFP expression by flow cytometry after treatment with UNC0638 (100, 250 or 500 nM), BIX01294 (2 M) or UNC0737 (500 nM, as a negative control). Whereas UNC0737 did not reactivate EGFP expression above the 36% EGFP+ cells seen in the untreated sample, UNC0638 reactivated EGFP expression in a concentration-dependent manner to a maximal level of 63% EGFP+ cells at day 10 (Fig. 5a and Supplementary Fig. 16c). BIX01294 reactivated expression reaching the level of 53% EGFP+ cells at day 10, an expression level exceeded when cells were treated with UNC0638 at 250 nM, one-eighth the concentration of BIX01294. Moreover, we observed cell morphology changes under BIX01294 treatment, suggesting that this inhibitor may
57 1 S23

Genomic profiling of UNC0638-modulated H3K9me2 levels

UN 70 C0 nM 638 UN 32 C0 0 63 nM 8 DA C

UNC0638 reactivates silenced gene expression in mES cells

NATURE BIOLOGY | VOL 7 | AUGUST 2011 | www.nature.com/naturechemicalbiology NATURECHEMICAL REPRINT COLLECTION Epigenetics

ARTICLE
a
Percentage of EGFP cells 80 70 60 50 40 30 20 Relative expression Untreated UNC0638-500nM UNC0638-250nM UNC0638-100nM UNC0737-500nM BIX01294-2M

NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.599


b
10,000 1,000 100 10 1 0.1 MAGEA2 DUB1

c
Percentage of input

2.0 1.5 1.0 0.5 0

GFP IgG H3K9me2

d 3

re at ed UN C0 0. 7 5 3 7 UN M C 0. 06 1 38 UN M 0. C0 25 63 8 UN M C0 0. 6 5 38 BI M X0 1 2 294 M

ed

7 8 9 10 11 12 13 14 15 Time (d)

Un t

d
Percentage of input

7 6 5 4 3 2 1 0

MAGEA2 IgG H3K9me2

Untreated

UNC0737 500 nM

BIX01294 2 M

UNC0638 100 nM

UNC0638 250 nM

Un t

re

at

UNC0638 500 nM

Figure 5 | UNC0638 reactivates a silent EGFP retrovirus vector and G9a-regulated endogenous genes in mES cells. (a) Time course of EGFP retrovirus activation during indicated treatments in J1 mES cells infected by HSC1-EF1-EGFP-Puromycin retroviral vector. Plotted percentage of EGFP expression is the mean of the percentage of EGFP+ cells in three independent experiments. Error bars, s.d. (b) Analysis of mRNA levels of two G9a-regulated endogenous genes (MAGEA2 and DUB1) in J1 mES cells treated for 10 d with indicated compounds. The graph shows the normalized expression of MAGEA2 and DUB1 relative to the mRNA level detected in untreated cells ((Ct)). (c,d) ChIP analysis of H3K9me2 enrichment at the EGFP gene (c) and MAGEA2 promoter (d) in cells treated with UNC0638 (500 nM) for 3 or 7 d. (e) Analysis of DNA methylation in the long terminal repeat (LTR) of the HSC1-EF1-EGFP-Puromycin retroviral vector and in the MAGEA2 promoter after 10 d of treatment.

Un

tre

MAGEA2 promoter

ed

at

LTR

induce cell differentiation. By day 12, BIX01294-treated cells had arrested or died, whereas UNC0638 reactivated EGFP expression in 75% of cells without showing morphological signs of cell differentiation (Fig. 5a). At day 10 of BIX01294 treatment, only 65% of cells were positive for the pluripotency marker SSEA1 (Supplementary Fig. 16b). In contrast, UNC0638 treatment maintained expression of the SSEA1 pluripotency marker: the level of marker in cells treated with 100 nM of UNC0638 (97% SSEA1+ cells) was indistinguishable from that in UNC0737-treated cells or untreated cells. We conclude that inhibition of G9a with UNC0638 functionally reactivates silent retrovirus vectors without promoting differentiation into SSEA1 cells and is considerably more potent than BIX01294 treatment. We next tested whether MAGEA2 and DUB1, genes previously shown to be reactivated in G9a-knockout mES cells25,44, could be reactivated with UNC0638 treatment in J1 mES cells. At day 10, DUB1 and MAGEA2 genes were more highly expressed in UNC0638 than in untreated or UNC0737-treated cells (Fig. 5b). Similar to the results for retroviral vector reactivation, mRNA levels of DUB1 and MAGEA2 genes showed a concentration-dependent increase upon treatment with UNC0638. We note that reactivation of endogenous genes occurred by day 3 (Supplementary Fig. 16d), whereas EGFP retrovirus reactivation was first evident by flow cytometry at day 7 (Fig. 5a). In addition to directly methylating H3K9 (ref. 20), G9a has been reported to indirectly facilitate DNA methylation in mES cells2426. We first analyzed the presence of H3K9me2 by ChIP on the EGFP provirus and the endogenous MAGEA2 promoter and found that UNC0638 treatment decreased H3K9me2 at both targets by day 3, with a further decrease by day 7 (Fig. 5c,d). To test whether this reduction of H3K9me2 affects DNA methylation, we performed bisulfite sequencing on the retrovirus long terminal repeat and MAGEA2 promoter after 10 d of treatment. Both regions were hypermethylated in untreated or UNC0737-treated mES cells. In contrast, UNC0638 treatment induced DNA hypomethylation in a
572 S24

concentration-dependent manner (Fig. 5e). These results suggest that inhibition of gene silencing by UNC0638 primarily drives H3K9me2 loss to reactivate gene expression and facilitates DNA hypomethylation in mES cells. Protein lysine methyltransferase G9a has been implicated in various human diseases including leukemia8, prostate cancer8,16, liver cancer17, lung cancer18, drug addiction21, mental retardation22 and maintenance of HIV-1 latency23. Given the broad areas of biology in which G9a and GLP have a role, a high-quality chemical probe of these two PKMTs would be very valuable for dissecting the molecular mechanism(s) of these activities, the cell types in which they are relevant and which diseases (if any) would benefit from their inhibition. Here we report the discovery and characterization of UNC0638, which has all the properties of a high-quality chemical probe30. (i) UNC0638 is a potent, substrate-competitive inhibitor of G9a (IC50 < 15 nM, Ki = 3 nM) and the closely related GLP (IC50 = 19 nM). (ii) It is selective for G9a and GLP over a wide range of epigenetic and non-epigenetic targets. (iii) It is highly active in cells: at 250 nM concentration, it reduces the levels of H3K9me2 by ~6080% in a variety of cell lines, similar to the reductions seen for shRNA knockdown of G9a and GLP, and modulates expression of known G9a-regulated genes. (iv) UNC0737, an N-methyl derivative of UNC0638, is >300fold less potent against G9a and GLP, with similar selectivity and cellular toxicity compared to UNC0638, and therefore is a useful negative control. (v) UNC0638 has low cellular toxicity in seven cell lines tested at functional doses. Notably, the greatly improved cellular toxicity/function ratio relative to the previously available probe, BIX01294, makes UNC0638 much more versatile as a chemical probe. (vi) Finally, a useful chemical probe must be available to the biological research community. As such, we have made UNC0638 available through a commercial vendor (Sigma-Aldrich).

DISCUSSION

NATURE CHEMICAL BIOLOGY | VOL 7 | AUGUST 2011 | www.nature.com/naturechemicalbiology NATURE REPRINT COLLECTION Epigenetics

NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.599


Our proteomics and immunofluorescence data show that pharmacologic inhibition of G9a and GLP with UNC0638 leads to global reductions in H3K9me2 levels, comparable to those achieved with shRNAs. However, the genomic regions specifically marked with H3K9me2 vary with cell type and possibly with disease state1921,23,45. Similarly, the cellular levels of G9a and GLP also vary with cell type and disease state8. Indeed, we observed considerable variation in the global concentration-response behavior of UNC0638 in six cancer cell lines and human fibroblasts. We also observed a marked growth-inhibitory effect in MCF7 cells but not MDA-MB-231 cells at moderate UNC0638 concentrations under which H3K9me2 levels are fully suppressed. Given the association of G9a and GLP with DNMTs46,47 and repression of tumor suppressor p53 (refs. 8,48), there may be specific epigenetic cellular states (in particular, H3K9me2mediated repression of a specific set of genes and/or p53 status) in which cells are selectively vulnerable to G9a and GLP inhibition. UNC0638 will be a useful tool to search for such types of cancer or other disease-related epigenetic states. BIX01294 is likely to be less effective in such studies owing to off-target toxic effects encountered at or near its required concentration for full H3K9me2 suppression. It is notable that most studies of G9a and GLP to date have used knockout or knockdown of G9a and/or GLP, whereas UNC0638 inhibits only the enzymatic function of G9a and GLP, and does not affect the protein and mRNA levels, thereby preserving a potential scaffolding function in the many protein complexes reported for G9a and/or GLP20,4649. For example, it has been shown that catalytic activity of G9a or GLP is not required for all of its function25,26. This may explain the milder phenotype of UNC0638 compared to knockdowns of G9a and GLP, and it suggests that UNC0638 can be used to separate enzymatic from non-enzymatic functions of these proteins. We also show that UNC0638 can reactivate endogenous genes and silenced retroviral reporters in mES cells, further implicating H3K9me2-mediated repression in these processes. Retroviral silencing is a reliable criterion for identification of fully reprogrammed cells and is a good indicator of pluripotency50. UNC0638 reduced H3K9me2 on endogenous genes and retrovirus vectors within 3 d, and DNA hypomethylation was observed by day 10, when the cells had already reactivated expression. Together, these results suggest that a cascade of events is involved in the reactivation of silenced genes, and concentration-dependent inhibition of G9a by UNC0638 drives this process. Therefore, UNC0638 is a potent chemical tool for modulating G9a-related activities in cells to alter their expression profiles and epigenetic landscapes, to assist in manipulating their cell identity and phenotype, and to decipher the timing and inter-relationship of H3K9me2 and DNA methylation in gene silencing.
Biochemical assays. SAHH-coupled assay, peptide displacement, DSF and DSLS assay protocols are described in Supplementary Methods. Morrison Ki and IC50 in G9a MCE assay were determined as described36. Details of other selectivity assays are included in Supplementary Methods. Cell immunostaining (in-cell western). We added 2% (w/v) formaldehyde in PBS to fix cells for 15 min. After five washes with 0.1% (v/v) Triton X100 in PBS, cells were blocked for 1 h with 1% (w/v) BSA in PBS. Three of four replicates were exposed to primary H3K9me2 antibody, Abcam no. 1220 at 1:800 dilution in 1% BSA and PBS for 2 h. One replicate was reserved as a background control. The wells were washed five times with 0.1% (v/v) Tween 20 in PBS, then secondary IR800-conjugated antibody (LiCor) and a nucleic acidintercalating dye, DRAQ5 (LiCor), were added for 1 h. After five washes with 0.1% Tween 20 in PBS, the plates were read on an Odyssey (LiCor) scanner at 800 nm (H3K9me2 signal) and 700 nm (DRAQ5 signal). Fluorescence intensity was quantified, normalized to the background and then to the DRAQ5 signal, and expressed as a percentage of control. Quantitative MS analysis of histones. Histones from MDA-MB-231cells were extracted using standard acid procedures and analyzed after chemical derivatization by propionic anhydride and trypsin as described39. Digested histone samples were analyzed by LC-MS/MS on an Orbitrap mass spectrometer as described39. All data were manually verified.

ARTICLE

ChIP-chip. ChIP samples were amplified for arrays using a whole-genome amplification (WGA) method (reference is provided in Supplementary Methods). In WGA, DNA fragments are primed to generate a library of DNA fragments with a common end sequence. The library then is replicated using linear, isothermal amplification, followed by a limited round of geometric PCR amplifications. The GenomePlex Complete WGA kit (Sigma) was used for library preparation. ChIP samples concentrated to 10 l were mixed with 2 l library-preparation buffer and then with 1 l library-stabilization solution. Samples were incubated at 95 C for 2 min and afterwards immediately cooled on ice. Each sample was mixed with 1 l of library-preparation enzyme and incubated as follows: 16 C for 20 min, 24 C for 20 min, 37 C for 20 min, 75 C for 5 min, 4 C hold. Amplification of the samples was completed with the GenomePlex WGA kit (Sigma). Each sample was combined with 44 l nuclease-free water, 7.5 l Amplification Master Mix, 3 l dNTP/dUTP mix (10 mM dATP, 10 mM dCTP, 10 mM dGTP, 8 mM dTTP and 2 mM dUTP) and 5 l WGA DNA polymerase. Amplified samples were purified using the QIAquick PCR purification kit (Qiagen) and then processed for array hybridization as described below. Samples exposed to the H3K9me2 antibody or IgG control antibody were hybridized to the arrays (n = 2 per group). Reactivation of retrovirus expression in mES cells. J1 mES cells were cultured (reference is provided in Supplementary Methods) in DMEM with 15% (v/v) ES-qualified FBS supplemented with 4 mM L-glutamine, 0.1 mM MEM non-essential amino acids, 1 mM sodium pyruvate, 0.55 mM 2-mercaptoethanol and purified recombinant leukemia inhibitory factor on 0.1% gelatin-coated plates. Cells were infected with a self-inactivated HSC1 retroviral vector (reference is provided in Supplementary Methods) engineered to harbor an EGFP-Puromycin biscistronic reporter gene controlled by the human EF1 promoter. EGFP gene expression was analyzed by flow cytometry. Pluripotency of mES cells was tested by SSEA1 (a surface maker of mouse undifferentiated cells) immunostaining and also measured by flow cytometry. Briefly, to perform flow cytometry, we fixed trypsinized cells with 2% formaldehyde in phosphate-buffered saline with 2% (v/v) FBS for 10 min at room temperature. Cells were then suspended in PBS with 2% (v/v) of serum (flow buffer) and filtered through 70-m nylon membranes. EGFP expression analyses were performed by LSRII flow cytometer (Becton-Dickinson) using CellQuest Pro software. SSEA1 (a surface maker of mouse undifferentiated cells) immunostaining was performed on non-permeabilized fixed cells. They were incubated with mouse IgM antibody to SSEA1 (DSHB, MC-480) for 30 min at 4 C. After being washed three times with flow buffer, cells were incubated for 30 min with secondary antibody, Phycoerythrin-Cy5.5 (PE-Cy5.5) anti-mouse IgM (eBioscience), at 4 C. Cells were washed three times in the flow buffer, and SSEA1 immunostaining was analyzed by LSRII flow cytometer. We excluded cell debris were excluded from analysis by using forward- and side-scatter gating. Uninfected J1 ES cell line was used as a negative control to adjust EGFP fluorescence measurements. SSEA1 immunostaining of mouse embryonic fibroblast was used as a negative control cell line for SSEA1 cell measurement. Reactivation of endogenous genes, ChIP of endogenous gene and retrovirus, and DNA methylation analysis in mES cells are described in Supplementary Methods. Additional methods. Synthesis of UNC0638 and UNC0737, UNC0638 mechanism-of-action and kinetic studies, determination of X-ray crystal structure of the G9aUNC0638SAH complex, UNC0638 stability studies, mouse drug metabolism and pharmacokinetic studies, cell growth, MTT, shRNA, ChIP, western blotting, immunofluorescence microscopy, clonogenicity, immunostaining flow cytometry, reverse transcriptionqPCR, DNA extraction, enrichment of unmethylated DNA fraction, and microarray experiments and data analysis are described in Supplementary Methods. Accession codes. Protein Data Bank: Coordinates and structure factors for the cocrystal structure of the G9aUNC0638SAH complex have been deposited with accession code 3RJW.

METHODS

Received 25 February 2011; accepted 27 April 2011; published online 10 July 2011

References

1. Kouzarides, T. Chromatin modifications and their function. Cell 128, 693705 (2007). 2. Martin, C. & Zhang, Y. The diverse functions of histone lysine methylation. Nat. Rev. Mol. Cell Biol. 6, 838849 (2005). 3. Jenuwein, T. & Allis, C.D. Translating the histone code. Science 293, 10741080 (2001). 4. Bernstein, B.E., Meissner, A. & Lander, E.S. The mammalian epigenome. Cell 128, 669681 (2007). 5. Gelato, K.A. & Fischle, W. Role of histone modifications in defining chromatin structure and function. Biol. Chem. 389, 353363 (2008). 6. Strahl, B.D. & Allis, C.D. The language of covalent histone modifications. Nature 403, 4145 (2000). 7. Huang, J. et al. Repression of p53 activity by Smyd2-mediated methylation. Nature 444, 629632 (2006). 8. Huang, J. et al. G9A and GLP methylate lysine 373 in the tumor suppressor p53. J. Biol. Chem. 285, 96369641 (2010).
57 3 S25

NATURE BIOLOGY | VOL 7 | AUGUST 2011 | www.nature.com/naturechemicalbiology NATURECHEMICAL REPRINT COLLECTION Epigenetics

ARTICLE

NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.599


42. Vakoc, C.R., Mandat, S.A., Olenchock, B.A. & Blobel, G.A. Histone H3 lysine 9 methylation and HP1gamma are associated with transcription elongation through mammalian chromatin. Mol. Cell 19, 381391 (2005). 43. Wolf, D. & Goff, S.P. TRIM28 mediates primer binding site-targeted silencing of murine leukemia virus in embryonic cells. Cell 131, 4657 (2007). 44. Yokochi, T. et al. G9a selectively represses a class of late-replicating genes at the nuclear periphery. Proc. Natl. Acad. Sci. USA 106, 1936319368 (2009). 45. Hosey, A.M., Chaturvedi, C.P. & Brand, M. Crosstalk between histone modifications maintains the developmental pattern of gene expression on a tissue-specific locus. Epigenetics 5, 273281 (2010). 46. Epsztejn-Litman, S. et al. De novo DNA methylation promoted by G9a prevents reprogramming of embryonically silenced genes. Nat. Struct. Mol. Biol. 15, 11761183 (2008). 47. Estve, P.O. et al. Direct interaction between DNMT1 and G9a coordinates DNA and histone methylation during replication. Genes Dev. 20, 30893103 (2006). 48. Chen, L. et al. MDM2 recruitment of lysine methyltransferases regulates p53 transcriptional output. EMBO J. 29, 25382552 (2010). 49. Fritsch, L. et al. A subset of the histone H3 lysine 9 methyltransferases Suv39h1, G9a, GLP, and SETDB1 participate in a multimeric complex. Mol. Cell 37, 4656 (2010). 50. Stadtfeld, M., Maherali, N., Breault, D.T. & Hochedlinger, K. Defining molecular cornerstones during fibroblast to iPS cell reprogramming in mouse. Cell Stem Cell 2, 230240 (2008).

9. Rathert, P. et al. Protein lysine methyltransferase G9a acts on non-histone targets. Nat. Chem. Biol. 4, 344346 (2008). 10. Huang, J. et al. p53 is regulated by the lysine demethylase LSD1. Nature 449, 105108 (2007). 11. Copeland, R.A., Solomon, M.E. & Richon, V.M. Protein methyltransferases as a target class for drug discovery. Nat. Rev. Drug Discov. 8, 724732 (2009). 12. Rea, S. et al. Regulation of chromatin structure by site-specific histone H3 methyltransferases. Nature 406, 593599 (2000). 13. Cole, P.A. Chemical probes for histone-modifying enzymes. Nat. Chem. Biol. 4, 590597 (2008). 14. Esteller, M. Epigenetics in cancer. N. Engl. J. Med. 358, 11481159 (2008). 15. Tachibana, M. et al. G9a histone methyltransferase plays a dominant role in euchromatic histone H3 lysine 9 methylation and is essential for early embryogenesis. Genes Dev. 16, 17791791 (2002). 16. Kondo, Y. et al. Downregulation of histone H3 lysine 9 methyltransferase G9a induces centrosome disruption and chromosome instability in cancer cells. PLoS ONE 3, e2037 (2008). 17. Kondo, Y. et al. Alterations of DNA methylation and histone modifications contribute to gene silencing in hepatocellular carcinomas. Hepatol. Res. 37, 974983 (2007). 18. Watanabe, H. et al. Deregulation of histone lysine methyltransferases contributes to oncogenic transformation of human bronchoepithelial cells. Cancer Cell Int. 8, 15 (2008). 19. Goyama, S. et al. EVI-1 interacts with histone methyltransferases SUV39H1 and G9a for transcriptional repression and bone marrow immortalization. Leukemia 24, 8188 (2010). 20. Tachibana, M. et al. Histone methyltransferases G9a and GLP form heteromeric complexes and are both crucial for methylation of euchromatin at H3K9. Genes Dev. 19, 815826 (2005). 21. Maze, I. et al. Essential role of the histone methyltransferase G9a in cocaine-induced plasticity. Science 327, 213216 (2010). 22. Schaefer, A. et al. Control of cognition and adaptive behavior by the GLP/ G9a epigenetic suppressor complex. Neuron 64, 678691 (2009). 23. Imai, K., Togami, H. & Okamoto, T. Involvement of histone H3 Lysine 9 (H3K9) methyl transferase G9a in the maintenance of HIV-1 latency and its reactivation by BIX01294. J. Biol. Chem. 285, 1653816545 (2010). 24. Link, P.A. et al. Distinct roles for histone methyltransferases G9a and GLP in cancer germ-line antigen gene regulation in human cancer cells and murine embryonic stem cells. Mol. Cancer Res. 7, 851862 (2009). 25. Tachibana, M., Matsumura, Y., Fukuda, M., Kimura, H. & Shinkai, Y. G9a/ GLP complexes independently mediate H3K9 and DNA methylation to silence transcription. EMBO J. 27, 26812690 (2008). 26. Dong, K.B. et al. DNA methylation in ES cells requires the lysine methyltransferase G9a but not its catalytic activity. EMBO J. 27, 26912701 (2008). 27. Shi, Y. et al. Induction of pluripotent stem cells from mouse embryonic fibroblasts by Oct4 and Klf4 with small-molecule compounds. Cell Stem Cell 3, 568574 (2008). 28. Shi, Y. et al. A combined chemical and genetic approach for the generation of induced pluripotent stem cells. Cell Stem Cell 2, 525528 (2008). 29. Kubicek, S. et al. Reversal of H3K9me2 by a small-molecule inhibitor for the G9a histone methyltransferase. Mol. Cell 25, 473481 (2007). 30. Frye, S.V. The art of the chemical probe. Nat. Chem. Biol. 6, 159161 (2010). 31. Liu, F. et al. Discovery of a 2,4-diamino-7-aminoalkoxyquinazoline as a potent and selective inhibitor of histone lysine methyltransferase G9a. J. Med. Chem. 52, 79507953 (2009). 32. Liu, F. et al. Protein lysine methyltransferase G9a inhibitors: design, synthesis, and structure activity relationships of 2,4-diamino-7-aminoalkoxyquinazolines. J. Med. Chem. 53, 58445857 (2010). 33. Chang, Y. et al. Adding a lysine mimic in the design of potent inhibitors of histone lysine methyltransferases. J. Mol. Biol. 400, 17 (2010). 34. Chang, Y. et al. Structural basis for G9a-like protein lysine methyltransferase inhibition by BIX-01294. Nat. Struct. Mol. Biol. 16, 312317 (2009). 35. Collazo, E., Couture, J.F., Bulfer, S. & Trievel, R.C. A coupled fluorescent assay for histone methyltransferases. Anal. Biochem. 342, 8692 (2005). 36. Wigle, T.J. et al. Accessing protein methyltransferase and demethylase enzymology using microfluidic capillary electrophoresis. Chem. Biol. 17, 695704 (2010). 37. Morrison, J.F. Kinetics of the reversible inhibition of enzyme-catalysed reactions by tight-binding inhibitors. Biochim. Biophys. Acta 185, 269286 (1969). 38. Zee, B.M. et al. In vivo residue-specific histone methylation dynamics. J. Biol. Chem. 285, 33413350 (2010). 39. Plazas-Mayorca, M.D. et al. Quantitative proteomics reveals direct and indirect alterations in the histone code following methyltransferase knockdown. Mol. Biosyst. 6, 17191729 (2010). 40. Barski, A. et al. High-resolution profiling of histone methylations in the human genome. Cell 129, 823837 (2007). 41. Rice, J.C. et al. Histone methyltransferases direct different degrees of methylation to define distinct chromatin domains. Mol. Cell 12, 15911598 (2003).
574 S26

Acknowledgments

We thank A. Tumber for JMJD2E assay support; J. Moffat (University of Toronto) for the gift of shRNAs; R. Bristow (University Health Network) for RV221 and PC3 cells; T. Hajian and F. Syeda for protein purification; G. Senisterra for contributing to DSF and DSLS data analysis; M. Herold for graphical design and illustration; I. Korboukh, M. Herold and J. Yost for critical reading of the manuscript; and R. Trump and C. Yates for helpful discussion. The research described here was supported by the National Institute of General Medical Sciences, US National Institutes of Health (NIH; grant RC1GM090732), the Carolina Partnership and University Cancer Research Fund from the University of North Carolina at Chapel Hill, the US National Science Foundation (NSF), the Ontario Research Fund, the Ontario Ministry of Health and Long-term Care and the Structural Genomics Consortium. The Structural Genomics Consortium is a registered charity (number 1097737) that receives funds from the Canadian Institutes for Health Research (CIHR), the Canada Foundation for Innovation, Genome Canada through the Ontario Genomics Institute, GlaxoSmithKline, Karolinska Institutet, the Knut and Alice Wallenberg Foundation, the Ontario Innovation Trust, the Ontario Ministry for Research and Innovation, Merck & Co. Inc., the Novartis Research Foundation, the Swedish Agency for Innovation Systems, the Swedish Foundation for Strategic Research and the Wellcome Trust. C.H.A. holds a Canada Research Chair in Structural Genomics. V.L. is supported by a CIHR fellowship. A.P. is supported by grants from the CIHR (199170 and 186007) and from the NIH (MH074127, MH088413, DP3DK085698 and HG004535). A.P. is Tapscott Chair in Schizophrenia Studies and a Senior Fellow of the Ontario Mental Health Foundation. J.E. is supported by CIHR grant IG1-102956. B.A.G. is supported by grants from the NSF (Early Faculty CAREER award and CBET-0941143), the American Society for Mass Spectrometry and the NIH Office of the Director (DP2OD007447). P.A.D. is supported by NIH postdoctoral fellowship F32 NRSA.

Author contributions

M.V., A.A.-H., A.S. and I.C. performed SAHH-coupled, fluorescence-polarization, DSF, DSLS and DNMT1 assays; D.B.-L. developed and performed in-cell western, MTT, ChIP, gene expression, clonogenicity, western blotting and immunofluorescence studies; F.L. developed the synthetic route to UNC0638 and UNC0737 and synthesized the compounds; S.R.-G. and J.E. developed and performed mES cell studies; V.L., S.-C.W. and A.P. performed H3K9me2 genomic localization and DNA methylation analysis; T.J.W. and W.P.J. performed mechanism-of-action studies; P.A.D. and B.A.G. performed MS-based proteomics studies; M.V., G.A.W., A.D., W.T., D.B.K. and C.H.A. solved and analyzed the X-ray crystal structure of the G9a-UNC0638-SAH complex; T.J.W. and A.T. performed SPR studies; X.C. and S.G.P. performed UNC0638 stability studies; S.G.P. performed RT-qPCR studies; T.J.M., X.-p.H. and B.L.R. performed GPCR selectivity studies; C.D.S. and W.P.J. performed kinase selectivity studies; J.L.N. purified proteins; J.J. designed UNC0638 and UNC0737; C.H.A., J.J., S.V.F., M.V., D.B.-L., P.J.B., J.E., S.R.-G., A.P., V.L., B.A.G., T.J.W. and A.E. designed studies and discussed results; J.J., C.H.A., D.B.-L., S.R.-G., M.V., V.L., S.V.F., J.E., A.P., B.A.G., P.J.B. and T.J.W. wrote the paper.

Competing financial interests Additional information

The authors declare no competing financial interests.

Supplementary information, chemical compound information and chemical probe information is available online at http://www.nature.com/naturechemicalbiology/. Reprints and permissions information is available online at http://www.nature.com/reprints/index. html. Correspondence and requests for materials should be addressed to C.H.A. and J.J.

NATURE CHEMICAL BIOLOGY | VOL 7 | AUGUST 2011 | www.nature.com/naturechemicalbiology NATURE REPRINT COLLECTION Epigenetics

LETTER
First published in Nature 483, 598602 (2012); doi: 10.1038/nature10953

doi:10.1038/nature10953

Chromatin-modifying enzymes as modulators of reprogramming


Tamer T. Onder1,2,3,4, Nergis Kara5, Anne Cherry1,2,3,4, Amit U. Sinha6,7, Nan Zhu3,6,7, Kathrin M. Bernt3,6,7, Patrick Cahan1,2,3,4, B. Ogan Mancarci8, Juli Unternaehrer1,2,3,4, Piyush B. Gupta9,10, Eric S. Lander9,11,12, Scott A. Armstrong3,6,7 & George Q. Daley1,2,3,4,6,13,14

Generation of induced pluripotent stem cells (iPSCs) by somatic cell reprogramming involves global epigenetic remodelling1. Whereas several proteins are known to regulate chromatin marks associated with the distinct epigenetic states of cells before and after reprogramming2,3, the role of specific chromatin-modifying enzymes in reprogramming remains to be determined. To address how chromatin-modifying proteins influence reprogramming, we used short hairpin RNAs (shRNAs) to target genes in DNA and histone methylation pathways, and identified positive and negative modulators of iPSC generation. Whereas inhibition of the core components of the polycomb repressive complex 1 and 2, including the histone 3 lysine 27 methyltransferase EZH2, reduced reprogramming efficiency, suppression of SUV39H1, YY1 and DOT1L enhanced reprogramming. Specifically, inhibition of the H3K79 histone methyltransferase DOT1L by shRNA or a small molecule accelerated reprogramming, significantly increased the yield of iPSC colonies, and substituted for KLF4 and c-Myc (also known as MYC). Inhibition of DOT1L early in the reprogramming process is associated with a marked increase in two alternative factors, NANOG and LIN28, which play essential functional roles in the enhancement of reprogramming. Genome-wide analysis of H3K79me2 distribution revealed that fibroblast-specific genes associated with the epithelial to mesenchymal transition lose H3K79me2 in the initial phases of reprogramming. DOT1L inhibition facilitates the loss of this mark from genes that are fated to be repressed in the pluripotent state. These findings implicate specific chromatin-modifying enzymes as barriers to or facilitators of reprogramming, and demonstrate how modulation of chromatinmodifying enzymes can be exploited to more efficiently generate iPSCs with fewer exogenous transcription factors. To examine the influence of chromatin modifiers on somatic cell reprogramming, we used a loss-of-function approach to interrogate the role of 22 select genes in DNA and histone methylation pathways. We tested a pool of three hairpins for each of 22 target genes and observed knockdown efficiencies of .60% for 21 out of 22 targets (Supplementary Fig. 1). We infected fibroblasts differentiated from the H1 human embryonic stem cell (ESC) line (dH1fs) with shRNA pools, transduced them with reprogramming vectors expressing OCT4 (also known as POU5F1), SOX2, KLF4 and c-Myc (OSKM), and identified the resulting iPSCs by Tra-1-60 staining (Fig. 1a)4. Eight shRNA pools reduced reprogramming efficiency (Fig. 1b). Among the target genes were OCT4 (included as a control), and EHMT1 and SETDB1, two H3K9 methyltransferases whose histone mark is associated with transcriptional repression. The remaining five shRNA
1

pools targeted components of polycomb repressive complexes (PRC), major mediators of gene silencing and heterochromatin formation5. Inhibition of PRC1 (BMI1, RING1) and PRC2 components (EZH2, EED, SUZ12) significantly decreased reprogramming efficiency while having negligible effects on cell proliferation (Fig. 1c and Supplementary Fig. 2). This finding is of particular significance given that EZH2 is necessary for fusion-based reprogramming6 and highlights the importance of transcriptional silencing of the somatic cell gene expression program during generation of iPSCs. In contrast to genes whose functions seem to be required for reprogramming, inhibition of three genes enhanced reprogramming: YY1, SUV39H1 and DOT1L (Fig. 1b, d). YY1 is a context-dependent transcriptional activator or repressor7, whereas SUV39H1 is a histone H3K9 methyltransferase implicated in heterochromatin formation8. Interestingly, enzymes that modify H3K9 were associated with both inhibition and enhancement of reprogramming, which suggested that unravelling the mechanisms for their effects might be challenging. Thus, we focused on DOT1L, a histone H3 lysine 79 methyltransferase that has not previously been studied in the context of reprogramming9. We used two hairpin vectors that resulted in the most significant downregulation of DOT1L and concomitant decrease in global H3K79 methylation levels (Supplementary Fig. 3a, b). Fibroblasts expressing DOT1L shRNAs formed significantly more iPSC colonies when tested separately or in a context where they were fluorescently labelled and co-mixed with control cells (Fig. 2a and Supplementary Fig. 4). This enhanced reprogramming phenotype could be reversed by overexpressing an shRNA-resistant wild-type DOT1L, but not a catalytically inactive DOT1L, indicating that inhibition of catalytic activity of DOT1L is key to enhance reprogramming10 (Fig. 2a). Our findings with dH1fs were applicable to other human fibroblasts, as IMR-90 and MRC-5 cells also showed threefold and sixfold increases in reprogramming efficiency, respectively, upon DOT1L suppression (Supplementary Fig. 5). To validate our findings independently of shRNA-mediated knockdown, we used a recently discovered small molecule inhibitor of DOT1L catalytic activity. EPZ004777 (ref. 11, referred to as iDot1L) abrogated H3K79 methylation at concentrations ranging from 1 mM to 10 mM and increased reprogramming efficiency three- to fourfold (Fig. 2b and Supplementary Fig. 6a, b). Combination of inhibitor treatment with DOT1L knockdown did not further increase reprogramming efficiency, reinforcing our previous observation that inhibition of the catalytic activity of DOT1L is key to reprogramming (Supplementary Fig. 6c). iPSCs generated through DOT1L inhibition showed characteristic ESC morphology, immunoreactivity for SSEA4, SSEA3, Tra-1-81, OCT4 and NANOG,

Stem Cell Transplantation Program, Division of Pediatric Hematology and Oncology, Manton Center for Orphan Disease Research, Childrens Hospital Boston and Dana Farber Cancer Institute, Boston, Massachusetts, 02115, USA. 2Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, Massachusetts, 02115, USA. 3Harvard Stem Cell Institute, Cambridge, Massachusetts, 02138, USA. 4Stem Cell Program, Childrens Hospital Boston, Boston, Massachusetts, 02115, USA. 5German Cancer Research Center, Heidelberg, 69120, Germany. 6Division of Hematology/Oncology, Childrens Hospital, Harvard Medical School, Boston, Massachusetts, 02115, USA. 7Department of Pediatric Oncology, Harvard Medical School, Boston, Massachusetts, 02115, USA. 8 Department of Molecular Biology and Genetics, Bilkent University, Ankara, 06800, Turkey. 9Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, 02142, USA. 10 Whitehead Institute for Biomedical Research, Cambridge, Massachusetts, 02142, USA. 11The Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, Massachusetts, 02142, USA. 12Department of Systems Biology, Harvard Medical School, Boston, Massachusetts, 02115, USA. 13Division of Hematology, Brigham and Womens Hospital, Boston, Massachusetts, 02115, USA. 14 Howard Hughes Medical Institute, Chevy Chase, Maryland, 20815, USA. 5 9 8 | N AT U R E | VO L 4 8 3 | 2 9 M A R C H 2 0 1 2

NATURE REPRINT COLLECTION Epigenetics

S27

Personalized Therapeutics The Power of Epigenetics


Epizyme is creating personalized therapeutics for patients with genetically defined cancers based on breakthrough discoveries in the field of epigenetics.

MAPPING THE HMTome The Power of Personalized Therapeutics


MLL4
SUV420H1 SMYD1 SMYD5 SETD3 SETD6 SETD4 SMYD3 SMYD2 SMYD4 SUV420H2 MLL4 MLL SETD1B

EZH1
SETD7

EZH1

EZH2 EZH2

SETD1A

MLL2 MLL3

SETD7

SETD8

SETD8
PRDM5 PRDM3 PRDM16 EHMT1 PRDM2 EHMT2 PRDM1 PRDM11 PRDM7 PRDM9 PRDM14 PRDM6 PRDM8 PRDM13 PRDM12 PRDM4 PRDM15 PRDM10 SETD5 SETD2 ASH1L MLL5 Q6ZW69 SETMAR SETDB2 SETDB1 SUV39H1 SUV39H2

EZH2 In pre-clinical development for patients with genetically defined lymphomas and solid tumors

NSD1

WHSC1L1

WHSC1

Epizyme mapped the HMTome, a therapeutically important class of enzymes known as histone methyltransferases (HMTs) that are proven drivers of diseases such as cancer. The HMTome includes two major families - lysine methyltransferases (KMTs) and arginine methyltransferases (RMTs). Epizyme is creating small molecule HMT inhibitors as personalized therapeutics for the treatment of patients with genetically defined cancers.

ALKBH8
METTL11A METTL13 METTL7A ECE2 COQ3 METTL11B

WBSCR22 METTL7B
ALKBH8

PRMT7 PRMT10 METTL20 METTL10 PRMT5

METTL12

AS3MT DOT1L
AS3MT DOT1L METTL7B

WBSCR27

WBSCR22

METTL7A

CO

WBSCR27 COQ5 C20orf7

PRMT6 PRMT2 PRMT3 PRMT1 PRMT8 CARM1

ASMT

METTL6

METTL2A METTL8 METTL2B

DOT1L In Phase I development for patients with MLL-r, a genetically defined type of acute leukemia

PRMT9 PRMT11 NSUN5C

NOP2 NSUN7 NSUN5B NNMT INMT NSUN4 NSUN3 NSUN2 NSUN6 NSUN5 PNMT

To learn more about moving science forward in genetically defined cancers, visit

www.epizyme.com

RESEARCH LETTER
a
3.5 Fold change in Tra-1-60+ colonies 3.0 2.5 2.0 1.5

LETTER RESEARCH
a b

Day 6
5.0

Day 5 Day 1 Day 0

Day 6

n=5

Fold change in Tra-1-60+ colonies

n=3

n=5

sh sh Cnt SE rl TD B sh 1 Bm sh i1 Ri ng 1 sh Ee d sh Ez h sh 2 Su z1 2 Untreated-biorep1

sh C nt Do r l t1 L sh Do -1 sh t1L +D D -2 ot ot1 1 L sh L_w -1 +D D t ot ot1 1L L_m 1 + colonies ut Number of Tra-1-60

n=5

200 150 100 50 0


U nt re a te d iD ot 1L

1.0 0.5

shDot1L_ 0 (94) OSM

iDot1L_OSKM (405)

n=3

te d

Fold change in Tra-1-60+ colonies

120 100 80 60 40 20 0

Fold change in Tra-1-60+ colonies

OSKM

OSM

SUV39H1 YY1 DOT1L DNMT3A MECP2 NR2F1 Tra-1-60+ colonies DNMT1 SMYD2 CNTRL MBD2 MBD4 EZH1 SUV39H2 MBD1 G9A MBD3 SETDB1 OCT4 BMI1 RING1 SUZ12 EHMT1 EZH2 EED Number of Tra-1-60+ colonies

60 40 20 OSK 0

U nt re a

3. 3

10

80

sh

iDot1L

d
3.5 3.0 2.5 2.0 1.5 1.0 0.5 0
Cntrl
shCntrl shYY1

OS

SKM

OKM

**

3.5

shCntrl

* 3.0
2.5 2.0 1.5 1.0 0.5 0
shCntrl shCntrl shDot1L shDot1L + N2L + N2L

shDot1L

SUV39H1

YY1

DOT1L DNMT3A MECP2 NR2F1 DNMT1 SMYD2

e
Untreated
CNTRL

MBD2

250 MBD4 EZH1 200 150

shCntrl SUV39H2 MBD1 G9A shDot1L Untreated iDot1L

MBD3

SETDB1 OCT4

iDot1L

BMI1 100 RING1 50

shSuv39H1 shDot1L

SUZ12 EHMT1

EZH2

EED

shLin28A shNanog

OSKM OSM

OSK

OS

shDot1L + OS

Figure21 | Screening for inhibitors andreprogramming enhancers of reprogramming. Figure | DOT1L inhibition enhances efficiency and a, Timeline of shRNA infection and iPSC generation. b, Number of Tra-1-601 substitutes for KLF4 and Myc. a, Fold change in the reprogramming efficiency colonies 21 days after OSKM transduction of 25,000 dH1f cells previously of dH1f cells infected with two independent DOT1L shRNAs or co-infected with infected with pools of shRNAs against the indicated genes. Representative TrashRNA-1 and a vector expressing an shRNA-resistant wild-type or catalytically 1-60-stained reprogramming wells are shown. The dotted lines indicates 3 dead mutant DOT1L. Data correspond to the average and s.e.m.; standard deviations from the mean number of colonies in control wells. n 5 independent experiments. *P , 0.01 compared to control shRNAc, Validation of primary screen hits that decrease reprogramming efficiency. expressing fibroblasts. b, Fold change in the reprogramming efficiency of dH1f cells treated with iDot1L at the indicated concentrations for 21 days. Data and differentiated all three embryonic germtolayers in vitro correspond to the meaninto 6 s.d.; n5 3. *P , 0.001 compared untreated 1 and in c teratomas Fig. 7ac). iPSCs ) colonies derived fibroblasts. , Number of(Supplementary alkaline-phosphatase-positive (APTherefore, from OSKM-transduced untreated or iDot1L-treated M) OCT4GFP generated following DOT1L inhibition display(10 allm of the hallmarks of MEFs. *P , 0.001 compared untreated MEFs (n 5 4; error bars, 6 s.d.). pluripotency. Representative wells are inhibition shown. d, Tra-1-60 stained of plates of We next AP-stained assessed DOT1L in murine reprogramming. shCntrl and shDot1L fibroblasts in the absence of each factor or both KLF4 and iDot1L treatment led to threefold enhancement of reprogramming c-Myc. e, Tra-1-60-stained plates of untreated and iDot1L treated (3.3 mM) of mouse embryonic fibroblasts carrying an OCT4-GFP (green fibroblasts in the absence of each factor or both KLF4 and c-Myc. f, Quantification protein) reporter gene (OCT4GFP MEFs; Fig. 2c). colonies in Fig. 2d, e representing mean and s.d. of two offluorescent the Tra-1-601 Reprogramming of tail-tip independent experiments done infibroblasts triplicate. (TTFs) derived from a con-

1LIN28 are required for enhancement of Figure 3 | NANOG and Fold change in Tra-1-60 iPSC colonies relative to control cells. *P , 0.05, reprogramming by DOT1L inhibition. a, Overlap of differentially ** P , 0.01 compared to control shRNA-expressing fibroblasts (n 5 4; error upregulated genes in shDot1LTra-1-60-stained cells 6 days post-OSKM OSM bars, 6s.e.m.). Representative wells areand shown. d,transduction Validation with the genes upregulated in OSKM-transduced iDot1L-treated cells. b, Heat of primary screen hits that increase reprogramming efficiency. Fold change in 1 maps showing differential expression levels cells. of commonly upregulated genes in iPSC colonies relative to control *P , 0.05, **P , 0.01 Tra-1-60 1 iPSC OSKM-transduced cells. c, Number compared to controlDOT1L-inhibited shRNA-expressing fibroblasts (n 5of 4; Tra-1-60 error colonies upon knockdown of Nanog or Lin28 in 2-factor reprogramming of bars, 6 s.e.m.). Representative Tra-1-60-stained wells are shown. shDot1L cells. Data represent mean and s.e.m of 2 independent experiments done in triplicate. d, Fold-change in Tra-1-601 iPSC colonies in 4-factor revealed that the silencing occurred by day 15 after OSKM transduc(OSKM) and 6-factor (OSKMNL) reprogramming of shCntrl and shDot1L tion (Supplementary Fig. mean 10b, c). To define time window for fibroblasts. Data represent and s.e.m. of the twocrucial independent experiments DOT1L inhibition, we treated fibroblasts with iDot1L at shown 1-week interdone in duplicate. Representative Tra-1-60-stained wells are above.

ditional knockout DOT1L mouse strain yielded significantly more 12 isolated from expanded Figs 7ac and 12b). PCR on genomic DNA iPSC colonies upon deletion of DOT1L (Supplementary Fig. 8a). colonies confirmed the absence integrated KLF4 and transCre-mediated excision of both of floxed DOT1L alleles inc-Myc iPSC clones genes (Supplementary Fig. 12c). Thus, were able to genomic generate twoderived from homozygous TTFs waswe confirmed by PCR factor iPSCs either by8b). suppression of DOT1L expression chemical (Supplementary Fig. DOT1L inhibition also increasedor reprograminhibition of its methyltransferase activity. ming efficiency of MEFs and peripheral blood cells derived from an 13 To gain secondary insights into the molecular of how DOT1L inducible iPSC mouse strainmechanisms (Supplementary Fig. 8c, d). inhibition promotes reprogramming and replaces we performed Taken together, these results demonstrate thatKLF4 DOT1L inhibition global gene-expression analyses on control and shDot1L enhances reprogramming of both mouse and human cells.fibroblasts before 6 days after OSKM and OSM transduction, cells Weand next examined the cellular mechanisms by along whichwith DOT1L that were treated withreprogramming. iDot1L. Relatively few genes were differentially inhibition promotes DOT1L inhibition affected neiexpressed in shDot1L cells on day 6 of reprogramming (22 up, 23 ther retroviral transgene expression nor cellular proliferation down; Supplementary Table 3). Inhibitor-treated cells showed broader (Supplementary Fig. 9ac). Although previous studies indicated that gene expression changes up and 175 down; Supplementary Table 3), DOT1L -null cells have (405 increased apoptosis and accumulation of cells 9 presumably due tofailed moreto complete of K79me2 levels (Fig. 3a). in G2 phase , we observeinhibition a significant increase in apoptosis or In the absence KLF4, 94 profile genes were differentially upregulated in change in theofcell cycle of DOT1L-inhibited fibroblasts shDot1L cells; intersection this of genes with the set differentially (Supplementary Fig. 9d, of e). In set human iPSC clones derived from upregulated in four-factor reprogramming cells shDot1L fibroblasts, DOT1L inhibition was of noDOT1L-inhibited longer evident, reflectyielded five common genes (Fig. that 3a, occurs b). Weduring were reprogramparticularly ing theonly known silencing of retroviruses intrigued to find NANOG and LIN28 upregulated in all three instances ming (Supplementary Fig. 10a). Quantitative PCR (qPCR) analysis of DOT1L inhibition, because these two genes are part of the core pluripotency network of human ESCs14,15 and can reprogram human
S28

vals during reprogramming. iDot1L treatment in either the first or fibroblasts into iPSCs when used in combination with OCT4 and SOX2 second week was sufficient to enhance reprogramming, whereas treat(ref. 16). ment in the third week or a 5-day pretreatment had no effect We explored Fig. the 10d, possibility that NANOG andanalysis LIN28 upregula(Supplementary e). Immunofluorescence revealed tion might account the enhanced reprogrammingcell observed followsignificantly greaterfor numbers of Tra-1-60-positive clusters on ing DOT1L inhibition, and validated upregulation in 11a, shDot1L day 10 and day 14 in shDot1L culturestheir (Supplementary Fig. b), fibroblasts that upon OSM or OS transduction Fig. 13a, b). indicating the emergence of iPSCs is(Supplementary accelerated upon DOT1L Interestingly, at this time the point REX1 (also known as ZFP42 ) 10 and inhibition. When weearly extended reprogramming experiments by DNMT3B , shDot1L two other well-characterized genes, were not more days, cells still yielded morepluripotency iPSC colonies than controls upregulated, indicating DOT1L inhibition doesindicate not broadly (Supplementary Fig. 11c). that Taken together, these findings that upregulate the pluripotency network. Suppression either Nanog DOT1L inhibition acts in early to middle stages to of accelerate and or Lin28the abrogated the (OS) reprogramming of shDot1L increase efficiency of two-factor the reprogramming process. fibroblasts, the essential roles of NANOG andof LIN28 in this To assess indicating whether DOT1L inhibition could replace any the reproprocess (Fig. 3c and Supplementary Fig. 13c). DOT1L inhibition also gramming factors, we infected control and DOT1L-inhibited fibroled towith increased NANOG expression in the context of OCT4, SOX2 blasts three factors, omitting one factor at a time. In the absence of and LIN28 (OSL) LIN28 expression in the context of OCT4, SOX2 OCT4 or SOX2 noand iPSC colonies emerged (Fig. 2d). When we omitted and NANOG Fig. 14a). Furthermore, DOT1L either KLF4 or(OSN) c-Myc,(Supplementary DOT1L-inhibited fibroblasts gave rise to robust inhibitionof significantly increased the efficiency three-factor repronumbers Tra-1-60-positive colonies, whereasof control cells gener4 gramming in the context OSN and OSL (Supplementary Fig.Sup14b). ated very few colonies, asof reported previously (Fig. 2df and Finally, inclusion NANOG and LIN28 in the OSKM reprogramming plementary Fig. of 12a). Importantly, DOT1L-inhibited fibroblasts cocktail didwith not only confer any additional enhancement to shDot1L cells transduced OCT4 and SOX2 gave rise to Tra-1-60-positive (Fig. 4d whereas and Supplementary Fig. 14c). Taken these data colonies, control fibroblasts did not (Fig.together, 2df). These twoimplicate NANOG LIN28 the enhancement ofthe reprogramming factor iPSCs showedand typical ESCin morphology, silenced reprogramand replacement of KLF4 c-Myc with inhibition. ming vectors and had all ofand the hallmarks of DOT1L pluripotency as gauged by To gain insight into the genome-wide chromatin changes thatall are endogenous pluripotency factor expression and the ability to form facilitated by DOT1L inhibition, we performed chromatin immunothree embryonic germ layers in vitro and in teratomas (Supplementary precipitation followed by DNA sequencing (ChIP-seq) for H3K79me2 9 MARC H 2 0as 1 2 well | V Oas L 4 8 3 | N AT U RE | 599 and H3K27me3 in 2 human ESCs fibroblasts undergoing
NATURE REPRINT COLLECTION Epigenetics

6 0 0 | N AT U R E | VO L 4 8 3 | 2 9 M A R C H 2 0 1 2

Untreated-biorep2

iDot1L-biorep1 iDot1L-biorep2 shCntrl-biorep1 shCntrl-biorep2 shCntrl-biorep3 shDot1L-biorep1 shDot1L-biorep2 shDot1L-biorep3

4.5 4.0 3.5 200 3.0 180 2.5 160 2.0 140 1.5 120 1.0 100 0.5 0

Number of AP+ colonies derived from OCT4GFP MEFs

dH1f fibroblasts shRNA Re-seed OSKM Plate on MEFs

* *

450 400 350 300 250

c 1.2 a Day 21 ARL6IP1 CADM1 1.0 LEFTY1 Tra-1-60 LIN28A INHBA LUM Staining INHBB 0.8 NANOG LEFTY1

Fold change in Tra-1-60+ colonies

b
CDO1 CHST15 COL11A1 LEFTY1 LEFTY2 LIN28A LUM NANOG PROM1** SCG2 ** UPP1
CDO1 CHST15 COLL11A1 LEFTY1 LEFTY2 LIN28A LUM NANOG PROM1 SCG2 UPP1

UPP1 LIN28A 0.6 LUM NANOG NPPB 0.4 shDot1L_ o L_ PMEPA1 KM M OSKM RUNX3 ** (22) (22) (2 0.2 UPP1

**

**

RESEARCH LETTER
a
Day 6
Fold change in Tra-1-60+ colonies ARL6IP1 c CADM1 INHBA Day 21 INHBB Tra-1-60 LEFTY1 LIN28A Staining LUM NANOG NPPB PMEPA1 RUNX3 UPP1

LETTER RESEARCH
Fold change in Tra-1-60+ colonies

a
3.5 Fold change in Tra-1-60+ colonies 3.0 2.5 2.0 1.5 1.0 0.5

b
n=5

c
Day 5 Day 1 Day 0
Number of AP+ colonies derived from OCT4GFP MEFs

a
Day 6

b
1.2 1.0 0.8 0.6 0.4
LEFTY1 LIN28A LUM NANOG UPP1 CDO1 CHST15 COL11A1 LEFTY1 LEFTY2 LIN28A LUM NANOG ** PROM1 SCG2 UPP1
CDO1 CHST15 COLL11A1 LEFTY1 LEFTY2 LIN28A LUM NANOG PROM1 SCG2 UPP1

n=5

shDot1L_ OSM 0.2 (94)

sh C nt sh Do r l t1 L sh Do -1 sh t1L +D D -2 ot ot1 1 L s L_ -1 +D hD wt ot ot1 1L L-+ colonies Number of Tra-1-60 _m 1 ut

100 80

U nt re at ed 1 M 3. 3 M 10 M

U nt re at ed iD ot 1L

Fold change in Tra-1-60+ colonies

shCntrl

SUV39H1 YY1 DOT1L DNMT3A MECP2 NR2F1 DNMT1 + Tra-1-60 colonies SMYD2 CNTRL MBD2 MBD4 EZH1 SUV39H2 MBD1 G9A MBD3 SETDB1 OCT4 BMI1 RING1 SUZ12 EHMT1 EZH2 EED Number of Tra-1-60+ colonies

120 d 100 80 60

Fold change in Tra-1-60+ colonies

OSKM

OSM

60 OSK 40 20 0

iDot1L

OS

SKM

OKM

3.5 3.0 2.5 2.0 1.5 1.0 0.5


Cntrl

shDot1L

e
Untreated

SUV39H1

YY1

f DNMT3A DOT1L MECP2 NR2F1 DNMT1 SMYD2 40 250 shCntrl


200

CNTRL
iDot1L

MBD2

MBD4

150 EZH1 100 50

shDot1L Untreated iDot1L

20 0

SUV39H2 MBD1

G9A

MBD3

SETDB1 OCT4

BMI1

0 EZH2 RING1 OSKM OSM EHMT1 OSK OS SUZ12

of mouse embryonic fibroblasts carrying an OCT4-GFP (green Figs 7ac and 12b). reporter PCR on genomic DNA isolated from Fig. expanded fluorescent protein) gene (OCT4GFP MEFs; 2c). colonies confirmed the absence of integrated andfrom c-Myc Reprogramming of tail-tip fibroblasts (TTFs)KLF4 derived a transcongenes (Supplementary Fig.mouse 12c). Thus, we were able to generate twoditional knockout DOT1L strain yielded significantly more 12 factor iPSCs either by suppression of DOT1L expression or chemical iPSC colonies upon deletion of DOT1L (Supplementary Fig. 8a). inhibition of excision its methyltransferase activity. Cre-mediated of both floxed DOT1L alleles in iPSC clones To gain insights into the molecular mechanisms how DOT1L derived from homozygous TTFs was confirmed by of genomic PCR inhibition promotes reprogramming and replaces KLF4 we performed (Supplementary Fig. 8b). DOT1L inhibition also increased reprogramglobal gene-expression on control and shDot1L ming efficiency of MEFs analyses and peripheral blood cells derivedfibroblasts from an before and 6 days after OSKM and OSM 13 transduction, along with cells inducible secondary iPSC mouse strain (Supplementary Fig. 8c, d). that were treated with iDot1L. Relatively few genes were differentially Taken together, these results demonstrate that DOT1L inhibition expressed in shDot1L cells on day 6 of reprogramming (22 up, 23 enhances reprogramming of both mouse and human cells. down; Supplementary Table 3). Inhibitor-treated cells showed broader We expression next examined the cellular by which DOT1L gene changes (405 up and mechanisms 175 down; Supplementary Table 3), inhibition promotes reprogramming. DOT1L inhibition affected neipresumably due to more complete inhibition of K79me2 levels (Fig. 3a). ther retroviral expression cellular upregulated proliferation In the absence transgene of KLF4, 94 genes were nor differentially in (Supplementary Fig. 9ac). of Although previous studies indicated that shDot1L cells; intersection this set of genes with the set differentially DOT1L -null cells have increased apoptosis and of cells upregulated in four-factor reprogramming of accumulation DOT1L-inhibited cells 9 in yielded G2 phase , we failed to observe a significant increase in apoptosis or only five common genes (Fig. 3a, b). We were particularly change in the cell cycle profile of DOT1L-inhibited fibroblasts intrigued to find NANOG and LIN28 upregulated in all three instances (Supplementary Fig. 9d,because e). In these human iPSC clones derived from of DOT1L inhibition, two genes are part of the core shDot1L fibroblasts, DOT1L inhibition was and no longer evident, reflectpluripotency network of human ESCs14,15 can reprogram human ing the known silencing of retroviruses that occurs during reprogram6 0 0 (Supplementary | N A T U R E | V OFig. L 48 3 | 2 Quantitative 9 M A R C H 2 0PCR 12 ming 10a). (qPCR) analysis
NATURE REPRINT COLLECTION Epigenetics

Figure 2 | DOT1L inhibition enhances reprogramming efficiency and substitutes for KLF4 and Myc. a, Fold change in the reprogramming efficiency of dH1f infected with two independent DOT1L shRNAs or co-infected with Figure 1 | cells Screening for inhibitors and enhancers of reprogramming. shRNA-1 of and a vector expressing shRNA-resistant a, Timeline shRNA infection andan iPSC generation. b,wild-type Number or of catalytically Tra-1-601 dead mutant DOT1L. Data correspond to the average and s.e.m.; colonies 21 days after OSKM transduction of 25,000 dH1f cells previously n 5 independent *P , 0.01 comparedgenes. to control shRNA- Trainfected with pools experiments. of shRNAs against the indicated Representative expressing fibroblasts. b, Fold change the reprogramming efficiency of dH1f 1-60-stained reprogramming wells are in shown. The dotted lines indicates 3 cells treated with iDot1L at the indicated concentrations for 21 days. Data standard deviations from the mean number of colonies in control wells. correspond of to primary the mean 6 s.d.;hits n 5that 3. *P , 0.001 reprogramming compared to untreated c, Validation screen decrease efficiency. fibroblasts. c, Number of alkaline-phosphatase-positive (AP1) colonies derived from OSKM-transduced untreated or iDot1L-treated (10 mM) OCT4GFP and differentiated into alluntreated three embryonic layers in vitro MEFs. *P , 0.001 compared MEFs (n 5 4; germ error bars, 6 s.d.). and in teratomas (Supplementary Fig. 7ac). stained Therefore, Representative AP-stained wells are shown. d, Tra-1-60 of platesiPSCs of shCntrl and shDot1L DOT1L fibroblasts in the absence of each or hallmarks both KLF4 and generated following inhibition display allfactor of the of c-Myc. e, Tra-1-60-stained plates of untreated and iDot1L treated (3.3 mM) pluripotency. fibroblasts in the absence of each factor or both KLF4 and c-Myc. f, Quantification We next assessed DOT1L inhibition in murine reprogramming. of the Tra-1-601 colonies in Fig. 2d, e representing mean and s.d. of two iDot1L treatment led todone threefold enhancement of reprogramming independent experiments in triplicate.

Figure 3 | NANOG and LIN28 are required for enhancement of reprogramming by DOT1L inhibition. a, Overlap of differentially upregulated genes in shDot1L cells 6 days post-OSKM and OSM transduction Fold in upregulated Tra-1-601 iPSC colonies relative to control cells.cells. *P ,b0.05, withchange the genes in OSKM-transduced iDot1L-treated , Heat ** P , showing 0.01 compared to control shRNA-expressing fibroblasts (n 5 4; error maps differential expression levels of commonly upregulated genes in bars, 6s.e.m.). Representative Tra-1-60-stained wells are d1 , Validation iPSC OSKM-transduced DOT1L-inhibited cells. c, Number of shown. Tra-1-60 colonies upon knockdown of Nanog or Lin28 in 2-factor reprogramming of in of primary screen hits that increase reprogramming efficiency. Fold change shDot1L1 cells. Data represent mean and s.e.m of 2 independent experiments iPSC colonies relative to control cells. *P , 0.05, **P , 0.01 Tra-1-60 1 iPSC colonies in 4-factor done in triplicate. d, shRNA-expressing Fold-change in Tra-1-60 compared to control fibroblasts (n 5 4; error (OSKM) and 6-factor (OSKMNL) reprogramming of shCntrl and shDot1L bars, 6 s.e.m.). Representative Tra-1-60-stained wells are shown. fibroblasts. Data represent mean and s.e.m. of two independent experiments done in duplicate. Representative Tra-1-60-stained wells are shown above.

EED

revealed that the silencing occurred by day 15 after OSKM transduction (Supplementary 10b, c). in Tocombination define the crucial time window for fibroblasts into iPSCsFig. when used with OCT4 and SOX2 (ref. 16). DOT1L inhibition, we treated fibroblasts with iDot1L at 1-week interWe explored the possibility that NANOG and LIN28 the upregulavals during reprogramming. iDot1L treatment in either first or tion might account for the enhanced reprogramming observed followsecond week was sufficient to enhance reprogramming, whereas treating DOT1L and validated their upregulation in no shDot1L ment in theinhibition, third week or a 5-day pretreatment had effect fibroblasts upon OSM or OS transduction (Supplementary Fig.revealed 13a, b). (Supplementary Fig. 10d, e). Immunofluorescence analysis Interestingly,greater at this early time point REX1 (also known as ZFP42 ) and significantly numbers of Tra-1-60-positive cell clusters on DNMT3B , two genes, were not day 10 and day other 14 inwell-characterized shDot1L cultures pluripotency (Supplementary Fig. 11a, b), upregulated, indicating that DOT1L does upon not broadly indicating that the emergence of iPSCsinhibition is accelerated DOT1L upregulate the pluripotency network. Suppression of either Nanog inhibition. When we extended the reprogramming experiments by 10 or Lin28 abrogated the two-factor (OS) reprogramming of shDot1L more days, shDot1L cells still yielded more iPSC colonies than controls fibroblasts, indicating the essential roles of NANOG and LIN28 in this (Supplementary Fig. 11c). Taken together, these findings indicate that process (Fig. 3c and Supplementary Fig. 13c). DOT1L inhibition DOT1L inhibition acts in early to middle stages to accelerate also and led to increased NANOG expression in the context of OCT4, SOX2 increase the efficiency of the reprogramming process. and LIN28 (OSL) and LIN28 expression in the context of OCT4, SOX2 To assess whether DOT1L inhibition could replace any of the reproand NANOG (OSN) (Supplementary Fig. 14a). Furthermore, DOT1L gramming factors, we infected control and DOT1L-inhibited fibroinhibition significantly increased the efficiency of three-factor reproblasts with three factors, omitting one factor at a time. In the absence of gramming in the context of OSN and OSL (Supplementary Fig. 14b). OCT4 or SOX2 no iPSC colonies 2d). When we omitted Finally, inclusion of NANOG andemerged LIN28 in(Fig. the OSKM reprogramming either KLF4 or c-Myc, DOT1L-inhibited fibroblasts gave rise to robust cocktail did not confer any additional enhancement to shDot1L cells numbers Tra-1-60-positive colonies, control cells (Fig. 4d of and Supplementary Fig. 14c). whereas Taken4 together, thesegenerdata ated very NANOG few colonies, as reported (Fig. 2df and Supimplicate and LIN28 in the previously enhancement of reprogramming plementary Fig. of 12a). Importantly, DOT1L-inhibited fibroblasts and replacement KLF4 and c-Myc with DOT1L inhibition. transduced with only OCT4 and SOX2 gave rise to Tra-1-60-positive To gain insight into the genome-wide chromatin changes that are colonies, whereas control fibroblasts did not (Fig. 2df). These twofacilitated by DOT1L inhibition, we performed chromatin immunofactor iPSCs showed typical ESC morphology, silenced for theH3K79me2 reprogramprecipitation followed by DNA sequencing (ChIP-seq) ming vectors and had all of the hallmarks ofas pluripotency gauged by and H3K27me3 in human ESCs as well fibroblasts as undergoing endogenous pluripotency factor expression and the ability to form all three embryonic germ layers in vitro and in teratomas (Supplementary
2 9 M A R C H 2 0 1 2 | VO L 4 8 3 | N AT U R E | 5 9 9

sh sh Cnt SE rl TD B sh 1 Bm sh i1 Ri ng 1 sh Ee d sh Ez h sh 2 Su z1 Untreated-biorep1 2 Untreated-biorep2 iDot1L-biorep1 iDot1L-biorep2 shCntrl-biorep1 shCntrl-biorep2 shCntrl-biorep3 shDot1L-biorep1 shDot1L-biorep2 shDot1L-biorep3

5.0 * * OSKM dH1f fibroblasts shRNA Re-seed Plate on MEFs 450 4.5 * 400 4.0 * 350 * 3.5 300 n=5 b 3.0 * 250 200 2.5 200 2.0 180 150 1.5 100 160 1.0 n=3 50 140 0.5 0 0 120

n=3

shDot1L_ o L_ KM M OSKM (22) (22) (2

**

**

iDot1L_OSKM (405)

**

**

d
3.5 3.0 2.5 2.0 1.5 1.0 0.5 0
shCntrl shDot1L shDot1L + N2L + N2L

**

shCntrl shLin28A shNanog shCntrl + shYY1 shSuv39H1 shDot1L shDot1L OS

S29

RESEARCH LETTER
a
a
ESCs

LETTER RESEARCH
b c
5.0 4.5 4.0 * 3.5 3.0 2.5 2.0 1.5 1.0 0.5 SNAI2 0
te d M 1 3. 3

a
ARL6IP1 CADM1 INHBA INHBB LEFTY1 LIN28A LUM NANOG NPPB PMEPA1 RUNX3 UPP1 LEFTY1 LIN28A LUM NANOG UPP1

d
0.18 CDO1 CHST15 0.16 COL11A1 LEFTY1 0.14 LEFTY2 LIN28A LUM0.12 NANOG PROM1 0.10 SCG2 UPP1 0.08 Relative mRNA levels 0.06 0.04 0.02 0

NANOG

3.5

n=5
Fold change in Tra-1-60+ colonies

H3K79me2

Fold change in Tra-1-60+ colonies

3.0

Fib_OSKM 2.5

n=3

Number of AP+ colonies derived from OCT4GFP MEFs

* Fib

450 400 350 300 250 200 150 100 50 0

2.0Fib_iDot1L n = 5

1.5Fib_iDot1L n_OSKM =5 1.0 ES_H3K27me3 0.5 0

shDot1L_ OSM (94)

iDot1L_OSKM (405)

n=3
SNAI1

ZEB1

ZEB2
te d iD ot 1L

TGFB2

C nt Do r l t1 L sh Do -1 sh t1L +D D -2 ot ot1 1 L sh L_w -1 +D D t ot ot1 1L L_m 1 ut

U nt re a

U nt re a

sh

Number of Tra-1-60+ colonies

shCntrl

Number of Tra-1-60+ colonies

120 100 80 60 40 20 0
shCntrl shDot1L Untreated iDot1L

120 100 80 60 40 20 0
OSK OS Cntrl shLin28A shNanog

Fold change in Tra-1-60+ colonies

1.6 OSKM 1.4 1.2

iDot1L

c
e

OSM

OSK

OS

D0 SKM D6 D12 D15

140 OKM

0.012 3.0

Relative mRNA levels

shDot1L

1.0 0.8 ZEB2 Tra-1-60+ colonies 0.6 0.4 0.2

Relative mRNA levels

2.5 0.010 2.0 0.008 1.5 0.006 1.0 0.004 0.5


0.002 0
nt nt rl_ D0

250 200 150 100 50 0

Untreated

iDot1L

SNAI1

SNAI2

CDH1

SNAI1

SNAI2

OCLN

CDH1

ZEB1

ZEB2

ZEB1

OCLN

Figure 2 | DOT1L inhibition enhances reprogramming efficiency and iDot1L substitutes for KLF4Control and Myc. a, Fold change in the reprogramming efficiency of dH1f cells infected with two independent DOT1L shRNAs or co-infected with Figure 4and | Genome-wide analysis of H3K79me2 marks during shRNA-1 a vector expressing an shRNA-resistant wild-type or catalytically reprogramming. a, Data H3K79me2 ChIP-sequencing tracks (blue) for select dead mutant DOT1L. correspond to the average and s.e.m.; genes in fibroblasts (Fib) and ESCs with the nEMT-associated 5 independent experiments. *P , 0.01 compared toalong control shRNAcorresponding H3K27me3 tracks in in ESCs b, Expression of EMTexpressing fibroblasts. b, Fold change the (red). reprogramming efficiency of dH1f associated factors (EMT-TF) and epithelial genes in control cells treated transcription with iDot1L at the indicated concentrations for 21 days. Data and iDot1L-treated fibroblasts at the indicated during correspond to the mean 6 s.d.; n5 3. *P , time 0.001points compared to reprogramming. untreated qPCR wascnormalized to uninfected fibroblasts for EMT-TFs H1 ESCs for colonies derived fibroblasts. , Number of alkaline-phosphatase-positive (AP1) and 1 colonies(10 derived from untreated CDH1 and OCLN. c, Number of Tra-1-60 from OSKM-transduced untreated or iDot1L-treated mM) OCT4GFP and iDot1L-treated (3.3 mM) dH1f cells that ( are either infected with SNAI1, MEFs. *P , 0.001 compared untreated MEFs n5 4; error bars, 6 s.d.). Representative AP-stained wells are shown. d, Tra-1-60 stained of plates of reprogramming, or without iDot1L shCntrl and shDot1Lwith fibroblasts in the absence oftreatment each factor (Supplementary or both KLF4 and Fig. 15). In both ESCs plates and fibroblasts, is positively assoc-Myc. e, Tra-1-60-stained of untreatedH3K79me2 and iDot1L treated (3.3 mM) ciated with transcriptionally active genes and associated fibroblasts in the absence of each factor or both KLF4 andnegatively c-Myc. f, Quantification 1 colonies Fig. 2d, e representing mean and s.d. of two ESCof the Tra-1-60 with genes marked byin H3K27me3 (Supplementary Fig. 16ac). independent experiments in triplicate.included pluripotency factors, a specific genes marked done by H3K79me2
nt U

Figure 3 | NANOG and LIN28 are required for enhancement of reprogramming by DOT1L inhibition. a, Overlap of differentially iDot1L upregulated genes in shDot1L cells 6 days post-OSKM and OSM transduction iDot1L with the genes upregulated in OSKM-transduced iDot1L-treated cells. b, Heat TWIST1 or ZEB1 expression vectors or treated with soluble TGF-b2 genes in maps showing differential expression levels of commonly upregulated (2 ng ml21) (n 5 3; error bars, 6 s.d.). Representative Tra-1-60-stained wells OSKM-transduced DOT1L-inhibited cells. c, Number of Tra-1-601 iPSC are shown. d, qRTPCR quantification of NANOG mRNA reprogramming level on day 6 ofof colonies upon knockdown of Nanog or Lin28 in 2-factor OSKM-expressing untreated ormean iDot1L-treated M) fibroblasts expressing shDot1L cells. Data represent and s.e.m(3.3 of 2mindependent experiments the indicated EMT-factors. Expression levels were 1 normalized to those iPSC colonies in 4-factor done in triplicate. d, Fold-change in Tra-1-60 observed in H1 ESCs. e, qRTPCR quantification of LIN28A mRNA level on (OSKM) and 6-factor (OSKMNL) reprogramming of shCntrl and shDot1L day 6 of OSKM-expressing untreated or iDot1L-treated (3.3 mM) fibroblasts fibroblasts. Data represent mean and s.e.m. of two independent experiments expressing the indicated EMT-factors. Expression levels were normalized to done in duplicate. Representative Tra-1-60-stained wells are shown above. those observed in H1 ESCs.
ZE IS TW TG C C

subset of their downstream targets, andDNA genes involved in epithelial cell Figs 7ac and 12b). PCR on genomic isolated from expanded adhesion such as CDH1 (E-cadherin) (280 genes; Supplementary colonies confirmed the absence of integrated KLF4 and c-Myc transFig. 17a, b and Supplementary Tables 4, 5). In contrast, in fibroblasts, genes (Supplementary Fig. 12c). Thus, we were able to generate twogenes marked by H3K79me2 were significantly enriched in genes factor iPSCs either by suppression of DOT1L expression or chemical induced during the epithelial to mesenchymal transition (EMT) (377 inhibition of its methyltransferase activity. genes; Supplementary Fig. 17a). To gain insights into the molecular mechanisms of how DOT1L Among the 348 genes that showed reduced H3K79me2 6 days after inhibition promotes we reprogramming and replaces KLF4 we performed OSKM expression, likewise found a significant enrichment of gene global gene-expression analyses on control and shDot1L sets associated with the induction of a mesenchymal state,fibroblasts including before and 6 days after OSKM(Supplementary and OSM transduction, withacells 17,18 . Only few SNAI2 , TGFB2 and TGFBR1 Fig. 18a)along that were treated with iDot1L. Relatively few genes were differentially of these genes showed decreased expression at day 6 (12 out of 348), expressed in majority shDot1Lof cells on day 6 of mark reprogramming (22 up, 23 but the vast them lacked this in the pluripotent state down; Supplementary Table 3). Inhibitor-treated cells showed broader (272 out of the 348 devoid of H3K79me2 in ESCs), suggesting they gene expression (405 up and 175 down;during Supplementary Table 3), were destined changes for transcriptional silencing reprogramming. presumably due to more complete of K79me2 levels (Fig. 3a). This finding prompted us to askinhibition whether DOT1L inhibition results In 94 genes were differentially upregulated in inthe theabsence removalofofKLF4, H3K79me2 from such fibroblast-specific, EMTshDot1L cells; intersection of thisinhibitor set of genes with the set differentially associated genes. Upon DOT1L treatment, H3K79me2 levels upregulated in four-factor of exception DOT1L-inhibited cells were reduced on almost reprogramming all loci, with the of a subset yielded onlymostly five common genes (Fig. 3a,that b). also We had were particularly comprised of housekeeping genes high levels of intrigued to find NANOG and LIN28 upregulated in all threethe instances H3K79me2 in ESCs (Supplementary Fig. 19a). Strikingly, genes of DOT1L inhibition, because these two genes are part of the core pluripotency network of human ESCs14,15 and can reprogram human
6 0 0 | N AT U R E | VO L 4 8 3 | 2 9 M A R C H 2 0 1 2

fibroblasts into iPSCs when used in combination with OCT4 and SOX2 that lost proportionally the most H3K79me2 in inhibitor-treated (ref. 16). fibroblasts during reprogramming (eightfold more) were again We explored the possibility that NANOGor and LIN28 upregulahighly enriched in genes induced in EMT (Supplementary Fig. follow19b). tion might account for the enhanced reprogramming observed Mesenchymal master regulators such as SNAI1 , SNAI2 , ZEB1 , ZEB2 ing DOT1L inhibition, and validated their 19 upregulation in shDot1L . In the presence the and TGFB2upon were OSM among genes (Fig. 4a) fibroblasts orthese OS transduction (Supplementary Fig. of 13a, b). DOT1L inhibitor, these regulators were more strongly repressed Interestingly, at this early time point REX1 (also known as ZFP42) and during reprogramming, whereas epithelial genes such as CDH1 and DNMT3B, two other well-characterized pluripotency genes, were not OCLN were more robustly upregulated (Fig. 4b). The extinction of upregulated, indicating that DOT1L inhibition does not broadly fibroblast gene expression was accompanied by increased deposition upregulate the pluripotency network. Suppression of either Nanog of the repressive H3K27me3 mark on the majority of fibroblastor Lin28 abrogated the two-factor (OS) reprogramming of shDot1L specific regulators examined (Supplementary Fig. 20). In contrast, fibroblasts, indicating theto essential roles of NANOG in this H3K27me3 was depleted a greater extent on SOX2and andLIN28 E-cadherin process (Fig. 3c and Supplementary Fig. 13c). DOT1L inhibition also promoters, reflecting their activation during reprogramming. Finally, led H3K27me3 to increasedstatus NANOG expression in the of OCT4, SOX2 the of master regulators of context other lineages, such as and LIN28 (OSL) LIN28 expression the context of OCT4,upon SOX2 OLIG2 , MYOD1 , and NKX2-1 and GATA4, in remained unchanged and NANOG (OSN) (Supplementary Fig. 14a). DOT1L DOT1L inhibitor treatment, indicating that Furthermore, the deposition of inhibition significantly the efficiency of three-factor reproH3K27me3 was specific increased to fibroblast-specific regulators. gramming in the context of OSN and OSL (Supplementary Fig. 14b). To test the functional importance of downregulation of mesenchymal Finally, inclusion NANOG and LIN28 in the OSKM reprogramming regulators in the of iDot1L-mediated enhancement of reprogramming, cocktail did not confer anySNAI1 additional tosoluble shDot1L cells we overexpressed TWIST1, and enhancement ZEB1 or added TGF(Fig. 4d and Supplementary Fig. 14c). Taken together, these data b 2 to cells undergoing reprogramming in the presence of the DOT1L implicate NANOG and perturbations LIN28 in the enhancement of reprogramming inhibitor. All of these significantly counteracted the and replacement of KLF4 and c-Myc with DOT1L enhancement observed with DOT1L inhibition (Fig. inhibition. 4c). Interestingly, To gainof insight into the genome-wide chromatin changes that are expression these factors also abrogated the iDot1L-mediated upregufacilitated by DOT1L we performed chromatin lation of NANOG andinhibition, LIN28, suggesting that the effect of immunoDOT1L precipitation followed by DNA sequencing (ChIP-seq) for H3K79me2 9 M A R ESCs C H 2 0as 1 2 well | V Oas L 4 8 3 | N AT U RE | 601 and H3K27me3 in 2 human fibroblasts undergoing
NATURE REPRINT COLLECTION Epigenetics

S30

rl_ D Ve 6 ct o SN r AI TW 1 IS T1 ZE TG B1 F2

B1

Ve c

re a

SN

F2

to

AI 1

T1

te

OSKM OSM

shDot1L + OS
r

C nt rl_ C D0 nt Untreated-biorep1 rl_ D Untreated-biorep2 Ve 6 iDot1L-biorep1 ct or SN iDot1L-biorep2 A TW I1 shCntrl-biorep1 shCntrl-biorep2 IS T1 shCntrl-biorep3 ZE shDot1L-biorep1 B shDot1L-biorep2 TG 1 shDot1L-biorep3 F2

shDot1L_ o L_ KM M OSKM (22) (22) (2

CDO1 CHST15 COLL11A1 LEFTY1 LEFTY2 LIN28A LUM NANOG PROM1 SCG2 UPP1

sh

10

3.5

iDot1L LIN28A

shCntrl

shCntrl shDot1L shDot1L + N2L + N2L

RESEARCH LETTER
Fold change in Tra-1-60+ colonies

LETTER RESEARCH

inhibition on these two pluripotency genes is likely to be indirect 4. Park, I.-H. et al. Reprogramming of human somatic cells to pluripotency with defined factors. Nature 451, 141146 (2008). a we tested whether destabilization of the c 1.2 (Fig. 4d, e). Conversely, 5. Margueron, R. & Reinberg, D. The Polycomb complex PRC2 and its mark in life. Day 6 Day 5 Day 1 Day 0 Day 6 Day 21 mesenchymal state by inhibition of TGF-b signalling would be redundNature 469, 343349 (2011). 1.0 dH1f fibroblasts shRNA Re-seed OSKM Plate MEFs b Tra-1-60 6. Pereira, C. F. et al. ESCs require PRC2 to direct the successful reprogramming of ant with DOT1L inhibition. A small molecule inhibitor ofonTGF0.8 cells toward pluripotency. Cell Stem Cell 6, 547556 (2010). signalling (SB431542) increased reprogramming efficiency, but in com- Stainingdifferentiated ippel-related protein, 7. Shi, Y. et al. Transcriptional repression by YY1, a human GLI-Kru * bination with the DOT1L inhibitor, showed no significant further and relief of repression by adenovirus E1A protein. Cell 67, 377388 (1991). 0.6 ** b ** increase in iPSC colonies (Supplementary Fig. 21). Taken together these 8. Schotta, G., Ebert, A. & Reuter, G. S. U. (VAR)39 is a conserved key function in 0.4 heterochromatic gene silencing. Genetica 117, 149158 (2003). ** 200 data indicate that in fibroblasts, downregulation of the mesenchymal ** 9. Jones, B. et al. The ** histone H3K79 methyltransferase Dot1L is essential for gene expression program is critical to enhancement of reprogramming 180 0.2 mammalian development and heterochromatin structure. PLoS Genet. 4, by DOT1L inhibition. 160 e1000190 0 (2008). Our loss-of-function survey indicates that chromatin-modifying 10. Okada, Y. et al. hDOT1L links histone methylation to leukemogenesis. Cell 121, 140 (2005). enzymes play critical roles for both reactivating silenced loci as well 11. 167178 120 Daigle, S. R. et al. Selective killing of mixed lineage leukemia cells by a potent small100 of heterochromatin during the global as reinstating closed domains molecule DOT1L inhibitor. Cancer Cell 20, 5365 (2011). epigenetic remodelling of 80 differentiated cells to pluripotency, thus 12. Bernt, K. M. et al. MLL-rearranged leukemia is dependent on aberrant H3K79 methylation by DOT1L. Cancer Cell 20, 6678 (2011). implicating specific enzymes 60 as facilitators or barriers to cell fate tran13. Carey, B. W. et al. Single-gene transgenic mouse strains for reprogramming adult sitions. DOT1L inhibition seems to enhance reprogramming at least in 40 d cells. Nature Methods 7, 5659 (2010). somatic part by facilitating loss of20 H3K79me2 from fibroblast genes whose 14. Boyer, L. A. et al. Core transcriptional regulatory circuitry in human embryonic 3.5 ** stem cells. Cell 122, 947956 (2005). 0 silencing is required for reprogramming (Supplementary Fig. 22). * Mikkelsen, 3.0 T. S. et al. Dissecting direct reprogramming through integrative Interestingly, KLF4, which can be replaced by DOT1L inhibition, 15. genomic analysis. Nature 454, 4955 (2008). has been shown to facilitate a mesenchymal to epithelial transition 16. Yu, J. et2.5 al. Induced pluripotent stem cell lines derived from human somatic cells. 20 Science2.0 318, 19171920 * (2007). (MET) by inducing E-cadherin expression . Persistent H3K79me2 17. Charafe-Jauffret, E. et al. Gene expression profiling of breast cell lines identifies at the fibroblast masterSUV39H1 regulators during the initialMECP2 phasesNR2F1 of reproYY1 DOT1L DNMT3A DNMT1 SMYD2 potential new basal markers. Oncogene 25, 22732284 (2006). 1.5 gramming seems to prevent shutdown of these genes, thus hindering 18. Onder, T. T. et al. Loss of E-cadherin promotes metastasis via multiple downstream the acquisition of an epithelial phenotype concomitant with delayed 1.0 transcriptional pathways. Cancer Res. 68, 36453654 (2008). Taube, J. H. et al. Core epithelial-to-mesenchymal transition interactome geneactivation of NANOG and LIN28. In this regard H3K79me2 acts as a 19. CNTRL MBD2 MBD4 EZH1 SUV39H2 MBD1 G9A MBD3 0.5 signature is associated with claudin-low and metaplastic breast cancer expression barrier to efficient repression of the somatic program by the reprosubtypes.0Proc. Natl Acad. Sci. USA 107, 1544915454 (2010). gramming factors. This notion is consistent with the role of Dot1 in 20. Samavarchi-Tehrani, shCntrl P. shYY1 shSuv39H1 shDot1L reveals a BMP-driven et al. Functional genomics EED BMI1 21 RING1 SUZ12 EHMT1 EZH2 SETDB1 OCT4 mesenchymal-to-epithelial transition in the initiation of somatic cell . As reprogramming of yeast, where it antagonizes gene repression reprogramming. Cell Stem Cell 7, 6477 (2010). blood cells is also enhanced by DOT1L inhibition, we speculate that 21. Stulemeijer, I. J. e. t. a. l. Dot1 binding induces chromatin rearrangements by DOT1L inhibition may enhance reprogramming in a broad range of histone methylation-dependent and -independent mechanisms. Epigenetics cell types by facilitating the silencing of lineage-specific programs of Fold chromatin 4,Tra-1-60 2 (2011).1 iPSC colonies relative to control cells. *P , 0.05, change in Figure 1 | Screening for inhibitors and enhancers of reprogramming. 1 22. Olson, A. et al. RNAi Codex: a portal/database for short-hairpin RNA (shRNA) genegene expression. results also demonstrate that specific a, Timeline of shRNA Finally, infectionour and iPSC generation. b, Number of Tra-1-60 ** P , 0.01 compared to control shRNA-expressing fibroblasts (n 5 4; error silencing constructs. Nucleic Acids Res. 34, D153D157 (2006). chromatin modifiers cantransduction be modulated to generate iPSCs more effi- bars, colonies 21 days after OSKM of 25,000 dH1f cells previously 6s.e.m.). Representative Tra-1-60-stained are shown. , Validation 23. Schlabach, M. R. et al. Cancer proliferation gene wells discovery through d functional ciently and with fewer exogenously transcription factors. infected with pools of shRNAs against theintroduced indicated genes. Representative Tra- of primary screen hits319, that 620624 increase (2008). reprogramming efficiency. Fold change in genomics. Science
Number of Tra-1-60+ colonies

1-60-stained reprogramming wells are shown. The dotted lines indicates 3 standard deviations from the mean number of colonies in control wells. METHODS SUMMARY 22 reprogramming efficiency. c, Validation of designed primary screen hits that Codex decrease shRNAs were using the RNAi . 97-mer oligonucleotides (Sup-

Tra-1-601 iPSC colonies relative to control cells. *P , 0.05, **P , 0.01 Supplementary Information is linked to the online version of the paper at compared to control shRNA-expressing fibroblasts (n 5 4; error www.nature.com/nature. bars, 6 s.e.m.). Representative Tra-1-60-stained wells are shown.

iDot1L treatment led to threefold enhancement of reprogramming Full Methods and any associated references are available in the online version of ofthe mouse embryonic fibroblasts carrying an OCT4-GFP (green paper at www.nature.com/nature. fluorescent protein) reporter gene (OCT4GFP MEFs; Fig. 2c). Received 16 May 2011; accepted 16 February 2012. derived from a conReprogramming of tail-tip fibroblasts (TTFs) Published online 4 March; corrected 28 March 2012 (see full-text HTML version for ditional knockout DOT1L mouse strain yielded significantly more details). iPSC colonies upon deletion of DOT1L12 (Supplementary Fig. 8a). Cre-mediated excision of both floxed DOT1L alleles in iPSC clones 1. Takahashi, K. & Yamanaka, S. Induction of pluripotent stem cells from mouse derived from homozygous TTFs was by genomic PCR embryonic and adult fibroblast cultures byconfirmed defined factors. Cell 126, 663676 (2006). (Supplementary Fig. 8b). DOT1L inhibition also increased reprogram2. Hawkins, R. D. et al. Distinct epigenomic landscapes of pluripotent and lineageming committed efficiencyhuman of MEFs andStem peripheral blood(2010). cells derived from an cells. Cell Cell 6, 479491 13 inducible secondary iPSC mouse strain (Supplementary Fig. 8c,and d). 3. Mikkelsen, T. S. et al. Genome-wide maps of chromatin state in pluripotent cells. Nature demonstrate 448, 553560 (2007). Takenlineage-committed together, these results that DOT1L inhibition enhances reprogramming of both mouse and human cells. We next examined the cellular mechanisms by which DOT1L inhibition promotes reprogramming. DOT1L inhibition affected neither retroviral transgene expression nor cellular proliferation (Supplementary Fig. 9ac). Although previous studies indicated that DOT1L-null cells have increased apoptosis and accumulation of cells in G2 phase9, we failed to observe a significant increase in apoptosis or change in the cell cycle profile of DOT1L-inhibited fibroblasts (Supplementary Fig. 9d, e). In human iPSC clones derived from shDot1L fibroblasts, DOT1L inhibition was no longer evident, reflecting the known silencing of retroviruses that occurs during reprogram6 0 2 (Supplementary | N A T U R E | V O Fig. L 4 810a). 3 | 29 M A R C H 2 0 PCR 12 ming Quantitative (qPCR) analysis
NATURE REPRINT COLLECTION Epigenetics

plementary Table 1) were PCR-amplified and cloned into the MSCV-PM23 vector. 4 16 or lentiviral Reprogramming assays were with eithergerm retroviral and differentiated into all carried three out embryonic layers in vitro 4 reprogramming vectors. dH1f cells were previously described . For gene expresand in teratomas (Supplementary Fig. 7ac). Therefore, iPSCs sion analyses, total RNA was extracted from two or three independent culture generated DOT1L display all of the hallmarks of plates forfollowing each condition and inhibition transcriptional profiling was performed using pluripotency. Affymetrix U133A microarrays. ChIP-seq was performed as described with slight 12 We next assessed DOT1L inhibition in murine reprogramming. . modifications

Acknowledgements We thank G. Hu and S. J. Elledge for providing the MSCV-PM vector, K. Ng and M. W. Lensch for teratoma injections and assessment and S. Loewer revealed that We thealso silencing day 15 OSKM transducfor discussions. thank E. occurred Olhava and by Epizyme Inc. after for synthesizing and providing the DOT1L inhibitor, EPZ004777. G.Q.D. isthe an investigator of the Howardfor tion (Supplementary Fig. 10b, c). To define crucial time window Hughes Medical Institute. Research was funded by grants from the US National DOT1L inhibition, we treated fibroblasts with iDot1L at 1-week interInstitutes of Health (NIH) to S.A.A. (CA140575) and G.Q.D., and the CHB Stem Cell Program. vals during reprogramming. iDot1L treatment in either the first or

second week was sufficient to enhance whereas treatAuthor Contributions T.T.O. performed project reprogramming, planning, experimental work, data interpretation and preparation of or the manuscript. N.K., A.C, N.Z., J.U. had and B.O.M. ment in the third week a 5-day pretreatment no effect performed experimental P.C. and A.U.S. participated in dataanalysis analysis. K.M.B. and (Supplementary Fig.work. 10d, e). Immunofluorescence revealed S.A.A. provided critical materials and participated in the preparation of the manuscript. significantly numbers of Tra-1-60-positive clusters on P.B.G. and E.S.L.,greater participated in data acquisition, data interpretationcell and preparation of the manuscript. G.Q.D. research and participated in project planning, data day 10 and day 14 supervised in shDot1L cultures (Supplementary Fig. 11a, b), interpretation and preparation of the manuscript. indicating that the emergence of iPSCs is accelerated upon DOT1L Author Information and ChIP-seq data have been deposited in by the 10 inhibition. WhenThe wemicroarray extended the reprogramming experiments National Center for Biotechnology Information Gene Expression Omnibus (GEO) and more days, shDot1L cells still yielded more iPSC colonies than controls are accessible through GEO Series accession numbers GSE29253 and GSE35791. (Supplementary Fig. 11c). Taken findings indicate that Reprints and permissions information is together, available at these www.nature.com/reprints. The authors declare competing financial interests: details accompany full-text HTML DOT1L inhibition acts in early to middle stages tothe accelerate and version of the paper at www.nature.com/nature. Readers are welcome to comment on increase efficiency of the reprogramming process. the online the version of this article at www.nature.com/nature. Correspondence and To assess whether DOT1L inhibition could replace any of the reprorequests for materials should be addressed to G.Q.D. (George.Daley@childrens.harvard.edu). gramming factors, we infected control and DOT1L-inhibited fibroblasts with three factors, omitting one factor at a time. In the absence of OCT4 or SOX2 no iPSC colonies emerged (Fig. 2d). When we omitted either KLF4 or c-Myc, DOT1L-inhibited fibroblasts gave rise to robust numbers of Tra-1-60-positive colonies, whereas control cells generated very few colonies, as reported previously4 (Fig. 2df and Supplementary Fig. 12a). Importantly, DOT1L-inhibited fibroblasts transduced with only OCT4 and SOX2 gave rise to Tra-1-60-positive colonies, whereas control fibroblasts did not (Fig. 2df). These twofactor iPSCs showed typical ESC morphology, silenced the reprogramming vectors and had all of the hallmarks of pluripotency as gauged by endogenous pluripotency factor expression and the ability to form all three embryonic germ layers in vitro and in teratomas (Supplementary
2 9 M A R C H 2 0 1 2 | VO L 4 8 3 | N AT U R E | 5 9 9

Fold change in Tra-1-60+ colonies

SUV39H1 YY1 DOT1L DNMT3A MECP2 NR2F1 DNMT1 SMYD2 CNTRL MBD2 MBD4 EZH1 SUV39H2 MBD1 G9A MBD3 SETDB1 OCT4 BMI1 RING1 SUZ12 EHMT1 EZH2 EED

sh sh Cnt SE rl TD B sh 1 Bm sh i1 Ri ng 1 sh Ee d sh Ez h sh 2 Su z1 2

S31

ARTICLE
First published in Nature 488, 4348 (2012); doi: 10.1038/nature11213

doi:10.1038/nature11213

Novel mutations target distinct subgroups of medulloblastoma


Giles Robinson1,2,3*, Matthew Parker1,4*, Tanya A. Kranenburg1,2*, Charles Lu1,5, Xiang Chen1,4, Li Ding1,5,6, Timothy N. Phoenix1,2, Erin Hedlund1,4, Lei Wei1,4,7, Xiaoyan Zhu1,2, Nader Chalhoub1,2, Suzanne J. Baker1,2, Robert Huether1,4,8, Richard Kriwacki1,8, Natasha Curley1,2, Radhika Thiruvenkatam1,2, Jianmin Wang1,9, Gang Wu1,4, Michael Rusch1,4, Xin Hong1,5, Jared Becksfort1,9, Pankaj Gupta1,9, Jing Ma1,7, John Easton1,4, Bhavin Vadodaria1,4, Arzu Onar-Thomas1,10, Tong Lin1,10, Shaoyi Li1,10, Stanley Pounds1,10, Steven Paugh1,11, David Zhao1,9, Daisuke Kawauchi1,12, Martine F. Roussel1,12, David Finkelstein1,4, David W. Ellison1,7, Ching C. Lau1,13, Eric Bouffet1,14, Tim Hassall1,15, Sridharan Gururangan1,16, Richard Cohn1,17, Robert S. Fulton1,5,6, Lucinda L. Fulton1,5,6, David J. Dooling1,5,6, Kerri Ochoa1,5,6, Amar Gajjar1,3, Elaine R. Mardis1,5,6,18, Richard K. Wilson1,5,6,19, James R. Downing1,7, Jinghui Zhang1,4 & Richard J. Gilbertson1,2,3

Medulloblastoma is a malignant childhood brain tumour comprising four discrete subgroups. Here, to identify mutations that drive medulloblastoma, we sequenced the entire genomes of 37 tumours and matched normal blood. One-hundred and thirty-six genes harbouring somatic mutations in this discovery set were sequenced in an additional 56 medulloblastomas. Recurrent mutations were detected in 41 genes not yet implicated in medulloblastoma; several target distinct components of the epigenetic machinery in different disease subgroups, such as regulators of H3K27 and H3K4 trimethylation in subgroups 3 and 4 (for example, KDM6A and ZMYM3), and CTNNB1-associated chromatin re-modellers in WNT-subgroup tumours (for example, SMARCA4 and CREBBP). Modelling of mutations in mouse lower rhombic lip progenitors that generate WNT-subgroup tumours identified genes that maintain this cell lineage (DDX3X), as well as mutated genes that initiate (CDH1) or cooperate (PIK3CA) in tumorigenesis. These data provide important new insights into the pathogenesis of medulloblastoma subgroups and highlight targets for therapeutic development. Medulloblastoma is the most common malignant childhood brain tumour1. The disease includes four subgroups (sonic hedgehog (SHH) subgroup, WNT subgroup, subgroup 3 and subgroup 4), defined primarily by gene expression profiling, that show differences in karyotype, histology and prognosis2. Studies of genetically engineered mice show that these tumours arise from different cell types: SHHsubgroup medulloblastomas develop from committed cerebellar granule neuron progenitors (GNPs) in Ptch11/2 mice3,4; WNTsubgroup tumours are generated by lower rhombic lip progenitors (LRLPs) in Blbp-Cre;Ctnnb11/lox(Ex3);Tp53flx/flx mice5; whereas subgroup-3 medulloblastomas probably arise from an undefined class of cerebellar progenitors6. The identification of medulloblastoma subgroups has not changed clinical practice. All patients currently receive the same combination of surgery, radiation and chemotherapy. This aggressive treatment fails to cure two thirds of patients with subgroup-3 disease, and probably over-treats children with WNTsubgroup medulloblastoma who invariably survive with long-term cognitive and endocrine side effects2,7. Drugs targeting the genetic alterations that drive each medulloblastoma subgroup could prove more effective and less toxic, but the identity of these alterations remains largely unknown.
1

The genomic landscape of medulloblastoma


To identify genetic alterations that drive medulloblastoma, we performed whole-genome sequencing (WGS) of DNA from 37 tumours and matched normal blood (discovery cohort). Tumours were subgrouped by gene expression (WNT subgroup, n 5 5; SHH subgroup, n 5 5; subgroup 3, n 5 6; subgroup 4, n 5 19; unclassified (profiles not available), n 5 2; Fig. 1, Supplementary Figs 13 and Supplementary Table 1). Validation of all putative somatic alterations including single nucleotide variations (SNVs), insertion/deletions (indels) and structural variations (SVs) identified by CREST8, was conducted for 12 tumours using custom capture arrays and Illumina-based DNA sequencing (Supplementary Table 2). Putative coding alterations and SVs were validated in the remaining 25 discovery cohort cases by polymerase chain reaction (PCR) and Sanger-based sequencing. Mutation frequency was determined in a separate validation cohort of 56 medulloblastomas (WNT subgroup, n 5 6; SHH subgroup, n 5 8; subgroup 3, n 5 11; subgroup 4, n 5 19; unclassified, n 5 12; Fig. 1 and Supplementary Table 1). WGS of the discovery cohort detected 22,887 validated or highquality somatic sequence mutations (SNVs and indels), 536 validated or curated SVs, and 5,802 copy number variations (CNVs; 92%

St Jude Childrens Research Hospital, Washington University Pediatric Cancer Genome Project, Memphis, Tennessee 38105, USA. 2Department of Developmental Neurobiology, St Jude Childrens Research Hospital, Memphis, Tennessee 38105, USA. 3Department of Oncology, St Jude Childrens Research Hospital, Memphis, Tennessee 38105, USA. 4Department of Computational Biology and Bioinformatics, St Jude Childrens Research Hospital, Memphis, Tennessee 38105, USA. 5The Genome Institute, Washington University School of Medicine in St Louis, St Louis, Missouri 63108, USA. 6 Department of Genetics, Washington University School of Medicine in St Louis, St Louis, Missouri 63108, USA. 7Department of Pathology, St Jude Childrens Research Hospital, Memphis, Tennessee 38105, USA. 8Department of Structural Biology, St Jude Childrens Research Hospital, Memphis, Tennessee 38105, USA. 9Department of Information Sciences, St Jude Childrens Research Hospital, Memphis, Tennessee 38105, USA. 10Department of Biostatistics, St Jude Childrens Research Hospital, Memphis, Tennessee 38105, USA. 11Department of Pharmaceutical Sciences, St Jude Childrens Research Hospital, Memphis, Tennessee 38105, USA. 12Department of Tumour Biology and Genetics, St Jude Childrens Research Hospital, Memphis, Tennessee 38105, USA. 13Texas Childrens Cancer and Hematology Centers, 6701 Fannin Street, Ste. 1420, Houston, Texas 77030, USA. 14The Hospital for Sick Children, 555 University Avenue, Toronto, Ontario M5G 1X8, Canada. 15The Royal Childrens Hospital, 50 Flemington Road, Parkville, Victoria 3052, Australia. 16Duke University Medical Center, 102382, Durham, North Carolina 27710, USA. 17The School of Womens and Childrens Health, University of New South Wales, Kensington, New South Wales NSW 2052, Australia. 18Siteman Cancer Center, Washington University School of Medicine in St Louis, St Louis, Missouri 63108, USA. 19 Department of Medicine, Washington University School of Medicine in St Louis, St Louis, Missouri 63108, USA. *These authors contributed equally to this work. 2 AU G U S T 2 0 1 2 | VO L 4 8 8 | N AT U R E | 4 3

S32

NATURE REPRINT COLLECTION Epigenetics

RESEARCH ARTICLE
Age and sex Histology <5 yr Melanotic >5 yr Classic ABC GCB F Desmoplastic Anaplastic U FL M Stage M0 M1 M2 M2 Outcome Disease free Progression Dead Chromosome Balance Loss Gain ND nCTTNB1 Cohort 40 WGS Valid. 30 + ND 20

ARTICLE RESEARCH
Cases

Figure 1 | The genomic landscape of medulloblastoma. Top, clinical, histological, gross chromosomal, 10 nuclear CTNNB1 (nCTNNB1) and MYD88 Subgroup 4 WNT SHH Subgroup 3 cohort (discovery or validation) CD79B MLL2 ER BCL6s details of 79 medulloblastomas SET COG5141 by HMG box Age PHD PHD NS TNFAIP3 Sex ** FYRN subgroup. ER, enrichment. Bottom, Histology CARD11 *** FYRC Stage * FAS genetic alterations detected in 27 NS 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 bp Outcome TMEM30A Mono 6 *** genes of particular interest. Colour CD58 N81K 7q * b N81Y 9p CD70 * key at top right. ANOVA D83G 9q * STAT3 10p NS Y69C ETS1 (continuous) or Fishers exact test 10q * D83V Y69H HIST1H1C 16q *** (categorical) P value is shown on CCND3 17p * 17q KLHL6 *** right. False discovery rate (FDR) NS Y D83A K4E BTG1 nCTNNB1 IHC *** estimates of each mutation are MEF2B BTG2 Cohort NS Cohort MADS box MEF2 IRF8 shown on right. Slash indicates loss B2M ER FDR 150 0 50 100 200 250 of wild-type 300 350 bp EP300 or mutation allele, CREBBP CTNNB1 *** 3.3 10 including X chromosome in males. CDH1 0.02 MLL2 * Figure 3 | Summary and effect of somatic mutations affecting MLL2 and DDX3X *** 4.6 10 FOXO1 *** P, ** P , 0.005; SMARCA4 6.7 10 ** TNFRSF14 MEF2B. a, Re-sequencing the MLL2 locus in 890.0005; samples revealed mainly NS CREBBP * MEF2B * P , 0.05; NS, not significant. F, NS NS TRRAP nonsense (red circles) and indel mutations (orange TP53 NS frameshift-inducing NS MED13 female; M, male. amp., amplification; SUFU BCL2 NS 0.06 triangles; inverted triangles for insertions and upright triangles for deletions). A PTCH1 NS * SGK1 del., deletion; microdel., NS 3.7 10 TP53 smaller number of non-synonymous somatic mutations (green circles) and GNA13 NS 0.04 MLL2 microdeletion; valid., validation EZH2 0.08 GABRG1 * point mutations or deletions affecting splice sites (yellow stars) were also BCL2s PTEN * cohort. ND, not done. NS MYCN observed. All of the non-synonymous point mutations affected a residue within <0.05 NS MYC NS OTX2 either the catalytic SET domain, the FYRC domain (FY-rich carboxy-terminal 0.10.05 NS DDX31 NS 4.6domains. KDM6A 10 domain) or PHD zinc finger The effect of these splice-site mutations 0.30.1 NS 0.18 ZMYM3 NS NS KDM1A on MLL2 splicing was also explored (Supplementary Figure 7). b, The cSNVs NS NS KDM3A KDM4C NS 0.006 in MEF2B in all FL and DLBCL cases sequenced Figure 2 | Overview of mutations and potential cooperative interactions in and somatic mutations found KDM5A NS NS KDM5B NS are shown with the sameNS symbols. Only the amino acids with variants in at least NHL. This heat map displays possible trends towards co-occurrence (red) and KDM7A NS NS two patients are labelled. CHD7 0.0006 were most prevalent in the first two proteinNScSNVs mutual exclusion (blue) of somatic mutations and structural rearrangements. PIK3CA NS 0.12 Colours were assigned by taking the minimum value of a left- and right-tailed coding exons of MEF2B (exons 2 and 3). The crystal structure of MEF2 bound to EP300 supports the idea that two of the mutated sites (L67 and Y69) are Fishers exact test. To capture trends a P -value threshold of 0.3 was used, with concordant with 6.0 SNP mapping arrays; Supplementary Tables 36 with excessive SVs) encode potential in-frame fusion proteins the darkest shade of the colour indicating those meeting statistical significance important in the interaction between these proteins (Supplementary Figure 8 and Supplementary Figs 47). In all but five tumours with the highest and (Supplementary Fig. 15); none 50 affect the same gene or signal pathway. . Supplementary Discussion) (P # 0.05). The relative frequency of mutations in ABC (blue), GCB (red), mutation (black) rates, DLBCLs .50% and of FL SNVs were C R T/GR A transitions Therefore, fusion proteins are likely to be an uncommon transformunclassifiable (yellow) cases is shown on the left. Genes (Supplementary Fig. having 8). The mean missense:silent mutation ratio ing mechanism in medulloblastoma. were arranged with those significant (P , 0.05, Fishers exact test) One such gene was MEF2B , which had, not previously been was 3.6:1for and 40% ofin all missense mutations predicted to be mutations. Although germline mutations in TP53 , PTCH1 APC and CREBBP enrichment mutations ABC cases (blue triangle)were towards the top (and linked to lymphoma. We found that 20 (15.7%) cases had MEF2B 1114 deleterious, a selective pressure for SNVsin that affect protein , only 23 mutations previously predispose to medulloblastoma left) and those suggesting with significant enrichment for mutations GCB cases (red andwith 4 (3.1%) cases had MEF2C All cohort cSNVsgerm detected by codingtowards (Supplementary Table 5). The Global total SNVs cSNVs associated cancer were detected incSNVs. discovery lines. triangle) the bottom (and right). total patterns number ofof cases in which either a the MADS boxof or Turcots MEF2 domains. To detereach gene contained either cSNVs or confirmed somatic mutations is shown at RNA-seq and amplifications varied significantly among medulloblastoma Only oneaffected of thesein known case syndromewas the frequency and scope of MEF2B mutations, we Sangerthe top. The cluster of blue squares (upper-right) results from the mutual subgroups, even when corrected for age and sex, supporting the notion mine accompanied by a somatic mutation (germline APC Y935 */somatic exonssubgroup 2 and 3 in FL samples; 259 exclusion of the ABC-enriched mutations (for example, entities MYD88, CD79B ) from that these tumours are distinct pathological (Fig. 1 and sequenced deletion; WNT no.261 11;primary Supplementary Table 8). DLBCL Thus, the GCB-enriched mutations (for example, EZH2, GNA13 ). Presence of allele primary 17 cell lines; 35seem cases assorted NHL (IBL, Supplementary Fig. 6). Custom capture-based analysis of the inherited tumours; forms of medulloblastoma to of be rare in our cohort. structural rearrangements involving the two BCL6 and BCL2 frequency of all somatic mutations in oncogenes 12 medulloblastomas allowed composite FL and PBMCL); and eight non-malignant centroblast (indicated as BCL6s and BCL2s) was determined with FISH techniques using We also used capture strategy (Supplementary Novel mutations in a medulloblastoma subgroups Methods) us to predict the ancestry of certain genetic alterations, suggesting samples. break-apart probes (Supplementary Methods).
a
Wild type Missense Nonsense/FS Splice site Focal del. Homo del Focal amp. Microdel.
1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11 12 13 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38

ABC enrichment

Clinical

Chromosome

14 12 8

WNT

GCB enrichment

SHH

MYD88 CD79B BCL6s TNFAIP3 CARD11 FAS TMEM30A CD58 CD70 STAT3 ETS1 HIST1H1C CCND3 KLHL6 BTG1 BTG2 IRF8 B2M EP300 CREBBP MLL2 FOXO1 TNFRSF14 MEF2B TP53 BCL2 SGK1 GNA13 EZH2 BCL2s

Subgroup 3/4

11

that aneuploidy precedes widespread sequence mutation in medulloblastomas with highly mutated genomes (Supplementary Figs that 911). chromosome (BAC) clone sequencing in eight FL cases to show in all eight cases the mutations were in trans, affecting both MLL2 alleles. Novel CNVs and SVs are rare in medulloblastoma This observation is consistent with the notion that there is a complete, The repertoire of focally amplified or deleted genes to be very or near-complete, loss of MLL2 in the tumour cells ofseems such patients. 2 gains of MYC limited in exception medulloblastoma. We detected expected With the of two primary FL cases and two DLBCL cell, MYCN andand OTX2 in subgroups 3 and of 4, MLL2 but no novel recurrent lines (Pfeiffer SU-DHL-9), the majority mutations seemed (Fig. 1, Supplementary Fig. 12SNP and array Supplementary toamplifications be heterozygous. Analysis of Affymetrix 500k data from 9 , high-level amplification of Table 7). Inwith keeping with homozygous recent reports two FL cases apparent mutations revealed that both MYCN in subgroup-3 sample no. 16 (sample numbering as in Fig. 1) was tumours showed copy number neutral loss of heterozygosity (LOH) generated by of chromothripsis; chromothripsis was observed for the region chromosome although 12 containing MLL2 (Supplementary infrequently (n 5 of theto discovery cohort; Supplementary Fig. 13). Methods). Thus, in2/37 addition bi-allelic mutation, LOH is a second, Focal homoor heterozygous deletions of genes previously implialbeit less common mechanism by which MLL2 function is lost. cated in medulloblastoma were also detected (for example, PTCH1 MLL2 was the most frequently mutated gene in FL, and among the, PTEN; Fig. 1)10,11 but novel recurrent focal deletions were rare. Three most frequently mutated genes in DLBCL (Fig. 2). We confirmed subgroup-4 tumours (nos 1113) and one unclassified tumour MLL2 mutations in 31 of 35 FL patients (89%), in 12 of 37 DLBCL deleted DDX31, AK8 and TSC1 at chromosome 9q34.14 in concert patients (32%), in 10 of 17 DLBCL cell lines (59%) and in none of the with OTX2 amplification, suggesting that these alterations are coopeight normal centroblast samples we sequenced. Our analysis preerative (P , 0.0005, Fishers exact test). The breakpoint in this dicted thatoccurs the majority of the somatic mutations observed MLL2 deletion in DDX31 , and two samples contained a in missense were inactivating (91% disrupted the reading frame orrearrangement were truncatmutation (subgroup 4, no. 15) and complex ing point mutations), indicating to us that suggesting MLL2 is athat tumour sup(unidentified case SJMB026) in this gene, DDX31 is pressor of significance in NHL. the target of these alterations (Supplementary Fig. 14). Over 50% of SVs detected by WGS broke the coding region of at Recurrent point least one gene, butmutations less than 2%in (n MEF2B 5 6/314, excluding two tumours Our selective pressure analysis also revealed genes with stronger pres4 4 for | NA TURE | VO L amino 4 8 8 | 2acid AUG UST 2012 sure acquisition of substitutions than for nonsense
NATURE REPRINT COLLECTION Epigenetics

to sequence coding region in the 261 FL samples, Because SVs the andentire CNVsMEF2B are unlikely to drive most medulloblastomas, revealing six additional variants outside exons 2 and 3. We thus idenwe investigated whether recurrent (more than two samples) somatic tified cases indels (34 DLBCL, 12.67%; and 35 FL, 15.33%) with MEF2B SNVs69 and/or might target discrete genes and pathways. This cSNVs or indels, failing to observe in other were NHLtarand analysis identified 49 genes, across novel all 93 variants tumours, which non-malignant samples. Of the variants 55 (80%) 84% affected geted by non-silent, recurrent, somatic mutations; (n 5residues 41/49) within theyet MADS and MEF2 encoded (Supplementary by exons 2 and 3 have not beenbox implicated in domains medulloblastoma (Supplementary Table 11; of Fig. 3b). congregated Each patient in generally a single Tables 9 and 10). Several these disease had subgroups and converged specific cell pathways (Fig. 1, Supplementary Fig. 8 MEF2B varianton and we observed relatively few (eight in total, 10.7%) and Supplementary Table truncation-inducing SNVs11). or indels. Non-synonymous SNVs were by far the most common type of change observed, with 59.4% of detected Histoneaffecting methylation is deregulated 3 and 4 variants K4, Y69, N81 or D83. in In subgroups 12 cases MEF2B mutations The H3K27 mark (H3K27me3) represses lineage-specific genes were showntrimethyl to be somatic, including representative mutations at each (Supplementary Fig. 8). H3K27me3 written by the in K4, stemY69, cells15 of N81 and D83 (Supplementary Table is 12). We did not polycomb repressive complex 2 indicating (PRC2) that includes the methylase detect mutations in ABC cases, that somatic mutations in EZH2 (refs 16, 17) and is erased during differentiation by the demethylase MEF2B have a role unique to the development of GCB DLBCL and FL 18 KDM6A (Fig. 2). . As H3K27me3 is erased, chromatin remodellers recruited to H3K4me3 promote differentiation, for example, CHD7 (refs 19, 20). This process is tightly controlled during development and deregulated in 21 cancers; EZH2 is mutated in lymphomas and mutations upregulated in breast22 Table 2| Summary of types of MLL2 somatic 23 and prostate cancer, while biallelic inactivation of KDM6A (chroSample Type FL DLBCL DLBCL cell-line Centroblast mosome Xp11.2) or KDM6A and its paralogue UTY (chromosome Truncation 18 4 7 0 24 Yq11), occurs in adult female and male cancers, respectively . 0 Indel with frameshift 22 8 6 Hypergeometric distribution analyses revealed mutaSplice site 4 2 0 selective 0 SNV 3 in subgroup-3 2 0 tion of histone modifiers and -4 2 medulloblastomas Any mutation/ (Supplementary Table 11). Six subgroup-4, one subgroup-3, and
number of cases Percentage 31/35 89 12/37 32 10/17 59 0/8 0

1 8 AU G U S T 2 0 1 1 | VO L 4 7 6 | N AT U R E | 3 0 1

S33

RESEARCH ARTICLE
Table 1 | Overview of cSNVs and confirmed somatic mutations in most frequently mutated genes

ARTICLE RESEARCH

one unclassified medulloblastoma contained novel inactivating muta- subgroup-3 and -4 medulloblastomas retain a stem-like epigenetic tions in KDM6A (Figs 1 and 2 and Supplementary Figs 8 and 16). The state by aberrantly writing (EZH2 upregulation) or preserving Gene Cases Total Somatic cSNVs P (raw) q NS SP T SP Skew single female with a KDM6A splice-site mutation showed a deletion of (KDM6A-UTY inactivation) H3K27me3, or disrupting H3K4me3 (RNA-seq (M, WT, both){ 25 S T NS S T associated transcription (CHD7 and ZMYM3 inactivation). Indeed, the second alleleNS that escapes X inactivation (subgroup 4, no. 15), and cohort) * 8 human and mouse subgroup-3 medulloblastomas contained 57%{ (n 5 4/7) 16 of KDM6A male deleted MLL2 8 -mutant 17 17 medulloblastomas 8 18 10 6.85 3 102 8.50 3 1027 and -4 0.834 14.4 WT G 8 significantly H3K27me3 or SHH-subgroup chromosome Y,7 compared with only n 5 3/51) of male, TNFRSF14 { 1 7 8 6% ( 1 7 11 6.85 3 102more 8.50 3 1027 than did 7.52WNT-118 Both G 7 SGK1 { 18 tumours 6 37 Fishers 10 exact 6 test; Fig. 9 6.85 3(Fig. 10282b). Thus, 8.50 gain 3 102 19.5 2 KDM6A wild-type (P 6 , 0.005, 1). tumours of EZH2 and loss of61.7 KDM6A probably 28 BCL10 { a two-hit model 2 0 KDM6A-UTY 4 3 tumour 0 suppression 4 4 6.85 3 10 8.50 3 1027 3.62 112 WT Thus, of seems maintains H3K27me3 in subgroup-3 and -4 medulloblastomas. G 28 27 GNA13 { 21 1 2 33 1 2 5 6.85 3 10 8.50 3 10 24.1 25.7 Both to operate in subgroup-4 medulloblastomas. Notably, mutations in Finally, we looked 8.50 to see the 28 27 differential expression of H3K27me3 TP53 G{ 20 2 1 23 3 1 22 6.85 3 10 3if 10 15.6 14.1 Both G 2 8 2 7 six other KDM (KDM1A , 0 KDM3A among medulloblastoma subgroups11.4 reflects ancestral chromatin EZH2 { 33 family 0 members 0 33 0 , KDM4C 33 , 6.85 3 10 8.50 3 10 0.00 Both 28 27 BTG2 { 12 and 6 KDM7A 1 ) were 14 6 1 2in 6.85 3in 10 the progenitors 8.50 3 10that generate 23.9 these tumours 35.1 22b). KDM5A , KDM5B detected exclusively marking (Fig. G 28 BCL2 { 45 0 96 broad 105disruption 0 43 9.35 3 10 8.50 3 1027 3.78 0.00 M subgroup-3 and42 -4 tumours, implicating of lysine Relatively low levels of H3K27me3 were detected in LRLPs and comBCL6{1 11 2 0 12 2 0 2 9.35 3 1028 8.50 3 1027 0.175 0.00 M demethylation in these medulloblastomas (Fig. 1, Supplementary mitted GNPs, WNTand SHH-subgroup medullo28 which generate 27 CIITA{1 5 3 0 6 3 0 2 9.35 3 10 8.50 3 10 0.086 0.00 35 27 Table Fig. blastomas, respectively , potentially explaining why that FAS { 11 and Supplementary 2 0 4 16). 3 0 4 2 1.52 3 10 1.17 3 1026 2.54 66.5mutations WT 27 26 Subgroup-3 and also gained BTG1 { 11 -4 medulloblastomas 6 2 11 7 and overexpressed 2 10 1.52 3this 10 epigenetic 1.17 3 10 are absent 17.5 from these 52.5 tumours. Both preserve mark We G MEF2B 20 2 0 which 20 writes 2 H3K27me3, 0 10 2.05 3 1027 that1.47 3 1026 medulloblastomas 14.2 0.00 Mrare EZH2 { (chromosome 7q35-34), and recently showed subgroup-3 arise from a 27 26 IRF8{ 5 3 14 5 4.55 3 10 3.03 3 10 8.82 28.2 WT contained novel11 inactivating mutations in effectors and3regulators3of fraction of cerebellar progenitors6. We are currently investigating TMEM30A{ 1 26 0 4 1 0 4 4 6.06 3 1027 3.79 3 1026 0.785 65.0 WT 2 6 2 5 the { H3K4me3 mark (Fig. 2a and Supplementary Fig. whether these CD58 2 0 3 2 0 3 8). Gain2of 2.42 3 10 progenitors 1.43 3are 10 found among 2.29 the H3K27me3-positive 69.2 2 5 chromosome 7q was significantly among subgroup-3 and cells seen in 2 the external germinal layer (Fig. 2b). 16.4 KLHL6 { 10 2 2 enriched 12 2 2 4 1.00 3 10 5.26 3 1025 5.42 2 A MYD88 { 13 0 Fishers 14 exact2test) and 0 correlated 9 1.00 3 1025 5.26 3 1025 12.4 0.00 WT -4 medulloblastomas (P2 , 0.005, 8.48 3 1025 7.08 44.0 2 CD70 { 5 expression. 0 1 2 eighth most 3 1.70mutations 3 1025 Novel in WNT-subgroup medulloblastomas directly with EZH2 Indeed,5EZH2 0 was the CD79B A{ 7 2 1 9 2 1 5 2.00 3 1025 9.52 3 1025 10.9 18.3 M WNT-subgroup medulloblastomas contained mutations in epigenetic significantly overexpressed gene on chromosome 7 among subgroup-3 25 24 CCND3{ 7 1 2 7 1 2 6 2.80 3 10 1.27 3 10 6.55 36.3 WT 24 are different to 2 4 regulators that seen in subgroup-3 and -4 disease. and -4{medulloblastomas chromosome to those CREBBP 20 7that gained 4 24 7 7q relative 4 9 1.00 3 10 4.35 3 10 those 2.72 6.04 Both 24 24 HIST1H1C { 9 0 10 0 0 correction). 6 1.80 3 10 7.50 3 10 0.00 Both a CTNNB1, the principal effector of 11.9 the WNT pathway, forms with diploid chromosome 7 0 (P , 0.005, Bonferroni 23 B2M { 7 0 mutations 0 7 detected 0 in CHD7 0 4 3.90 3 1024 1.56 3 10T-cell 16.6 0.00 WT transcription factor with the factor/lymphoid enhancer factor Nonsense and frameshift were in four 24 ETS1{ 10 1 0 10 1 0 4 4.10 3 10 1.58 3 1023 5.76 0.00 WT 28 (TCF/LEF) . The carboxy terminus of CTNNB1 then recruits a series subgroup-3 and -4 tumours. ZMYM3 (chromosome Xq13.1), which CARD11{ 14 3 0 14 3 0 3 1.90 3 1023 7.04 3 1023 3.37 0.00 Both 23 22 of protein complexes that remodel chromatin and promote transcripparticipates in a protein complex with KDM1A to regulate gene FAT2{1 2 1 0 2 1 0 2 6.30 3 10 2.25 3 10 0.128 0.00 2 2 tion at WNT-responsive Fig. 0.00 8). These include: IRF4 {1 4 mark 0 27, was 26 5 by novel 0 frameshift, 5 7.00 3 1023 2.41genes 3 102(Supplementary 0.569 Both expression at the 9 H3K4me3 targeted 3 22 FOXO1 { 8 4 0 in three 10 male 4 0 4 7.60 acetyltransferases 3 10 2.53 3 10 example, 4.02 0.00 2 histone (for CREBBP and TRRAPTIP60 nonsense and missense mutations subgroup-4 medullo22 STAT3 9 0 0 9 0 0 4 2.19 3 10 6.08 3 1022 2 2 Both 28,29 complexes) SWI/SNF family (for example, blastomas. All three tumours with mutations in ZMYM3 also mutated 22 ; ATPases of 2the 2 RAPGEF1 8 3 0 10 3 0 3 2.98 3 1030 7.45 3 10 2 2 WT SMARCA4) ; and 1.67 the mediator complex that coordinates RNA KDM6A (subgroup (subgroup ABCA7 12 4, nos 3 19, 20) 0 or KDM1A 15 3 0 4, no. 21), 2 7.76 3 1022 3 1021 2 2 WT 31 2 2 2 1 polymerase (for example, suggesting that10these 8alterations are RNF213 0 10 cooperative. 8 0 Remarkably, 2 7.87 3 10 II placement 1.67 3 10 2 MED13) 2 . As expected, 2 2 MUC16 12 0 39were confined 25 0to subgroups 2 8.32 ( 3 1.73 3 1021 2 2 2 .70% n 10 52 8/11) of WNT-subgroup medulloblastomas contained KDM6A, CHD717 and ZMYM3 mutations 22 HDAC7 8 4 0 8 4 0 2 8.94 3 10 1.82CTNNB1 3 1021 2 2 WT 8; mutations that stabilize (Fig. 1 and Supplementary Fig. 3 and 4, and clustered in samples with sub-median EZH2 expression PRKDC 7 3 0 7 4 0 2 1.06 3 1021 2.05 3 1021 2 2 2 P, 0.0001, Fishers 3.01 exact test)32,33. A single subgroup-3 case (no. 5) levels (Fig. 2a; P 9 , 0.05,2Fishers These SAMD9 0 exact test). 9 2 data 0 suggest that 2 1.79 3 1021 3 1021 2 2 2
TAF1 PIM1 COL4A2 EP300 10 20 8 8 0 19 a 2 7
Chr 7

p Individual cases with non-synonymous (NS), synonymous (S) and truncating (T) mutations and the total number of mutations of each class are shown separately because some genes contained multiple q indicated in bold are the upper limit on the P value for that gene determined with the approach described in ref. 19 (see Supplementary Methods), q is the Benjaminimutations in the same case. The P values corrected q value, and NS SP and T SP refer to selective pressure estimates from this model for the acquisition of non-synonymous or truncating mutations, respectively. Genes with a superscript of either A or G EZH2 were found to have mutations significantly expr. enriched in ABC or GCB cases, respectively (P , 0.05, Fishers exact test). * Additional somatic mutations identified KDM6A in larger cohorts and insertion/deletion mutations are not included in this total. { Both indicates that we observed separate cases in which skewed expression was seen but where this skew was not consistent for the mutant or wild-type allele. CHD7 { Genes significant at a false discovery rate of 0.03. SNVs in BCL2 and previously confirmed hot spot mutations in EZH2 and CD79B are probably somatic in these samples based on published observations of others. ZMYM3 1 Selective pressure estimates are both , 1 indicating purifying selection rather than positive selection acting on this gene. 103 phosphatidylinositol-3-OH a.u.) SGK1 encodes (a (PI(3)K)9.3 17.0 15.2 15.3 12.3 10.6 10.6 9.7 13.5 8.6 13.2 kinase regulated kinase with functions including regulation of FOXO 25 transcription factors , regulation of NF-kB by phosphorylating IkB kinase26, and negative regulation of NOTCH signalling27. SGK1 also resides within a region ofWNT7 chromosome 6 commonly deleted in DLBCL SHH7 GP4-7 by which SGK1 and GNA13 inactivation may (Fig. 1)5. The mechanism b Mouse medulloblastoma contribute to lymphoma is unclear, but the strong degree of apparent Subgroup 3 WNT SHH selection towards their inactivation and their overall high mutation frequency (each mutated in 18 of 106 DLBCL cases) suggests that their loss contributes to B-cell NHL. Certain genes are known to be mutated more commonly in GCB DLBCLs (for example, TP53 (ref. 28) and EZH2 (ref. 13)). Here, both SGK1 and GNA13 mutations were found 24 and 2.28 3 10 , Fishers exact only in 2 GCB cases (P 5 3 1023in Figure | Deregulation of1.93 H3K27me3 subgroup-3 and -4 human and mouse medulloblastoma. a, Top row, SNP profiles of chromosome 7 (Chr 7) test; n5 15 and 18, respectively) (Fig. 2). Two additional genes copy number in medulloblastomas (samples Fig. 1; asterisk indicates subgroup(MEF2B and TNFRSF14) with no as previously described role in 3 cases). showed Second row, expression of EZH2.to Subgroup-3 and -4 tumours DLBCL a similar restriction GCB cases (Fig. 2). are H3K27me3

0 0 WNT 0 1

10 0 33 SHH 34 8 2 8 7

0 0 0 1

2 11 2 3

3.03 3 1021 4.74 3 1021 21 21 3.40 3 10 Subgroups 2 3 and 4 5.23 3 1021 7.64 3 10 1 8.99 3 10 9.54 3 1021 1.00

2 2 2 2

2 2 2 2

2 WT 2 WT

3 4 5 7 8 9 10 11 2 3 4 5 6 7 8 9 11 12 13 22 32 28 29 7 16* 3 8* 33 31 9 1 6 2 12* 5 10 14 37 14* 7* 17* 36 21 8 16 17 27 4 25 1* 24 11 35 12 18 23 6* 34 10* 30 19 15 2* 3* 15* 13* 13 20 38 26 9* 11* 4* 5*

20.0 17.7 17.5 20.0 15.8 12.7 25.7 22.1 14.1 14.0 17.6

Inactivating MLL2 mutations Third row, mutation status of KDM6A, CHD7 and ZMYM3 (P value, Fishers

ordered left to right by expression level, dagger indicates median expression point (Bonferroni-corrected P value of EZH2 expression versus chromosome 7 gain).

MLL2 showed theversus most significant evidence for selection exact test mutations EZH2 expression). Fourth row, H3K27me3and the largest number of nonsense SNVs. Our RNA-seq analysis indicated that 26.0% (33/127) of cases carried at least one MLL2 cSNV. To
3 0 0 | N AT U R E | VO L 4 7 6 | 1 8 AU G U S T 2 0 1 1

address the possibility that variable RNA-seq coverage of MLL2 failed to capture some mutations, we PCR-amplified the entire MLL2 locus 4 0 4 in 89 cases (35 primary FLs, 17 DLBCL cell lines, and (,36 kilobases) Data N/A Chr 7 copy number score 37 DLBCLs). these cases 58 were among the RNA-seq cohort. 2 0 Of 2 Mutant Wild type EZH2 exp. log ratio Illumina amplicon re-sequencing (Supplementary Methods) revealed 78 mutations, confirming the RNA-seq mutations in the overlapping P7 CB Mouse E14.5 hindbrain cases and identifying 33 additional mutations. We confirmed the (ii) (i) Cerebellum somatic status of 46 variants using Sanger sequencing (SupplemenLRL Fourth Choroid (ii) tary Table 10), and showed that 20 of the 33 additional mutations were ventricle URL (i) insertions or deletions (indels). Three SNVs at splice sites were also IGL EGL detected, as were 10 new cSNVs that had not been detected by RNA-seq. Brainstem Brainstem The somatic mutations were distributed across MLL2 (Fig. 3a). Of immunohistochemistry (numbers indicate colorimetry, P value46% ANOVA). these, 37% (n 5 29/78) were nonsense mutations, (n 5GP4-7 36/78) indicates case subgroup-4, no.7. arbitrary units.8% N/A, not available. were indels that altered thea.u., reading frame, ( n 5 6/78) were point 1/lox(Ex3) flx/flx Tp53 (WNTb , H3K27me3at expression in mouse ;Ctnnb1 mutations splice sites and Blbp-Cre 9% (n 5 7/78) were ;non-synonymous 1/2 2/2 ;Tp532/2 (SHH-subgroup) andof Myc ;Ink4c (subgroup-3) subgroup), Ptch1 amino acid substitutions (Table 2). Four the somatic splice site medulloblastomas (right) and developing hindbrain (left). High-power views of mutations effects on MLL2 transcript length and structure. E14.5 LRL (i)had and upper rhombic lip (URL) (ii). EGL, external germinal layer;For example, two heterozygous splice site mutations in the (CB) use of IGL, internal granule layer. Scale bar, 50 m m. White arrows resulted in P7 cerebellum a novel H3K27me3 splice donor and an intron retention event. pinpoint cellssite in the EGL. Approximately half of the NHL cases we sequenced had two MLL2 2 AUGU ST 20 1 2 | We V O Lused 4 8 8 bacterial | N A T U R artificial E | 45 mutations (Supplementary Table 10).
NATURE REPRINT COLLECTION Epigenetics

P = 0.001

P < 0.005 P = 0.05

S34

H3K27me3

RESEARCH ARTICLE
also showed a mutation in CTNNB1, but this mutation 40 has not been reported in cancer, did not upregulate nuclear CTNNB1 30 ABC GCB 20 (Fig. and is of unclear relevance. Remarkably, six WNT-subgroup U 1) FL 10 medulloblastomas showed mutations in chromatin modifiers that MYD88 CD79B are recruited to TCF/LEF WNT-responsive genes by CTNNB1 BCL6s (Fig. 1 and Supplementary Fig. 8). Four WNT-subgroup TNFAIP3 tumours CARD11 contained heterozygous missense mutations in the helicase domain of FAS TMEM30A SMARCA4 (P , 0.002, Fishers exact test), two samples, including CD58 one CD70 in with a SMARCA4 mutation (no. 5), contained nonsense mutations STAT3 ETS1test), CREBBP (WNT-subgroup enrichment, P , 0.02, Fishers exact HIST1H1C CCND3 and missense mutations in TRRAP and MED13 were detected in a single KLHL6 WNT-subgroup medulloblastoma each. Thus, in addition to stabilizaBTG1 BTG2 tion of CTNNB1, the development of WNT-subgroup medulloblastoma IRF8 B2M may require disruption of chromatin remodelling at WNT-responsive EP300 CREBBP genes. MLL2 FOXO1 A small number of WNT-subgroup medulloblastomas lack mutaTNFRSF14 tions in CTNNB1 or APC, suggesting that alternative mechanisms MEF2B TP53 drive aberrant WNT signals in these tumours. Three WNT-subgroup BCL2 SGK1 medulloblastomas in our series contained wild-type CTNNB1 (nos 1, GNA13 EZH2 10 and 11; Fig. 1). Sample no. 11 inactivated APC as the sole case of BCL2s Turcots syndrome in our study, but this tumour and sample no. 10 <0.05 also contained novel missense mutations in CDH1 (R63G, V329F; 0.10.05 WNT-subgroup enrichment, P , 0.05, Fishers exact test; Fig. 1). 0.30.1 CDH1 sequesters CTNNB1 at the cell membrane34, and mutations Figure | Overview mutations and potential cooperative interactions in that 2 disrupt thisofinteraction promote WNT signalling in adult 35,36 NHL. This heat map displays possible trends towards co-occurrence (red) and . The functional consequences of CDH1(R63G) and cancers mutual exclusion (blue) of somatic and structural rearrangements. CDH1(V329F) remain to be mutations determined, but their restriction to Colours were assigned by takingmutual the minimum value of a leftand right-tailed WNT-subgroup tumours, exclusivity with CTNNB1 mutaFishers exact test. To capture trends a P-value threshold of 0.3 was used, with tions, and adjacency to residues mutated in breast cancer (http:// the darkest shade of the colour indicating those meeting statistical significance www.sanger.ac.uk/genetics/CGP/cosmic/), suggest they might pro(P # 0.05). The relative frequency of mutations in ABC (blue), GCB (red), mote aberrant WNT signals unclassifiable (black) DLBCLs andin FLmedulloblastoma. (yellow) cases is shown on the left. Genes We showed previously in mice that mutant Ctnnb1 initiates WNTwere arranged with those having significant (P , 0.05, Fishers exact test) subgroupfor medulloblastoma by arresting the migration ofthe LRLPs from enrichment mutations in ABC cases (blue triangle) towards top (and 5 the embryonic dorsal brainstem to the pontine grey (PGN) left) and those with significant enrichment for mutations in nucleus GCB cases (red . triangle) towards the bottom (and right). The number of cases in which Therefore, to test whether disruption oftotal CDH1 might substitute for each gene contained cSNVs or confirmed somatic mutations shown at mutant CTNNB1either in medulloblastoma, we used short hairpinis (sh)RNAs the The cluster of blue squares (upper-right) results from the (Fig. mutual totop. knockdown Cdh1 in embryonic day (E)14.5 mouse LRLPs 3ac). exclusion ofof the ABC-enriched mutations (for example, MYD88, CD79B ) tranfrom Deletion Cdh1 expression upregulated Tcf/Lef-mediated gene the GCB-enriched mutations (for example, EZH2their , GNA13 ). Presencecapacity of scription in LRLPs and more than doubled self-renewal structural rearrangements involving the two oncogenes BCL6 and BCL2 (Fig. 3b). Furthermore, in utero electroporation of LRLPs with Cdh1 (indicated as BCL6s and BCL2s) was determined with FISH techniques using shRNAs impeded their migration from the dorsal brainstem to the break-apart probes (Supplementary Methods). PGN with an efficiency similar to that of mutant Ctnnb1 (Fig. 3d, e; see Supplementary Methods). These data support the to hypothesis that chromosome (BAC) clone sequencing in eight FL cases show that in CDH1 suppresses the formation of trans WNT-subgroup medulloblastoma all eight cases the mutations were in , affecting both MLL2 alleles. by regulating WNT-signals LRLPs. This observation is consistentin with the notion that there is a complete, WNT-subgroup medulloblastomas were also enriched for novel, or near-complete, loss of MLL2 in the tumour cells of such patients. recurrent missense mutations DEAD-box helicase With thesomatic exception of two primary in FLthe cases and twoRNA DLBCL cell DDX3X at chromosome Xp11.3 (P , 0.0001, Fishers exact test; Fig. 1). lines (Pfeiffer and SU-DHL-9), the majority of MLL2 mutations seemed regulates several critical cell processes including chromosome toDDX3X be heterozygous. Analysis of Affymetrix 500k SNP array data from 37 38 , cell cycle progression , gene transcription segregation two FL 39 cases with apparent homozygous mutations revealedand thattransboth . Previously reported cancer-associated mutations in lation tumours showed copy number neutral loss of heterozygosity DDX3X (LOH) disrupt the ATPase activity of12 the protein, but seven of eight mutafor the region of chromosome containing MLL2 (Supplementary tions identified inaddition our series clusteredmutation, in the DEAD-box domain Methods). Thus, in to bi-allelic LOH is a second, (Supplementary Information and Supplementary Fig. 8). Structural albeit less common mechanism by which MLL2 function is lost. modelling predicts that these mutations interfere with nucleic acid MLL2 was the most frequently mutated gene in FL, and among the binding, possibly altering specificity and/or affinity for RNA substrates, most frequently mutated genes in DLBCL (Fig. 2). We confirmed rather than inactivating DDX3X (Supplementary Figs 1722). Indeed, MLL2 mutations in 31 of 35 FL patients (89%), in 12 of 37 DLBCL the wild-type allele of DDX3X that escapes X inactivation25 was retained patients (32%), in 10 of 17 DLBCL cell lines (59%) and in none of the by two of three DDX3X-mutant female medulloblastomas, and knockeight normal centroblast samples we sequenced. Our analysis predown of Ddx3x halved the self-renewal rate of mouse LRLPs, suggestdicted that the majority of the somatic mutations observed in MLL2 ing that this protein is important for the proliferation and/or were inactivating (91% disrupted reading maintenance of the LRLP lineagethe (Fig. 3b). frame or were truncating point mutations), indicating to us that MLL2 a tumour supTo understand better the role of DDX3X inis WNT-subgroup pressor of significance in NHL. medulloblastoma, we used our in utero migration assay to assess the impact of Ddx3x shRNAs, mutant Ddx3xT275M (identified in Recurrent point mutations in MEF2B WNT-subgroup sample no. 9), or mutant Ddx3xG325E (WNT sample Our selective pressure analysis also revealed genes with stronger pres4 6 for | NA TURE | VO L amino 4 8 8 | 2acid AUG UST 2012 sure acquisition of substitutions than for nonsense
Cases MYD88 CD79B BCL6s TNFAIP3 CARD11 FAS TMEM30A CD58 CD70 STAT3 ETS1 HIST1H1C CCND3 KLHL6 BTG1 BTG2 IRF8 B2M EP300 CREBBP MLL2 FOXO1 TNFRSF14 MEF2B TP53 BCL2 SGK1 GNA13 EZH2 BCL2s
Ctnnb1*
+ N/A +

ARTICLE RESEARCH
Percentage expression of controls real-time qPCR

Olig3

Phase

shRNA

TCF

100- T

ABC enrichment

MLL2

Wnt1

None

75COG5141 SET FYRN FYRC T 4500 5000 5500 bp

Vector

PHD

PHD

HMG box 2000

0.37 0.02 0.77 0.09 0.19 0.08 0.27 0.05

50-

***

***

Cdh1

500 DAPI

1000

1500 N81K N81Y D83G D83V D83A

2500 +

3000

3500

25-

4000

***

***
T

***

***

**

Smarca4 Ddx3x

K4E MEF2B MADS box

Sm a

d
Control

50

100

150

200

250

300

Ga

MEF2

brg 1 sh R N A Kd m6 a shRN
350

RN

Cd h1 shR N

RN

Dd x3x sh

rca 4 sh

Mll s 2 hR

NA

Merge

Y69C Y69H

bp

Relative distance travelled

Figure 3 | Summary and effect of somatic mutations affecting MLL2 and e PGN MEF2B. a, Re-sequencing the MLL2 locus in 89 samples revealed mainly nonsense (red circles) and frameshift-inducing indel mutations (orange PGN triangles; inverted triangles for insertions and upright triangles for deletions). A 0 smaller number of non-synonymous somatic mutations (green circles) and P1 point mutations or deletions affecting splice sites (yellow0.1 stars) were also Dorsal brainstem Dorsal brainstem observed. All of the non-synonymous point mutations affected a residue within either the catalytic SET domain, the FYRC domain (FY-rich carboxy-terminal 0.2 domain) or PHD zinc finger domains. The effect of these splice-site mutations on MLL2 splicing was also explored (Supplementary Figure 7). b, The cSNVs 0.3 andP1 somatic mutations found in MEF2B in all FL and DLBCL cases sequenced are shown with the same symbols. Only the amino acids with variants in at least Dorsal brainstem 0.4 two patients are labelled. cSNVs were most prevalent in the first two proteinDorsal brainstem coding exons of MEF2B (exons 2 and 3). The crystal structure of MEF2 bound 0.5 to EP300 supports the idea that two of the mutated sites (L67 and Y69) are important in the interaction between these proteins (Supplementary Figure 8 0.6 andP1 Supplementary Discussion)50.
Cdh1shRNA Ctnnb1mutant
Dorsal brainstem PGN
Dorsal brainstem mutations. One such gene was MEF2B, which had not previously been linked to lymphoma. We found that 20 (15.7%) 0.8 cases had MEF2B cSNVs and 4 (3.1%) cases had MEF2C cSNVs. All cSNVs detected by 0.9 RNA-seq affected either E16.5 the MADS box or MEF2 domains. To deterP1 mine the frequency and scope of MEF2B mutations, we Sanger1.0 sequenced exons 2 and 3 in 261 primary FLMedian samples; 259 DLBCL distance (mm): primary tumours; 17 cell lines; 35 cases of assorted NHL (IBL, P-cell distance: composite FL and PBMCL); and eight non-malignant centroblast P-cell number: PGN samples. We also used a capture PGN strategy (Supplementary Methods) Labelled cells (%) P1 to sequence the entire MEF2B coding region in the 261 0 FL 10 samples, >20 revealing six additional variants outside exons 2 and 3. We thus idenFigure 3 | Genes mutated in WNT-subgroup medulloblastomas regulate tified 69 (34 DLBCL, 12.67%; and 35 transduced FL, 15.33%) with 1 LRLPs were in b with MEF2B mutant LRLPs. a,cases b, Isolated Olig31/Wnt1 cSNVs or indels, failing novel variants in other NHL and Ctnnb1 (above hashed line)to or observe the indicated shRNA-RFP (red fluorescence non-malignant Of the variants 55 also (80%) affected residues protein) constructsamples. (below hashed line). LRLPs were transduced (1 ) or not (2) with a Tcf/Lef-enhanced fluorescence reporter. Numbers on 3 within the MADS box andgreen MEF2 domains (Tcf) encoded by exons 2 and right show clonal percentage 9 to 3b). 39 passage formation (Supplementary Table 11; 2 Fig. Each neurosphere patient generally had a single (6 standard deviation N/A, not applicable. Scale bar, 10 m. 10.7%) MEF2B variant and (s.d.)). we observed relatively few (eight in m total, c, Knockdown of genes targeted by shRNA relative to control transduced cells. truncation-inducing SNVs or indels. Non-synonymous SNVs were by Data show mean 6 s.d. d, Immunofluorescence of P1 mouse hindbrains far the most common type ofwith change 59.4% of detected electroporated in utero at E14.5 GFPobserved, (to controlwith for equivalence of variants affecting K4, Y69, N81 or D83. In 12 cases MEF2B mutations electroporation between embryos control) and the indicated construct. Highwere shown to be somatic, representative mutations at each power views of indicated areas including are shown right. Cells targeted by Ddx3x shRNA areK4, present 48 N81 h after electroporation but ablated byTable P1. Scale bars, 200 mm.not of Y69, and D83 (Supplementary 12). We did 1 cells in eletroporated mice e, Heatmap showingin the distribution GFP1/RFP detect mutations ABC cases, of indicating that somatic mutations in at P1. Median migrated cells and P values of distance and MEF2B havedistance a role unique toby the development ofmigration GCB DLBCL and FL cell number relative to controls is shown. ****P , 0.00005; ***P , 0.0005; ** (Fig. 2). P , 0.005; *P , 0.05. Red and green text reports significant increase or

Dorsal brainstem

GCB enrichment

0.7

Ddx3xshRNA

1,662 1,468 1,662 1,203

1,907

Ddx3xT275M

decrease, respectively, relative to control.

no. 8) on LRLPs. Remarkably, although Ddx3x shRNAs wereCentroblast expressed FL DLBCL DLBCL cell-line abundantly in E14.5 brainstem cells within 48 h of electroporaTruncation 18 4 7 0 tion, #0.5% of Ddx3x-shRNA-positive cells were present by postnatal Indel with frameshift 22 8 6 0 day (P)1, of this Splice site confirming the 4critical importance 2 0 gene to maintain 0 the LRLP lineage (Fig. 3d, mice electroporated SNV 3 e). In contrast, 2 2 0 with Any mutation/ either mutant Ddx3xT275M or Ddx3xG325E consistently contained
Sample Type

Table 2 | Summary of types of MLL2 somatic mutations

number of cases Percentage

31/35 89

12/37 32

10/17 59

NATURE REPRINT COLLECTION Epigenetics

1 8 AU G U S T 2 0 1 1 | VO L 4 7 6 | N AT U R E | 3 0 1

NS **** NS **** **** ** NS ** NS NS NS NS NS NS NS

0/8 0

S35

1,552

84

261

Control Ctnnb1mutant Cdh1shRNA Ddx3xshRNA Ddx3xT275M Ddx3xG325E Mll2shRNA Gabrg1shRNA Kdm6ashRNA

RESEARCH ARTICLE
Table 1 | Overview of cSNVs and confirmed somatic mutations in most frequently mutated genes

ARTICLE RESEARCH

,50% more labelled cells at P1 than did controls, although these cells drives aberrant SHH signals in the remaining cases remains unclear. migrated normally (Fig. 3d, e and data not shown). Thus, mutations in These tumours contained mutations in MLL2, TP53 and PTEN that Gene Cases Total Somatic cSNVs P (raw) q NS SP T SP Skew 42 reported previously in medulloblastoma ; but these mutaDDX3X may contribute to WNT-subgroup medulloblastoma by have been (RNA-seq (M, WT, both){ NSproliferation S T NS S T tions occur in other subgroups and are not known to activate SHH increasing LRLP rather than perturbing the migration cohort)* signals. Two SHH-subgroup tumours (nos 11 and 12) contained of their daughter knockdown MLL2 { 16 cells. Notably, 8 17comparable 17 8 18 in utero 10of 6.85 3 1028 8.50 3 1027 0.834 14.4 WT G identical mutations GABAA (c -aminobutyric acid, Mll2, Gabrg1 and mutated in non-WNT TNFRSF14 { 7 Kdm6a 1 that were 7 selectively 8 1 7 11 6.85 3novel 1028 T48M 8.50 3 1027 in the 7.52 118 Both G 28 SGK1 { 18 had no 6 apparent 6 impact 37 on 10 6 9 6.85 3 10receptor, 8.50 1027 is predicted 19.5 2 1 A) c1, 3 which to be61.7 deleterious (Fig. medulloblastomas LRLPs; supporting the subtype 27 BCL10 0 4 3 0 4 4 6.85 3 1028 8.50 3 10 3.62 of GABA 112 receptors WTcan and Supplementary Table 9). Disruption value{of our assay2for assessing WNT-subgroup specific mutations and A GNA13 G{ 21 1 2 33 1 2 5 6.85 3 1028 8.50 3 1027 24.1 25.7 43 enhance neural , suggesting that theseBoth mutaunderscoring the importance of cell context for functional studies of G 28 stem cell proliferation 27 TP53 { 20 2 1 23 3 1 22 6.85 3 10 8.50 3 10 15.6 14.1 Both G mutated in cancer subgroups. 8 tions might deregulate the of GNPs that generateBoth SHHgenes EZH2 { 33 0 0 33 0 0 33 6.85 3 102 8.50 3 proliferation 1027 11.4 0.00 28 subgroup medulloblastomas. BTG2{ 12 6 1 14 6 1 2 6.85 3 10 8.50 3 1027 23.9 35.1 2 G BCL2 { 42 45 0 96 105 medulloblastoma 0 43 9.35 3 1028 8.50 3 1027 3.78 0.00 M PIK3CA mutations promote WNT-subgroup BCL6 {1 11 activating 2 0 12in PIK3CA 2 0 detected 2in 9.35 3 1028 8.50 3 1027 0.175 0.00 M Discussion Cancer-associated, mutations were CIITA{1 5 3 0 6 3 0 2 9.35 3 1028 8.50 3 1027 0.086 0.00 a single case each of 0 WNT-subgroup (PIK3CA(Q546K)), SHH26 recurrent, somatic mutations in speWe1.52 have several, new, FAS{ 2 4 3 0 4 2 3identified 1027 1.17 3 10 2.54 66.5 WT subgroup (PIK3CA(H1047R)) and subgroup-4 cific subgroups of 1.17 medulloblastoma. Alterations affecting Both EZH2, BTG1 { 11 6 2 11 7 (PIK3CA(N345K)) 2 10 1.52 3 1027 3 1026 17.5 52.5 G 27 26 medulloblastoma 1 and 0 Supplementary Although MEF2B { 20 (Fig. 2 20 2 Fig. 23). 0 10 2.05 3 1.47 3 10 seem 14.2 0.00 M KDM6A , 10 CHD7 and ZMYM3 to disrupt chromatin marking 7 26 IRF8 { 11 are 5 3 in adult 14 cancers 5 40 and 3 reported3in 4.55 3in 102 3.03 and 3 10-4 8.82Further28.2 WT PIK3CA mutations common of genes subgroup-3 tumours. epigenetic studies 27 26 41 TMEM30A { 1 0role in 4 1 0 4 4 6.06 3 10 3.79 3 10 0.785 65.0 WT , their tumorigenesis remains controversial. medulloblastoma will2.42 be required to uncover the identity of these genes, but evidence CD58{ 2 0 3 2 0 3 2 3 1026 1.43 3 1025 2.29 69.2 2 44,45 In particular it 10 is not known if 2 these mutations suggests these and MYCN KLHL6 { 2 12 2 initiate 2 or progress 4 1.00 3 1025may include 5.26 3 OTX2 1025 , MYC 5.42 16.4 . As amplifica2 A To test this, we generated mice that express a conditional 5 cancer. tion of these genes was detected exclusively in subgroup-3 and MYD88 { 13 2 0 14 2 0 9 1.00 3 102 5.26 3 1025 almost 12.4 0.00 WT E545K E545K 25 8.48 3 1025 in KDM6A 7.08 , CHD7 44.0 2 CD70 { of the Pik3ca 5 0 mutation. 1 5 0 2 Pik3ca 3 1.70 3 10 allele Mice harbouring -4 tumours that lacked mutations or ZMYM3 , it is A CD79B { E545K and 7 Tp53 2flx/flx were 1 bred9with Blbp-Cre 2 5 2.00 3 1025 9.52 3 1025 10.9 18.3 M or Pik3ca ,1which drives tempting to 2 speculate that these genetic alterations target common 5 24 CCND3{ 7 1 2 5 7 1 2 6 2.80 3 10 1.27 3 10 6.55 36.3 WT E545K efficient recombination in LRLPs . Blbp-Cre ;Pik3ca mice, with transforming pathways. detected recurrent mutations CREBBP{ 20 4 24 7 4 9 1.00 3 1024 4.35A 3 recent 1024 study 2.72 6.04 Both flx/flx 7 42 2 4 2 4 or without Tp53 , survived tumour free for a median of 212 in 1.80 three other chromatin in medulloblastoma HIST1H1C{ 9 0 0 10 0 0 6 3 10 7.50 3 10 remodellers 11.9 0.00 Both : 4 days LRLP migration B2M { with no evidence 7 0of aberrant 0 7 0 0(Fig. 4a and 4 3.90 3 10 1.56 3 102 16.6 0.00 WT SMARCA4 ,2 MLL2 and MLL3 , 3but this study did not include details 24 23 ETS1 { not shown). 10 In stark 1 0 10 0 4 4.10 3 10 1.58 3 10 0.00 WT , data contrast, 100% (1 n 5 11/11) of Blbpof tumour subgroup. Here, we show 5.76 that mutations in SMARCA4 1/lox(Ex3) E545K CARD11 { 14 ;Tp53 3 1/flx;Pik3ca 0 14 mice 3 developed 0 3 1.90 3 1023 7.04 3 1023 3.37 0.00 Both Cre;Ctnnb1 WNTCREBBP , TRRAP and MED13 are enriched in WNT-subgroup FAT2{1 2 1 0 2 1 0 2 6.30 3 1023 2.25 3 1022 0.128 0.00 2 subgroup medulloblastomas by 0 3 months of age; 22 medulloblastomas; thereby uncovering potential cooperative mutaIRF4 {1 9 1/lox(Ex3) 4 5 only 4% 0 (n 5 2/54) 5 7.00 3 1023 2.41 3 10 0.569 0.00 Both 1/flx 26 3 22 of Blbp-Cre ; Ctnnb1 ; Tp53 mice develop WNT-subgroup tions in 3 chromatin remodellers binding-partner oncogene, FOXO1{ 8 4 0 10 4 0 4 7.60 10 2.53 3 10 and their 4.02 0.00 2 22 22 STAT3 0 (Fig. 4a, 9 b). Pik3ca 0 0 4 2.19 3 10 6.08 3 10 2 2 Both medulloblastoma9 by 110months wild-type and CTNNB1. Thus, disruptions in the epigenetic machinery of medullo22 22 RAPGEF1 8 3 0 displayed 10 similar 3 classic 0 histologies 3 2.98 3 10 7.45 10 2 2 cooperate WT mutant mouse medulloblastomas blastoma are likely to be 3 subgroup specific and may with 1 3 ABCA7 12 0 E545K 15 3 0 2 7.76 3 1022 1.67 3 1021 2 2 WT and nuclear Ctnnb1 , but Pik3ca mutant tumours contained other oncogenic mutations. The low incidence of MLL2 mutations 22 21 RNF213 10 8 0 10 8 0 2 7.87 3 10 1.67 3 10 2 2 2 2 greater AKT pathway activity 0as measured by pS6 0and p4EBP1 detected our study relative work42 2 probably reflects MUC16 17 12 39 25 2 8.32 3in 102 1.73 3 1021to previous 2 2 22 21 immunostaining.8 Thus 4 mutations in 8PIK3CA activate HDAC7 0 4 probably 0 2 8.94 3 10 1.82 3 10 2 2 Results). WT differences in study populations (see Supplementary 1 PRKDC 7 to progress, 3 0 7 4 0 2 1.06 3 102 2.05 3 1021 2 2 the AKT pathway rather than initiate, WNT-subgroup Although medulloblastoma is more prevalent in2males, especially SAMD9 9 2 0 9 2 0 2 1.79 3 1021 3.01 3 1021 46 2 2 2 medulloblastoma. , the reason for this sex bias is with subgroup-3 and -4 disease 21 21 TAF1 10 0 0 10 0 0 2 3.03 3 10 4.74 3 10 2 2 2 21 potential explanation 21 unknown. One is the location of medulloblastoma PIM1 20 19 0 33 34 0 11 3.40 3 10 5.23 3 10 2 2 WT 21 SHH-subgroup8medulloblastomas oncogenes or tumour suppressor X47. Three COL4A2 2 0 8 2 0 2 7.64 3 10 8.99 3 1021 genes on 2 chromosome 2 2 of Four of thirteen SHH-subgroup medulloblastomas EP300 8 7 1 8 7 1 contained 3 3 1021 1.00 genes detected 2 in our study 2 are located WT on the9.54 most recurrently mutated expected biallelic inactivating alterations in SUFU or What chromosome X, which two (ZMYM3 and KDM6A ) contained were observed Individual cases with non-synonymous (NS), synonymous (S) and truncating (T)PTCH1. mutations and the total number of mutations of of each class are shown separately because some genes multiple mutations in the same case. The P values indicated in bold are the upper limit on the P value for that genealmost determined with the approach ref. 19 (see Supplementary Methods), q is the Benjaminiexclusively in described males. in Mutation in these genes might explain corrected q value, and NS SP and T SP refer to selective pressure estimates from this model for the acquisition of non-synonymous or truncating mutations, respectively. Genes with a superscript of either A or G a some of the male sex bias in medulloblastoma. The third mutated X were found to have mutations significantly enriched in ABC or GCB cases, respectively (P , 0.05, Fishers exact test). Pik3ca+/x n =11 100 +/x;Tp53+/xmutations * Additional somatic mutations identified in larger cohorts andPik3ca insertion/deletion are not included in this total. chromosome gene, DDX3X, is more likely to be a WNT-subgroup n =15 { Both indicates that we observed separate cases in which skewed expression was seen but where this skew was not consistent for the mutant or wild-type allele. +/lox(Ex3);Tp53+/x n = 54 medulloblastoma oncogene. Three of four female medulloblastomas Ctnnb1 { Genes significant at a false discovery rate of 0.03. SNVs in BCL2 and previously confirmed hot spot mutations in EZH2 and CD79B are probably somatic in these samples based on published observations of others. 75 +/lox(Ex3) carried heterozygous mutations in DDX3X that escape X inactivation25, 1 Selective pressure estimates are both , 1 indicating purifying Ctnnb1 selection rather thanx/x positive selection acting on this gene. ;Tp53 n = 55 and our functional data indicate that mutations in this gene provide a 50 advantage that to LRLPs that generate coverage these tumours. address the possibility variable RNA-seq of MLL2 failed SGK1 encodes a phosphatidylinositol-3-OH kinase (PI(3)K)- proliferative Our findings also have important implications drug develop+/lox(Ex3);Tp53+/x;Pik3ca+/x n = 11 25 Ctnnb1 entire MLL2 locus regulated kinase with functions including regulation of FOXO to capture some mutations, we PCR-amplified thefor P < 0.0001 Inhibitors in of89 the epigenetic machinery, those that phosphorylating IkB ment. transcription factors25, regulation of NF-kB by (,36 kilobases) cases (35 primary FLs, 17especially DLBCL cell lines, and 0 27 H3K27me3for example, EZH2 methylasemay be useful regulation of300 NOTCH . SGK1 also maintain kinase26, and negative 37 DLBCLs). Of these cases 58 were among the RNA-seq cohort. 0 50 100 150 200 250 350 400 signalling 450 500 550 600 for subgroup-3 and -4 disease. These tumours include the Time (days) resides within a region of chromosome 6 commonly deleted in DLBCL treatments Illumina amplicon re-sequencing (Supplementary Methods) revealed aggressive forms ofthe medulloblastoma, for which treatment b 1)5. The H&E Ctnnb1 pS6(Ser 235/236) inactivation p4EBP1(Thr 37/46) mechanism by which SGK1 and GNA13 may most (Fig. 78 mutations, confirming RNA-seq mutations in the overlapping are identifying limited. Mutations that activate PIK3CA and DDX3X in contribute to lymphoma is unclear, but the strong degree of apparent options cases and 33 additional mutations. We confirmed the tumours mightusing also be targeted with novel(Supplementherapeutic selection towards their inactivation and their overall high mutation WNT-subgroup somatic status of 46 variants Sanger sequencing 48,49 . Future clinical trials of the drugs that targetmutations these mutant frequency (each mutated in 18 of 106 DLBCL cases) suggests that their strategies tary Table 10), and showed that 20 of 33 additional were must recruit the appropriate patient populations, as we loss contributes to B-cell NHL. Certain genes are known to be mutated proteins insertions or deletions (indels). Three SNVs at splice sites were also that mutations show subgroup specificity in medullomore commonly in GCB DLBCLs (for example, TP53 (ref. 28) and demonstrate detected, as were 10 new cSNVs that had not been detected by RNA-seq. blastoma. Our accurate mouse models of WNT-subgroup, SHHEZH2 (ref. 13)). Here, both SGK1 and GNA13 mutations were found The somatic mutations were distributed across MLL2 (Fig. 3a). Of subgroup and subgroup-3 medulloblastoma should help with future only in GCB cases (P 5 1.93 3 1023 and 2.28 3 1024, Fishers exact these, 37% (n 5 29/78) were nonsense mutations, 46% (n 5 36/78) studies of the biological and therapeutic importance of the novel test; n 5 15 and 18, respectively) (Fig. 2). Two additional genes were indels that altered the reading frame, 8% (n 5 6/78) were point genetic alterations described in this study. (MEF2B and TNFRSF14 ) with no previously described role in mutations at splice sites and 9% (n 5 7/78) were non-synonymous E545K Figure 4 | Pik3ca accelerates but does not initiate WNT-subgroup DLBCL showed a similar restriction to GCB cases (Fig. 2). amino acid substitutions (Table 2). Four of the somatic splice site
Ctnnb1+/lox(Ex3); Ctnnb1+/lox(Ex3); Tp53+/x Tp53+/x;Pik3ca+/x

Tumour-free survival (%)

medulloblastoma. a, Tumour-free survival of mice of the indicated genotype. All mice carry the Blbp-cre allele. Log rank P , 0.0001. b, Haematoxylin and Inactivating MLL2 mutations eosin (H&E) and immunohistochemical stains of indicated tumours. MLL2 showed Scale bar, 50 mm. the most significant evidence for selection and the

METHODS SUMMARY

largest number of nonsense SNVs. Our RNA-seq analysis indicated that 26.0% (33/127) of cases carried at least one MLL2 cSNV. To
3 0 0 | N AT U R E | VO L 4 7 6 | 1 8 AU G U S T 2 0 1 1

Human tumour and matched blood samples were obtained with informed example, two heterozygous splice site mutations resulted inat the consent through an institutional review board approved protocol St use Judeof a novel splice donor site and an intron retention event. Childrens Research Hospital. WGS and analysis of WGS data were performed

mutations had effects on MLL2 transcript length and structure. For Approximately half of the NHL cases we sequenced had two MLL2 2 A U G Table UST 20 1 2 | We VOL 4 8 8 bacterial | N A T U Rartificial E | 47 mutations (Supplementary 10). used
NATURE REPRINT COLLECTION Epigenetics

S36

RESEARCH ARTICLE
as previously described50. Details of sequence coverage, custom capture and other 40 validation procedures are provided in Supplementary Information (Supplemen30 ABC GCB 20 tary Tables 1215). Immunohistochemistry and immunofluorescence of U FL 10 human and mouse tissues were performed using routine techniques and primary MYD88 antibodies of the appropriate tissues as described (Supplementary Methods). CD79B BCL6s Medulloblastoma mRNA and DNA profiles were generated using Affymetrix TNFAIP3 U133v2 and SNP 6.0 arrays, respectively (Supplementary Methods). CARD11 Real-time PCR with reverse transcriptase (RTPCR) analysis of genes targeted FAS in mouse TMEM30A isolated LRLPs by shRNAs were performed as described previously32. LRLPs were CD58 CD70 and transduced with indicated lentiviruses in stem cell cultures or targeted STAT3 ETS1 in utero with shRNAs or mutant cDNA sequences by electroporation as HIST1H1C described5 (Supplementary Information). Mice harbouring a Cre-inducible CCND3 KLHL6 a loxPik3caE545K allele were generated using homologous recombination: BTG1 puro-STOP-lox cassette was introduced immediately upstream of the BTG2 exon conIRF8 taining the initiation codon, exon 9 was replaced with an exon containing B2M the EP300 E545K mutation. Pik3caE545K mice were bred with Blbp-Cre;Ctnnb1lox(Ex3)/lox(Ex3) CREBBP flx/flx and Tp53 mice to generate progeny of the appropriate genotypeMLL2 and subFOXO1 jected to clinical surveillance. TNFRSF14
MEF2B TP53 BCL2 SGK1 Published online 20 June 2012. GNA13 EZH2 BCL2s 1. Central Brain Tumor Registry of the United States. Statistical report: primary brain

ARTICLE RESEARCH
28. Mosimann, C., Hausmann, G. & Basler, K. b-Catenin hits chromatin: regulation of Wnt target gene activation. Nature Rev. Mol. Cell Biol. 10, 276286 (2009). 29. Hecht, A., Vleminckx, K., Stemmler, M. P., van Roy, F. & Kemler, R. The p300/CBP acetyltransferases function as transcriptional coactivators of b-catenin in vertebrates. EMBO J. 19, 18391850 (2000). 30. Barker, N. et al. The chromatin remodelling factor Brg-1 interacts with b-catenin to MLL2 promote activation. EMBO HMG box J. 20, 49354943 (2001). COG5141 SET PHD target gene PHD FYRN 31. Carrera, I., Janody, F., Leeds, N., Duveau, F. & Treisman, J. E. Pygopus activates FYRC Wingless target gene transcription through the mediator complex subunits 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 bp Med12 and Med13. Proc. Natl Acad. Sci. USA 105, 66446649 (2008). N81K 32. Thompson, M. C. et al. Genomics identifies medulloblastoma subgroups that are b N81Y D83G enriched for specific genetic alterations. J. Clin. Oncol. 24, 19241931 (2006). Y69C 33. Kool, M. et al. Integrated genomics identifies five medulloblastoma subtypes with D83V Y69H distinct genetic profiles, pathway signatures and clinicopathological features. PLoS ONE 3, e3088 (2008). K4E 34. Orsulic, S., Huber, D83A O., Aberle, H., Arnold, S. & Kemler, R. E-cadherin binding MEF2B prevents MADS boxb-catenin MEF2 nuclear localization and b-catenin/LEF-1-mediated transactivation. J. Cell Sci. 112, 12371245 (1999). 0 50 100 150 250 300 350 bp 35. Risinger, J. I., Berchuck, A., Kohler, M. F. &200 Boyd, J. Mutations of the E-cadherin gene in human gynecologic cancers. Nature Genet. 7, 98102 (1994). Figure 3 | Summary and effect of somatic mutations affecting MLL2 and 36. Becker, K.-F. et al. E-Cadherin gene mutations provide clues to diffuse type gastric MEF2B . a, Re-sequencing the38453852 MLL2 locus in 89 samples revealed mainly carcinomas. Cancer Res. 54, (1994). nonsense and frameshift-inducing indel mutations (orange 37. Pek, J.(red W. &circles) Kai, T. DEAD-box RNA helicase Belle/DDX3 and the RNA interference pathway promote mitotic for chromosome segregation. Proc. Natl Acad. USA 108, triangles; inverted triangles insertions and upright triangles for Sci. deletions). A 1200712012 smaller number of(2011). non-synonymous somatic mutations (green circles) and 38. Lai, M. C., Chang, W. C., Shieh, S. Y. & Tarn, W. Y. DDX3 regulates cell growth through point mutations or deletions affecting splice sites (yellow stars) were also translational control of cyclin E1. Mol. Cell. Biol. 30, 54445453 (2010). observed. non-synonymous point mutations residue All 39. Schro der,of M.the Human DEAD-box protein 3 has multiple affected functionsa in gene within either the catalytic SET domain, the FYRC domain (FY-rich carboxy-terminal regulation and cell cycle control and is a prime target for viral manipulation. Biochem. Pharmacol. 79, 297306 (2010). domain) or PHD zinc finger domains. The effect of these splice-site mutations 40.MLL2 Samuels, Y. et al. High frequency of mutations of the PIK3CA gene in on splicing was also explored (Supplementary Figure 7). b , human The cSNVs cancers. Science 304, 554 (2004). and mutations found in of MEF2B in FL and oligodendrogliomas, DLBCL cases sequenced 41. somatic Broderick, D. K. et al. Mutations PIK3CA inall anaplastic highare shown with the same symbols. Only the amino acids variants in at least grade astrocytomas, and medulloblastomas. Cancer Res. with 64, 50485050 (2004). 42. patients Parsons, D. W. et al. ThecSNVs geneticwere landscape the childhood cancer two are labelled. mostof prevalent in the first two proteinmedulloblastoma. Science 331, 435439 (2011). coding exons of MEF2B (exons 2 and 3). The crystal structure of MEF2 bound ng, M. et al. Histone H2AX-dependent GABAA receptor regulation of stem cell 43. Anda to EP300 supports the idea that two of the mutated sites (L67 and Y69) are proliferation. Nature 451, 460464 (2008). important the these proteins (Supplementary Figure 44. Pasini, in D. et al.interaction Coordinatedbetween regulation of transcriptional repression by the RBP28 50 . and Supplementary Discussion) H3K4 demethylase and Polycomb-Repressive Complex 2. Genes Dev. 22, 13451355 (2008). 45. Khan, A., Shover, W. & Goodliffe, J. M. Su(z)2 antagonizes auto-repression of Myc in mutations. One such gene was MEF2B , which had not previously been Drosophila , increasing Myc levels and subsequent trans-activation. PLoS ONE 4, e5076 linked to (2009). lymphoma. We found that 20 (15.7%) cases had MEF2B 46. Northcott, P. A. et al. Medulloblastoma comprises four distinct molecular variants. cSNVs and 4 (3.1%) cases had MEF2C cSNVs. All cSNVs detected by J. Clin. Oncol. 29, 14081414 (2011). RNA-seq affected the MADS box or genetics MEF2 domains. deter47. Spatz, A., Borg, C. either & Feunteun, J. X-chromosome and human To cancer. Nature Cancer 4, 617629 (2004). mine the Rev. frequency and scope of MEF2B mutations, we Sanger48. Lindqvist, L. et al. Selective pharmacological targeting of a DEAD box RNA helicase. sequenced exons 2 and 3 in 261 primary FL samples; 259 DLBCL PLoS One 3, e1583 (2008). primary tumours; 17 cell 35 assortedchallenges NHL (IBL, 49. Engelman, J. A. Targeting PI3Klines; signalling in cases cancer: of opportunities, and limitations. Nature Cancer 9, 550562 (2009). composite FL and Rev. PBMCL); and eight non-malignant centroblast 50. Zhang, J. et al. The genetic basis of early T-cell precursor acute lymphoblastic samples. We Nature also used a capture strategy (Supplementary Methods) leukaemia. 481, 157163 (2012).
Cases

GCB enrichment

ABC enrichment

Received 13 January; accepted 2 May 2012.

2002/2002report.pdf (CBTRUS, 2006). Taylor, M. D. et al. Molecular subgroups of medulloblastoma: the current 0.30.1 consensus. Acta Neuropathol. 123, 465472 (2012). ller, U. et al. Acquisition of granule neuron precursor identity is a critical 3. Schu Figuredeterminant 2 | Overview of mutations and potential cooperative interactions in of progenitor cell competence to form Shh-induced Cancer Cell 14, 123134 (2008). co-occurrence (red) and NHL. medulloblastoma. This heat map displays possible trends towards 4. Yang, Z. J. et al. Medulloblastoma be initiated by deletion of rearrangements. Patched in lineagemutual exclusion (blue) of somatic can mutations and structural restricted progenitors or stemthe cells. Cancer Cell 14, 135145 Colours were assigned by taking minimum value of a left-(2008). and right-tailed 5. Gibson, P. et al. Subtypes of medulloblastoma have distinct developmental origins. Fishers exact test. To capture trends a P-value threshold of 0.3 was used, with Nature 468, 10951099 (2010). the shade those statistical significance 6.darkest Kawauchi, D.of et the al. Acolour mouse indicating model of the mostmeeting aggressive subgroup of human medulloblastoma. Cell 21, (2012). (P # 0.05). The relativeCancer frequency of 168180 mutations in ABC (blue), GCB (red), 7. Mulhern, R. K. et DLBCLs al. Neurocognitive consequences therapy for unclassifiable (black) and FL (yellow) cases of is risk-adapted shown on the left. Genes childhood medulloblastoma. J. Clin. Oncol. 23, 55115519 (2005). were arranged with those having significant ( P , 0.05, Fishers exact test) 8. Wang, J. et al. CREST maps somatic structural variation in cancer genomes with enrichment for resolution. mutations in ABC cases8, (blue triangle) towards the top (and base-pair Nature Methods 652654 (2011). left) those with enrichment for mutations in GCB cases 9. and Rausch, T. et al. significant Genome sequencing of pediatric medulloblastoma links (red catastrophic DNA rearrangements with TP53 mutations. Cellof 148, 5971 (2012). triangle) towards the bottom (and right). The total number cases in which 10.gene Castellino, R. C. et al. Heterozygosity for Pten promotes a mouse each contained either cSNVs or confirmed somatictumorigenesis mutations is in shown at model of medulloblastoma. PLoS ONE 5, e10849 (2010). the top. The cluster of blue squares (upper-right) results from the mutual 11. Hahn, H. et al. Mutations of the human homolog of Drosophila patched in the exclusion of the ABC-enriched example, MYD88 , CD79B) from nevoid basal cell carcinomamutations syndrome. (for Cell 85, 841851 (1996). the GCB-enriched mutations (formutations example,in EZH2 , GNA13 ). Presence ofcancer, 12. Malkin, D. et al. Germ line p53 a familial syndrome of breast sarcomas, and other neoplasms. 250, 12331238 (1990). structural rearrangements involvingScience the two oncogenes BCL6 and BCL2 13. Hamilton, S. R.s et al. The molecular of Turcots syndrome. N. Engl. J. Med. 332, (indicated as BCL6 and BCL2 s) wasbasis determined with FISH techniques using 839847 (1995). break-apart probes (Supplementary Methods). 14. Taylor, M. D. et al. Medulloblastoma in a child with Rubenstein-Taybi syndrome: case report and review of the literature. Pediatr. Neurosurg. 35, 235238 (2001). chromosome (BAC) clone sequencing FL cases to show that in 15. Mikkelsen, T. S. et al. Genome-wide mapsin of eight chromatin state in pluripotent and lineage-committed cells. Nature 448, (2007). both MLL2 alleles. all eight cases the mutations were in 553560 trans, affecting 16. Cao, R. et al. Role of histone H3 lysine 27 methylation in Polycomb-group silencing. This observation is consistent with the notion that there is a complete, Science 298, 10391043 (2002). or17. near-complete, of MLL2 in the tumour complexes cells of such Czermin, B. et al.loss Drosophila Enhancer of Zeste/ESC have patients. a histone H3 methyltransferase activity that primary marks chromosomal sites. Cell 111, With the exception of two FL casesPolycomb and two DLBCL cell 185196 (2002). lines (Pfeiffer and SU-DHL-9), the majority of MLL2 mutations seemed 18. Agger, K. et al. UTX and JMJD3 are histone H3K27 demethylases involved in HOX to be heterozygous. Analysis of Affymetrix SNP array data from gene regulation and development. Nature 449, 500k 731734 (2007). 19.FL Schnetz, P. et al. Genomic distribution of CHD7 on chromatin tracks H3K4 two cases M. with apparent homozygous mutations revealed that both methylation patterns. Genome Res. 19, 590601 (2009). tumours showed copy number neutral loss of heterozygosity (LOH) 20. Sauvageau, M. & Sauvageau, G. Polycomb group proteins: multi-faceted regulators somatic stem cells and Cell Stem Cell 7, 299313 (2010). for the region of of chromosome 12 cancer. containing MLL2 (Supplementary 21. Morin, Thus, R. D. et al. mutations altering mutation, EZH2 (Tyr641) in follicular and Methods). inSomatic addition to bi-allelic LOH is a second, diffuse large B-cell lymphomas of germinal-center origin. Nature Genet. 42, albeit 181185 less common (2010). mechanism by which MLL2 function is lost. 22. Kleer, C. G. et al. EZH2 is a marker of aggressive breast and among promotes MLL2 was the most frequently mutated gene incancer FL, and the transformation of breast epithelial cells. Proc. Natl Acad. Sci. USA 100, most neoplastic frequently mutated genes in DLBCL (Fig. 2). We confirmed 1160611611 (2003). MLL2 mutations in of 35 FL group patients (89%), 12 of in 37 DLBCL 23. Varambally, S. et al.31 The polycomb protein EZH2 in is involved progression of prostate cancer. 624629 patients (32%), in 10 Nature of 17 419, DLBCL cell (2002). lines (59%) and in none of the 24. van Haaften, G. et al. Somatic mutations of the histone H3K27 demethylase gene eight normal centroblast samples we521523 sequenced. Our analysis preUTX in human cancer. Nature Genet. 41, (2009). dicted thatF., the majority of the somatic observed in MLL2 25. Yang, Babak, T., Shendure, J. & Disteche,mutations C. M. Global survey of escape from X inactivation by RNA-sequencing in mouse. Genomeframe Res. 20, 614622 (2010). were inactivating (91% disrupted the reading or were truncat26. Christensen, J. et al. RBP2 belongs to a family of demethylases, specific for tri- and ing point mutations), that MLL2 (2007). is a tumour supdimethylated lysine 4indicating on histone 3. to Cellus 128, 10631076 27. Lee, M. G., Wynder, C., in Cooch, N. & Shiekhattar, R. An essential role for CoREST in pressor of significance NHL. nucleosomal histone 3 lysine 4 demethylation. Nature 437, 432435 (2005). 2.
0.10.05

MYD88 CD79B BCL6s TNFAIP3 CARD11 FAS TMEM30A CD58 CD70 STAT3 ETS1 HIST1H1C CCND3 KLHL6 BTG1 BTG2 IRF8 B2M EP300 CREBBP MLL2 FOXO1 TNFRSF14 MEF2B TP53 BCL2 SGK1 GNA13 EZH2 BCL2s

tumors in the United States, 19951999. https:// http://www.cbtrus.org/reports/ <0.05

to sequence the entire MEF2B coding region in the 261 FL samples, Supplementary Information is linked to the online version of the paper at revealing six additional variants outside exons 2 and 3. We thus idenwww.nature.com/nature. tified 69 cases (34 DLBCL, 12.67%; and 35 FL, 15.33%) with MEF2B Acknowledgements This research was supported as part of the St Jude Childrens cSNVs or indels, failing to observe novelCancer variants in other NHL and Research Hospital, Washington University Pediatric Genome Project. This work was supported by grants from the Institutes of Health (R01CA129541, non-malignant samples. OfNational the variants 55 (80%) affected residues P01CA96832 and P30CA021765; R.J.G.),domains the Collaborative Ependymoma Research within the MADS box and MEF2 encoded by exons 2 and 3 Network (CERN), Musicians against Childhood Cancer (MACC), The Noyes Brain (Supplementary Table 11; Fig. 3b). Each patient generally had a single Tumour Foundation, and by the American Lebanese Syrian Associated Charities (ALSAC). We are grateful S. Temple for the gift of reagents the staff of the Hartwell MEF2B variant andto we observed relatively few and (eight in total, 10.7%) Center for Bioinformatics and Biotechnology and ARC at St Jude Childrens Research truncation-inducing SNVs or indels. Non-synonymous SNVs were by Hospital for technical assistance. far the most common type of change observed, with 59.4% of detected Author Contributions G.R., M.P., T.A.K., C.L., X.C., L.D., T.N.P., E.H., L.W., X.Z., N.Ch., R.H., variants affecting K4, Y69, N81 or D83. In 12 cases MEF2B mutations N.Cu., R.T., J.W., G.W., M.R., X.H., J.B., P.G., J.M., J.E., B.V., A.O.-T., T.L., S.Po., S.Pa., D.Z., D.K. were shown to be somatic, mutations atS.J.B., each and D.F. contributed to the designincluding and conductrepresentative of experiments and to the writing. R.K., M.F.R., R.S.F., L.L.F., D.J.D., K.O. and E.R.M. contributed to experimental design of K4, Y69, N81 and D83 (Supplementary Table 12). We did and not to the writing. A.G., D.W.E., C.C.L., E.B., T.H., S.G. and R.C. provided clinical expertise. detect mutations in ABC cases, indicating that somatic mutations R.K.W., J.R.D., J.Z. and R.J.G. conceived the research and contributed to the design, in MEF2B have a role unique to the development of GCB DLBCL and FL direction and reporting of the study. (Fig. 2). Author Information Sequence and SNP array data were deposited in dbGaP under
accession number phs000409 and in the Sequence Read Archive (SRA) under accession number SRP008292. Reprints and permissions information is available at www.nature.com/reprints. This paper distributed undermutations the terms of the Creative Table 2 | Summary of types ofis MLL2 somatic Commons Attributions-Non-Commercial-Share Alike licence, and is freely available to Sample Type FL DLBCL DLBCL cell-line Centroblast all readers at www.nature.com/nature. The authors declare no competing financial interests. Readers are welcome to comment on the online version of this article at Truncation 18 4 7 0 www.nature.com/nature. Correspondence and requests for materials should be Indel with frameshift 22 8 0 addressed to R.J.G. (Richard.Gilbertson@stjude.org) or J.Z. 6 Splice site 4 2 0 0 (Jinghui.Zhang@stjude.org). SNV 3 2 2 0 Any mutation/ number of cases 31/35 12/37 10/17 0/8 Percentage 89 32 59 0

Recurrent point mutations in MEF2B


Our selective pressure analysis also revealed genes with stronger pres4 8 for | NA TURE | VO L amino 4 8 8 | 2acid AUG UST 2012 sure acquisition of substitutions than for nonsense
NATURE REPRINT COLLECTION Epigenetics

1 8 AU G U S T 2 0 1 1 | VO L 4 7 6 | N AT U R E | 3 0 1

S37

ARTICLE
PUBLISHED ONLINE: 30 SEPTEMBER 2012 | DOI: 10.1038/NCHEMBIO.1084
First published in Nature Chemical Biology 8, 890896 (2012); doi:10.1038/nchembio.1084

A selective inhibitor of EZH2 blocks H3K27 methylation and kills mutant lymphoma cells
Sarah K Knutson1,2, Tim J Wigle1,2, Natalie M Warholic1, Christopher J Sneeringer1, Christina J Allain1, Christine R Klaus1, Joelle D Sacks1, Alejandra Raimondi1, Christina R Majer1, Jeffrey Song1, Margaret Porter Scott1, Lei Jin1, Jesse J Smith1, Edward J Olhava1, Richard Chesworth1, Mikel P Moyer1, Victoria M Richon1, Robert A Copeland1, Heike Keilhack1, Roy M Pollock1 & Kevin W Kuntz1*
EZH2 catalyzes trimethylation of histone H3 lysine 27 (H3K27). Point mutations of EZH2 at Tyr641 and Ala677 occur in subpopulations of non-Hodgkins lymphoma, where they drive H3K27 hypertrimethylation. Here we report the discovery of EPZ005687, a potent inhibitor of EZH2 (Ki of 24 nM). EPZ005687 has greater than 500-fold selectivity against 15 other protein methyltransferases and has 50-fold selectivity against the closely related enzyme EZH1. The compound reduces H3K27 methylation in various lymphoma cells; this translates into apoptotic cell killing in heterozygous Tyr641 or Ala677 mutant cells, with minimal effects on the proliferation of wild-type cells. These data suggest that genetic alteration of EZH2 (for example, mutations at Tyr641 or Ala677) results in a critical dependency on enzymatic activity for proliferation (that is, the equivalent of oncogene addiction), thus portending the clinical use of EZH2 inhibitors for cancers in which EZH2 is genetically altered.

rimethylation of H3K27 is a transcriptionally repressive epigenetic mark that has been causally associated with a number of hematologic and solid human cancers. Methylation of H3K27 is catalyzed by polycomb repressive complex 2 (PRC2), containing the enzymatic subunit EZH2 or EZH1 (refs. 1,2). Reversal of H3K27 methylation is catalyzed by the histone demethylases UTX and JMJD3 (refs. 37). Several molecular mechanisms leading to a hypertrimethylated state of H3K27 are seen among human cancers. For example, EZH2 itself and other PRC2 subunits are amplified and/or overexpressed in subsets of several human cancers including breast, prostate and lymphoma813. Loss-of-function mutations in the demethylase UTX are found in subsets of myeloma, renal and esophageal cancers14, and overexpression of the PRC2-associated protein PHF19 are observed in a number of solid tumors15. Most recently, point mutations at Tyr641 (Y641F, Y641N, Y641S and Y641H) have been identified in 824% of non-Hodgkin lymphomas in several studies1618. The mutation status for EZH2 is found to always be heterozygous in primary tumor samples from these patients. Although these mutations were originally characterized as loss-of-function mutations, our group later demonstrated that the mutations in fact change the substrate specificity of EZH2 (ref. 19). The wild-type enzyme is most efficient as a monomethyltransferase and wanes in catalytic efficiency for the second and especially the third methylation reaction. In contrast, all of the mutant enzymes show the exact opposite order of substrate use; they are essentially inactive as monomethyltransferases but are effective at catalyzing the reaction from mono- to dimethyl and are very efficient at catalyzing the reaction from di- to trimethyl. We, and subsequently others20, demonstrated that lymphoma cells heterozygous for these Tyr641 mutants show hypertrimethylation of H3K27 compared to EZH2 wild-type lymphoma cells; the hypertrimethylation results from the enzymatic coupling between wild-type (to drive monomethylation) and mutant (to drive di- and trimethylation) EZH2 in the heterozygous cells. Additionally, a heterozygous EZH2 mutation within the SU(VAR)39, enhancer of zeste, trithorax (SET) domain at Ala677
1

(A677G) is seen both in the Pfeiffer cell line and in primary patient samples18,21. Further investigation of this mutation indicates that it also results in increased H3K27me3 while decreasing H3K27me2 in vitro, similar to the Tyr641 mutations. However, at the biochemical level, the substrate specificity of this enzyme differs from that seen in the Tyr641 mutants. Specifically, in vitro assays demonstrate that the A677G mutant efficiently catalyzes all three H3K27 methylation steps, whereas the Tyr641 mutants preferentially catalyze the reaction from di- to trimethyl21. This is another example of a heterozygous change-of-function point mutation within the EZH2 SET domain observed in lymphoma. On the basis of this enzymatic coupling and the resultant hypertrimethylation of H3K27, we hypothesized that the hypertrimethylated H3K27 phenotype drives the lymphomagenic proliferation in these EZH2 mutantbearing cells; the cells thus depend on EZH2 enzymatic activity for proliferation and survival. This hypothesis has not been adequately tested, however, until now. In this paper, we report the discovery of a potent and selective small-molecule inhibitor of EZH2, EPZ005687 (4). The ability of this compound to directly and selectively inhibit PRC2 enzymatic activity distinguishes it from DZNep, a compound that has been used previously to probe cellular EZH2 function. DZNep is an inhibitor of S-adenosylhomocysteine (SAH) hydrolase and is thought to inhibit and cause the degradation of the PRC2 complex by an indirect mechanism involving an increase in the cellular concentration of SAH, an inhibitory byproduct of cellular methyltransferase reactions22,23. Interpretation of cellular phenotypes caused by DZNep is complicated by DZNeps ability to reduce methylation at multiple histone residues targeted by protein methyltransferases (PMTs) other than EZH2. In contrast, treatment of cells with EPZ005687 resulted in concentration-dependent ablation of H3K27 methylation without major decreases in any other histone methyl marks. When the compound was applied to lymphoma cells bearing an EZH2 Tyr641 or Ala677 mutation, concentration-dependent cell killing was observed. Unlike the potent cell killing seen for mutantbearing lymphoma cell lines, EPZ005687 had minimal effects on

Epizyme, Inc., Cambridge, Massachusetts, USA. 2These authors contributed equally to this work. *e-mail: kkuntz@epizyme.com
NATURE REPRINT COLLECTION Epigenetics NATURE CHEMICAL BIOLOGY | VOL 8 | NOVEMBER 2012 | www.nature.com/naturechemicalbiology

S38 8 90

NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.1084


the of lymphoma cellwas lines containing wild-type andproliferation selective DOT1L inhibitor, used to elucidate the EZH2. causal 34 Thus, represents a chemical probe molecule for testing role ofEPZ005687 DOT1L enzymatic activity in MLL -rearranged leukemia . theIn dependency ofwork, cancerwe cell lines on EZH2 enzymatic the present identified EPZ005687 as aactivity. potent The and data reported here substantial support the hypothesis selective inhibitor ofprovide wild-type and mutant EZH2for containing PRC2 (described above) that EZH2 mutant bearing lymphomas critically enzymatic activity. We showed that the compound selectively inhibdepend on EZH2 enzymatic activity proliferation andinto survival. its H3K27 methylation in cells and for that this translated selective cell killing for lymphoma cells that contain heterozygous EZH2 mutations at Tyr641 or Ala677. These data established a critical RESULTS and identification unique dependency on PRC2 enzymatic activity for the lymHit and optimization of target potency phoma cell lines that bear these mutations. This dependency High-throughput screening of EZH2 a 175,000-compound subset of a is equivalent to the library conceptagainst of oncogene addiction, in which cells chemical diversity recombinant wild-type PRC2, 24 become abnormally dependent on biochemical activity of a under balanced assay conditions , the yielded inhibitors of varying specific oncogene product for growth, survival or both, such that chemotypes with half-maximum inhibitory concentration (IC 50) ablation the cytotoxic in the genetically altered cells values inof the 3-oncogene to 30-M is range. The hits were divided into clusters butthe inconsequential to growth of normal cells. The present on basis of structural similarity, and an additional 5,000results comprovide representing a compelling25 foundation for mined the clinical use of selective pounds, clusters, were from the remainder of EZH2 inhibitors for the treatment mutant-bearing lymphomas. the compound library and screened.of The majority of the hits proved The compound represents a poor chemical biological probe for to becurrent promiscuous inhibitors or had physicochemical properin vitro and we do not suggest this compound ties (hadexperiments, poor solubility or were redox active,that irreversible inhibiitselfor could form forming). the basis for patientthis treatment. Pharmacological tors aggregate However, hit expansion identified optimization of compounds such as EPZ005687 a pyridone-containing chemotype, 1 (Fig. 1), holds whichgreat had promise an IC50 for620 thisnM eventual outcome. of for wild-type PRC2. Early attempts to use 1 in cellular Genetic alterations in EZH2 and other PRC2 subunits are not assays quickly identified poor solubility as a liability of this chemical limitedAto the Tyr641 and Ala677 observed lymseries. survey of vectors around themutations template showed thatin amines phoma. A broad of genetic of PRC2 hasled been were tolerated in spectrum the 4-position of the alterations phenyl ring (2 ), which to documented in a range of hematologic solid tumors. Notably, large improvements in solubility with a and slight increase in potency. in myeloid and T-cell leukemia, mutations EZH2 We made amalignancies variety of 5,6-fused heteroaryl ring systems,in and the and other PRC2 components lead tocompared a loss of function of the comindazole showed improved potency to the pyrazolopyri47 plex45 dine (3 versus 2). that Increasing the size of the lipophilic group off the . The fact both activating and inactivating mutations of 1-position of the indazole led to improved potency and provided EZH2 are associated with malignancy is remarkable and reflects the EPZ005687 (4 ). PRC2 A comprehensive exploration of the optimization complex role of target genes in cell fate decisions. of these inhibitors through iterative structure-activity EPZ005687 is shown here to be an equally potent relationship inhibitor of studies to yield and EPZ005687 and related compounds will suggestbe preboth wild-type Tyr641 or Ala677 mutants of EZH2, sented in full in a separate publication. Subsequent tothis the inhibition discovery ing that pharmacologically optimized inhibitors with of EPZ005687, two patent applications were published containing profile may be useful in the treatment of a number of human canEZH2 inhibitors with structures function similar to those described here25,26. cers wherein gain-of-enzymatic of PRC2 drives disease.
Determination of inhibitor IC50 values in the PMT panel. Values for enzymes in the As illustrated in Figure 2a, EPZ005687 showed concentration-dehistone methyltransferase panel were determined under balanced assay conditions pendent inhibition of PRC2 enzymatic activity with an IC50 value with both SAM and protein or peptide substrate present at concentrations equal 24 of 54 5 nM. K Dual titration of the compound and the substrate to their respective m values . Where a peptide was used as a methyl-accepting substrate, the peptide is referred to here by the histone and residue numbers S -adenosylmethionine (SAM) yielded Michaelis-Menten plots that that it represents. Forthe example, peptide H3:16 30 refers to a peptide representing were best fit by steady-state equation for competitive inhibihistone H3 residues 16 through 30. All reactions were run at 25 C in a 50-l tion, yielding a K i value for EPZ005687 of 24 7 nM. Consistent volume with 2% (v/v) DMSO in the final reaction. Flag- and His-tagged CARM1 with competitive inhibition, the IC EPZ005687 inhibition 50 for at (residues 2585) expressed in 293 cells was assayed a final concentration of of PRC2 positive peptide linear corresponding dependence to on SAMH3:16 concentration 0.25 nMshowed against a a biotinylated histone 30 with aFig. monomethylated His-tagged Dot1L (residues 416) expressed in sub( 2b). Dual Arg26. titration of compound and 1oligonucleosome Escherichia coli was assayed at a final concentration 0.25were nM against strate resulted in Michaelis-Menten plots of that best chicken described erythrocyte oligonucleosomes. His-tagged EHMT2 (residues 9131193) expressed by the steady-state equation for noncompetitive inhibition, and the in E. coli was assayed at a final concentration of 0.1 nM against a biotinylated IC EPZ005687towas of the oligonucleosome 50 ofcorresponding peptide H3:1independent 15. His-tagged EHMT1 (residues 9511235) substrate concentration (Fig. at 2c expressed in E. coli was assayed a). final concentration of 0.1 nM against a biotinylated to H3:1 15.EPZ005687 Full-length glutathione S-transferase The peptide above corresponding data suggested that binds in the SAM (GST)-tagged PRMT1 expressed in Spodoptera frugiperda cells was assayed within pocket of the EZH2 SET domain. Definitive proof of binding at a final concentration of 0.75 nM against biotinylated peptide corresponding the SAM pocket of the enzyme requires structural confirmation to H4:3650. GST-tagged PRMT3 (residues 2531) expressed in E. coli was by crystallographic or NMR however, the multisubunit assayed at a final concentration of 0.5methods; nM against a biotinylated peptide with the nature enzymatically active PRC2 presents a challenge with sequence of biotin-aminohexyl-GGRGGFGGRGGFGGRGGFG-amide. Flag-tagged full-length PRMT5 expressed in 293 cells was assayed at a final respect to structural biology. Indeed, though the concentration structure of the of 1.5 nM against a biotinylated peptide corresponding to H4:1 15. His-tagged embryonic ectoderm development (EED) subunit of PRC2 has been PRMT6 (residues 2375) expressed in 293 cells was assayed at 27 29 a final concentradetermined by high-resolution crystallography , there have been tion of 1 nM against a peptide corresponding to H4:N3650 with monomethylated no literature reports of intact or assayed EZH2incrystal structures. Lys44. Full-length PRMT8 expressed inPRC2 E. coli was a final concentration Our efforts to generate apo or cocrystal structures the entire of 1.5 own nM against a biotinylated peptide corresponding to H4:3145.of Full-length SETD7 expressed inwith E. coli EPZ005687 was assayed at a final concentration of 1 Additionally, nM against a PRC2 complex were unsuccessful. biotinylated peptide corresponding to H3:1 15. Full-length SMYD3 in biophysical methods to confirm binding to theFlag-tagged EZH2 subunit was expressed in E. coli and assayed at a final concentration of 50 nM against isolation was not possible owing to the poor solubility, recombinant histone H4. His-tagged full-length SMYD2 was assayed at aunstable final contertiary and complete peptide absence of enzymatic activity of centration structure of 1 nM against a biotinylated corresponding to H4:36 50. Flag30 and His-tagged full-lengthTherefore, WHSC1 was a expressed in 293 cells and analysis assayed at a was the isolated subunit. Yonetani-Theorell

ARTICLE
N N

O chicken erythrocyte oligonucleosomes. Flagfinal concentration of 2.5 nM against N tagged full-length WHSC1L1 was expressed in S. frugiperda cells and was assayed N O N N N O at a final concentration of 4 nM against chicken erythrocyte oligonucleosomes.

Cell culture. Lymphoma cell lines OCI-LY19 (ACC-528), WSU-DLCL2 (ACC-575) O and HN Karpas422 (ACC-32) were obtained from Deutsche Sammlung von O O HN O Mikroorganismen und Zellkulturen. Toledo (CRL-2631), HT (CRL-2260), Pfeiffer HN and SUDHL6 (CRL-2959) cell lines HN were obtained from American (CRL-2632) Type Culture Collection. DOHH2 (HTL99022) was obtained from Banca Biologica e Cell Factory. SUDHL6 and Karpas422 cell lines were cultured in RPMI plus 20% (2)plus 10% (v/v) FBS. 1) other cell lines were cultured in RPMI (v/v) FBS, and ( all Analysis of long-term proliferation and cell cycle. Proliferation and cell cycle analysis were performed as previously described34, with slight exceptions. For the 11-d proliferation assay, plating N densities were determined for each cell lineNon the O basis of linear log-phase growth. CellsOwere countedN and split back to the original N N days 4 and 7. Viable cell plating N density in fresh medium with EPZ005687 on counts and IC50 calculations were performed as previously described34, and LCC calculations were performed as described in Supplementary Methods. O 12-well Ocell HN O HN For cycle, O WSU-DLCL2 cells were plated in plates at a density of 1 105 cells per ml. Cells were incubated with EPZ005687 at 0.2 M, 0.67 M, HN HN 2 M and 6 M, in a total of 2 ml, over a course of 10 d. All remaining cell cycle analysis was performed as previously described34.
PRC2 Received 19KMarch 2012;(4) EPZ005687 i = 80 nM 2012; accepted 13 July published online 30 September 2012 PRC2 Ki = 24 nM (3) PRC2 Ki = 310 nM PRC2 Ki = 180 nM

METHODS characterization of EPZ005687 Biochemical

complex containing the Enhancer of Zeste protein. Genes Dev. 16, 28932905 (2002). 2. Cao, R. et al. Role of histone H3 lysine 27 methylation in Polycomb-group silencing. Science 298, 1039 1043 (2002 ). performed to determine whether EPZ005687 bound in a mutually 3. Agger, K. et al. UTX and JMJD3 are histone H3K27 demethylases involved in exclusive fashion with SAH. In a previous report, we HOX gene regulation and development. Nature 449, 731 734demonstrated (2007). that SAH EZH2 SAM-competitive aas Ki 4. Hong , S. inhibits et al. Identifi cationin of a JmjC domain-containingmanner UTX and with JMJD3 of 7.5 M (ref. 31 ).demethylases The structural similarity between and histone H3 lysine 27 . Proc. Natl. Acad. Sci. USA 104SAM , 18439 18444overlapping (2007). SAH implies binding sites for these two ligands, and 5. Lee , M.G. et al. H3K27 regulates polycomb recruitment this inference isDemethylation confirmed of by crystallographic analysis of SAM and H2A ubiquitination. Science 318, 447450 (2007). and SAH complexes of a number of PMTs (reviewed in ref. 32). 6. Lan, F. et al. A histone H3 lysine 27 demethylase regulates animal posterior Figure 2d shows a Yonetani-Theorell development . Nature 449, 689694 (2007).plot of the reciprocal of reac7. De Santa, F.as et al. The histone lysine-27 demethylase Jmjd3 links tion velocity a function ofH3 SAH concentration at several different inflammation to inhibition of polycomb-mediated gene fit silencing Cell 130, of EPZ005687 concentrations. The data were best with . a series 10831094 (2007). parallel lines, indicative of mutually exclusive binding of the two 8. Kleer, C.G. et al. EZH2 is a marker of aggressive breast cancer and promotes inhibitors. these data epithelial suggest cells that EPZ005687 inhibits neoplastic Overall, transformation of breast . Proc. Natl. Acad. Sci. USA EZH2 binding the 100,by 11606 11611in (2003 ).SAM pocket. 9. EPZ005687 Varambally, S. et The polycomb group protein EZH2 is involved is al. a potent and selective inhibitor of PRC2 in activity. prostate of cancer . Nature 419, 624 629 (2002 ). Weprogression tested theofactivity the compound against a panel of 15 other 10. Kirmizis, A. et al. Silencing of human polycomb target genes is associated human PMTs and 6 EZH2 enzymes with point mutations in the SET with methylation of histone H3 Lys 27. Genes Dev. 18, 15921605 (2004). domain at, Tyr641 or Ala677. As illustrated the ligand affinity map 11. Bracken A.P. et al. EZH2 is downstream of the in pRB-E2F pathway, essential (Fig. ), EPZ005687 >500-fold selectivity against of (the tested for3 proliferation and had amplifi ed in cancer . EMBO J. 22 , 5323all 5335 2003 ). 12. Simon , J.A.the & Lange , C.A. Roles of the EZH2 histone PRC2 methyltransferase PMTs, with exception of the closely related complex in concancer epigenetics . Mutat. 647, 21 29selectivity (2008). taining EZH1 in place ofRes. EZH2. The of EPZ005687 was 13. Velichutina, I. et al. EZH2-mediated epigenetic silencing in germinal center B further evaluated by measuring its ability to displace radioligands cells contributes to proliferation and lymphomagenesis. Blood 116, 52475255 from 77 (2010 ). human ion channels and G proteincoupled receptors. At a concentration of 10 M, EPZ005687 did not displace radio14. van Haaften, G. et al. Somatic mutations of the histone H3K27 demethylase gene UTX inmost human . Nat. Genet. 41, 521 523 (2009). for only four ligands from ofcancer the targets tested. Radioligands 15. Wang , S., Robertson , G.P . &more Zhu, J.than A novel homologue of Drosophila targets were displaced by 50human % (Supplementary Results, polycomblike gene is up-regulated in multiple cancers. Gene 343, 6978 (2004). Supplementary Table 1 ), and the lowest IC extrapolated for any of 50 16. Morin, R.D. et al. Somatic mutations altering EZH2 (Tyr641) in follicular and these 1.5 M, indicating a selectivity of >60-fold. difftargets use largewas B-cell lymphomas of germinal-center origin . Nat. Genet. 42, EPZ005687 also showed ~50-fold selectivity for EZH2 over 181185 (2010). 1 17. Lohr, J.G. et al. Discovery and of somatic in diffuse EZH1-containing PRC2 ( prioritization Gbinding > 2 kcal molmutations ; Supplementary large B-cell lymphoma (DLBCL) by whole-exome sequencing . Proc. Natl. was Fig. 1 and Supplementary Table 1). The affinity of EPZ005687 Acad. Sci. USA 109, 38793884 (2012). similar (within a two-fold range) of for PRC2 complexes 18. Morin , R.D. et al. Frequent mutation histone-modifying genescontaining in wild-type and lymphoma Tyr641 mutant EZH2. In ( contrast, the compound non-Hodgkin . Nature 476 , 298303 2011). had significantly affinity for the A677Gplus mutant 19. Sneeringer , C.J. et greater al. Coordinated activities of wild-type mutantenzyme EZH2 drive tumor-associated hypertrimethylation of lysine 27 on histone H3 (5.4-fold; P < 0.05). These findings were consistent across several (H3K27) in human B-cell lymphomas . Proc. Natl. Acad. Sci.Taken USA 107 , hundred compounds within this chemotype series. together 2098020985 (2010). with our previous demonstration that the Kact SAM is unaffected m of 20. Yap , D.B . et al. Somatic mutations at EZH2 Y641 dominantly through a 19,33 by mechanism the mutations , this observation implies that the structural of selectively altered PRC2 catalytic activity, to increase H3K27 trimethylation . Blood 117 2451 2459 (2011 ). recognition elements of, the indazole series do not differ between
891 895 S39

References Figure 1 | Chemical structures of PRC2 inhibitors. Wild type EZH21. Kuzmichev , A.,K Nishioka , K., Erdjument-Bromage, H., Tempst, P. & Reinberg, containing PRC2 i values shown are the mean of at least two independent D. Histone methyltransferase activity associated with a human multiprotein experiments, with each experiment run in duplicate.

NATURE BIOLOGY | VOL 8 | NOVEMBER 2012 | www.nature.com/naturechemicalbiology NATURECHEMICAL REPRINT COLLECTION Epigenetics

ARTICLE
Percentage of inhibition IC50 (nM)

NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.1084


Y641N Y641S 40. Tsai, J. et al. Discovery of Y641H a selective inhibitor of oncogenic B-Raf kinase with Y641F potent antimelanoma activity . Proc. Natl. Acad. Sci. USA 105, 30413046 (2008). Y641C A677G PRMT5 DOT1L PRMT6 41. SMYD3 Kwak , E.L. et al. Anaplastic lymphoma kinase inhibition in nonsmall-cell PRMT3 SMYD2 EZH1 EZH2 lung cancer. N. Engl. J. Med. 363,PRMT1 16931703 (2010). CARM1 as personalized 42. Copeland, SETD7 R.A. Protein methyltransferase inhibitors PRMT8 cancer therapeutics. Drug Discov. Today Ther. Strateg. published online, doi:10.1016/j.ddstr.2011.08.001 (16 September 2011). 43. Copeland, R.A., Solomon, M.E. EHMT1 & Richon, V.M. Protein methyltransferases as a target class for drug discoveryEHMT2 . Nat. Rev. Drug Discov. 8, 724732 (2009). 44. Vedadi, M. et al. A chemical probe selectively inhibits G9a and GLP methyltransferase activity in cells. Nat. Chem. Biol. 7, 566574 (2011). 45. Ernst, T. et al. Inactivating mutations of the histone methyltransferase gene 109 108 107 106 105 >5 105 WHSC1L1 EZH2 in myeloid disorders . Nat. Genet. 42, 722726 (2010). Legend: 46. Nikoloski, G. et al. WHSC1 Somatic mutations Ki (M) of the histone methyltransferase gene EZH2 in myelodysplastic syndromes. Nat. Genet. 42, 665667 (2010). 47. Ntziachristos, P. et al. Genetic inactivation of the polycomb repressive complex Figure Ligand maps leukemia of EPZ005687 across the family trees 2 in3 T| cell acuteaffinity lymphoblastic . Nat. Med. 18, 298 301 (2012 ).

21 . McCabe, M.T. et al. Mutation of A677 b in histone methyltransferase EZH2 in a 300 human B-cell lymphoma promotes hypertrimethylation of histone H3 on 100 lysine 27 (H3K27). Proc. Natl. Acad. Sci. USA 109, 29892994 (2012). 200 methylation inhibitor that 22. Miranda, T.B. et al. DZNep is a global histone reactivates developmental genes not silenced by DNA methylation. 50 Mol. Cancer Ther. 8, 15791588 (2009). 100 23. Tan, J. et al. Pharmacologic disruption of Polycomb-repressive complex 2-mediated gene repression selectively induces 0 0 apoptosis in cancer cells. Genes 4 ). 5 0 2 4 0 Dev. 121, 1050 2 1063 3 (2007 6 log (nM) [SAM]/KmA guide for 24. Copeland, R .AEPZ005687 . Evaluation of enzyme inhibitors in drug discovery. c medicinal chemists and pharmacologistsd(John Wiley & Sons, 2005). (nM) 25. Duquenne , C. et al. Indazoles. International patent application EPZ005687 PCT 0.3 300 200 WO2011140325 (2011). 133 26. Burgess, J. et al. Azaindazoles. International patent application PCT 89 200 0.2 WO2012005805 (2012). 59 27. Xu, C. et al. Binding of different histone marks differentially regulates the 39 26 100 0.1 activity and specificity of polycomb repressive complex 2 (PRC2). Proc. Natl. 18 Acad. Sci. USA 107, 1926619271 (2010). 12 28. Han by EED. Structure 15, 0, Z. et al. Structural basis of EZH2 recognition 0 1306 (20074 ). 0 10,000 20,000 30,000 01315 2 6 8 10 SAH (nM) [Nucleosome]/ 29. Margueron, R . et al. Role K of the propagation of m the polycomb protein EED in repressive histone marks. Nature 461, 762767 (2009). 30. Yonetani , T. & Theorell ,a HSAM-competitive . Studies on liver alcohol hydrogenase complexes. Figure 2 | EPZ005687 is inhibitor of EZH2 enzyme 3. Multiple inhibition kinetics in the presence of two competitive inhibitors. activity. ( a ) Inhibition of EZH2 when activity is assessed under balanced Arch. Biochem. Biophys. 106, 243251 (1964). 24 conditions and peptide substrates using a Flashplate assay 31. Richon, Vfor .M. both et al. SAM Chemogenetic analysis of human protein to measure the transfer of a tritiated methyl group SAM methyltransferases . Chem. Biol. Drug Des. 78 , 199from 210 (2011 ). to the peptide. 32. Chapman P.B . et Improved survivalisotherm with vemurafenib in melanoma The data are, fit to a al. standard Langmuir for inhibition, and thewith IC50 BRAF V600E mutation . N. Engl. J. Med. 364 , 2507 2516 ( 2011 ). of EPZ005687 was calculated to be 54 5 nM with a Hill slope of 1. The 33. Wigle, T.J. et al. The Y641C mutation of EZH2 alters substrate specificity for data shown are the average and s.d. of seven independent duplicate runs. histone H3 lysine 27 methylation states . FEBS Lett. 585, 3011 3014 (2011 ). (b.) Daigle Plot of IC values of EPZ005687 a function SAM concentration 50 34 ,S .R . et al. Selective killing of as mixed lineage of leukemia cells by a potent small-molecule DOT1L inhibitor .K Cancer Cell 20, using 5365a (2011 ). relative to the Km of SAM ([SAM] / Flashplate assay m) measured 35. Ben-Porath , I50 . et al. An embryonic stem cell-like gene expression in similar to the IC measurements described above. These values signature show poorly differentiated tumors. Nat. Genet. 40, 499 507 a linear relationship, asaggressive expected human for SAM-competitive inhibition with (2008). a K of 24 , 7 (Th s.d. of three experiments). (c) Plot ofantibody IC50 values 36 . iDornan DnM . et al. erapeutic potential of an anti-CD79b drug of EPZ005687 as function of chicken oligonucleosome conjugate, anti a CD79b-vc-MMAE, for erythrocyte the treatment of non-Hodgkin lymphoma. relative Blood 114 , the 2721 m 2729 (2009). concentration to K of nucleosome ([Nucleosome]/Km) 37. Renan, using M.J. How many mutations are required for measured a filter-binding microplate assay totumorigenesis? measure the transfer Implications from human cancer data. Mol. Carcinog. 7, 139146 (1993). of tritiated methyl groups from SAM to the oligonucleosome. As expected 38. Kaelin, W.G. Jr. Choosing anticancer drug targets in the postgenomic era. for a with respect to this substrate, the IC50 is J. noncompetitive Clin. Invest. 104, inhibitor 15031506 (1999 ). unaffected the concentration of oligonucleosome is increased. The 39. Li, R. & as Staff ord , J.A. Kinase Inhibitor Drugs (John Wiley & Sons, Inc. , 2009).
1/velocity (c.p.m.1 min1) IC50 (nM)

We thank D.enzymes. Johnston and A. K Basavapathruni for performing DOT1L and WHSC1 and EZH1 The i of EPZ005687 was measured across a enzyme selectivity assays, lysine K. Kuplast for help with the LCC calculations, A.arginine Santospago panel of recombinant methyltransferase (KMT; left) and for preparation of assay plates and R. Gould for helpful discussions. methyltransferase (RMT; right) enzymes at balanced conditions24 of both the SAM and peptide or protein substrates. The EZH2 Tyr641 and Ala677 Author contributions mutant are indicated above wild-type EZH2. including The Ki values were L.J. madeenzymes the enzymes. K.W.K. and E.J.O. designed compounds EPZ005687. converted toand pKiC.J.S. values and used generate red circles of proportional T.J.W., C.R.M. performed theto enzyme inhibition assays, and T.J.W. performed substrate competitions, analysis and the in vitro EZH2 pull-down sizes to indicate the Yonetani-Theorell extent of inhibition as shown the legend. Larger assay. S.K.K., N.M.W., C.R.K., J.S. and versus J.D.S. performed the intracellular inhibition circles correlate to C.J.A., increased potency the enzymes, and gray circles of H3K27 methylation ELISA. S.K.K. and N.M.W. performed the western blotting of all indicate that inhibition was not measurable at concentrations up to 50 M methyl marks and proliferation assays. S.K.K., N.M.W. and J.J.S. performed gene expresof EPZ005687. sion and cell cycle experiments. S.K.K., T.J.W., K.W.K., A.R., J.J.S., M.P.S., R.M.P., R.C.,

enzymes show EPZ005687 is a selective and potent inhibitor of EZH2 Acknowledgments

of human lysine methyltransferases and arginine methyltransferase

by analyzing the supernatant and boiled magnetic beads by SDSCompeting financial interests PAGE. The PRC2 complex was pulled down intact regardless of The authors declare competing financial interests: details accompany the online version whether or not EPZ005687 was bound, and the supernatant was not of the paper. enriched for any displaced subunit relative to the DMSO control ( Supplementary Fig. 2). Additional information of the PRC2 substrate H3K27 within lymphoma cells. Figure 4a and Supplementary Figure 3 illustrate a typical western blot against H3K27me3 at increasing concentrations of EPZ005687 for the EZH2 wild-type lymphoma cell line OCI-LY19 and demonstrate a clear concentration-dependent inhibition of H3K27me3. Quantification of H3K27me3 by ELISA yielded an IC50 of 80 30 nM for the blot in Figure 4a. Similar results were obtained for additional EZH2 wild type, EZH2 Tyr641 and Ala677 mutant lymphoma cell lines as well as for cell lines of other cancer types, including breast and prostate cancer. Thus we conclude that the compound is cell permeable and inhibits methylation of the physiologically relevant substrate of PRC2. The exquisite selectivity of EPZ005687 for PRC2, demonstrated in biochemical assays (Fig. 3), is recapitulated within the cellular milieu. This is illustrated for the wild-type lymphoma line OCILY19 and the Y641F mutantbearing lymphoma line WSU-DLCL2 in Figure 4b and c, respectively. In these experiments, histones were isolated from cells after treatment with or without a high concentration of EPZ005687 (5.6 M) for 4 d and probed for a broad panel of histone post-translational modifications (Figure 4b,c and Supplementary Fig. 4). The only histone methyl marks decreased by compound treatment are those at H3K27. In the wild-type OCILY19 cell line, both H3K27me3 and H3K27me2 are greatly reduced by compound treatment. The EZH2 mutant WSU-DLCL2 cell line, however, showed a decrease only in the H3K27me3 mark; it was not possible to observe a decrease in H3K27me2 owing to the already undetectable dimethylation in Tyr641 mutant cell lines. To our surprise, the amount of monomethylated H3K27 seemed to be unaffected by compound treatment in both cell types, suggesting that H3K27 monomethylation may be carried out by enzymes other
Supplementary information, chemical compound information and chemical probe information is available in the online of the paper. Reprints and permissions Intracellular inhibition ofversion H3K27 methylation information is available online at http://www.nature.com/reprints/index.html . We next tested the ability of EPZ005687 to block methylation Correspondence and requests for materials should be addressed to K.W.K.

M.P.M., V.M.R., R.A.C. and H.K. designed studies and interpreted results. S.K.K., T.J.W., K.W.K. and R.A.C. wrote the paper.

mean and standard error of three experiments are shown. (d) YonetaniTheorell analysis of SAH and EPZ005687 indicates that they are mutually exclusive inhibitors of PRC2. Assays were performed by combining several concentrations of SAH and EPZ005687 and yielded a series of parallel lines in a plot of 1/velocity as a function of SAH concentration for several concentrations of EPZ005687 tested. The mean and standard error of three experiments are shown.

wild-type and Tyr641 mutations. However, the enhanced affinity for the A677G mutant leads us to surmise that EPZ005687 may engage additional interactions as a result of this mutation. Similarly, the significantly (P < 0.05) diminished affinity of the compound for EZH1-containing PRC2, which contains the identical Suz12, EED and RbAp48 subunits, likewise suggests that compound affinity is affected by key recognition elements of binding within these closely related catalytic subunits and not by the other three members of the holoenzyme complex. In aggregate, the SAM-competitive inhibition modality, mutually exclusive binding with SAH and impact on binding affinity of A677G or EZH1 substitution for wild-type EZH2 in the PRC2 complex strongly lead us to infer that the binding site for EPZ005687 is contained within the catalytic EZH2 or EZH1 subunit of the PRC2 complex and is likely to overlap with the binding site for SAM. We have further demonstrated that EPZ005687 is a direct inhibitor of PRC2 enzymatic activity and does not function by disrupting the protein-protein interactions among the PRC2 subunits. This was shown by performing a magnetic Flag pulldown of the wild-type PRC2 complex containing a Flag-tagged EED subunit with and without saturating concentrations of EPZ005687 and
892 8 96 S40

NATURE CHEMICAL BIOLOGY | VOL 8 | NOVEMBER 2012 | www.nature.com/naturechemicalbiology NATURE REPRINT COLLECTION Epigenetics

NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.1084


a selective DOT1L inhibitor, Concentration and was of used to elucidate the causal EPZ005687 (m) role of DOT1L enzymatic activity in MLL-rearranged leukemia34. In the present work, we identified EPZ005687 as a potent and selective inhibitor of wild-type and mutant EZH2containing PRC2 enzymaticK27me3 activity. We showed that the compound selectively inhibits H3K27 methylation in cells and that this translated into selecTotal H3 tive cell killing for lymphoma cells that contain heterozygous EZH2 c mutations at Tyr641 or Ala677. These data established a critical b DMSO DMSO K27me3 K27me3 and unique dependency on PRC2 enzymatic activity for EPZ005687 the lymEPZ005687 K27me2 K27me2 phoma cell lines that bear these EZH2 mutations. This dependency K27me1 is K27me1 equivalent to the concept of oncogene addiction, in which cells K27ac K27ac become abnormally dependent on the biochemical activity of a K4me3 K4me3 specific oncogene product for growth, survival or both, such that K9me3 K9me3 ablation in the genetically altered cells K36me2 of the oncogene is cytotoxic K36me2 but inconsequential to growth of normal K79me2 K79me2 cells. The present results provide 0a compelling for the0 clinical use of 400 selective 100 200 300 foundation 400 100 200 300 Percentage for of DMSO Percentage of DMSO EZH2 inhibitors the treatment of mutant-bearing lymphomas. The current compound represents a chemical biological probe for Figure 4 |experiments, EPZ005687 specifically inhibits H3K27 methylation in in vitro and we do not suggest that this compound lymphoma cells. ( a ) The wild-type EZH2 lymphoma cell line OCI-LY19 itself could form the basis for patient treatment. Pharmacological shows a dose-dependent decrease in as H3K27me3 after treatment optimization of compounds such EPZ005687 holds greatwith promise EPZ005687 for 96outcome. h. (b,c) A wild-type lymphoma cell line, OCI-LY19 (b), for this eventual andGenetic a mutantalterations lymphoma cell WSU-DLCL2 (cPRC2 ), show subunits the specificity in line, EZH2 and other are not of H3K27to methylation inhibition EPZ005687 across observed a broad panel limited the Tyr641 and by Ala677 mutations in lymof histone methylation marks. Quantification of methylation changes phoma. A broad spectrum of genetic alterations of PRC2 hasis been represented in the bar graphs the right of each western Notably, blots. documented in a range of to hematologic andpanel solidof tumors. Representative western blotsand (n = T-cell 1) wereleukemia, normalizedmutations to corresponding in myeloid malignancies in EZH2 total H3 and expressed as percent change in EPZ005687-treated versus and other PRC2 components lead to a loss of function of the comDMSO-treated cells. plex4547. The fact that both activating and inactivating mutations of EZH2 are associated with malignancy is remarkable and reflects the complex role of PRC2 target genes in cell fate decisions. EPZ005687 is shown here to be as anEZH1-containing equally potent inhibitor of than EZH2-containing PRC2, such PRC2 (as both wild-type and Tyr641 increase or Ala677 of EZH2, was suggestdescribed above). A modest in mutants H3K27 acetylation also ing that pharmacologically optimized inhibitors with this inhibition observed upon treatment of the OCI-LY19 cells with compound. profile may be usefulincrease in the treatment of a number of in human canAdditionally, a slight in H3K36me2 was seen the WSUcers wherein function of PRC2 drives DLCL2 cells gain-of-enzymatic treated with EPZ005687. The degree ofdisease. interplay between these two methylation marks may be dependent on cell METHODS context, as the increase in H3K36me2 was not observed in the OCIDetermination of inhibitor IC50 values in the PMT panel. Values for enzymes in the LY19 cell line upon inhibition of EZH2.
SO 0. 04 4 0. 08 0. 8 18 0. 35 0. 70 1.4 2. 8 DM 5. 6 Un tre at ed

ARTICLE

histone methyltransferase panel were determined under balanced assay conditions with both SAM and protein or peptide substrate present at concentrations equal Impact of EPZ005687 on cell growth a peptide was used as a methyl-accepting to their respective Km values24. Where substrate, established the peptide is referred to here by thecan histone andcells residue numbers Having that EPZ005687 enter and selectively that it represents. For example, peptide H3:1630 refers a peptide affect H3K27 methylation, we investigated theto impact ofrepresenting PRC2 inhihistone H3 residues 16 through 30. All reactions were run at 25 C in a 50-l bition on cell growth in wild-type and mutant lymphoma cell lines. volume with 2% (v/v) DMSO in the final reaction. Flag- and His-tagged CARM1 We studied the effects of cells varying concentrations of EPZ005687 (residues 2585) expressed in 293 was assayed at a final concentration of on lymphoma lines, OCI-LY19, WSU-DLCL2 and Pfeiffer. 0.25 three nM against a biotinylated peptide corresponding to histone H3:16 30 with Y641F a monomethylated Arg26. His-taggedcontain Dot1L (residues 1416)EZH2, expressed in These cell lines respectively wild-type EZH2 A677G Escherichia coli was.assayed at a finalconcentrations concentration of 0.25 against chicken and EZH2 Increasing ofnM compound from 13. Velichutina, I. et al. EZH2-mediated epigenetic silencing in germinal center B erythrocyte oligonucleosomes. His-tagged EHMT2 (residues 9131193) expressed cells( contributes to proliferation and lymphomagenesis. Blood 116, 52475255 in E. coli was assayed at a final concentration of 0.1 nM against a biotinylated Concentration of EPZ005687 m) (2010 ). 0.93 peptide corresponding to H3:115. His-tagged EHMT1 (residues 9511235) DMSO 0.011 0.034 0.10 0.31 2.8 8.3 of the histone H3K27 demethylase expressed E. coli was assayed at a final concentration of 0.1 nM a 109 in c 108 mutations(A677G) b 109 against a bio-(Y641F) 14. van Haaften, G. et al. Somatic OCI-LY19 (WT) gene UTX in human cancer. Nat. Pfeiffer Genet. 41, 521523 (2009). tinylated peptide corresponding to H3:115. Full-length glutathioneWSU-DLCL2 S-transferase 8 15. Wang, S., Robertson, G.P. & Zhu , J. A novel human homologue of Drosophila (GST)-tagged PRMT1 expressed in Spodoptera frugiperda cells was assayed 8 7 10 10 10 polycomblike gene is up-regulated in multiple cancers. Gene 343, 6978 (2004). at a final concentration of 0.75 nM against biotinylated peptide corresponding 7 16. Morin, R.D. et al. Somatic mutations altering EZH2 (Tyr641) in follicular and to H4:36 10 50. GST-tagged PRMT3 (residues 2531) expressed in E. coli was 7 6 10 peptide with the 10 of germinal-center origin. Nat. Genet. 42, diffuse large B-cell lymphomas assayed at a final concentration of 0.5 nM against a biotinylated 6 10 181185 (2010). sequence biotin-aminohexyl-GGRGGFGGRGGFGGRGGFG-amide. Flag-tagged 6 5 10concentration 10 prioritization of somatic mutations in diffuse 17. Lohr, J.G. et al. Discovery and full-length PRMT5 expressed in 293 cells was assayed at a final 5 10 large B-cell lymphoma (DLBCL) by whole-exome sequencing. Proc. Natl. of 1.5 nM against a biotinylated peptide corresponding to H4:115. His-tagged 5 4 10 10 Acad. Sci. USA 109, 38793884 (2012). PRMT6 104 (residues 2375) expressed in 293 cells was assayed at a final concentra18. Morin, R.D. et al. Frequent mutation of histone-modifying genes in tion of 1 nM against a peptide corresponding to H4:N3650 with monomethylated 4 3 3 10 10 Full-length PRMT8 expressed in E. coli was assayed in 10 non-Hodgkin lymphoma. Nature 476, 298303 (2011). Lys44. a final concentration 0 1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 196 . Sneeringer , C.J. et al. Coordinated activities of wild-type plus mutant EZH2 of 1.5 nM against a biotinylated peptide corresponding to H4:310 45. Full-length Time (d) Time (d) Time (d) drive tumor-associated hypertrimethylation of lysine 27 on histone H3 SETD7 expressed in E. coli was assayed at a final concentration of 1 nM against a in human B-cell lymphomas(WT) . Proc. OCI-LY19 Natl. Acad.cells Sci. USA biotinylated peptide corresponding to proliferation H3:115. Full-length Flag-tagged SMYD3 Figure 5 | EPZ005687 decreases in mutant but not wild-type EZH2(H3K27) lymphoma cells. ( ac) Wild-type (a), 107, 2098020985 (2010). was expressed in E. coli and assayed at a final concentration of 50 nM against WSU-DLCL2 (Y641F) ( b ) and Pfeiffer (A677G) ( c ) cells were treated with EPZ005687 over an 11-d time course, and proliferation was measured at the a 20. Yap, D.B. et al. Somatic mutations at EZH2 Y641 act dominantly through recombinant histone H4. His-tagged full-length SMYD2 was assayed at a final conindicated points. The viable cell count (y axis) in eachto panel ispresented logarithmicof scale as thealtered mean PRC2 of triplicates s.e.m. The proliferation selectively catalyticactivity, to increase H3K27 centration time of 1 nM against a biotinylated peptide corresponding H4:36 50. Flag- on a mechanism trimethylation. Blood 117, 24512459 (2011). and full-length WHSC1 was expressed in 293 cells IC and LCC values are listed in Supplementary Table 2.and assayed at a 50 His-tagged
Viable cells per ml Viable cells per ml Viable cells per ml

0.011 M to 8.3of 2.5 M nM had a minimal on proliferation of OCIfinal concentration against chickeneffect erythrocyte oligonucleosomes. Flagtagged cells full-length was expressed in S.5a frugiperda cells andEPZ005687 was assayed LY19 overWHSC1L1 the course of 11 d (Fig. ). In contrast, at a final concentration 4proliferation nM against chicken erythrocyte oligonucleosomes. -bearing cell line had a notable effect of on of the EZH2Y641F Y641F ( Fig. 5b). In all of the EZH2 -bearing cell lines tested (described Cell culture. Lymphoma cell lines OCI-LY19 (ACC-528), WSU-DLCL2 below), there was a consistent andobtained reproducible latency periodvon of (ACC-575) and Karpas422 (ACC-32) were from Deutsche Sammlung Mikroorganismen Toledo (CRL-2631), (CRL-2260), 4 d over which und theZellkulturen. compound seemed to haveHT little impact Pfeiffer on cell (CRL-2632) and SUDHL6 (CRL-2959) cell from American growth followed by a period from 4lines 11 were d in obtained which the impact of the Type Culture Collection. DOHH2 (HTL99022) was obtained from Banca Biologica compound was fully realized. A time course of H3K27me3 inhibie Cell Factory. SUDHL6 and Karpas422 cell lines were cultured in RPMI plus 20% tion in cells treated with EPZ005687 (Supplementary Fig. 5) dem(v/v) FBS, and all other cell lines were cultured in RPMI plus 10% (v /v) FBS. onstrated that the diminution of H3K27me3 was apparent within Analysis of long-term proliferation cellday cycle. and cell cycle 24 h but was not fully realizedand until 4 Proliferation and beyond. Remarkably, 34 , with slight exceptions. (ref. For the analysis were performed previously described when the potent andas selective DOT1L inhibitor EPZ004777 34) 11-d proliferation assay, plating densities were determined for each cell line on the was applied to MLL -rearranged leukemia cell lines, a similar latency basis of linear log-phase growth. Cells were counted and split back to the original period in the inhibition H3K79me2 methylation and cell cellular plating density in fresh medium of with EPZ005687 on days 4 and 7. Viable 34 proliferation was observed. This delay may bedescribed a common counts and IC50 calculations were performed as previously , and feature LCC calculations were performed as described in Supplementary Methods . antiproof inhibitors of PMT enzymatic activity. The latency of the For cell effect cycle, WSU-DLCL2 cells in 12-well plates at a density of liferative was shorter inwere the plated Pfeiffer cell line, which contains 1 105 A677G cells per ml. Cells were incubated with EPZ005687 at 0.2 M, 0.67 M, EZH2 (ref. 21), and this cell line was found to be particularly 2 M and 6 M, in a total of 2 ml, over a course of 10 d. All remaining cell cycle 34 sensitive to EZH2 inhibition by EPZ005687 (Fig. 5c). analysis was performed as previously described . Antiproliferative compounds may affect reduction of cell growth by either 19 causing cell stasis or cell 13 killing. Historically, the effects Received March 2012; accepted July 2012; published online 30 September 2012 of such compounds have been quantitatively compared using their IC50 values, which report on the concentration of compound References required to reduce the rate of cell growth (or, more typically, the cell 1. Kuzmichev, A., Nishioka, K., Erdjument-Bromage, H., Tempst, P. & Reinberg, number at a specified time point) by half of the untreated control D. Histone methyltransferase activity associated with a human multiprotein value. We have found use of to differen50 values complex containing thethe Enhancer ofIC Zeste proteininadequate . Genes Dev. 16 , 2893 2905 tiate between cytostatic and cytotoxic effects of compound treat(2002 ). 2. Cao , R. et al. Role of histone H3 lysine 27 a methylation in Polycomb-group ment of cells. Therefore, we propose new metric for quantifying Science 298, 10391043 (2002). thesilencing effects .of antiproliferative compounds on cell growth, the lowest 3. Agger, K. et al. UTX and JMJD3 are histone H3K27 demethylases involved in cytotoxic concentration The LCC is defined as the ). concenHOX gene regulation and (LCC). development . Nature 449 , 731734 (2007 tration atcation which proliferative rate becomes zero and 4. Hongof , S.inhibitor et al. Identifi ofthe JmjC domain-containing UTX and JMJD3 as represents the crossover point between cytostasis and104 cytotoxicity. histone H3 lysine 27 demethylases . Proc. Natl. Acad. Sci. USA , 1843918444 (2007). Additional information on the calculation of LCC is presented in 5. Lee , M.G. et al. Demethylation the Supplementary Methodsof . H3K27 regulates polycomb recruitment and H2A ubiquitination. Science 318, 447450 (2007). The IC and LCC values for EPZ005687 treatment of mul6. Lan, F. et50 al. A histone H3 lysine 27 demethylase regulates animal posterior Y641F Y641N tiple wild-type and mutant (EZH2 and EZH2A677G) development . Nature 449, 689 694 (2007). , EZH2 7. De Santa, F . et al. The are histone H3 lysine-27 in demethylase Jmjd3 linksTable 3. lymphoma cell lines summarized Supplementary inflammation to inhibition ofdifferential polycomb-mediated gene silencing . Cell 130,on These data make clear the effects of the compound 10831094 (2007). wild-type and mutant-bearing cells. Though some modest cyto8. Kleer, C.G. et al. EZH2 is a marker of aggressive breast cancer and promotes static effectstransformation were observed in wild-type lymphoma cells, the neoplastic of breast epithelial cells . Proc. Natl. Acad. Sci.comUSA pound showed robust cell 100, 11606 11611 (2003 ). killing only for the Tyr641 mutant and A677G , S. et al. The polycomb group protein EZH2 is involved in 9. Varambally EZH2 -bearing lymphoma lines. The wild-type cell lines had progression of prostate cancer . Nature 419, 624 629 (2002). used in the LCC values greater than the highest concentration 10. Kirmizis, A. et al. Silencing of human polycomb target genes is associated proliferation assay (>25 M). In contrast, the LCC values for the with methylation of histone H3 Lys 27. Genes Dev. 18, 15921605 (2004). Tyr641 mutant lines all in of the to mid-micromolar 11. Bracken , A.P. et cell al. EZH2 is were downstream thelowpRB-E2F pathway, essential A677G range, and the LCC for the EZH2 mutant cell5335 line (was even for proliferation and amplifi ed in cancer . EMBO J. 22, 5323 2003). 12. Simon , J.A. & Lange , C.A . Roles of thepresence EZH2 histone methyltransferase in more potent (36 nM). Clearly, the of heterozygous mutacancer . Mutat. 647, 21is 29 tions in epigenetics the EZH2 SET Res. domain a (2008 key ).driver of sensitivity to

NATURE BIOLOGY | VOL 8 | NOVEMBER 2012 | www.nature.com/naturechemicalbiology NATURECHEMICAL REPRINT COLLECTION Epigenetics

893 895 S41

ARTICLE
Percentage of total cell cycle Percentage of total cell cycle

NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.1084


Percentage of total cell cycle

population of cells, whereas the lower doses resulted in a continued increase of the G1 population (Fig. 6b). The prolonged exposure to EPZ005687 for 10 d led to WSU-DLCL2 cells progressing further toward the sub-G1 population, as seen with both the 2-M and 6-M doses (Fig. 6c). Gene set enrichment analysis (GSEA) of transcriptional profiling data from WSU-DLCL2 cells treated with a high and low dose (6 M and 1.5 M, respectively) of EPZ005687 revealed a negative enrichment of cell cycle gene sets as early as 24 h after addition of EPZ005687 (Supplementary Fig. 6a and Supplementary Table 4). These data further complement the cell cycle analysis showing a progression toward G1 accumulation upon treatment of EPZ005687 (Fig. 6). Additional GSEA showed strong enrichment of PRC2-regulated gene sets in WSU-DLCL2 cells treated with EPZ005687. Using a centroblast-repressed gene signature, in which the chosen genes were identified by chromatin immunoprecipitation to be bound by EZH2 and marked with H3K27me3 in centroblast cells relative to naive B cells13, a strong enrichment of this gene set was observed with the higher dose of EPZ005687 at all time points (Supplementary Fig. 6b and Supplementary Table 5). Expression of these genes upon EZH2 inhibition may lead to a more naive B cell or a more differentiated phenotype. Upregulation of a PRC2-repressed gene signature35 is also significantly enriched in the EPZ005687-treated WSU-DLCL2 cells (P < 0.01 across all time points; Supplementary Fig. 6c and Supplementary Table 6), suggesting that small-molecule inhibition of EZH2 can lead to increased expression of known repressed targets of EZH2 (ref. 36). Taken together, the GSEA data strongly suggest that small-molecule inhibition of EZH2 in a Tyr641 mutant lymphoma
8 94 96 S42

. Tsai, J. et al. Discovery ofca selective inhibitor ofDay oncogenic B-Raf kinase with 21. McCabe , M.T. et al. Mutation of A677 in histone methyltransferase EZH2 in Day 40 a b Day 4 10 7 100 B-cell lymphoma promotes hypertrimethylation 100 Sub-G1 Sub-G1 activity Sub-G1 potent antimelanoma . Proc. Natl. Acad. Sci. USA 105, 30413046 (2008). human of 100 histone H3 on G1 G1 G1 41. Kwak, E.L. etSal. Anaplastic 80 lymphoma kinase inhibition in nonsmall-cell lysine 27 (H3K27). Proc. Natl. Acad. Sci. USAS109, 29892994 (2012). S 80 80 lung cancer. N. Engl. J. Med. 363, 16931703 (2010). 22. Miranda, T.B. et al. DZNep is a global histone methylation inhibitor that G2/M G2/M G2/M 42. Copeland, R.A. Protein methyltransferase inhibitors as personalized reactivates developmental genes not silenced by DNA methylation . 60 60 60 cancer therapeutics. Drug Discov. Today Ther. Strateg. published online, Mol. Cancer Ther. 8, 15791588 (2009). 40 (16 September 2011). 40 complex doi:10.1016/j.ddstr.2011.08.001 23. Tan40 , J. et al. Pharmacologic disruption of Polycomb-repressive 43 . Copeland , R . A . , Solomon , M .E. & Richon, V.M. Protein methyltransferases as 2-mediated gene repression selectively induces apoptosis in cancer cells . 20 20 20 a target class for drug discovery. Nat. Rev. Drug Discov. 8, 724732 (2009). Genes Dev. 21, 10501063 (2007). 44. Vedadi, M. et al. A chemical 0 probe selectively inhibits G9a and GLP 24. Copeland , R.A. Evaluation of enzyme inhibitors in drug discovery. A guide for 0 0 O O(John Wiley & Sons, 2005 activity in cells . dNat.M Chem. 7, 566 574SO (2011). medicinal ). M M M M pharmacologists M and M M methyltransferase M edchemists ed e S S t t MBiol. M M a a at M 2 2 2 45. Ernst 2 2 2 6 6 6 6 6 6 re DM DM , DT . et al. Inactivating mutations gene 25. Duquenne patent applicationntPCT tre , C tre 0. . et al. 0. 0. of the 0. Indazoles. International 0. 0. histone methyltransferase n n U U U EZH2 in myeloid disorders. Nat. Genet. 42, 722726 (2010). WO2011140325 (2011). 46. Nikoloski, G. et al. Somatic mutations of the histone methyltransferase gene 26. Burgess, J. et al. Azaindazoles. International patent application PCT Figure 6 | Inhibition of EZH2 by EPZ005687 results in accumulation in the G1 phase ofin the cell cycle in an EZH2 Tyr641 mutant cell). line. EZH2 myelodysplastic syndromes . Nat. Genet. 42lymphoma , 665667 (2010 WO2012005805 (2012 ). ( a. ) Xu Treatment WSU-DLCL2 cells with EPZ005687 for 4 dregulates results in a dose-dependent increase ofal. accumulation in G1. (b ,cthe ) Prolonged 47. Ntziachristos , P. et Genetic inactivation of polycomb exposure repressive of complex 27 , C. et al. of Binding of different histone marks differentially the 2 the in Thigher cell acute lymphoblastic leukemia.the Nat. Med. of 18duplicates , 298301 (2012 ). activity and specifi of polycomb complex 2 after (PRC2) Natl. EPZ005687 leads tocity increases in therepressive sub-G1 population 7. dProc. (b) and 10 d (c) at doses. Graphs represent mean s.e.m. Acad. Sci. USA 107, 1926619271 (2010). 28. Han, Z. et al. Structural basis of EZH2 recognition by EED. Structure 15, Acknowledgments 13061315 (2007). We thank D. Johnston and A. Basavapathruni for performing DOT1L and WHSC1 29. Margueron, R. et al. Role of the polycomb protein EED in the propagation of enzyme selectivity assays, Kuplast for helpof with the LCC calculations, A. Santospago line can lead to K. derepression known EZH2 target genes and compound in these lymphoma cells. We believe these data strongly cell repressive histone marks. Nature 461, 762767 (2009). for preparation of assay plates and R. Gould for helpful discussions. affect genes specifically repressed by the EZH2 Tyr641 mutant. support the notion that the enzymatic activity of PRC2 becomes 30. Yonetani, T. & Theorell, H. Studies on liver alcohol hydrogenase complexes. 3. uniquely cellin growth and of survival of lymphoma Multiplerequired inhibition for kinetics the presence two competitive inhibitorscells . Author contributions Arch. Biochem. , 243 251 therefore (1964). bearing mutant Biophys. EZH2; 106 these data point to the change-of- DISCUSSION L.J. made the enzymes. K.W.K. and E.J.O. designed compounds including EPZ005687. 31 . Richon , V . M . et al. Chemogenetic analysis of human protein function mutations in EZH2 as causal genetic drivers of lymphom- Chemical probes are increasingly proving indispensible for a T.J.W., C.R.M. and C.J.S. performed the enzyme inhibition assays, and T.J.W. performed methyltransferases. Chem. Biol. Drug Des. 78, 199210 (2011). substrate competitions, Yonetani-Theorell andand the in vitro EZH2 pull-down agenesis in, these molecular understanding of theanalysis biology physiology of cellular 32. Chapman P.B. et cells. al. Improved survival with vemurafenib in melanoma with assay. S.K.K., N.M.W., C.J.A., and C.R.K., J.S. and J.D.S. performed the intracellular inhibition processes in normal disease states. In human cancers, mulBRAF V600E mutation. N. Engl. J. Med. 364, 25072516 (2011). of H3K27 methylation ELISA. S.K.K. and N.M.W. performed the western blotting of all tiple genetic alterations are commonly associated with the genetic Impact cell cycle and gene expression 33. Wigle, of T.J. EPZ005687 et al. The Y641C on mutation of EZH2 alters substrate specificity for methyl marks and proliferation assays. S.K.K., N.M.W. and J.J.S. performed gene expresH3 further lysine 27 methylation states. FEBS 585, of 3011 3014 (2011). in instability that leads to transformation ofA.R., cells to M.P.S., a hyperproliferaTo histone explore the mechanism of Lett. action EPZ005687 sion and cell cycle experiments. S.K.K., T.J.W., K.W.K., J.J.S., R.M.P., R.C., 34 . Daigle , S . R . et al. Selective killing of mixed lineage leukemia cells by a potent malignant phenotype. It has been estimated that aS.K.K., minimum mutant-bearing lymphoma, we performed cell cycle analysis and tive, M.P.M., V.M.R., R.A.C. and H.K. designed studies and interpreted results. T.J.W., small-molecule DOT1L inhibitor. Cancer Cell 20, 5365 Y641F (2011). K.W.K. and R.A.C. wrote the paper. of five separate genetic alterations must be accumulated to effect transcriptional profiling in WSU-DLCL2 (EZH2 ) mutant lym35. Ben-Porath, I. et al. An embryonic stem cell-like gene expression signature in such transformation37. Because of the genetic instability of cancer phoma cells treated aggressive with EPZ005687. To investigate poorly differentiated human tumors . Nat. Genet. 40, the 499cell 507 killCompeting financial interests ). ing(2008 in mutant lymphoma, WSU-DLCL2 cells were treated with cells, many genetic alterations are observed that do not substanThe authors declare competing financial interests: details accompany the online version 36 . Dornan , D . et al. Th erapeutic potential of an anti-CD79b antibody drug EPZ005687 at concentrations ranging from 0.2 M to 6 M, and cell tially affect cancer transformation or proliferation in a causal manof the paper. conjugate, antiCD79b-vc-MMAE, for the treatment of non-Hodgkin cycle analysis was performed by flow cytometry at 4-, 7- and 10-d ner; such mutations have been referred to a passenger mutations lymphoma. Blood 114, 27212729 (2009). time points treatment (Fig. and Supplementary to distinguish them from the true driver mutations that have a 37. Renan , M.J.after How many mutations are 5 required for tumorigenesis? Table 3). Additional information 38 After 4 d, the from G1 phase the data cell. cycle increased, with causal role in tumorigenesis . Hence,information a major hurdle to the develImplications human of cancer Mol. Carcinog. 7, 139 146 correlative (1993). Supplementary information, chemical compound and chemical probe 38. Kaelin, W .Gthe . Jr. Choosing drug in the postgenomic is available in the online version of the paper.on Reprints and permissions decreases in S as wellanticancer as the G2 /Mtargets phases (Fig. 6a). By 7era d,. the information opment of new cancer treatments based molecular targeting J. Clin. Invest. , 15031506(6 (1999 ). led to an increase in the sub-G1 information available online http://www.nature.com/reprints/index.html . mutahighest dose of104 EPZ005687 M) has been is the ability toatdistinguish passenger from driver 39. Li, R. & Stafford, J.A. Kinase Inhibitor Drugs (John Wiley & Sons, Inc., 2009). Correspondence and requests for materials should be addressed to K.W.K.

tions. The use of selective inhibitors of genetically altered enzymes and antagonists of altered receptors has proven valuable in making such distinctions. Over the past decade, numerous kinase inhibitors have become available to the chemical biology community and have been used to probe the impact of selective kinase inhibition on cancer cells39. These studies provide a basis for establishing specific genetic alterations as drivers of particular human cancers and pave the way for the development of targeted therapeutic agents for patients that may be identified by the presence of the specific genetic change. Two contemporary examples of this are provided by the recent US Food and Drug Administration approval of vemurafenib to specifically treat melanoma patients carrying the BRAFV600E mutant32,40 and of crizotinib to specifically treat lung cancer patients with a chromosomal translocation of the ALK gene41. These drugs exemplify a paradigm shift in the clinical treatment of cancer, with increasing reliance on a molecular understanding of the underlying disease and the use of drugs targeted to the genetic alterations that drive a particular individuals cancer. This paradigm has been referred to as personally targeted cancer therapeutics42. The PMTs represent a large class of epigenetic enzymes that have a paramount role in the control of gene transcription. Several examples of genetic alterations in specific PMTs have been reported in association with different human cancers43. It thus seems timely to begin to probe the driver status of genetic alterations in PMTs by the use of potent, selective small-molecule inhibitors of specific PMTs. Indeed, a number of specific PMT inhibitors have begun to be reported in the literature42. For example, UNC0638 has been identified as a G9A and GLP inhibitor that modulates H3K9 methylation in cells44, and EPZ004777, a potent

2012 | www.nature.com/naturechemicalbiology NATURE CHEMICAL BIOLOGY | VOL 8 | NOVEMBER NATURE REPRINT COLLECTION Epigenetics

NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.1084


and selective DOT1L inhibitor, was used to elucidate the causal role of DOT1L enzymatic activity in MLL-rearranged leukemia34. In the present work, we identified EPZ005687 as a potent and selective inhibitor of wild-type and mutant EZH2containing PRC2 enzymatic activity. We showed that the compound selectively inhibits H3K27 methylation in cells and that this translated into selective cell killing for lymphoma cells that contain heterozygous EZH2 mutations at Tyr641 or Ala677. These data established a critical and unique dependency on PRC2 enzymatic activity for the lymphoma cell lines that bear these EZH2 mutations. This dependency is equivalent to the concept of oncogene addiction, in which cells become abnormally dependent on the biochemical activity of a specific oncogene product for growth, survival or both, such that ablation of the oncogene is cytotoxic in the genetically altered cells but inconsequential to growth of normal cells. The present results provide a compelling foundation for the clinical use of selective EZH2 inhibitors for the treatment of mutant-bearing lymphomas. The current compound represents a chemical biological probe for in vitro experiments, and we do not suggest that this compound itself could form the basis for patient treatment. Pharmacological optimization of compounds such as EPZ005687 holds great promise for this eventual outcome. Genetic alterations in EZH2 and other PRC2 subunits are not limited to the Tyr641 and Ala677 mutations observed in lymphoma. A broad spectrum of genetic alterations of PRC2 has been documented in a range of hematologic and solid tumors. Notably, in myeloid malignancies and T-cell leukemia, mutations in EZH2 and other PRC2 components lead to a loss of function of the complex4547. The fact that both activating and inactivating mutations of EZH2 are associated with malignancy is remarkable and reflects the complex role of PRC2 target genes in cell fate decisions. EPZ005687 is shown here to be an equally potent inhibitor of both wild-type and Tyr641 or Ala677 mutants of EZH2, suggesting that pharmacologically optimized inhibitors with this inhibition profile may be useful in the treatment of a number of human cancers wherein gain-of-enzymatic function of PRC2 drives disease.
Determination of inhibitor IC50 values in the PMT panel. Values for enzymes in the histone methyltransferase panel were determined under balanced assay conditions with both SAM and protein or peptide substrate present at concentrations equal to their respective Km values24. Where a peptide was used as a methyl-accepting substrate, the peptide is referred to here by the histone and residue numbers that it represents. For example, peptide H3:1630 refers to a peptide representing histone H3 residues 16 through 30. All reactions were run at 25 C in a 50-l volume with 2% (v/v) DMSO in the final reaction. Flag- and His-tagged CARM1 (residues 2585) expressed in 293 cells was assayed at a final concentration of 0.25 nM against a biotinylated peptide corresponding to histone H3:1630 with a monomethylated Arg26. His-tagged Dot1L (residues 1416) expressed in Escherichia coli was assayed at a final concentration of 0.25 nM against chicken erythrocyte oligonucleosomes. His-tagged EHMT2 (residues 9131193) expressed in E. coli was assayed at a final concentration of 0.1 nM against a biotinylated peptide corresponding to H3:115. His-tagged EHMT1 (residues 9511235) expressed in E. coli was assayed at a final concentration of 0.1 nM against a biotinylated peptide corresponding to H3:115. Full-length glutathione S-transferase (GST)-tagged PRMT1 expressed in Spodoptera frugiperda cells was assayed at a final concentration of 0.75 nM against biotinylated peptide corresponding to H4:3650. GST-tagged PRMT3 (residues 2531) expressed in E. coli was assayed at a final concentration of 0.5 nM against a biotinylated peptide with the sequence biotin-aminohexyl-GGRGGFGGRGGFGGRGGFG-amide. Flag-tagged full-length PRMT5 expressed in 293 cells was assayed at a final concentration of 1.5 nM against a biotinylated peptide corresponding to H4:115. His-tagged PRMT6 (residues 2375) expressed in 293 cells was assayed at a final concentration of 1 nM against a peptide corresponding to H4:N3650 with monomethylated Lys44. Full-length PRMT8 expressed in E. coli was assayed in a final concentration of 1.5 nM against a biotinylated peptide corresponding to H4:3145. Full-length SETD7 expressed in E. coli was assayed at a final concentration of 1 nM against a biotinylated peptide corresponding to H3:115. Full-length Flag-tagged SMYD3 was expressed in E. coli and assayed at a final concentration of 50 nM against recombinant histone H4. His-tagged full-length SMYD2 was assayed at a final concentration of 1 nM against a biotinylated peptide corresponding to H4:3650. Flagand His-tagged full-length WHSC1 was expressed in 293 cells and assayed at a

ARTICLE

final concentration of 2.5 nM against chicken erythrocyte oligonucleosomes. Flagtagged full-length WHSC1L1 was expressed in S. frugiperda cells and was assayed at a final concentration of 4 nM against chicken erythrocyte oligonucleosomes. Cell culture. Lymphoma cell lines OCI-LY19 (ACC-528), WSU-DLCL2 (ACC-575) and Karpas422 (ACC-32) were obtained from Deutsche Sammlung von Mikroorganismen und Zellkulturen. Toledo (CRL-2631), HT (CRL-2260), Pfeiffer (CRL-2632) and SUDHL6 (CRL-2959) cell lines were obtained from American Type Culture Collection. DOHH2 (HTL99022) was obtained from Banca Biologica e Cell Factory. SUDHL6 and Karpas422 cell lines were cultured in RPMI plus 20% (v/v) FBS, and all other cell lines were cultured in RPMI plus 10% (v/v) FBS. Analysis of long-term proliferation and cell cycle. Proliferation and cell cycle analysis were performed as previously described34, with slight exceptions. For the 11-d proliferation assay, plating densities were determined for each cell line on the basis of linear log-phase growth. Cells were counted and split back to the original plating density in fresh medium with EPZ005687 on days 4 and 7. Viable cell counts and IC50 calculations were performed as previously described34, and LCC calculations were performed as described in Supplementary Methods. For cell cycle, WSU-DLCL2 cells were plated in 12-well plates at a density of 1 105 cells per ml. Cells were incubated with EPZ005687 at 0.2 M, 0.67 M, 2 M and 6 M, in a total of 2 ml, over a course of 10 d. All remaining cell cycle analysis was performed as previously described34.

Received 19 March 2012; accepted 13 July 2012; published online 30 September 2012

References

METHODS

1. Kuzmichev, A., Nishioka, K., Erdjument-Bromage, H., Tempst, P. & Reinberg, D. Histone methyltransferase activity associated with a human multiprotein complex containing the Enhancer of Zeste protein. Genes Dev. 16, 28932905 (2002). 2. Cao, R. et al. Role of histone H3 lysine 27 methylation in Polycomb-group silencing. Science 298, 10391043 (2002). 3. Agger, K. et al. UTX and JMJD3 are histone H3K27 demethylases involved in HOX gene regulation and development. Nature 449, 731734 (2007). 4. Hong, S. et al. Identification of JmjC domain-containing UTX and JMJD3 as histone H3 lysine 27 demethylases. Proc. Natl. Acad. Sci. USA 104, 1843918444 (2007). 5. Lee, M.G. et al. Demethylation of H3K27 regulates polycomb recruitment and H2A ubiquitination. Science 318, 447450 (2007). 6. Lan, F. et al. A histone H3 lysine 27 demethylase regulates animal posterior development. Nature 449, 689694 (2007). 7. De Santa, F. et al. The histone H3 lysine-27 demethylase Jmjd3 links inflammation to inhibition of polycomb-mediated gene silencing. Cell 130, 10831094 (2007). 8. Kleer, C.G. et al. EZH2 is a marker of aggressive breast cancer and promotes neoplastic transformation of breast epithelial cells. Proc. Natl. Acad. Sci. USA 100, 1160611611 (2003). 9. Varambally, S. et al. The polycomb group protein EZH2 is involved in progression of prostate cancer. Nature 419, 624629 (2002). 10. Kirmizis, A. et al. Silencing of human polycomb target genes is associated with methylation of histone H3 Lys 27. Genes Dev. 18, 15921605 (2004). 11. Bracken, A.P. et al. EZH2 is downstream of the pRB-E2F pathway, essential for proliferation and amplified in cancer. EMBO J. 22, 53235335 (2003). 12. Simon, J.A. & Lange, C.A. Roles of the EZH2 histone methyltransferase in cancer epigenetics. Mutat. Res. 647, 2129 (2008). 13. Velichutina, I. et al. EZH2-mediated epigenetic silencing in germinal center B cells contributes to proliferation and lymphomagenesis. Blood 116, 52475255 (2010). 14. van Haaften, G. et al. Somatic mutations of the histone H3K27 demethylase gene UTX in human cancer. Nat. Genet. 41, 521523 (2009). 15. Wang, S., Robertson, G.P. & Zhu, J. A novel human homologue of Drosophila polycomblike gene is up-regulated in multiple cancers. Gene 343, 6978 (2004). 16. Morin, R.D. et al. Somatic mutations altering EZH2 (Tyr641) in follicular and diffuse large B-cell lymphomas of germinal-center origin. Nat. Genet. 42, 181185 (2010). 17. Lohr, J.G. et al. Discovery and prioritization of somatic mutations in diffuse large B-cell lymphoma (DLBCL) by whole-exome sequencing. Proc. Natl. Acad. Sci. USA 109, 38793884 (2012). 18. Morin, R.D. et al. Frequent mutation of histone-modifying genes in non-Hodgkin lymphoma. Nature 476, 298303 (2011). 19. Sneeringer, C.J. et al. Coordinated activities of wild-type plus mutant EZH2 drive tumor-associated hypertrimethylation of lysine 27 on histone H3 (H3K27) in human B-cell lymphomas. Proc. Natl. Acad. Sci. USA 107, 2098020985 (2010). 20. Yap, D.B. et al. Somatic mutations at EZH2 Y641 act dominantly through a mechanism of selectively altered PRC2 catalytic activity, to increase H3K27 trimethylation. Blood 117, 24512459 (2011).
895 S43

NATURE BIOLOGY | VOL 8 | NOVEMBER 2012 | www.nature.com/naturechemicalbiology NATURECHEMICAL REPRINT COLLECTION Epigenetics

ARTICLE

NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.1084


40. Tsai, J. et al. Discovery of a selective inhibitor of oncogenic B-Raf kinase with potent antimelanoma activity. Proc. Natl. Acad. Sci. USA 105, 30413046 (2008). 41. Kwak, E.L. et al. Anaplastic lymphoma kinase inhibition in nonsmall-cell lung cancer. N. Engl. J. Med. 363, 16931703 (2010). 42. Copeland, R.A. Protein methyltransferase inhibitors as personalized cancer therapeutics. Drug Discov. Today Ther. Strateg. published online, doi:10.1016/j.ddstr.2011.08.001 (16 September 2011). 43. Copeland, R.A., Solomon, M.E. & Richon, V.M. Protein methyltransferases as a target class for drug discovery. Nat. Rev. Drug Discov. 8, 724732 (2009). 44. Vedadi, M. et al. A chemical probe selectively inhibits G9a and GLP methyltransferase activity in cells. Nat. Chem. Biol. 7, 566574 (2011). 45. Ernst, T. et al. Inactivating mutations of the histone methyltransferase gene EZH2 in myeloid disorders. Nat. Genet. 42, 722726 (2010). 46. Nikoloski, G. et al. Somatic mutations of the histone methyltransferase gene EZH2 in myelodysplastic syndromes. Nat. Genet. 42, 665667 (2010). 47. Ntziachristos, P. et al. Genetic inactivation of the polycomb repressive complex 2 in T cell acute lymphoblastic leukemia. Nat. Med. 18, 298301 (2012).

21. McCabe, M.T. et al. Mutation of A677 in histone methyltransferase EZH2 in human B-cell lymphoma promotes hypertrimethylation of histone H3 on lysine 27 (H3K27). Proc. Natl. Acad. Sci. USA 109, 29892994 (2012). 22. Miranda, T.B. et al. DZNep is a global histone methylation inhibitor that reactivates developmental genes not silenced by DNA methylation. Mol. Cancer Ther. 8, 15791588 (2009). 23. Tan, J. et al. Pharmacologic disruption of Polycomb-repressive complex 2-mediated gene repression selectively induces apoptosis in cancer cells. Genes Dev. 21, 10501063 (2007). 24. Copeland, R.A. Evaluation of enzyme inhibitors in drug discovery. A guide for medicinal chemists and pharmacologists (John Wiley & Sons, 2005). 25. Duquenne, C. et al. Indazoles. International patent application PCT WO2011140325 (2011). 26. Burgess, J. et al. Azaindazoles. International patent application PCT WO2012005805 (2012). 27. Xu, C. et al. Binding of different histone marks differentially regulates the activity and specificity of polycomb repressive complex 2 (PRC2). Proc. Natl. Acad. Sci. USA 107, 1926619271 (2010). 28. Han, Z. et al. Structural basis of EZH2 recognition by EED. Structure 15, 13061315 (2007). 29. Margueron, R. et al. Role of the polycomb protein EED in the propagation of repressive histone marks. Nature 461, 762767 (2009). 30. Yonetani, T. & Theorell, H. Studies on liver alcohol hydrogenase complexes. 3. Multiple inhibition kinetics in the presence of two competitive inhibitors. Arch. Biochem. Biophys. 106, 243251 (1964). 31. Richon, V.M. et al. Chemogenetic analysis of human protein methyltransferases. Chem. Biol. Drug Des. 78, 199210 (2011). 32. Chapman, P.B. et al. Improved survival with vemurafenib in melanoma with BRAF V600E mutation. N. Engl. J. Med. 364, 25072516 (2011). 33. Wigle, T.J. et al. The Y641C mutation of EZH2 alters substrate specificity for histone H3 lysine 27 methylation states. FEBS Lett. 585, 30113014 (2011). 34. Daigle, S.R. et al. Selective killing of mixed lineage leukemia cells by a potent small-molecule DOT1L inhibitor. Cancer Cell 20, 5365 (2011). 35. Ben-Porath, I. et al. An embryonic stem cell-like gene expression signature in poorly differentiated aggressive human tumors. Nat. Genet. 40, 499507 (2008). 36. Dornan, D. et al. Therapeutic potential of an anti-CD79b antibodydrug conjugate, antiCD79b-vc-MMAE, for the treatment of non-Hodgkin lymphoma. Blood 114, 27212729 (2009). 37. Renan, M.J. How many mutations are required for tumorigenesis? Implications from human cancer data. Mol. Carcinog. 7, 139146 (1993). 38. Kaelin, W.G. Jr. Choosing anticancer drug targets in the postgenomic era. J. Clin. Invest. 104, 15031506 (1999). 39. Li, R. & Stafford, J.A. Kinase Inhibitor Drugs (John Wiley & Sons, Inc., 2009).

Acknowledgments

We thank D. Johnston and A. Basavapathruni for performing DOT1L and WHSC1 enzyme selectivity assays, K. Kuplast for help with the LCC calculations, A. Santospago for preparation of assay plates and R. Gould for helpful discussions.

Author contributions

L.J. made the enzymes. K.W.K. and E.J.O. designed compounds including EPZ005687. T.J.W., C.R.M. and C.J.S. performed the enzyme inhibition assays, and T.J.W. performed substrate competitions, Yonetani-Theorell analysis and the in vitro EZH2 pull-down assay. S.K.K., N.M.W., C.J.A., C.R.K., J.S. and J.D.S. performed the intracellular inhibition of H3K27 methylation ELISA. S.K.K. and N.M.W. performed the western blotting of all methyl marks and proliferation assays. S.K.K., N.M.W. and J.J.S. performed gene expression and cell cycle experiments. S.K.K., T.J.W., K.W.K., A.R., J.J.S., M.P.S., R.M.P., R.C., M.P.M., V.M.R., R.A.C. and H.K. designed studies and interpreted results. S.K.K., T.J.W., K.W.K. and R.A.C. wrote the paper.

Competing financial interests Additional information

The authors declare competing financial interests: details accompany the online version of the paper.

Supplementary information, chemical compound information and chemical probe information is available in the online version of the paper. Reprints and permissions information is available online at http://www.nature.com/reprints/index.html. Correspondence and requests for materials should be addressed to K.W.K.

8 96 S44

NATURE CHEMICAL BIOLOGY | VOL 8 | NOVEMBER 2012 | www.nature.com/naturechemicalbiology NATURE REPRINT COLLECTION Epigenetics

Finding the right antibody for the right application just got easier!

Antibodypedia is a free online resource that helps you to compare and select antibodies. Independent, with data curated with the assistance of an international advisory board, Antibodypedia lets you: Search for antibodies that have proved themselves effective for speci c applications Discover research employing particular antibodies Publish antibody validation data from your own experiments

Find the right antibody for the right application by visiting: www.antibodypedia.com

Personalized Therapeutics The Power of Epigenetics


Epizyme is creating personalized therapeutics for patients with genetically defined cancers based on breakthrough discoveries in the field of epigenetics.

MAPPING THE HMTome The Power of Personalized Therapeutics


MLL4
SUV420H1 SMYD1 SMYD5 SETD3 SETD6 SETD4 SMYD3 SMYD2 SMYD4 SUV420H2 MLL4 MLL SETD1B

EZH1
SETD7

EZH1

EZH2 EZH2

SETD1A

MLL2 MLL3

SETD7

SETD8

SETD8
PRDM5 PRDM3 PRDM16 EHMT1 PRDM2 EHMT2 PRDM1 PRDM11 PRDM7 PRDM9 PRDM14 PRDM6 PRDM8 PRDM13 PRDM12 PRDM4 PRDM15 PRDM10 SETD5 SETD2 ASH1L MLL5 Q6ZW69 SETMAR SETDB2 SETDB1 SUV39H1 SUV39H2

EZH2 In pre-clinical development for patients with genetically defined lymphomas and solid tumors

NSD1

WHSC1L1

WHSC1

Epizyme mapped the HMTome, a therapeutically important class of enzymes known as histone methyltransferases (HMTs) that are proven drivers of diseases such as cancer. The HMTome includes two major families - lysine methyltransferases (KMTs) and arginine methyltransferases (RMTs). Epizyme is creating small molecule HMT inhibitors as personalized therapeutics for the treatment of patients with genetically defined cancers.

ALKBH8
METTL11A METTL13 METTL7A ECE2 COQ3 METTL11B

WBSCR22 METTL7B
ALKBH8

PRMT7 PRMT10 METTL20 METTL10 PRMT5

METTL12

AS3MT DOT1L
AS3MT DOT1L METTL7B

WBSCR27

WBSCR22

METTL7A

CO

WBSCR27 COQ5 C20orf7

PRMT6 PRMT2 PRMT3 PRMT1 PRMT8 CARM1

ASMT

METTL6

METTL2A METTL8 METTL2B

DOT1L In Phase I development for patients with MLL-r, a genetically defined type of acute leukemia

PRMT9 PRMT11 NSUN5C

NOP2 NSUN7 NSUN5B NNMT INMT NSUN4 NSUN3 NSUN2 NSUN6 NSUN5 PNMT

To learn more about moving science forward in genetically defined cancers, visit

www.epizyme.com

www.nature.com/reprintcollections/epigenetics

S-ar putea să vă placă și