Molecular classification of breast cancer: What the pathologist needs to know

Breast cancer (BC) is a heterogeneous disease featuring distinct histological, molecular and clinical phenotypes. Although traditional classification systems utilising clinicopathological and few molecular markers are well-established and validated they remain insufficient to reflect the diverse biological and clinical heterogeneity of BC. Advancements in high-throughput molecular techniques and bioinformatics have contributed to the improved understanding of BC biology, refinement of molecular taxonomies and the development of novel prognostic and predictive molecular assays. Application of such technologies is already underway, and is expected to change the way we manage BC. Despite the enormous amount of work that has been carried out to develop and refine BC molecular prognostic and predictive assays, molecular testing is still in evolution. Pathologists should be aware of the new technology and be ready for the challenge. In this review, we provide an update on the application of molecular techniques with regard to BC diagnosis, prognosis and outcome prediction. Current contribution of the emerging technology to our understanding of BC is also highlighted.


INTRODUCTION
Historically breast cancer (BC) was classified based on clinicopathological features mainly tumour stage, and grade. Other morphological features such as histological type, proliferation status and lymphovascular invasion are also recognised as important morphological prognostic variables that reflect tumour biology (1,2). Over time, knowledge about BC biology has significantly increased and led to the understanding that BC represents a heterogeneous group of tumours and that tumour behaviour and response to therapy is determined by the underlying biological features. The expression of oestrogen receptor (ER), progesterone receptor (PgR) and the human epidermal growth factor receptor 2 (HER2) that were originally identified as predictive of response to systemic therapy are now recognised to be the main determinants of BC biology and can be used to refine BC molecular and prognostic taxonomy. More recently, molecular data arising from a variety of high throughput techniques have been used to refine BC stratification and develop prognostic and predictive classification with the aim of individualised therapy.
Although molecular taxonomy of BC based on gene expression profiling, proteomics, DNA copy number alteration and chromosomal changes, mutation status, methylation and microRNAs has been expanding for many years and has increased our knowledge of BC biology, its clinical application remains limited. The introduction of next generation sequencing (NGS) or massively parallel sequencing (3) appears to have opened new avenues for decoding BC molecular complexity, refine molecular classification and identify new therapeutic targets. These molecular techniques hold promise for improving diagnosis, prediction of outcome and behaviour, and in aiding selection of therapies for individual patients (4). However, its clinical utility is still under investigation (5).
Pathologists are currently using conventional and novel molecular techniques on routine practice to help diagnosis of morphologically challenging entities, to assess the expression of hormone receptors and HER2 status on every BC and help oncologists to refine the

Using molecular biomarkers in the diagnosis of breast lesions
In addition to prognosis and treatment response prediction, molecular biomarkers are frequently used in the diagnosis of challenging breast lesions; to differentiate between benign and malignant entities, in situ and invasive tumours, subtyping of certain lesions and determination of the tissue of origin of less differentiated malignant tumours. The most frequent technique utilising in this aspect is IHC often using a panel of biomarkers (6,7). IHC plays a useful role in diagnosing spindle cell lesions, identification of myoepithelial cells, differentiate between ductal and lobular phenotype and between hyperplastic epithelial proliferative process from neoplastic clonal epithelial proliferation and in the classification of papillary lesions. Cytokeratins can be used to detect small nodal metastases or subtle invasive carcinomas such as invasive lobular carcinomas. IHC also is helpful in recognising metastases to the breast and mammary carcinomas metastasising to extramammary tissues. Different antibodies are useful for different tumours: PAX8 and WT1 for ovarian carcinoma; TTF1 for thyroid and pulmonary adenocarcinoma; melan-A, HMB45 and S100 for melanoma; and lymphoid markers for lymphoma. Specific genetic translocations are also helpful for diagnosis of certain breast lesions (see below) and for exclusion of specific soft tissue tumours when identified on a biopsy as a component of other mammary-specific lesions; for instance pure stromal component of a malignant phyllodes tumour to be differentiated from other soft tissue sarcomas that may have different management strategies (8).

Companion diagnostics in breast cancer
The ability to predict an individual's response to a specific therapy is the main aim in modern precision medicine. A molecular diagnostic tool in the field of cancer therapy was first used in the 1970s to predict response of BC to the selective oestrogen receptor modulator, tamoxifen based on the expression of ER (9). Currently, several targeted cancer therapies are utilised in standard oncological care and this field is expanding. As a result, the concept of "companion diagnostics" has emerged which can be defined as a diagnostic test used as a companion to a therapeutic drug to determine its applicability to a specific patient. Currently, the US Food and Drug Administration (FDA)-approved companion diagnostics are utilised in BC tests for the presence of HER2 protein overexpression or gene amplification. Despite not considered companion diagnostics by the FDA, ER and PgR testing is mandatory for effective hormone therapy decision making and can be considered as companion diagnostics in BC. Although prognostic multigene assays are not companion diagnostics per se, as they are not linked to a particular drug, they can result in changes in clinical decisions and treatment course based on their outcome predictions (Table 1).

HORMONE RECEPTOR TESTING:
Hormone receptor status is determined by the tumour cells' expression of nuclear receptors for oestrogen (ER) and progesterone (PgR).
Biochemical ligand-binding assays were initially used to detect ER and PgR, but they required fresh tissue and were technically challenging and therefore IHC assays have become routine. Different scoring methods are in use for determining the level of expression but the most widely used systems are the Allred scoring and the histochemical score (H-score) methods which both assess the proportion and intensity of staining that are summed to give an overall score. However, the currently agreed cut-off of positivity of ER and PgR for management purpose relies on proportion scoring and is 1% (10). Patients with BC showing any nuclear expression of hormone receptor in invasive tumour cells above the cut-off are likely to respond to hormone therapy and are therefore potential candidates for this therapy. However, for a diagnostic purpose, i.e. determination of a mammary origin of a metastatic carcinoma, a more stringent definition of positivity is often used based on the pathologist's discretion. Although current guidelines indicate that IHC is used for determination of hormone receptor status (10) in BC, ER and PgR are component genes of some multigene assays including Oncotype DX. Information regarding hormone receptor status using these assays can be used as an additional quality measures for assessment methods. Discrepancy of results should trigger a reflex test.
HER2 TESTING: HER2 is overexpressed in 12% to 20% of BC most often because of HER2 gene amplification. Because of its predictive value, guideline recommendations for its assessment (11) and their updated versions (12,13) have been published to provide guidance on HER2 testing in BC.
Key aspects of these guidelines include a recommendation that all BC be tested for HER2 using IHC and subsequently with ISH in borderline positive IHC cases using a validated test. It should be recognised that both IHC and ISH represent an attempt to convert a continuous biological variable into a dichotomous category and borderline or equivocal cases exist and a reflex test is recommended to reduce the proportion of these cases. The use of the updated definition of positivity of HER2 has reduced the proportion of these borderline cases (12,13).

Ki67 PROLIFERATION INDEX:
The Ki67 proliferation index has been investigated as a BC prognostic and predictive factor in various settings (14). Ki67 is assessed in routine practice using IHC however, its analytic validity remains a matter of debate and formal inter-and intra-laboratory standardisation hampers its use in routine practice for management decision (15). Ki67 can be used in routine practice to i) determine the proliferation status in poorly fixed specimens, or 2) stratify grade 2 tumours into two prognostically distinct classes (16) akin to the molecular grade index (17). Ki67 is also used a component of some prognostic tools (18) (20).
Importantly, some special type mammary carcinomas show specific translocations which characterise these tumours and can be used as diagnostic adjunct. Secretory carcinoma of the breast is characterised by a balanced translocation of genetic material between chromosomes 12 and 15 (t(12;15)) creating a new gene in which the 5' region of ETV6 is fused to the 3' region of NTRK3 producing ETV6-NTRK3 fusion gene (57).
Mucoepidermoid carcinoma which is a rare type of metaplastic BC is characterised by a translocation between chromosome 11 and 19 (t(11;19)(q21;p13)) creating a novel fusion product between mucoepidermoid carcinoma translocated 1 (MECT1) and Mastermindlike gene family (MAML2); MECT1-MAML2 fusion gene (58). Adenoid cystic carcinomas as well as cylindroma show a specific translocation t(6;9)(q22-23;p23-24) creating MYB-NFIB fusion gene (59). In a study of breast adenoid cystic carcinoma mixed with a high grade triple negative BC components, the MYB-NFIB fusion gene was detected in both tumour subtypes and it was postulated that the progression from adenoid cystic carcinoma to high-grade triple-negative BC of no special type may involve the selection of neoplastic clones and/or the acquisition of additional genetic alterations with enrichment of mutations affecting certain genes such as FGFR1 (21).

Molecular classification of breast cancer
BC has been classified based on the expression of biomarkers using a variety of techniques, concepts and applications. Based on the expression of individual biomarkers, BC can be classified into ER positive and ER negative, HER2 positive and HER2 negative.
Although this appears as a simplified molecular classification system, it remains as the most important and informative molecular BC taxonomy to date for clinical management in routine practice (15). These two markers with or without addition of other biomarkers; namely PgR and Ki67 can be used in combination to provide further important prognostic information (22,23). For instance the response of ER positive HER2 negative tumours to hormone therapy is different to ER positive HER2 positive tumours. Despite the predictive and prognostic value of hormone receptors and HER2, complex molecular classifications based on multiple markers utilising high-throughput techniques have attracted attention as a novel method for molecular taxonomy. Molecular classification of BC was initially investigated using loss of heterozygosity analysis (LOH), karyotyping and CGH, which identified key genomic alterations including losses, gains and amplifications of genomic DNA (24)(25)(26)(27). This provided the early framework for a molecular classification system that stratified BC into distinct classes. Global gene expression profiling (GEP) studies of BC using unsupervised clustering techniques have provided a more established molecular classification system and identified distinct clusters or intrinsic subtypes based on the quantitative expression of several genes (transcriptome profiles) (28,29). Subsequent class discovery studies have also reported an association between molecular intrinsic subtypes and patient outcome and that these classes are associated with distinct biological pathways making them potential candidates for targeted therapy.
In the pioneer GEP study by Perou and colleagues in 2000 (28)  To overcome the problems of fresh tissue, the availability of microarray-based technology, cost and assay reproducibility, other techniques such as RT-PCR and IHC coupled with tissue microarrays using a smaller set of genes have been introduced to replicate this molecular taxonomy and to identify intrinsic subtypes in routine practice. Two main approaches have been identified. The first approach was based on identifying a minimum gene sets from microarray-based studies and used the minimum set of genes that can reliably identify the GEP defined classes. One successful example is the PAM50 was based on classifying BC into seven distinct molecular classes using the 10 biomarkers followed by incorporation of clinicopathological variables to identify distinct prognostic groups with each of the classes (33). Using the NPI+ formulae, through incorporating molecular features and clinicopathological parameters, an improved patients' outcome stratification was achieved superior to the traditional NPI (33).
Although the identification of the intrinsic subtype-based molecular classification of BC has attracted attention, and improved our understanding of BC biology and increased hope in refinement of BC therapy prediction, their application in routine practice has been less successful. Targeted therapy of BC still relies of ER and HER2 regardless of the molecular class of the tumour; for instance HER2 positive BC patients are candidates for HER2 targeted therapy regardless of the intrinsic class whether HER2-enriched or luminal.
Despite the limited clinical applicability, GEP has opened new avenues for refinement of BC molecular prognostication as it has led to the introduction of the molecular multigene assays that aim to identify subgroups of BC associated with outcome or specific response to therapy (41). This approach is based on identification of a set of genes (gene signature) that can be used collectively to identify tumours with specific biological or clinical features. The term "genomic signatures" was used to refer to the expression of a set of genes in a biologic sample using microarray technology while "metagene" refers to a single aggregate measure of the expression of a group of genes that usually show coordinated expression in a set of samples and defined by mathematical combination of the genes of interest. Most of these multigene assays were used in BC to stratify prognostically clinically relevant groups into low and high risk subgroups to guide further treatment.
Although these molecular classification systems have provided fascinating new insights into BC biology and they may have provided more prognostic and better predictive power than conventional variables and complement them, we still have a long way to go in terms of delivering truly personalised medicine and further work is needed.
The first multigene prognostic assay was developed by van't Veer et al. (42) who used a class prediction approach utilising a 70-gene set associated with the likelihood of metastasis within 5 years. This 70-gene signature was validated in a subsequent study (42) and was later commercially marketed as the MammaPrint assay (Agilent, Amsterdam, the Netherlands immune-related and proliferative classes (51). These subtypes showed many significant genomic features at the mRNA and protein/phosphoprotein level with 1,277 genes differentially expressed between ILC subtypes. However, no difference between these ILC subtypes was identified in terms somatic mutations or DNA copy-number alterations. As expected the proliferative subtype was associated with the worst outcome whilst the reactive-like was associated with the best outcome (51). At the DNA ILC cases were significantly enriched for CDH1 mutations and mutations affecting TBX3 and FOXA1.
GATA3 mutations appeared to be the second most discriminant event between ILC and ductal NST carcinoma after CDH1 mutations. In addition homozygous losses of the PTEN locus (10q23) and PTEN mutations were more frequent in ILC (51).
Analysis of pure mucinous BC subtype indicated that they show a relatively low level of genetic instability and they tend to be homogeneously and preferentially clustered together, separately from ductal NST carcinomas. They less frequently harbour gains of 1q and 16p and losses of 16q and 22q than grade-and ER-matched ductal NST, and no pure mucinous carcinoma displayed concurrent 1q gain and 16q loss, a hallmark genetic feature of lowgrade ductal NST (52). Pure invasive micropapillary carcinoma that has a characteristic morphological appearance with a so-called inside-out growth pattern shows specific copy number aberrations (53), high cyclin D1 expression, high proliferation rates, and MYC (8q24) amplification (54) compared to ER-matched and grade-matched ductal NST.
Special subtypes that belong to the basal-like subgroup include carcinomas with medullary features, as well as metaplastic carcinomas and salivary-gland-like tumours such as adenoid cystic carcinoma. Adenoid cystic carcinoma forms an interesting paradox as it sits within the basal-like group, which is generally regarded as of poor prognosis, yet its clinical behaviour is generally indolent. This underscores the astonishing heterogeneity that can occur even within individual intrinsic subtypes. In a previous study of acinic cell carcinomas (ACCs) of the breast using massively parallel sequencing (55), our group identified that the most frequently mutated gene is TP53 with a complex patterns of gains and losses similar to those of common forms of triple negative BC. Additional somatic mutations affecting breast cancer-related genes found in ACCs included PIK3CA, mTOR, CTNNB1, BRCA1, ERBB4, ERBB3, INPP4B, and FGFR2. Using NGS approach, our group also demonstrated that microglandular adenosis/atypical microglandular adenosis, particularly those associated with triple negative BC harboured at least one somatic non-synonymous mutation with identical TP53 mutations and similar patterns of gene CNAs in microglandular adenosis and in the associated triple negative BC. Clonal shifts in the progression from microglandular adenosis to atypical microglandular adenosis and/or to triple negative BC were also observed. On the other hand pure microglandular adenosis lacked clonal non-synonymous somatic mutations and displayed limited copy number alterations (56). Importantly, these findings, in conjunction with others, underscore the significance for microglandular adenosis in clinical diagnosis. In another study of infiltrating epitheliosis using the same techniques (57), we demonstrated high prevalence of somatic mutations affecting PI3K pathway genes, suggesting that these lesions may be neoplastic rather than hyperplastic. The landscape of somatic genetic alterations found in infiltrating epitheliosis is similar to that of radial scars/complex sclerosing lesions, suggesting that they may represent one end of this spectrum of lesions.
There is also a strong evidence to indicate that the considerable molecular heterogeneity of BC is already present at the pre-invasive level with genomic, transcriptomic and phenotypic similarities found between ductal carcinoma in situ (DCIS) and coexisting invasive carcinoma (58). Similar to invasive BC, frequent genetic and genomic events have been reported in DCIS with several studies provided detailed descriptions of DCIS genomic, transcriptomic and proteomic profiling however, to date there are relatively little molecular data that can be used to predict the risk of progression to invasive tumour or risk of recurrence.

Next generation sequencing (NGS)
The introduction of NGS or massively parallel sequencing (MPS) has revolutionised BC genetics and genomics and is expected to assist in utilising for personalised treatment of BC patients. Common approaches to NGS include whole-genome sequencing (sequences the complete genome of a sample), whole-exome sequencing, targeted exome sequencing  (36). These drugs target specific molecular abnormalities, including mutated protein kinases and amplified or rearranged genes. BCs that carry any of these abnormalities particularly if they harbour the sensitising genomic abnormality is expected to respond to the corresponding targeted therapies. For example, the HER2 geneamplified BC benefit from HER2-targeted therapies.
Future perspectives: As a natural extension of the increasing application of the highthroughput sequencing technology, the list of cancer driver genes is growing, and a considerable number of these are potentially targetable. This may also help to understand the mechanisms underlying treatment failure. Furthermore, the identification of targets holds great potential for monitoring clonal evolution in response to treatment and, hence, the early detection of treatment failure. Application of such technologies is already underway is expected to result in further refinement of BC prognostication and prediction of response to specific therapies.
In conclusion: Molecular testing has become increasingly important in the prevention, diagnosis, and treatment of BC. Despite the enormous amount of work that has been carried out to develop and refine BC molecular classification, it is still in evolution. With the increasing use of more sophisticated high-throughput techniques such as NGS, large amounts of data will continue to emerge, which could potentially lead to identification of novel therapeutic targets and allow more precise classification systems that can predict outcome and response to therapy.