Further evidence to support bimodality of oestrogen receptor expression in breast cancer

Although oestrogen receptor (ER)‐negative breast cancers (BCs) do not respond to hormone therapy, the response of ER‐positive BCs is reported to be variable, which may suggest a dose‐dependent effect. The aim of this study was to assess the pattern of ER expression in BCs at the protein (immunohistochemistry) and transcriptome (microarray‐based gene expression) levels.


Introduction
Global gene expression studies of breast cancer (BC) have demonstrated that oestrogen receptor (ER) is the main determinant of BC molecular profiles, and that ER-positive and ER-negative BCs are different diseases. 1,2 Although several randomized clinical trials have demonstrated that ER-negative BCs do not respond to endocrine therapy, the effect of endocrine treatment being restricted to ER-positive BCs, the response of ER-positive BCs is variable. Only twothirds of ER-positive BC patients treated with endocrine therapy respond. 3 Several studies have reported a correlation between the level of ER expression and the response to endocrine therapy, and clinical trials have indicated that only 50% of BCs with an Allred score of 4-6 respond to endocrine treatments, as compared with 75-80% of BCs with a score of 7 or 8. [3][4][5][6] Before the use of immunohistochemistry, ER expression was measured by the use of radiolabelled ligand-binding assays (LBAs) with freshly frozen tumour samples, which showed a continuum of values. For more than two decades, ER expression has been assessed with immunohistochemistry, and is used to predict the response to endocrine therapy. 4,7 In addition to ER expression being a prognostic marker, some authors reported that the linear distribution of ER expression reported with LBAs was observed with immunohistochemistry. 4,5 However, other authors have challenged the concept of ER linear expression, and they have provided evidence that ER expression is essentially bimodal. 8,9 Identifying genes with bimodal expression patterns from large-scale gene expression profiling data has provided new insights into the distribution of the expression of key genes. At the transcriptomic level, there are two different classes of genes. The first class is composed of those with a Gaussian or continuous distribution: two small groups of tumours have very high expression and very low expression respectively, with the rest, which represent the majority, falling somewhere in between. The second class is composed of genes with a bimodal distribution of expression. This class has a majority of tumours with either high levels of gene expression or no expression, and relatively few tumours fall in between. 10 Previous studies have reported significant correlations between ESR1 expression and clinical ER status. 11 ESR1 has a high bimodality index score, and it can be used to classify samples into two distinct expression states. 12 The aim of this study was to assess the pattern of ER expression, both at the protein (immunohistochemistry) and transcriptome (microarray-based gene expression) levels, in order to obtain a comprehensive understanding of the clinical and biological value of tumours expressing low/intermediate levels of ER. To achieve this, we reviewed three cohorts of primary BCs: (i) 3649 core biopsies performed at Nottingham City Hospital, with immunohistochemical (IHC) staining for ER; (ii) 1892 cases prepared as tissue microarrays (TMAs), with IHC staining for ER; and (iii) 1980 BC cases that were included in the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) study, with determination of ESR1 mRNA levels. 13

P A T I E N T C O H O R T S
This study was based on three patient cohorts, and was approved by Nottingham Research Ethics Committee 2 under the title 'Development of a molecular genetic classification of breast cancer'. ER was assessed by immunohistochemistry in the first and second cohorts, and by determination of mRNA levels in the third cohort. Regarding the level of ER expression, weakly positive cases were defined as those with ER expression in 1-10% of the tumour cells, and intermediately positive cases were defined as those with ER expression in 11-69% of the tumour cells. Negative cases were defined as those with ER expression in <1% of the tumour cells, according to the American Society of Clinical Oncology/College of American Pathologists guidelines, 14 and highly positive cases were defined as those with ER expression in ≥70% of the tumour cells.
The first cohort comprised a consecutive series of symptomatic and screen-detected BC patients who had ER status assessed on a preoperative core needle biopsy (CNB) at Nottingham City Hospital in routine practice between March 2008 and November 2014 (3649 cases). CNBs were fixed, processed and stained according to a standardized protocol, as previously described. 15 The primary antibodies used were 1D5 (Dako, Cambridge, UK), diluted 1:100, and a prediluted SP1 clone (Roche, Welwyn Garden City, UK).
The second cohort comprised 1892 patients diagnosed between 1988 and 1998 whose BC tissues were prepared as TMAs as part of the previously published Nottingham Tenovus Primary Breast Carcinoma Series. 16 This is a consecutive well-characterized series of early-stage primary operable BC patients aged ≤70 years. In this cohort, patient outcome, including regional and distant events, survival, and time to the event, was recorded and annually updated. This included BC-specific survival. The clinical details of the patients, including age and menopausal status, and the tumour details, including tumour size, grade, stage, lymphovascular invasion, and lymph node status, were also available. TMA sections of this series were stained with the ER antibody SP1 clone (Dako), by use of a 1:150 dilution and 30 min of incubation. Levels of IHC expression were assessed, by visual inspection under a microscope, as the percentage of invasive tumour cells with positive staining for ER.
The third cohort comprised 1980 BC cases that were included in the METABRIC cohort. 13 In this study, the extracted and purified DNA probes were hybridized to Affymetrix SNP 6.0 arrays (Affymetrix, Santa Clara, CA, USA), with the quality control criteria established by AROS Applied Biotechnology (Aarhus, Denmark). The Illumina Total prep RNA amplification kit (Ambion, Warrington, UK) was used for total extraction and purification of total RNA. The generated biotinylated cRNA was hybridized with Illumina Human HT-12 v3 Expression Beadchips from the same manufacturer. 17 The resulting gene expression data from the microarray experiments were statistically analysed with the Linear Models for Microarray Data (LIMMA) inclusive software package, which is compatible with the Affymetrix data. 18 The primary statistical output of LIMMA software package used in the analysis comprised Student's t-test, base 2 logarithms of fold changes between the normal and tested samples (log 2 FC), average expression measurements, t-test values, P-values, adjusted P-values, log-odds of differential expression and B-values. In this cohort, the survival data and the ER immunostaining data were available for 262 cases from Nottingham.

I M M U N O H I S T O C H E M I S T R Y
Surgical excision specimens of 55 cases from the first cohort that showed low/intermediate ER expression on CNBs were immunostained to confirm the level of ER expression. These 55 cases comprised 39 from the low ER expression group (ER score of 1-10%) and 16 from the intermediate ER expression group (ER score of 11-69%). The IHC technique was applied to fullface sections, which are considered to be the 'gold standard' for ER assessment with the standard protocol. In brief, antigen retrieval was carried out with citrate buffer (pH 6.0) for 20 min in a microwave oven. Manual IHC staining was performed with the Novolink Max Polymer Detection Kit (Ref. RE7280-K) (Leica Biosystems, Newcastle, UK). Peroxidase blocking was performed by applying a peroxidase block for 5 min. The optimized primary antibody, i.e. EP1 anti-ER rabbit monoclonal antibody (Dako; Ref. M3643), was applied and incubated for 30 min at room temperature, and this was followed by incubation with Novolink polymer for 30 min and enzyme substrate for 5 min. Novolink haematoxylin was then added to each slide for 6 min. Finally, the slides were dehydrated and mounted by the use of coverslips with DPX (Leica Microsystems, Newcastle upon Tyne, UK). Nuclear ER staining was then scored according to percentage and the H-score system by two pathologists (A.A.M. and E.A.R.).

S T A T I S T I C A L A N A L Y S I S
For IHC expression, in the first and second cohort, the statistical tests were performed with IBM SPSS version 22 (SPSS, Chicago, IL, USA). The data were categorized into three groups: the first group included cases with ER expression scored as <1%; the second group included cases with ER expression scored as 1-69%; and the third group included cases with ER expression scored as ≥70%, which is similar to the cut-off point used by Collins et al. 8 In the second cohort, survival curves were also analysed with the Kaplan-Meier method, with significance being determined with the log rank (LR) test. The associations between ER expression subcategories and the clinicopathological variables, and between progesterone receptor (PR) and HER2 as basic prognostic and predictive markers, were evaluated with the chi-square test. For all statistical tests, a P-value of <0.05 was considered to be a significant value.
For transcriptomic data of the third cohort, 1980 BC cases were used to demonstrate the ESR1 gene expression distribution pattern. However, patient outcome and immunostaining of ER were available for only 262 cases. These cases were subdivided into three subgroups according to the changes in the distribution of the curve obtained with SPSS. The first cut-off point was 6.5 and the second cut-off point was 8.6. The high expression level group (group 3) included those cases with an ESR1 expression level of >8.6. Gene expression was compared with protein level and patient outcome. Survival curves were analysed with the Kaplan-Meier method, with significance being determined by the LR test, and a P-value of <0.05 considered to be significant.

Results
Analysis of the 3649 cases assessed on CNBs showed that the majority of cases (92.2%) were either strongly positive (≥70%) or negative (<1%) for ER, whereas weakly positive (1-10%) cases (2.7%) and intermediately positive (11-69%) cases (5.1%) were infrequent. Figure 1 shows the bimodal distribution of ER expression, whereby those with <1% expression represented 22.4% of cases, and strongly positive cases represented 69.8% of cases. To further assess the existence and frequency of tumours with low/intermediate expression, 55 cases were immunostained by the use of full-face sections of excision specimens. Of those 55 cases, 26 Figure 2 shows cases with negative, low, intermediate and highly positive expression.
The frequency distribution of ER staining, based on the percentage of ER-positive tumour cells, in the TMA series of 1892 cases is shown in Figure 3. Similarly to those in CNBs, this series showed bimodality of expression, with the completely negative and strongly positive cases representing 89.2% of the cases. Weak and intermediate expression of ER was infrequent, representing 1.6% and 9.2% of the cases, respectively.
Owing to the small number of cases in the groups with weakly positive and intermediate expression, they were combined, and the data were analysed accordingly: negative group (<1%), intermediate group (1-69%), and highly positive group (≥70%). The associations of these subgroups and other clinicopathological variables, i.e. PR and HER2 status, are shown in Table 1. The association of the ER subgroup with HER2 was highly significant; 25% and 22% of the negative and intermediate cases were HER2-positive, as compared with 6% of the strongly positive cases. PR is an ER-dependent protein. 19 In this study, we assessed the frequency of expression of PR in the TMA series by using the previously published immunohistochemically stained PR, 16 and this showed a bimodal distribution similar to that of ER ( Figure 4).
Outcome analysis showed significant associations between ER expression groups and patient outcome As expected, the difference in patient outcome between the negative and highly positive groups was highly significant (P < 0.001, LR = 22.05). Consistent with the IHC results, analysis of the METABRIC cases (n = 1980) showed a bimodal distribution of ESR1 mRNA ( Figure 6), with 18.2% of cases in group 1 (representing the negative group) and 72% in group 3 (representing the highly positive group), but only 9.8% in group 2 (representing the intermediate group). There was a significant positive correlation between ESR1 mRNA and protein expression levels (P < 0.001). The majority (79%) of ER immunohistochemically negative cases were in the negative ESR1 mRNA group. In addition, 97.1% of the strongly ER immunohistochemically positive cases were in the mRNA highly positive group. Interestingly, 77.8% of the immunohistochemically intermediate cases (1-69%) correlated with the negative and strongly positive mRNA groups ( Table 2). Outcome analysis showed significant differences between the groups (Figure 7; P = 0.001, LR = 13.28). No significant difference between the negative group and the intermediate group was identified (P = 0.74, LR = 0.10). However, the difference between the highly positive group and the intermediate group regarding patient outcome was significant (P = 0.01, LR = 6.52).

Discussion
The concept of bimodality and linear distribution of ER expression in BC remains a subject of debate. The current study provides evidence that the distribution of ER expression is bimodal at both the protein and gene expression levels by using immunohistochemistry and gene expression microarray technology, respectively. At the protein level, our data are consistent with the results reported by Collins et al., in which 99% of 817 cases were completely negative or positive, 8 23 The Schnitt group have commented on the reproducibility of the continuity of expression seen with the LBA when a large population-based study was tested with immunohistochemistry. 9 They described the relationship between the real quantity of ER protein in the malignant cells nuclei and the apparent amount of ER demonstrated by immunohistochemistry as a highly complex process, and related this to the effect of the pre-analytical factors on the IHC results. 24 The pre-analytical factors and the overall sensitivity of immunohistochemistry were considered to be the main explanation for the bimodality of the ER distribution. It has been demonstrated that IHC results can vary according to pre-analytical factors such as tissue fixation 25,26 and antigen retrieval. Umemura et al. demonstrated a positive correlation regarding a linear relationship between the LBA and a low-sensitivity IHC assays. This correlation was reduced by the use of a highly sensitive IHC test with non-linear correlation, and the biochemical assays resulted in a shift of the low-ER cases towards the higher end of ER positivity. 27 This finding was used to explain the low frequency of the low-ER cases. 1 -9 % 1 1 -1 5 % 1 6 -2 0 % 2 1 -4 9 % 5 0 -5 5 % 5 6 -6 9 % 7 0 -7 9 % 8 0 -8 9 % 9 0 -9 9 % 1 0 0 % 1 0 %  Interestingly, although the techniques and the testing level used to evaluate ER distribution were totally different, when the 1980 cases of the METABRIC BC dataset were evaluated in our study, the similarity between the transcriptomic and proteomic levels was significantly high. Both IHC and mRNA levels determined with large datasets showed the bimodal distribution. Therefore, the explanation that the bimodal distribution is a result of pre-analytical factors or the highly sensitive IHC test has to be reconsidered. The    bimodal gene expression for ESR1 in 123 patients was presented previously by Wang et al. 12 Their method was also applied to the MDA133 breast cancer microarray dataset. 28 This study has demonstrated a strong correlation between ER protein levels (immunohistochemistry) and ESR1 mRNA levels as assessed by gene expression microarrays. Interestingly, a large proportion of the immunohistochemically intermediate cases were related to the cases with strong ESR1 expression. However, this can be explained by either the use of different cut-offs in each cohort or by the differential translation of ER in some cases. The use of 70% as a cut-off to define the intermediate group in this study may have resulted in an underestimation of the level of ER positivity defined according to ESR1 mRNA. This study also did not assess the role of variable ER expression in the BC-associated stroma on the  strength of the association between ER IHC and ESR1 mRNA levels. At both the proteomic and transcriptomic level, our results showed no significant difference between the ER-negative and the intermediate ER expression groups regarding patients' outcome. Therefore, our findings raise the clinical question of whether patients with low positive ER expression are actually positive cases or ER-negative cancers have been misclassified as borderline weakly ER-positive. In this study,~50% of the low/intermediate immunohistochemically ER-positive group became ER-negative when full-face sections of excision specimens were immunostained. Although this may reflect a proportion of cases with false-positive ER expression on core biopsies, 29 further research is needed to evaluate this important group of cases. In this study, HER2 overexpression was more frequent in the groups with low and intermediate ER expression than in the group with strong ER expression group. The high frequency of HER2 overexpression in these tumours may explain their aggressive behaviour, and indicate that their response to therapy may be more similar to that of ER-negative tumours than to that of strongly ER-positive tumours. Some of these tumours would be classified in the luminal B or HER2-enriched classes; however; further study is warranted to investigate this point.
In conclusion, our study provides further evidence to support the concept that ER expression in BC is essentially bimodal, with the vast majority of cases being either ER-negative or strongly ER-positive. The biological and clinical significance of the intermediate-expression group, and particularly of the lowexpression subgroup, needs further investigation.

Conflicts of interest
The authors have no conflicts of interest.

Author contributions
A. A. Muftah: writing of the manuscript, IHC staining and scoring, and data collection, analysis, and interpretation. M. Aleskandarany: data analysis and interpretation, contribution to writing, and review of the manuscript. S. N. Sonbul: data analysis and interpretation. C. C. Nolan and M. D. Rodriguez: help with laboratory work. C. Caldas: provided data. A. R. Green: data collection, contribution to writing, and review of the manuscript. I. O. Ellis: data analysis and interpretation. E. A. Rakha: generation of the hypothesis and design of the study, data interpretation, contribution to writing, and review and approval of the manuscript. All authors approved the final version.