Diffuse Reflection Infrared Fourier Transform Spectroscopy and Partial Least Squares Regression Analysis for Temperature Prediction of Irreversible Thermochromic Paints

Temperature measurement of internal components of a jet engine is a crucial control parameter to ensure its component life and efficiency. Particularly for thermal analysis of internal components of jet engines, irreversible thermochromic paints (TPs) have been developed at Rolls-Royce plc to evaluate the surface temperature of engine components where it is otherwise impossible. Thermochromic paints change color with respect to an increased temperature whereby the resulting change in the TP color corresponds to the maximum temperature experienced by the surface of engine components during testing. To improve the reliability and reproducibility of the temperature measurement by TPs, this work explored the potential use of diffuse reflection Fourier transform infrared spectroscopy (DRIFTS) combined with partial least squares regression (PLSR) analysis. The outcome of the prediction of the raw and pre-processed datasets was compared and discussed. The major contributors to the prediction models were the change in the property of the surface M–OH bonds, the structural change of the inorganic pigments and fillers, and their solid-state reaction at a higher temperature. The result showed improved reliability of the prediction model after the combined pre-process treatments with reported RMSEC of 4.5 °C and RMSECV of 13.0 °C using three latent variables. Graphical Abstract


Introduction
Temperature measurement control is an important parameter to ensure component life and to improve the performance efficiency of machinery and engines operating at high temperature. To acquire the surface thermal information of turbine jet engine components, irreversible thermochromic paints (TPs) have been developed at Rolls-Royce plc (RR). [1][2] The surface thermal profile is obtained utilizing temperaturedependent color change, which is mostly due to (i) thermal decomposition of pigments, (ii) structural changes of the inorganic pigments and fillers, and (iii) high-temperature solid-state reaction of inorganic pigments resulting in new spinel species. [3][4][5][6] The series of TPs at RR incorporates various thermochromic pigments such as ultramarine violet (Al 6 Na 6 O 24 S (3)(4) Si 6 ) and cobalt ammonium phosphate (CoH 4 NO 4 P), to obtain the most optimal color changes on the applied surface. The surface temperature is determined by applying TP on engine components before an engine test and comparing the changed colors of the surface of disassembled engine components to the calibrated colortemperature reference of TP.
Although the use of TPs provides detailed thermal information of the surfaces of the engine components with a temperature range from 300 to 1200°C, the performance capability of TP depends on its color change that is examined by visual inspection. Hence, there has been a continuous effort to improve the precision of the temperature measurement, where it is currently limited to the visible color change of TP. One approach to refine existing TP technology is the use of non-contact and nondestructive spectroscopic surface analysis techniques. Specifically, temperature measurement using TP may improve by capturing spectroscopic information outside of the visible spectrum and correlating the chemical information to the temperature at which TP is exposed. The spectral information collected from a sample surface can be analyzed with multivariate analysis to highlight the most important spectral information related to the variable factors of interest. The use of chemometrics techniques in combination with spectroscopy, namely, diffuse reflection infrared Fourier transform spectroscopy (DRIFTS), has been widely applied in many fields such as geological analysis, art conservation, and food science to monitor the chemical change of samples to discover otherwise elusive information. 7-10 Data collection by DRIFTS is also both fast and nondestructive, which is beneficial for inspection and temperature measurement of TPs on post-test engine components.
A widely used data processing method for spectroscopy is partial least squares regression (PLSR). One of the main benefits of PLSR is that it enables correlation of observed information in a dataset with the independent variable(s) to develop a quantitative prediction model. It is a rapid, interpretive method that utilizes the most important chemical information from a material of interest, and its statistical analysis allows the user to comprehend the most significant contributing factors to the predicted outcome. The use of DRIFTS combined with chemometrics technique to determine the temperature of TP has never been reported. Therefore, the present work explores the potential application of DRIFTS-PLSR analysis to predict the temperature of a TP after thermal treatment. The raw and pre-processed datasets were compared to improve the prediction model and the statistical information obtained from PLSR was used to interpret the spectral dataset in relation to the thermal behavior of the TP.

Partial Least Square Analysis
Spectral information is often processed using multivariate analysis to reveal relationships between the sample and changing variables or processing methods. Partial least squares regression works by using regression of a dataset whose mathematical process is expressed as where X is the original matrix of the spectral data, t and p are the score values and loading spectrum of a latent variable (LV), or a dimension of the dataset, respectively, and E is the residual matrix obtained after regression of each LV, which is used as the matrix for the succeeding LV. 11 In spectroscopic applications, t values are often used to describe the level of similarity or differences of a sample within the whole population. It is a useful value to observe the trends of the collected spectral sample to the developed model. The vector p is used to determine which component(s) of a spectrum had contributed to the development of a model in each LV. 12 For a further explanation, refer to the paper by Wold et al. 11 Pre-Processing To acquire the most relevant information of interest, it is a common practice to treat the dataset with pre-processing algorithms. This step often excludes unwanted information such as baseline shifts and noises to highlight more relevant information, which depends on the physical and chemical nature of the sample as well as the end goal of the analysis. In this work, the following pre-process treatments were applied to maximize the use of chemical information of the TP from the spectra: standard normal variate transformation (SNV), autoscale, and orthogonal signal correction (OSC). These preprocessing were applied in the respective order to the Xblock (spectral data) and the Y-block (temperature of TP) were mean-centered. Standard normal variate transformation is a process in which the spectra are normalized by the standard deviation of the intensity values at each spectral index, expressed as where SNV i is the transformed intensity value at each ith wavenumber index, x is the original intensity value at i, x is the average intensity of all spectra at i, and n is the number of spectra collected. In the context of a spectroscopic application, SNV normalizes each spectrum to its standard deviation; each spectrum weighs in equally to the model. 13 Autoscale was then applied after the SNV treatment. Autoscale is arguably one of the most used pre-processing methods, which mean-centers the dataset and normalizes to its new standard deviation where MC i is the new spectral intensity after the meancentering process and AC i is the new spectral intensity after the autoscaling process at each spectral index. The meancentered spectrum possesses a unique identity within the dataset, and by normalizing the spectrum by its standard deviation of mean-centered spectra, the value at all wavenumbers of a spectrum holds equal importance. Ultimately SNV normalizes the spectral dataset per spectrum and autoscale per wavenumber. Finally, OSC, which is a powerful tool to reduce background noise, 14 was applied to the dataset. After this process, the first LV contains the minimal covariance between the changes in the spectra to increasing temperature, therefore, extracting the maximum correlated spectral information with respect to the increasing temperature. The resulting dataset contains a highly linear relationship between X-block and Y-block. Orthogonal signal correction works in such a way that it finds information of X that is orthogonal (unrelated) to the trend of Y where t* is the updated orthogonal score vector, whose predicted response is calculated in the same manner as PLSR. By applying OSC after SNV and autoscale, which highlights all the important spectral information for prediction, the outcome of the prediction can be improved drastically as compared to the outcome using the raw data.

Statistical Analysis
To evaluate the quality and fitness of the developed model, it is important to evaluate the statistical analysis to determine any outliers, trends, or biases caused by the model development. Q-residuals and Hotelling's T 2 are two statistic values to find data points that could ultimately affect the model quality significantly due to the unique properties of which a data point may contain. 15 Q-residual evaluates the fitness of the data points to the developed model using the residual matrix of each LV. An abnormal value of Q-residual of a data point in comparison to the sample population may indicate that the sample may contain a systematic or random error from data collection or pre-processing. Hotelling's T 2 indicates the fitness of the correlation between samples and the prediction model. Note that, unlike Q-residual statistics, Hotelling's T 2 incorporates the score matrix, T, which is used to explain the nature and behavior of the samples within the model, rather than evaluating the residual vector of each LV. 15 Ultimately, Hotelling's T 2 describes the variation of the samples within the model. The outcome of the statistical methods above was used to describe the models of raw and pre-processed data and to compare the effect of pre-processing.

Thermochromic Paint
Samples of TPs were prepared at Rolls-Royce plc, Derby, United Kingdom. TP calibration coupons were prepared by cold-spray painting the TP on coupons (15 × 25 × 2 mm) of Nimonic75, cured at 300°C for an hour, and left to cool at room temperature for an hour. Each coupon was then heated to the desired temperature in a Thermal Cycling furnace (Carbolite, UK) in the air for 3 minutes. After 3 minutes, the coupon is cooled to room temperature. A set of calibration coupons were prepared in the range of 470-1260°C with 10°C incrementations. The thickness of the TP layer on the coupons was measured by a PosiTector 6000 (DeFelsko, USA), to ensure consistent thickness for all coupons (see Table S1, Supplementary Material).
A gradual color change of TP was obtained by heating a bowtieshape Nimonic75 sample with a customized induction heater (Bowyer Engineering, UK). Three bowtie samples were prepared by spray painting the TP onto a bowtie-shaped Nimonic75 (50 × 180 × 2 mm) and the samples were cured at 300°C for an hour. The bowtie samples were cooled for an hour at room temperature then heated for 3 minutes at the highest temperature of 1200°C at the narrowest point of the sample where thermocouples were welded (Figs. 1 and 2). The bowtie samples were used for line-scan measurement to test the prediction model on a surface with gradual and continuous color change.

Diffuse Reflection Infrared Fourier Transform Spectroscopy
Diffuse reflection infrared Fourier transform spectroscopy allows the collection of structural information of the surface of TP in a nondestructive manner with minimal sample  preparation. It is also beneficial for analysis of non-smooth surface as the process considers both absorption and scattering of the incident light by the surface using Kubelka-Munk transformation, which is approximately proportional to the absorption spectra where R ∞ is the reflection of the surface, which is the ratio of the absorption coefficient, K, and the scattering efficient, S. 16 The IR spectra of the calibration coupons and bowtie samples were collected using Alpha FT-IR spectrometer (Bruker, Germany) equipped with the UP-DRIFT module (Bruker, Germany). The spectra were obtained in the Kubelka-Munk mode with atmospheric compensation mode from 400 to 4000 cm À1 , with the sample scanning number of 32 and resolution of 4 cm À1 . Each spectrum is the average spectrum of three measurements at different locations of the coupon to simulate random sample collection. The line measurement of the bowtie was taken by obtaining a spectrum at every 5 mm in the center line (Fig. 2). Data processing and PLSR analysis were executed using Matlab 2017b and its PLSTtoolbox 7.0 by Eigenvector (The MathWorks Inc., USA).

Spectroscopic Analysis
To develop a prediction model from spectroscopic data, understanding the original dataset (i.e., the chemical change over increasing temperature) is crucial to optimize the model specifically for the materials of interest. The raw spectra of calibration coupons, which were thermally treated by increments of 10°C, are presented in Figure 4a. In the spectral region above 3000 cm À1 , there are two different bands associated with -OH stretching, which are at 3357 cm À1 assigned to the surface M-OH stretching from ultramarine violet, silica, and alumina pigments (Fig. 3b-d), 17,18 and at 3242 cm À1 from -OH group present in the paint resin (Fig.  3e). The intensity of the M-OH band of the inorganic pigments increases at around 580°C whereby the maximum intensity is observed at 610°C and the peak decreases at higher temperatures. This is attributed to the increase of the M-OH bounding site on the surface of the pigments upon heating where the band decreases thereafter due to thermal dehydroxylation. 19 Further, its band maxima position shifted to 3617 cm À1 at around 860°C, which corresponds to the structural change of alumina pigments leading to altered vibrational activities of Al-OH stretching on the surface. 17,20 The M-OH stretch band completely disappears by 980°C, which is also observed in the corresponding O-H bending peak at 1628 cm À1 .
The sharp peaks at 3073, 3052, 2959, 2926, and 2854 cm À1 are attributed to stretching modes of C-H bonds of the resin mixture which burns off by 620°C. In the same region, there is N-H stretching band from cobalt ammonium phosphate (Fig.  3a), which overlaps with spectra of the resin. The impact of the presence of these organic peaks on the prediction model is discussed later.
After the disappearance of the M-OH bands at around 980°C inorganic pigments of alumina, silica, and ultramarine pigments show spectroscopic activities of thermal decomposition, solid-state reaction, and structural transformation. One of the two most notable peaks is observed in the region of 1200-1350 cm À1 after the disappearance of organo-silicone resin. These intense peaks can be assigned to the overlapping of P-O, Si-O-Si, and Si-O-Al bands from the inorganic pigments and fillers, which remain in the TP above 1000°C. 19,21,22 The peaks remaining at higher temperature are attributed to the silicate glass surface that melted on the coupon surface, which was observed under SEM (see Figure S2).
As described above, the pre-processing method was applied to improve the prediction outcome. The pre-processed spectra (Fig. 4b) are difficult to interpret, but some key features are linked back to the raw spectra, such as the change in the M-OH band in the region of 3000-3500 cm À1 . It must be noted that the pre-processing method highlighted the shift in the baseline for the samples treated above 1000°C. Another strong feature in the pre-processed spectra is the presence of the silica peaks at 1991, 1869, and 1793 cm À1 , which was more distinguished in the TP spectra at higher temperature region (>1100°C).

Temperature Prediction Model Development
After pre-processing, the spectral data were used to create a model to predict the temperature at which the TP has been exposed. The quality of the calibration curve from the preprocessed data was evaluated by the value of root mean squared error of calibration (RMSEC), which is reported in Figure 5. Six LV were used for the model developed with the raw dataset and three LV for the model developed with the pre-processed dataset. The developed models were validated using the Venetian blind cross-validation method with the number data splits of 10 and one sample per blind. This means that every tenth spectrum was taken out, the model was developed and tested. The summary of the result is provided in Table I. The RMSEC of the model developed with preprocessed dataset significantly improved, which is 7.3°C as compared to that of the raw spectra, 48.5°C. The root mean squared error of cross-validation (RMSECV) is 49.3°C for the raw data model and 9.07°C for the pre-processed model. Another notable result is the difference in LV1 Y-block cumulative value of pre-processed data, which is 99.6%, as compared to the raw data, which was 11.9 %. This is the major effect of OSC where the spectral data were orthogonalized to Y-block variable to exclude as much of unrelated spectral data as possible in the first LV, which also resulted in a lower number of LV needed to obtain lower error values, which contributed to the robustness of the model. Lastly, the improvement in the linearity of the calibration curve is especially remarkable in the temperature window above 1100°C, where very few spectral activities are observed in the raw data.

Statistical Analysis
After developing the prediction model, it is important to identify any outliers that may affect the quality of the model and prediction. The Q-residuals and Hotelling's T 2 are two statistic values to determine data points that could be considered as outliers or that could affect model significantly. In Figure 6a, the reported Hotelling's T 2 of the pre-processed data show more closely gathered values as compared to the model of the raw data, except for samples of the spectral from higher temperature coupons. This suggests that after preprocessing, the samples below about 1100°C are relatively more compatible with the new model than the samples of higher temperature. Similarly, the overall Q-residual values of the pre-processed data decreases, meaning that samples became more fitted to the model. As for the robustness of the pre-processed data, it is seen in Fig 6b and d that the Y-student residual values of pre-processed data show more randomly spread values as compared to the raw data, where the raw data shows a trend in the increasing and decreasing Y-student values in certain temperature windows. The more randomly spread Y-student values of the pre-process data suggest that the data became more prone to the effect of temperaturedepended spectral behavior, improving the robustness and fitness of the prediction model.

Scores and Loadings
Scores and loadings provide a useful insight as the information represent which behavior of the dataset have been captured and used to generate the outcome. For the simplicity of interpretation and comparison, only the scores and loadings of the first three LV are presented. The loading spectra show the main spectral activities that are correlated to the increasing temperature. In Figure 7a, the most notable features are the decrease of the intensities of the (Si-, Al-)OH region of 3357 and 1620 cm À1 , as well as P-O, Si-O-Si, and Si-O-Al region in 1200 to 1350 cm À1 . The same phenomenon is observed in the region below 1000 cm À1 where the peak intensity decreases over the temperature window. These activities are positively correlated, which suggests that dehydroxylation and the breakdown of the silicate-based pigments are the largest contributors to the development of the prediction model. LV2 (Fig. 7b) also shows similar spectral behavior; however, the peaks from silica 1991, 1869, and 1795 cm À1 show a negative correlation relative to the M-OH peak. This suggests that the increase in the silica peak, which is due to the shift of the baseline from the morphology change of the surface at higher temperatures (>1100°C), has a major contribution to the model development. Likewise, the increase in the intensity at 812 cm À1 appears to have the same effect as it is also negatively correlated to the M-OH peak, whose band intensity decreases over increasing temperature. Lastly, the score and loading spectrum from LV3 ( Fig. 7c and f), which only captured 1.57% of total information, show the spectral activities of sharp peaks from 2854 to 3073 cm À1 as well as 1430 cm À1 that correspond to the organic resin components, whose rapid decomposition is seen between 470 to 600°C. Further, the M-OH band in LV3 shows a negative correlation with M-O peaks indicating that LV3 also included the structural change of the inorganic pigments and an increase in hydroxyl site in the silica, alumina, and ultramarine violet upon heating to around 600°C.
The score plots and loading spectra of the pre-processed data show significant differences as compared to the raw dataset ( Fig. 7g-l). The most notable difference is seen in the LV1 score plot (Fig. 7j), which shows a great linearity with the increasing temperature of TP, accounting for 99.58% of Y-block. Orthogonal signal correction extracted the linear relationship of the given samples and variables; however, applying SN and autoscale before OSC further improved the prediction model. As seen in the LV1 loading plot (Fig. 7g), the pre-processing treatment highlights the information of M-OH bands and P-O, Si-O-Si, and Si-O-Al peaks, whose relative intensities are much closer than the raw data. This is the result of the OSC process where the new weight vector is calculated to include the maximum covariance of X-block and Y-block, and the loading show the most important spectral features of the TP. In Figure 7k, LV2 contributed 0.32% to the prediction model where spectral activities are captured in three main temperature windows: 470 to 590°C, 590 to 1080°C, and 1080 to 1260°C. These temperature windows are assigned to the decomposition of the organic resins, M-OH band change and baseline shifting, and the activity of inorganic components Figure 4. DRIFT spectra of TP collected from UP-DRIFT apparatus. Each presented spectrum was an average of three spectra per coupon, and every third spectrum (i.e., temperature increment of 30°C) is presented for clarity. (a) Raw spectra and (b) spectra after SNV-autoscale-OSC pre-processing treatment. Figure 5. The calibration curves of (a) raw spectral dataset and (b) pre-processed spectral dataset. Linearity of calibration curves (R 2 values) were reported to be 0.933 and 0.997 for the raw and pre-processed dataset, respectively. in the far spectral region (<1000 cm À1 ), respectively. In LV3, although there are important spectral features observed such as the Si-O peak, the loading spectra modified beyond a comprehensive interpretation.

Model Testing
To determine the applicability of the model, a separate set of calibration coupons made from a different batch of the same TP were prepared and used for the prediction test. The results are reported in Table II. The comparisons of raw and preprocessed data showed improved prediction outcome by the pre-processed data. The improvement is especially notable in the temperature region of 640 to 940°C where the difference between the treated and predicted temperatures is within 25°C. This is most likely due to the various spectral activities that take place within this temperature window as previously discussed. However, the prediction model still suffers to produce an accurate outcome for coupons heated at 1140 and 1240°C, which is probably due to the lack of spectral activities of the TP temperature at high temperature, which is observed in the raw spectra.
The prediction model was further tested on a bowtie substrate heated by the resistance heater up to 1200°C. The spectra for temperature prediction were collected at every 5 mm of the bowtie, where at 0 and 130 mm were the lowest temperature and 90 to 95 mm was the highest (1200°C). Through this test, the applicability of the model on the surface with unassigned temperature with a gradual color change was investigated (Fig. 8). First, the result shows an inaccurate  . Loadings and score plots of (a-f) raw and (g-l) pre-processed datasets. Only the first three LVs of each dataset were presented for the simplicity of comparison. The spectral region of atmospheric CO 2 was excluded for both models. outcome in the edges of the bowtie at 0 and 130 mm where the predicted temperatures are below the curing temperature. This is due to a high concentration of organic resin components still present on the lower temperature region of the surface, which is seen in the raw spectra of the painted bowtie sample (see Figure S2). The presence of a high concentration of the resin residue contributed to the lowering of the prediction outcome, which must be taken into consideration for the real application. Secondly, the pre-processed data shows three distinct "phases" of temperature change: a sharp change from 0 to 20 mm, a gradual change from 20 to 60 mm, and another sharp change from 60 to 95 mm. This is explained by the LV3 score plot (Fig. 7l), where the scores of the samples between 700 to 900°C appear randomly scattered; hence, the prediction outcome is inconsistent in the corresponding region on the bowtie. Although the highest predicted temperature is 1206.9°C at 85 mm, which is very close to the targeted temperature of 1200°C, this nonlinear prediction outcome along the bowtie surface indicated that the model still suffers in the region where the spectral information is not fully utilized.

Conclusion
This work demonstrated the interpretation of the temperature prediction model and spectral contribution of the TP that improved the accuracy of the prediction outcome. The prediction model became more robust after pre-processing of the spectral data while the accuracy of prediction drastically improved. Although this was the very first step to explore the potential use of DRIFTS and PLSR analysis to acquire thermal information of TPs, the result was promising for further development and application. Advancement in the present work will lead to an automation of temperature reading of TP, and the method may be applicable for thermal analysis of other coating materials.

Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: We thank EPSRC (EP/L015633/1, EP/S022236/1, and EP/R025282/1) Rolls-Royce plc, and Thermal Paint Department for support.

Supplemental Material
The supplemental material mentioned in the text, consisting of figures and tables, is available in the online version of the journal.