ICA-based denoising for ASL perfusion imaging

.


Introduction
The measurement of cerebral perfusion is an indispensable tool in clinical practice across a broad range of acute and chronic pathologies, such as stroke and dementia (Grade et al., 2015;Albers et al., 2018;Wolters et al., 2017). A number of methodologies can be used, each with its own advantages and disadvantages; arterial spin labeling (ASL) (Detre et al., 1992) MRI's key advantage is that it does not require the administration of an exogenous contrast agent. Instead, ASL generates an image by tracing the accumulation of magnetically labeled blood water in the brain, deriving a perfusion image by subtracting a magnetically labeled image from a control unlabeled image. The use of multiple post-labeling delays (PLD) in the acquisition allows the estimation of arterial transit time (ATT) values, which may not only improve the accuracy of the quantification of cerebral blood flow (CBF) (Wang et al., 2013;Okell et al., 2013), but may also provide relevant risk stratification information, for instance, in patients with carotid steno-occlusive arterial disease (Alsop et al., 2015).
The main drawback of ASL is that the generated image has an intrinsically low signal to noise ratio (SNR). Multiple measurements are routinely acquired and averaged to compensate for this (Alsop et al., 2015), though at a penalty of increased scan duration and, hence, opportunity for motion artefacts, whose adverse effect are further enhanced by the image subtraction process (Grade et al., 2015). Thus, in acute clinical settings, patient factors, and, in particular, increased patient movement (Carone et al., 2017) may limit this strategy as a means of improving SNR.
Methods have been proposed in the post-processing stage to remove structured noise arising from subtraction errors due to motion or other sources of difference between images while preserving as much signal as possible. Existing approaches include applying filters or removing image volumes deemed to be corrupted (Shirzadi et al., 2018;Tan et al., 2009). However, both strategies have their limitations. Filter thresholds are often chosen arbitrarily without reference to the imaging data. Removing entire volumes inevitably causes loss of signal, and risks being counterproductive in the presence of limited measurements.
The challenges of low SNR and the presence of structured noise are not unique to ASL. BOLD-based functional MRI (fMRI) shares the same issues. Using Independent Component Analysis (ICA) in post-processing BOLD fMRI data has been shown to reliably separate signal from artefacts or structured noise (Thomas et al., 2002), allowing a significant improvement over results obtained with more traditional post-processing (Stone et al., 2002;Kochiyama et al., 2005;McKeown et al., 2005;Zou et al., 2009). The utility of ICA to improve SNR has also been explored in diffusion-weighted imaging (Arfanakis et al., 2002), and dynamic susceptibility contrast-MRI (Calamante et al., 2004). It has shown promising preliminary results when applied to pre-clinical ASL data (Wells et al., 2010).
This study investigates the use of ICA-based denoising on clinical ASL data acquired in acute ischemic stroke patients. Its performance is compared to two other denoising strategies: aCompCor (Behzadi et al., 2007), a Principal Component-based method; and, Enhancement of Automated Blood Flow Estimates (ENABLE) (Shirzadi et al., 2018), an algorithm based on the removal of corrupted volumes.

Patients and MRI data acquisition
Consecutive patients presenting with a clinical stroke syndrome within 24 h of symptom onset (using the last seen well principle), regardless of age, were recruited into a prospective observational cohort study following informed consent or agreement from a representative according to protocols approved by UK National Research Ethics Service committees (ref: 12/SC/0292 and 13/SC/0362). Each subject was scanned at presentation, 24 h, a week, and a month later, whenever clinically possible. Exclusion criteria included the presence of a contraindication for MRI and a severely impaired conscious level (score >1 on question 1a of the National Institute for Health Stroke Scale).

MRI data acquisition
All scans were acquired using a 3.0T Siemens Verio scanner (Siemens Healthcare, Erlangen, Germany). ASL data were acquired using the following protocol: pseudo-continuous ASL; single-shot EPI readout; TR/ TE 5386/14 m s; 3.4x3.4 Â 4.5 mm; 24 slices using a matrix size 64x64; alternating control and label pairs acquired after 1.8 s of labeling at 5 different post-labeling delays: 400, 800, 1200, 1600, 2000 m s, varied in a looped fashion and repeated 6 times (total volumes acquired 60); background suppression (WET presaturation and two global inversion pulses with timings calculated as per Okell et al. (2013) 6 ); and, total acquisition time ¼ 4 min 30 s. A calibration image with identical readout parameters, but with no background suppression or ASL labelling, was automatically collected within the same scan to allow quantification of CBF in absolute units. The vendor "pre-scan normalize" functionality used to remove the effects of receive coil non-uniformity.
In all patients and at all time points, a high-resolution T1-weighted structural image (magnetization prepared rapid acquisition gradient echo (MPRAGE); 1.8 Â 1.8 Â 1.0 mm; FoV ¼ 228 mm; TR ¼ 2040 m s; TE ¼ 4.55 m s; and, a total acquisition time ¼ 3 min 58 s) was also acquired.
Tissue segmentation of the structural T1-weighted image using FMRIB's Automated Segmentation Tool (FAST (Zhang et al., 2001)) defined gray matter partial volume estimates, which were registered into perfusion image space. Gray matter masks were generated using a partial volume estimate (PVE) threshold of !70% unless otherwise specified. Registration between perfusion and structural image was carried out using the BBR (Boundary-Based Registration) option of the FLIRT tool (Jenkinson et al., 2002;Jenkinson and Smith, 2001), which also allowed for simultaneous distortion correction using separately acquired fieldmaps.
Registration between structural and standard-space (MNI152-2 mm standard brain) was carried out using the FNIRT tool (Jenkinson et al., 2012).

ICA-based denoising
Each 4D ASL dataset after control-label subtraction (ASL-sub) was processed using single-subject spatial-ICA decomposition with automatic dimensionality estimation using the Multivariate Exploratory Linear Optimised Decomposition of Independent Components (MELODIC) tool (Beckmann and Smith, 2004).

Manual Independent Component classification and artefactual component regression
Two independent raters manually classified the Independent Components (ICs) following a standardized procedure. Components were deemed as most likely representing signal ( Fig. 1) when at least two of the following characteristics were present: spatial maps consistent with the expected location of perfusion signal (i.e. gray matter). a time course congruous with the variation in post-labeling delays across the acquisition. most of the signal in the power spectrum at frequencies (in cycles per scan) corresponding to the number of repetitions or its multiple.
If a component had none or only one of these characteristics, it was labeled as noise (Fig. 2), and regressed out of the data using a nonaggressive approach so that only the unique variance related to the artefacts was removed . Disagreements were resolved by reference to a third rater.

Automated Independent Component classification and artefactual component regression
The automated classification and regression of components were obtained using a customized version of FMRIB's ICA-based X-noiseifier (FIX) (Salimi-Khorshidi et al., 2014). FIX extracts multiple spatial and temporal features for each component, each describing a different aspect of the data. These are fed into a multi-level classifier. Once trained using manually classified training datasets, FIX can then automatically classify new datasets (Salimi-Khorshidi et al., 2014). A threshold is applied to FIX output to determine the binary classification of any given component. Changing the threshold shifts the balance between true positive ratio (TPR) and true negative ratio (TNR). For the purpose of this study, the optimal threshold was defined as the one that would grant the highest TNR while preserving a TPR above 90% (see Supplementary Table 1).
In order to optimize FIX for ASL data, it was custom-modified to identify as signal those ICs that had higher power in the frequencies that matched the specific ASL sequence PLD cycle frequency (6 cycles per scan). For more details on how FIX features were modified see Supplementary Table 2. In addition, FIX was automated to use the average motion parameters between the corresponding label and control images as motion parameters for each subtracted image. This was required as motion correction was performed before label-control subtraction, whilst ICA was run on the subtracted data.
The labels generated during manual classification were used to create the required training dataset. When denoising ASL data obtained from a specific subject, all the labels derived from the same subject were excluded from the training dataset (leave-one-subject-out approach). As above, components were automatically regressed out of the data using the non-aggressive approach.
2.4. Alternative denoising methods used for comparison 2.4.1. aCompCor aCompCor (Behzadi et al., 2007) is a Principal Component Analysis (PCA) based method developed to reduce noise in both BOLD and ASL based fMRI. Significant Principal Components are derived from noise regions-of-interest (ROI) composed primarily of white matter and cerebrospinal fluid. These components are then included as nuisance parameters within general linear models for BOLD and perfusion-based fMRI time-series data. aCompCor (available on https://nipype.readthedo cs.io/en/latest/interfaces/generated/nipype.algorithms.confounds.html ) was applied to the data used in this study after the pre-processing step described above in accordance with the user guide instructions.

Enhancement of Automated Blood Flow Estimates (ENABLE)
ENABLE is a multiparametric automated algorithm that identifies and removes poor quality difference images in multiple post-labeling delay (PLD) ASL as a means to improve the SNR. The quality criteria that ENABLE simultaneously implements are: the temporal contrast to noise ratio (tCNR), defined as the temporal mean of the 4D ASL subtracted dataset (ASL-sub) in the gray matter divided by the standard deviation of ASL-sub in non-brain voxels); the proportion of ASL-sub voxels in the gray matter significantly greater than zero; the coefficient of variation (CoV, defined as the ASL-sub spatial standard deviation divided by its spatial mean in the gray matter; and, the temporal SNR (tSNR, defined as the spatial mean of ASL-sub divided by its standard deviation in the gray matter) (Shirzadi et al., 2018). ENABLE (available on https://asl-docs.re adthedocs.io/en/latest/index.html) was applied to the data used in this study after the pre-processing step described above in accordance to the user guide instructions (Shirzadi et al., 2018).

Evaluation of the effects of denoising
Effects on ASL-sub variance: to test the effect of correction on ASLsub, a voxel-wise variable, %ΔSTD map (Carone et al., 2017;Khalili-Mahani et al., 2013) was calculated for every scan obtained. %ΔSTD map was defined as: where STD is the standard deviation of each voxel over the volumes of the 4D ASL subtracted dataset. The %ΔSTD maps were then registered to the MNI152-2 mm standard brain (using the non-linear transformation matrix obtained from registering the perfusion data to the MNI152-2 mm standard brain) and averaged to generate an intensity map. ΔSTD maps were also thresholded (25%), binarized, and averaged to generate a probability map that would highlight those areas the ASL-sub variance was more frequently reduced across subjects.
Effects on the perfusion analysis: gray matter CBF, ATT and their respective intrasession variances were estimated before and after denoising using a spatially regularized Bayesian inference method (BASIL) that produces an estimate of both CBF and its associated variance at every voxel (Chappell et al., 2009). Mean CBF and mean ATT and their respective intrasession variances were compared in the gray matter across the different denoising strategies using an ANOVA for repeated measurements with multiple comparisons. In addition, mean CBF and its intrasession variance were computed before and after ICA-based denoising using different gray matter PVE thresholds (50, 70 and 90) to explore the effects of denoising in regions with increasingly smaller partial volume effects.
BASIL-generated z-statistics for the CBF fit were used as a marker of goodness-of-fit to the model parameters (Chappell et al., 2009;MacIntosh et al., 2010). The number of gray matter voxels having a z value < 2 (approximately < 5% confidence in fitted value) was compared across the different denoising strategies using an ANOVA for repeated measurements with multiple comparisons.
Effects on SNR estimates: SNR was defined to allow the direct comparison of the results with previous work using ENABLE (Shirzadi et al., 2018). CBF-SNR and ATT-SNR were estimated for each dataset by dividing (voxel-wise) the gray matter CBF (or ATT) values by the estimated standard deviation. SNR estimates were compared across different denoising strategies using an ANOVA for repeated measurements with multiple comparisons.
Effects on repeatability: to understand the effect of denoising on repeatability of CBF measurements, each single session ASL dataset acquired was split into epochs comprising of one repetition for each postlabelling delay, before and after denoising. The gray matter CBF was estimated for each epoch. Repeatability was assessed using the coefficient of variation of these estimates (defined as the standard deviation divided by the mean; the lower the coefficient of variation, the higher the repeatability). The results were compared across different denoising strategies using an ANOVA for repeated measurements with multiple comparisons.

Evaluating the effects of varying the number of repetitions
Epochs were generated for each ASL dataset acquired, before and after manual ICA-based denoising. An increasing number of repetitions were removed in a stepwise manner to understand how varying the number of repetitions acquired may impact SNR (the first repetitions to be removed were the last acquired). CBF-SNR was then estimated for each epoch and compared using an ANOVA for repeated measurements with multiple comparisons.

Statistical analysis software
All statistical analysis was performed using Prism 8 (GraphPad, California, USA).

ICs classification and artefactual component regression
The mean number of single-subject ICs estimated by MELODIC was 19 (range 7-23).
The mean number of ICs manually classified as signal was 7 (range 3-14; inter-rater agreement ratio was 92%). The mean number of ICs classified as signal by FIX was 7 (range 2-16).

ENABLE
The mean number of volumes per patient scan classified as poor and subsequently removed from analysis was 1.7 (range 0-7).

Evaluation of the effects of denoising
Effect of correction on ASL-sub variance: following ICA-based denoising, individual average maps of %ΔSTD showed more frequent and pronounced changes around the brain edges and in the periventricular areas with an average decrease of 30-35% in the ASL-sub variance (Fig. 3, right). When using aCompCor or ENABLE, the changes were similar although less frequent and less pronounced (Fig. 3, middle and  left).
Effects on the perfusion analysis: the mean CBF and ATT values and their mean intra-session variances, before and after denoising are reported in Table 2 and Figs. 4 and 5. Results from the multiple comparisons are listed in Tables 3 and 4. ICA-based denoising led to the greatest changes in the perfusion analysis. Compared to both raw data and data denoised using the other methods, mean CBF was significantly higher (mean difference vs raw data: 5.0 ml/100 g/min, p < 0.001: vs aCompCor 1.8 ml/100 g/min, p < 0.01; vs ENABLE 2.8 ml/100 g/min, p < 0.001; Fig. 4, Table 3), whilst the mean CBF intra-session variance was lower (mean difference vs raw data: 21%, p < 0.001, vs aCompCor 9%, p < 0.01; vs ENABLE 18%, p < 0.001; Fig. 4, Table 3). There was no significant difference in either metric when the effects of manual and automated IC classification were compared. The use of aCompCor led to more modest changes in mean CBF (mean difference vs raw data: 3.2 ml/100 g/min; p < 0.001) and its intra-session variance (mean difference vs raw data: 17%, p < 0.001; Fig. 4, Table 3). The use of ENABLE resulted in an increase in mean CBF only (mean difference vs raw data: 1.5 ml/100 g/min, p < 0.05, Fig. 4, Table 3).
The use of ICA-based denoising and aCompCor led to similar changes in ATT (mean difference vs raw data: 0.08s, p < 0.001) and ATT intrasession variance (mean difference vs raw data: 28%, p < 0.001; Fig. 5, Table 4). There was no significant difference in either metric when the effects of manual and automated IC classification were compared. The use of ENABLE led to an increase in ATT intra-session variance only (mean relative difference: 5%, p < 0.001, Fig. 5, Table 4).
The changes in mean CBF and its intra-session variance following ICA-based denoising were significant across all gray matter PVE thresholds tested (see Supplementary Fig. 1).
All denoising strategies led to a significant decrease in the number of poorly fitted voxels (z value < 2) (Gardener and Jezzard, 2015). Compared to the raw data, this effect was greater following ICA-based denoising (14% decrease, p < 0.001, Supplementary Fig. 2). There was no significant difference between manual and automated ICA-based denoising. The effects of aCompCor and ENABLE were more modest but still significant (aCompCor: 10% decrease, p < 0.001; ENABLE: 8% decrease, p < 0.05).
Effects on SNR estimates: all denoising strategies led to a significant increase in both CBF-and ATT-SNR (p < 0.001 Figs. 4 and 5; Tables 2  and 3). The highest increase in SNR compared to raw data was observed following ICA-based denoising (40% increase in CBF-SNR, p < 0.001, Fig. 4, Table 2; manual and automated). There was no significant difference between manual and automated ICA-based denoising (Figs. 4 and 5, Tables 2 and 3). The effects of aCompCor were more modest (20%  Fig. 3. Spatial pattern of change in ASL-sub variance after denoising. Probability map (top), representing areas where the ASL-sub variance was affected more frequently in patients. Intensity map (bottom), representing average change in ASL-sub variance.
increase in CBF-SNR, p < 0.001, Fig. 4, Table 2) whilst the use of ENABLE led to the smallest increase (4% increase in CBF-SNR, p < 0.001, Fig. 4, Tables 2 and 3). Effect of correction on repeatability: repeatability was unaltered following the use of either aCompCor or ENABLE when compared to raw data. However, ICA-based denoising yielded significantly higher repeatability (p < 0.05, Fig. 6). There was no significant difference in repeatability between manual and automated ICA-based denoising (Fig. 6).

Evaluating the effects of varying the number of repetitions
Decreasing the number of repetitions in the raw ASL data led to a drop in CBF-SNR, ranging from 4% (one repetition removed, p < 0.001) to 21% (four repetitions removed, p < 0.001). A similar effect was observed when removing repetitions from the manual ICA denoised ASL data. However, the CBF-SNR calculated in denoised data incorporating only 3, 4, or 5 repetitions was greater than the CBF-SNR calculated from the raw data with all repetitions included (p < 0.001; Fig. 7).

Discussion
This study shows that the use of an ICA-based denoising approach results in lower CBF and ATT variance, increased SNR, and improved repeatability. These changes went beyond the effects of approaches using Principal Component Analysis (aCompCor) and removal of corrupt image volumes (ENABLE). Automating ICA denoising achieved near identical results to manual denoising across a broad range of metrics. Intriguingly, following ICA-based denoising, it was possible to obtain higher SNR levels compared to the full raw dataset using only 50% of the data acquired, setting up the possibility of reducing acquisition times in acute stroke patients.
strategies. This was, at least in part, a reflection of the reduction in the number of poorly fitted voxels; the Bayesian CBF and ATT priors used in BASIL are both less than the overall mean CBF and ATT, regardless of denoising strategy. As data fidelity improves, the prior has less influence over the overall CBF and ATT estimates (Chappell et al., 2009;Gardener and Jezzard, 2015). Both component-based denoising approaches were superior to ENABLE presumably due to their ability to remove noise across multiple volumes without having to discard entire volumes containing both signal and noise. In settings where a small number of repetitions are acquired such as in this acute stroke study, ENABLE is disadvantaged as a denoising strategy as the concurrent removal of signal with noise has a greater marginal effect on the total signal available for analysis.
The comparative performance of the two component-based approaches was in line with previous studies comparing the two in BOLD fMRI where the denoising effects of aCompCor were more limited, especially in the presence of motion (Pruim et al., 2015;Parkes et al., 2018). This difference in effect appears to be related to inherent features of the approaches where different "rotation" criteria are used for determining the orientation of the basis vectors in their common solution space (James et al., 2009). PCA-based strategies are better in removing the contributions of random noise; whilst ICA-based approaches are better suited in dealing with structured noise such as motion, respiratory and cardiac noise (James et al., 2009;Biswal and Ulmer, 1999), major contributors of noise in acute stroke patients (Carone et al., 2017).
While manual IC classification has been widely used as the gold standard (De Martino et al., 2007;Rummel et al., 2013), it is time-consuming, operator dependent and requires expert knowledge to separate signal and noise characteristics (Carone et al., 2017;Salimi--Khorshidi et al., 2014). The customized version of FIX used here was able to achieve results comparable to manual ICA denoising, demonstrating that it is possible to automate this approach, overcoming these limitations. Moving to an automated approach may be challenging where the ASL data has been acquired with different scanners and/or different acquisition protocols. While the inherent structure of multi-PLD ASL data was exploited to help separate signal and noise components in this study, ICA denoising should also be beneficial in single-PLD ASL in the same it has been for BOLD fMRI. This requires further study. The customized version of FIX for ASL and our training datasets will be made available (URL).

Conclusion
ICA can be used to separate signal from noise in ASL data. The removal of artefactual components improves data quality without increasing acquisition duration. In fact, this study suggests that the acquisition time could be reduced by 50% without penalty to data quality, something that merits further investigation. Independent component classification and regression can be carried out either manually, following simple criteria, or automatically, thorough the use of FIX customized for ASL.