SDSS-IV MaNGA: Excavating the fossil record of stellar populations in spiral galaxies


 We perform a ‘fossil record’ analysis for ≈800 low-redshift spiral galaxies, using starlight applied to integral field spectroscopic observations from the SDSS-IV MaNGA survey to obtain fully spatially resolved high-resolution star formation histories (SFHs). From the SFHs, we are able to build maps indicating the present-day distribution of stellar populations of different ages in each galaxy. We find small negative mean age gradients in most spiral galaxies, especially at high stellar mass, which reflects the formation times of stellar populations at different galactocentric radii. We show that the youngest (<108.5 yr) populations exhibit significantly more extended distributions than the oldest (>109.5 yr), again with a strong dependence on stellar mass. By interpreting the radial profiles of ‘time slices’ as indicative of the size of the galaxy at the time those populations had formed, we are able to trace the simultaneous growth in mass and size of the spiral galaxies over the last 10 Gyr. Despite finding that the evolution of the measured light-weighted radius is consistent with inside-out growth in the majority of spiral galaxies, the evolution of an equivalent mass-weighted radius has changed little over the same time period. Since radial migration effects are likely to be small, we conclude that the growth of discs in spiral galaxies has occurred predominantly through an inside-out mode (with the effect greatest in high-mass galaxies), but this has not had anywhere near as much impact on the distribution of mass within spiral galaxies.


INTRODUCTION
Understanding how, when, and where galaxies built their mass is key to cosmology and astronomy. Analysis of the evolution of the masses and sizes of galaxies has generally been limited to comparisons of different galaxy populations at different redshifts. Studies done in this manner have shown that galaxies have grown in radius whilst building their mass (e.g. Trujillo et al. 2007;van der Wel et al. 2008;van Dokkum et al. 2008van Dokkum et al. , 2013Patel et al. 2013;Papovich et al. 2015; E-mail: Thomas.Peterken@nottingham.ac.uk Whitney et al. 2019), giving rise to the concept of "inside out" formation. It is thought that such growth in the most massive galaxies has been due to some combination of multiple minor mergers (Naab et al. 2009;Furlong et al. 2017), gas accretion (Conselice et al. 2013), and quasar feedback (Fan et al. 2008). These approaches have given us a good insight into how the average properties of galaxies have evolved over cosmic time, but because we cannot track the evolution of any individual system in this way, it is difficult to go beyond such global properties. Although some studies of galaxies at different redshifts have managed to show inside-out growth in disk-like galaxies (e.g. Trujillo et al. 2006;Patel et al. 2013), most are restricted to the highest mass galaxies, so this picture of inside-out growth is normally limited to earlytype galaxies.
An alternative approach which is more suited to latetype galaxies is to explore the stellar populations in different regions of a galaxy, particularly through studying how the mean stellar age varies with radius. This method requires high-quality spectral data at multiple locations across the face of a galaxy, and so has only been undertaken in detail for large numbers of galaxies since the advent of integral-field spectroscopic surveys such as the Calar-Alto Legacy Integral Field Array (CALIFA; Sánchez et al. 2012), Sydney-AAO Multi-object Integral field spectrograph (SAMI; Croom et al. 2012), and Mapping Nearby Galaxies at APO (MaNGA; Bundy et al. 2015) surveys. Using such a "fossil record" approach applied to integral-field spectroscopic data has revealed that most galaxies exhibit negative age gradients (e.g. Mehlert et al. 2003;Sánchez-Blázquez et al. 2014;González Delgado et al. 2015;Goddard et al. 2017) -with younger outskirts than centres -or earlier formation times of the central regions (e.g. Ibarra-Medel et al. 2016) providing more evidence for a dominant "inside out" growth mode occurring in galaxies of all Hubble types. This is also backed up by Sacchi et al. (2019) for the case of NGC 7793, who find that broad-band observations of resolved stellar populations in this nearby spiral galaxy indicate a clear gradient in stellar age.
By applying stellar population modelling methods to integral-field spectroscopic data from the CALIFA survey, Cid Fernandes et al. (2013Fernandes et al. ( , 2014, Pérez et al. (2013), González Delgado et al. (2017), and García-Benito et al. (2019) have shown that it is possible to reveal much more about a galaxy's history by deriving full star-formation histories rather than mean ages. We have shown previously that such analyses of the spatial variation in stellar populations of spiral galaxies can help us understand the structure of the spiral arms and bars, but here we investigate how such approaches can also help us study the evolution and growth of populations of galaxies.
Comparative studies of the masses and sizes of galaxies at different lookback times are most effective to measure the growth of early-type galaxies since these are typically the most massive and luminous objects at any given redshift so are easy to identify. By contrast, a fossil record analysis acts as a complementary approach best suited to -but by no means limited to; see e.g. Lacerna et al. (2020) -studying the growth of late-type galaxies, as such galaxies have in general had continued growth over the last several Gyr. This extended star-formation in late-type galaxies can be traced using fossil record methods, providing that care is taken to ensure that older populations can be detected when the flux may be dominated by the younger and brighter populations. Of course, there exists a population of spiral galaxies which are passive (see for example Masters et al. 2010;Fraser-McKelvie et al. 2016) -contrary to the well-known relation between the morphology and star formation rate (Tully et al. 1982;Baldry et al. 2004) -so a morphological classification does not always define the extent of the star-formation history of each galaxy. However, for consistency, we have chosen to study a galaxy population selected on their morphology rather than colour, to better understand how this well-defined galaxy class have evolved over time.
Here, we perform full spectral fitting of spiral galaxies from the MaNGA survey (Bundy et al. 2015) and measure spatially-resolved star formation histories, to uncover their formation sequences. This paper is structured as follows. In §2, we outline the data we use from the MaNGA survey. In §3 we describe how a sample of spiral galaxies from the MaNGA target list was selected, and in §4 we detail the spectral fitting method employed (with some tests of this method outlined in Appendix A). We then describe how the derived star-formation histories are processed in §5. The mean age and metallicity gradients are derived in §6. In §7 and §8 we analyse the starformation histories and spatially-resolved stellar populations in more detail, and infer the evolution of the mass-size relation in §9. Finally, we discuss the interpretation and context of the results in §10.

MaNGA
MaNGA (Bundy et al. 2015) is part of the fourth generation of the Sloan Digital Sky Survey (SDSS-IV; Blanton et al. 2017). By the survey's completion is 2020, MaNGA will have acquired 2.5 arcsec resolution integral-field spectroscopic observations of more than 10,000 galaxies in the redshift range 0.01 < z < 0.15 (Yan et al. 2016b). The survey makes use of the BOSS spectrograph (Smee et al. 2013) on the 2.5-metre SDSS telescope (Gunn et al. 2006) at the Apache Point Observatory, which has a spectral resolution of R ≈ 2000 and covers a large wavelength range of 3600-10300Å. The raw data's calibration is described by Yan et al. (2016a), with the datacubes then reduced using MaNGA's data reduction pipeline (DRP; Law et al. 2016). For each target galaxy, observations are taken out to at least either 1.5 R e or 2.5 R e (to form the "Primary" and "Secondary" samples respectively; Law et al. 2015), where R e is the elliptical half-light radius measured photometrically by the NASA-Sloan Atlas (Blanton et al. 2011). This is achieved using integral-field units of five different sizes, from the 127-fibre IFUs with a diameter of 32 arcsec, to the 19-fibre IFUs of 12 arcsec diameter (Drory et al. 2015).
The MaNGA target selection was chosen to obtain a flat distribution in log(stellar mass) (Wake et al. 2017). Neither the Primary nor Secondary samples are therefore volumelimited; instead the high-mass galaxies are over-represented while the low-mass galaxies are under-represented. The Pri-mary+ ("colour-enhanced") sample is an extended Primary sample but with an oversampling of the "green valley" galaxies (Wake et al. 2017) so is therefore also unrepresentative in this way too. However, since the sample selection in all cases is well-defined (Wake et al. 2017), a weighting has been determined for each galaxy to correct for these selection biases and form a representative volume-limited sample (referred to as the "Primary+ sample weighting" throughout this paper).
In this work, we make use of some of the analysis outputs of MaNGA's data analysis pipeline (DAP; Westfall et al. 2019). Specifically, we use the measured stellar velocities v , deprojected radii R, and emission line spectra [which are themselves described in detail by Belfiore et al. (2019)], all of which are derived using full spectral modelling. The data we use here is from the internal MaNGA product launch 8 (MPL-8) data release, which contains completed observations of 6778 galaxies.

Galaxy Zoo
We also make use of the morphological classifications of each MaNGA galaxy provided by volunteer "citizen scientists" as part of Galaxy Zoo (Lintott et al. 2008(Lintott et al. , 2011. The second phase of the project (Galaxy Zoo 2, hereafter GZ2; Willett et al. 2013) includes publicly-available detailed classifications of galaxies based on SDSS DR7 imaging. The users' classifications are weighted and combined to obtain a consensus fraction for each answer to each question for each galaxy, using methods described by Willett et al. (2013) and Hart et al. (2016). We use the redshift-debiased and userweighted probabilities -which we denote as p classification -from the Hart et al. (2016) catalogue.

SAMPLE SELECTION
A sample of spiral galaxies was drawn from the MPL-8 data release using the recommendations of Willett et al. (2013, Table 3); see also Masters et al. 2019 for another recent implementation. We first remove the 45 galaxies in the matched MPL-8/GZ2 catalogues that more than 50% of GZ2 users have classified as having some form of star or artifact in the image. To filter out elliptical galaxy morphologies, we select the 4201 galaxies with p features or disk > 0.43 and at least 20 classifications in this question, as recommended by Willett et al. (2013).
Since we are interested in the variation in stellar population properties across the face of each spiral galaxy, we remove edge-on galaxies from this sample. This cut can be made with either the GZ2 classifications -specifying p not edge-on > 0.8 -following Willett et al. (2013), or using an axis ratio cut -requiring b a ≥ 0.4 -following Hart et al. (2017). To select only face-on galaxies, we choose galaxies that satisfy the Willett et al. (2013) criterion and have an axis ratio of b a ≥ 0.5 (corresponding to an inclination of i ≤ 60°assuming the galaxies can be modelled as a thin intrinsically circular disks). We used this higher axis ratio cut compared to that used by Hart et al. (2017) to ensure that we have selected only galaxies for which the radial structure is clearly resolvable with MaNGA. Of the 5902 MPL-8 galaxies for which GZ2 classifications are available, this leaves a sample of 1686 close-to-face-on disky galaxies. Of these, 1314 galaxies satisfy the Willett et al. (2013) requirement for spiral galaxies of p spiral > 0.8 and 20 individual classifications in this question.
We then remove 109 galaxies which have flags for bad or questionable-standard data in the MaNGA DRP, or for which the MaNGA MPL-8 DAP dataproducts are unavailable. To ensure consistency in the spatial resolution relative to the galaxy size, we remove galaxies which are part of MaNGA's Secondary sample. For the final sample of spiral galaxies, we therefore select only those 795 which are in the Primary+ MaNGA sample, for which MaNGA observations extend to at least 1.5 R e . The median redshift of galaxies in our sample (weighted by the MaNGA Primary+ sample weighting) is z = 0.026, and 75% of the (weighted) sample are at redshifts z < 0.03.

SPECTRAL FITTING
Using a similar technique to that employed in Peterken et al. (2019a) and Peterken et al. (2019b), we fit each spectrum in each galaxy using Starlight (Cid Fernandes et al. 2005). We first de-redshift and subtract the emission-line spectrum using the MaNGA DAP (Westfall et al. 2019;Belfiore et al. 2019), and then fit using E-MILES (Vazdekis et al. 2016) single stellar population (SSP) templates. The de-redshifted and emission-subtracted MaNGA spectra are rebinned onto a linear wavelength scale (as required by Starlight) before fitting. Starlight then uses an iterative method to find the best-fit linear combination of the input templates, and returns the relative weights given to each SSP template in the fit, along with line-of-sight velocity v , velocity dispersion σ , and the amount of dust reddening A V .

Template stellar population spectra
We use a combination of 9 ages (log(age/years) = 7. 85, 8.15, 8.45, 8.75, 9.05, 9.35, 9.65, 9.95, 10.25) Girardi et al. (2000, "Padova") isochrones, and Milky-Way [α/Fe] ("baseFe"). To sample the full star-formation histories, we also include the younger templates of Asa'd et al. (2017) covering 6 ages (log(age/years) = 6.8, 6.9, 7.0, 7.2, 7.4, 7.6) and the two recommended metallicities ([M/H] = −0.41, +0.00), which are generated using the same method as the E-MILES set of Vazdekis et al. (2016), but with the earlier Bertelli et al. (1994) version of the Padova isochrones. Combining these libraries allows us to exploit the high spectral resolution of both MaNGA and E-MILES templates, while still being able to fully fit the whole of the star-formation histories of star-forming regions without combining different libraries produced in completely different ways.

Starlight configuration
We use Starlight in a "long fit" mode to prioritise robustness over computation time, based on the recommendations from extensive testing of Starlight by Ge et al. (2018) and Cid Fernandes (2018). We limit the fit to the wavelength range of 3541.4 to 8950.4Å, where the raw E-MILES templates have a constant FWHM of 2.51Å. To ensure that the model and measured spectra have consistent resolution, we degraded each of the SSP templates to the wavelengthdependent resolution of the median spaxel spectrum from all galaxies in our sample, using the line spread function measured by the DAP (Westfall et al. 2019;Belfiore et al. 2019).
Since the DAP robustly models the emission lines (Belfiore et al. 2019), we use Starlight in its "NOCLIP" mode to ensure that all of the diagnostic absorption lines are fully fitted. To ensure that the star-formation history of each spaxel is measured as fully as possible (defined here as the mass weights assigned to each SSP template divided by the time interval between that template and the next-youngest one), we require Starlight to retain at least 97% of its fit's total light during the "EX0" phase of reducing the number of templates used in the final fit (i.e. EX0s method option = CUMUL, EX0s Threshold = 0.03). This configuration helps to recover the presence of older stellar populations even when their flux has been obscured by the presence of a younger population, for example. The light weights we use from the fits are those contributed by each template at 4020Å (as used by Cid Fernandes et al. (2005)), and we allow the sum of weights at this wavelength to be between 50% and 150% of the input spectrum 1 . In all subsequent analysis, we use either the mass weights (using the implicit mass-to-light ratios included in the E-MILES SSP models and assuming a flat ΛCDM cosmology with H 0 = 70 km/s) or only compare the spatial variation of flux weights of a specific age, rendering the exact choice of reference wavelength irrelevant.
From the Starlight mass weights, a measure of the total stellar mass within the MaNGA FOV can be readily calculated. Reassuringly, we find that these stellar masses agree well with those measured by the NASA-Sloan Atlas (Blanton et al. 2011), but with a small offset due likely due to the difference in FOV limitations. We discuss this comparison further in §9. Although it is possible that the (Blanton et al. 2011) stellar mass measurements are more robust than the Starlight-derived measurements, the consistency between the two measurements is close enough to allow us to use either one. However, in measuring the mass growth in §9, we are limited to using the Starlight measurements. Therefore, for consistency, any quoted galaxy stellar mass measurements are those measured by Starlight unless stated otherwise. The E-MILES library contain stellar mass loss predictions for each of the SSP templates, allowing a measurement of the current mass and an initial mass at time of formation for each population contained within each spectrum. Unless otherwise stated, the mass weightings used in this work are the present-day masses of each template, to avoid reliance on the mass loss predictions. In any case, we find that all results presented here are entirely unaffected by this distinction.

Treatment of dust extinction
Starlight has the capacity to fit a general dust law with extinction A V , and also include an extra extinction YA V which is applied only to specified templates in the fit. This could, for example, allow for the possibility that the youngest stellar components are be affected by dust extinction to a greater extent than those populations which would be expected to be free of their birth clouds. The exact values of YA V measured by Starlight would in that case be an interesting parameter to model and investigate. However, in practice, we found that this extra degree of freedom caused Starlight's fits to be drawn towards negative extinctions when we included a YA V term for all populations younger than 10 7.05 years. This is likely due to the combination of the limited wavelength range for which these youngest templates dominate the spectrum due to their extreme colours, and the lack of any significant spectral information beyond their continuum shape. To the best of our knowledge, the YA V parameter in Starlight has not successfully been applied to any real spectral fitting to date.
We therefore include a single Calzetti et al. (2000) dust law in the fit, which has the same A V for all templates. We allow A V to vary in the range of −1 ≤ A V ≤ 8, and we find that over 90% of the spaxel fits are within the range 0.1 ≤ A V ≤ 0.8.

Kinematics
We use the stellar velocity dispersion σ measured by the MaNGA DAP (Westfall et al. 2019) as an initial kinematic guess for the de-redshifted input spectrum's Starlight fits, but allow this to vary as a free parameter in the range of σ = 20 to 900 km/s. Unlike other spectral fitting tools such as pPXF (Cappellari & Emsellem 2004), Starlight is not fine-tuned for measuring stellar kinematics, and we do not expect Starlight's exact measurements of σ to impact the measured SFHs. Similarly, despite de-redshifting each spectrum individually using the DAP's stellar velocity v measurements, this is a free parameter in the range of v = −600 to 600 km/s to allow Starlight to find its best possible fit, using v = 0 as the initial guess. In practice, we find that Starlight's fits are consistent with v = 0, with little deviation in σ from the DAP measurements.
In the tests outlined in §4.6 and described in more detail in Appendix A, we find that setting these kinematic parameters to be fixed or variable has no effect on Starlight's ability to measure stellar populations. Therefore, to accommodate for any wavelength calibration offsets between the DAP and the SSP templates -and for any uncertainties in the measurement of the spectral resolution -we allow the values of σ and v in the fit to vary.

Ignoring the youngest stellar populations
We do not expect the star formation rate or chemical evolution to vary significantly within the last 10 8 years in the majority of cases (see e.g. Schönrich & Binney 2009), but initial tests with Starlight revealed that there was a significant correlation between the weights assigned to templates of 10 9.5 years and those of 10 7.2 years, resulting in a sharp peak in the SFH at ∼ 10 7 years. This effect was found to be present in all locations of all galaxies regardless of signal-to-noise or the strength of dust extinction, and often resulted in an implied SFR of the galaxy to be at least an order of magnitude greater than at any previous time in its history, of up to ∼ 25 M /yr. Cid Fernandes & González Delgado (2010) showed that this phenomenon seems to be related to the known "UV upturn" seen in old stellar populations, which is normally attributed to horizontal branch stars in the planetary nebula phase; see Yi (2008) for a review. The cause and presence of this excess of blue light is not accounted for in the old SSP template spectra, so Starlight is forced to attribute it to another population.
We first attempted to mitigate this effect by fitting only from 3700Å instead of 3541.4Å, but found this had no effect on the derived star-formation histories, and we chose not to increase this lower wavelength limit further to avoid impacting the valuable Balmer absorption series. We then performed another fit with Starlight but where we had combined all of the templates younger than 10 7.5 years for each metallicity into a single template respresenting a flat SFR over that time interval, and used these two templates in the fit instead of the original eight over this time interval. Comparing the Starlight results of the two approaches shows that enforcing a flat SFR in the youngest templates has no noticeable effect on the SFH in ages ≥ 10 7.5 years at all (< 0.1 dex change in the measured SFR at any lookback time t ≥ 10 7.5 years), but the weights assigned to these new templates still exhibited correlation with those assigned to older stellar populations. Similarly, when we compared Starlight fits using only those SSP templates of ages older than 10 7.5 years, we found that the excess of hot stars was simply assigned to whichever stellar population was youngest. The rest of the star-formation histories were unaffected, indicating that older stellar populations are reliably measured regardless of how the youngest populations are treated in the fit. We concluded that the youngest stellar population available in the Starlight fit would always have a "cross-talk" effect with populations 10 9.5 years. The flux assigned to the youngest populations will always be a combination of the "true" flux from stars of that age, as well as a spurious contribution from the hot stars present but not modelled in older populations.
It may be possible to effectively separate these two effects when stellar population models are able to fully model the hot stellar remnants or other factors responsible for the UV upturn. However, for the purposes of this work, the weights and fluxes of stellar populations younger than 30 Myr are fundamentally unrealiable, so we include these SSP templates in the fit but then we ignore these populations entirely and do not use their weights in deriving the Starlight-measured star formation histories. The SFHs are not likely to have varied over this time period (Schönrich & Binney 2009), but such young stellar populations are clearly present in many galaxies, so by including these SSPs in the fit but ignoring their weights in subsequent analysis allows the spectrum to be fully modelled. Limiting the measurements of the derived SFHs to exclude the region younger than 30 Myr does not limit the results from our analyses.
Based on this, we advise users of Starlight and other stellar population fitting software to carefully consider the effects of attempting to measure star-formation histories to young ages without accounting for the limitations of SSP models to include the UV upturn. Cautious interpretation of all derived SFHs is essential to determine which parts of a SFH are likely to be correctly measured. However, we do see that the older populations are almost entirely unaffected however the youngest populations are modelled, so are robustly reliable.

Effects of low signal-to-noise
Many authors spatially bin neighbouring spaxels of integralfield spectroscopic data to create regions with approximately constant signal-to-noise ratio (SNR) before fitting. However, since we wish to retain the full spatial information of the stellar populations -and therefore fit each spectrum independently instead of binning -we must ensure that the Starlight fits in regions with low SNR are reliable. Ge et al. (2018) showed that Starlight may exhibit bias in the fitting of spectra with low SNR, but Cid Fernandes (2018) contend that these effects are not significant in most physical applications and with the robust Starlight configuration used here.
In Appendix A, we outline a series of tests to measure the effect of low signal-to-noise ratio on Starlight's recovered fits, and its ability to recover a stellar population of known age or a star-formation history of known shape using the configuration described above. We find that combining spectra with a given signal-to-noise ratio and comparing the fit of this combined spectrum with the fits of the spectra it contains, Starlight is consistent for the low signal-to-noise regions. We also find that Starlight is able to recover the age of a known stellar population with a signal-to-noise ratio as low as 5. Similarly, with the configuration outlined in §4.2, we find that Starlight can reliably measure the shape of a known SFH in such low signal-to-noise conditions, indicating that we are able to detect the presence of older stellar populations when obscured by brighter younger populations. These tests imply that, assuming the E-MILES model spectra are accurate representations of the stellar populations they represent, we expect Starlight to be able to recover the true SFHs under all the conditions analysed in the remainder of this paper. Notwithstanding this robustness, to ensure that the low signal-to-noise regions of the galaxy are not affecting our results in ways we don't anticipate, in all stages of our analysis we ensure that we weight spaxels by their flux or mass, ensuring that the central regions with good fits are up-weighted, and low signal-to-noise regions are down-weighted.

TIME-SLICING
From the SSP template weights obtained in the Starlight fits, we are able to reconstruct the star formation history (SFH) and metallicity distributions at every location in each galaxy in the spiral sample. From the SFHs, it is straightforward to reconstruct an image of the total flux (or mass) emitted by (or contained in) stars of any given age. To ensure that we are not over-interpreting small-scale noise in the age-distributions of weights assigned to individual templates, we first smooth the SFHs before any analysis is done on these images. We have smoothed by 0.3 dex in age, but smoothing by any factor between 0.2 and 0.5 dex does not affect results significantly 2 . As an illustration, Figure 1 shows an animation of a single galaxy (MaNGA plate-IFU 8329-12701) from the spiral sample, stepping through stellar population ages from 17 Gyr down to 30 Myr, highlighting the wealth of information contained in the spatially-resolved SFHs available using Starlight and MaNGA. Such animations can be made for any of the galaxies in the sample, but here we show an example of a galaxy observed using the largest-sized (127-fibre) IFU to demonstrate the amount of information potentially available through such time slicing.
It is worth emphasising that we can only measure the current location of stars in the galaxy, so that we can only treat a "time slice" at any given stellar age as an approximation of the structure of the galaxy at that time, since we cannot undo the effects of dynamical heating or radial mixing and migration. However, Martínez-Lombilla et al. (2019) showed that the shape of vertical colour gradients seen in edge-on disk galaxies imply that radial migration occurs at a slower rate than the intrinsic growth of the galactic disk. Simulations of galactic disks also suggest that stellar populations are in general equally likely to migrate inwards or outwards (Avila-Reese et al. 2018), and only by sufficiently small distances that this effect has only minor effects on the radial distribution of populations (Avila-Reese et al. 2018;Navarro et al. 2018;Barros et al. 2020), so here we assume that the current distribution of a given stellar population is -to a first approximation -representative of the distribution of star formation in the galaxy at the corresponding lookback time.
Clearly this assumption does not hold true for spiral structures, since such distributions will become diluted rapidly with the disk's rotation. However, for the youngest stellar populations, we showed in Peterken et al. (2019b) that interpreting spatially-resolved star-formation histories in this "time-slicing" approach can help to understand spiral arms and bars. Mallmann et al. (2018) also showed that a similar approach can be used to understand the properties of AGN, and other studies with CALIFA showed that this approach can offer clues to the history of a galaxy's radial profile (Cid Fernandes et al. 2013Pérez et al. 2013;González Delgado et al. 2014).

MEAN AGES AND METALLICITIES
A first-order measurement of the SFHs resolved across the face of a galaxy is that of the mean age or -with a similar calculation -of spatially-resolved metallicity. Using the mass weights assigned to each SSP template by Starlight in the fits for each spaxel spectrum, we derive mass-weighted mean age and metallicity (specifically log(age/yr) mass and log(Z/Z ) mass respectively) maps. We then plot the light-weighted median of all spaxels' mean log(age) and log(metallicity) within radial bins of width 0.045 R e (where R e is the elliptical Petrosian effective radius measurements from the NSA) against the elliptical galactocentric radius R (in units of R e ), and find a best-fit straight line to these data using a least-squares fit. The fitting is only performed out to 1.2 R e to avoid the edges of the hexagonalshaped IFU FOVs and to ensure consistency between galaxies. From these best-fit lines, we obtain a mean age and metallicity gradient, and a characteristic age and metallicity value of the stellar populations located at 1 R e , a measure which Sánchez et al. (2016) showed to be representative of the galaxy as a whole.
The distributions of age gradients and ages at 1 R e are shown in Figure 2, and equivalent metallicity measurement in Figure 3. We find that, on average, a majority (approximately 60%) of the spiral sample exhibit slight negative age gradients, implying younger outskirts. This agrees with the general picture found by others (Sánchez-Blázquez et al. 2014;González Delgado et al. 2015;Zheng et al. 2017;Goddard et al. 2017) and is usually taken to be evidence for inside-out formation being dominant in the most massive galaxies. When the sample is split into three mass bins (of M < 10 9.71 M , 10 9.71 < M < 10 10.22 M , and M > 10 10.22 M 3 ), we find that the approximately 80% of the highest-mass galaxies exhibit negative age gradients while only 50% of the lowest-mass galaxies galaxies do. This difference suggests that inside-out formation is more dominant in high-mass galaxies. We find that most (≈ 60 − 80%) galaxies in all mass bins exhibit slight negative metallicity gradients, and Figure 3 highlights a strong mass-metallicity correlation too, as first suggested by Lequeux et al. (1979).
volume-limited sample of spiral galaxies selected in the method described in §3 would contain equal numbers of galaxies in each bin, determined using the "EWEIGHT" sample weighting for the Primary+ MaNGA sample.  . Time since 95% of the total stellar mass within 1.2 R e had been assigned in the Starlight fits (T 95 ) for galaxies of different present-day stellar mass. All spiral galaxies with high presentday mass built the bulk of their mass at early times, but most low-mass galaxies were building their mass more recently. The transparency of each point is defined by the galaxy's MaNGA Primary+ sample described in §2.1.

MASS BUILDUP TIMES
Measuring only a mass-weighted mean age or metallicity does not make use of all of the available information in the age distribution of SSP template weights. From a full spectrum fitting approach, it is also possible to use the width of the distribution in stellar age, as well as its mean value. To this end, from a given smoothed SFH, we define the time T 95 by which 95% the total stellar mass of that spectrum was built up. We measure a T 95 for all light within R < 1.2 R e of each galaxy. We find that T 95 correlates with the total stellar mass of the galaxy, as shown in Figure 4: all galaxies with present-day stellar masses within 1.2 R e of M ≥ 2 × 10 10 M formed the bulk of their mass at least 5 Gyr ago, while most of those with stellar masses M ≤ 10 10 M were still building their mass as recently as ≈ 2 Gyr ago. This effect is reflected in the known relation between the stellar mass and star formation rates in galaxies, and the results shown here agree well with other fossil record studies (Thomas et al. 2010;Pacifici et al. 2016), empirical modelling (Rodríguez-Puebla et al. 2017;Behroozi et al. 2019), and theoretical modelling (Henriques et al. 2015;Hill et al. 2017) including previous analysis of MaNGA galaxies (Ibarra-Medel et al. 2016). There is a population of low-mass spiral galaxies with large values of T 95 , but no equivalent population of highmass galaxies with small build-up times, highlighting that low-mass spiral galaxies have had more varied histories than their high-mass counterparts, as found by Ibarra-Medel et al. (2016).
Using the spatial information available with MaNGA, we are also able to measure how the local value of T 95 varies with galactic radius R in galaxies of different masses, using the same total stellar mass bins as in Figures 2 and 3. In Figure 5, T 95 for each spaxel in the sample of spiral galaxies plotted against the galactocentric radius shows that the stellar populations currently at the centres of high-mass galaxies formed on average significantly earlier (by ≈ 0.7 dex or a factor of 5) than those in low-mass galaxies. By contrast, the galaxy's outskirts built up at approximately the same time regardless of the mass of the host galaxy. At ≈ 1 R e , the Colours denote the galaxy's total presentday mass. Solid lines represent a weighted running median, and dashed lines are one-third and two-third weighted percentiles. The outskirts of galaxies of all masses built up at approximately similar times, but the centres of massive galaxies formed significantly earlier than those of low-mass galaxies. The apparent horizontal feature in the high-mass data points at ∼ 7 Gyr is an artefact: there's a large number of spaxels which have not reached T 95 by 8.9 Gyr but have by the next oldest SSP at 4.5 Gyr, causing an apparent cluster between these two ages.  Figure 6. Distribution of gradients of T 95 vs. galactic radius R for galaxies of different masses. Most galaxies show evidence for inside-out formation, and the effect is strongest in high-mass galaxies. discrepancy in T 95 is much less, at ≈ 0.3 dex (or a factor of 2).
To quantify this effect, we obtained the radial profiles of T 95 for each individual galaxy. We find that these profiles are well-described by a straight line in radius vs. log(T 95 ), so we calculate a best-fit straight line, weighting spaxels by their flux. We find that the majority of galaxies (> 80%) in each mass bin show a negative gradient, as we show in Figure 6, implying younger outskirts than galactic centres. Assuming that the stellar populations of any given age have not significantly migrated since their birth, this is evidence for inside-out growth occurring in the great majority of spiral galaxies. We find strongest evidence in the highest-mass galaxies, for which > 90% exhibit negative gradients in T 95 . These results are consistent with the mean age gradient analysis of §6, which is not surprising since both approaches are measures of the age distributions contained within the derived SFHs. However, directly determining a quantity such as T 95 is returning something much closer to a physical measurement of how the mass of the galaxy has built up over time.
When a 90%, 75% or 50% threshold was used instead of the 95% threshold results shown here, we found no change to the qualitative conclusions. The higher 95% threshold was used to ensure that the buildup time of more galaxies and spaxels was within the range 0.8 T 95 5 Gyr where spectral fitting methods are most sensitive, and avoids saturation at either extreme of the stellar age range we are able to measure.

CONCENTRATION OF STELLAR COMPONENTS
Another more physically-motivated way to expand beyond measuring mean age gradients to infer the radial build-up of spiral galaxies is to analyse the spatial extent of individual stellar populations of different ages. We showed in Peterken et al. (2019b) that it is possible to measure such distributions directly using time-slicing techniques. The animation in Figure 1 suggests systematic variation in how concentrated the stellar populations are in one particular spiral galaxy. Older populations are most centrally-concentrated in the bulge regions of the galaxy while the younger populations make up the more extended disk. This illustrates the general consensus of the cores of galaxies having younger ages than the surrounding disks.
To quantify the variation in spatial extent of different stellar populations in the full galaxy sample, we choose to measure a concentration of each stellar population in each spiral galaxy. A concentration can be defined in a number of ways [for example as defined by (Conselice 2003)] which often require a larger FOV than MaNGA offers in order to measure a background flux. Here we define the concentration c of a population of stellar age t as where m r ≤kR e (t) is the mean mass contained in all spaxels within k × R e using the R e elliptical Petrosian radius values of each galaxy from the NASA-Sloan Atlas (Blanton et al. 2011). This measure ensures that the extent of each population is scaled by the size of the present-day galaxy, and only requires data from within the MaNGA footprint. This definition of c(t) also means that a completely uniform (i.e. radially flat) distribution has a value of c = 1, with c < 1 indicating a distribution which rises with radius in the inner region.
The concentration c of stellar populations of different ages in each galaxy in the full sample is shown in Figure 7. There is a clear trend of older populations being most centrally-concentrated (with typical values of c ≈ 2.5 at t ≥ 2 Gyr), and the younger stars in all galaxies exhibiting the most spatially extended distributions (with c ≈ 1 at t ≤ 0.1 Gyr). This is unsurprising since this is simply a . Each galaxy's line is weighted by its MaNGA Primary+ weighting. The heavy line shows the weighted median of all galaxies, and the dashed lines indicate the weighted one-third and two-third percentiles. Top: All galaxies. Bottom: The same, but with galaxies coloured by their total (present-day) stellar mass. The youngest stellar populations are more spatially extended than the oldest populations in all galaxies, with the effect strongest in higher-mass galaxies. different way of presenting and interpreting the same effects as in §7, but in a manner that utilises more of the temporal information available to illustrate how radial gradients in mass-to-light ratios (e.g. García-Benito et al. 2019) are created. We find that there is a strong dependence of c(t) on total (current) galactic stellar mass. Using the same mass bins as in Figures 2 and 3, we find that in the highest-mass galaxies, the oldest ( 6 Gyr) stellar populations are almost three times more concentrated than the youngest populations ( 0.1 Gyr), while in the lowest-mass galaxies this ratio is less than two.
By repeating this analysis using the mean 4020Å flux mass in the definition of c(t) (i.e. replacing m r ≤k R e (t) with f r ≤k R e (t)) in Equation 1), the results are unchanged. This is unsurprising since the radial variation in mass-to-light ratio is unlikely to be significant for any single time slice t.
This analysis reenforces the conclusion that inside-out growth is the primary formation mode in the majority of spiral galaxies, and that the effect is strongest in highermass galaxies.

MASS-SIZE DISTRIBUTION
Although the mass buildup times in §7 and the variation in concentration in §8 both show evidence for inside-out formation being the dominant growth mechanism in spiral galaxies, these analyses are still not directly comparable measurements to those used in most studies over different redshifts. Previously, observational evidence for inside-out formation in galaxies has come from analysing how the masses and sizes of galaxies increase simultaneously over time, by measuring these properties of different populations at different redshifts (e.g. Maltby et al. 2010;van Dokkum et al. 2013;Patel et al. 2013;Papovich et al. 2015;Whitney et al. 2019). This comparison is something that can be directly made using time-slicing methods with integral field spectroscopy for a single galaxy population, to understand how the total mass and size growth has occurred over time.

Deriving half-light and mass measurements
At each stellar age t, we define the stellar mass to be the sum of the masses in all populations with ages ≥ t within 1.2 R e , using the temporally-smoothed distribution of weights from Starlight. We can also define a measurement r l (t) of the light size of a time slice t as being the radius of half the light contained within 1.2 R e (using R e elliptical Petrosian radius measurements from the NSA) of all of the light emitted by stars older than t. This definition is used since the MaNGA observations are limited in their fields of view. This limitation prohibits us from reliably measuring a sky background, forbidding a direct half-light radius measurement in a normal approach.
To ensure that the radius and stellar mass measurements defined here using the Starlight fits are reliable, we compare these measurements for the present-day galaxy (i.e. t = 0) with the known size and mass measurements of the galaxies in the NSA (see Figure 8). We find that r l is a good proxy for the NSA elliptical Petrosian half-light measurements, with an offset of ≈ 0.2 dex which is a consequence of both the limited MaNGA FOV and the difference in wavelengths used. (The NSA radii are measured in the r band imagery, but the measurements for the Starlight outputs are done on a model 4020Å image, which would be located in the g band.) We also find that the total stellar masses determined by Starlight are consistent with the photometry-derived masses in the NSA. Both mass measurements assume the same IMF, so a small observed offset is likely due to MaNGA's limited FOV. and others have shown that, unlike the early-type galaxies, the mass-size relation for spiral galaxies is weak. However, using the Starlightderived measurements of the galaxies' masses and sizes, we find no strong mass-size trend in the present day sample of spiral galaxies at all; a Spearman rank test results in a correlation p-value of only p = 0.84 for the measured data, and similar for the NSA values. This lack of a relation may indicate that the Galaxy Zoo classifications for low-mass galaxies under the conservative selection criteria used here may be slightly biased so that the smaller low-mass galaxies are less likely to be classified as spirals. We find no mass-size relation for the sample of spiral galaxies at the present day when using the Starlight-or NSA-measured parameters (red and blue respectively). The transparency of the points indicate the relative Primary+ MaNGA sample weighting for each galaxy.

Evolution of the mass-half-light-radius plane
Having reassured ourselves that our mass and r l radius measurements are appropriate proxies for the photometric measurements in the present-day galaxies, we can now explore how the mass-size plane changes over time. The upper panels of the animation in Figure 9 shows the evolution of the mass-r l plane over the last ≈ 10 Gyr. Figure 10 also shows the distribution of galaxies in the mass-r l plane at four different redshifts. The measurements shown are using Starlight's current mass measurements of each SSP template (see §4.2 for the distinction between current and initial mass weights). In reality, the mass loss of each population will have been gradual over the galaxy's evolution rather than instantaneous as this approach implies. However, as stated in §4.2, by instead adopting the initial mass -and therefore assume that no mass loss occurs at all -we find no significant change to these results. The "reality" would of course be between these two extremes. However, since the two cases reach near-identical results, we present here only the results for the current mass template weightings to avoid uncertainties in modelling time-dependent mass loss estimates separately for each SSP at each time-step.
Assuming an absence of significant systematic radial migration effects, we find that the growth in r l of these galaxies has only occurred over the last ≈ 3 Gyr, while the bulk of the growth in mass occurred before this. We also find that galaxies generally have not changed their relative mass group, instead growing in mass and size at the same rates as those of similar masses and sizes. This cohort behaviour implies that, although every galaxy has had a unique formation history, tracing the average evolution of a galaxy population (e.g. by measuring galaxy properties at different redshifts) is representative of how most galaxies have evolved over the same time period.

Mass dependence
By splitting the galaxy into the subsamples of different mass bins as before, we find that over the last 10 Gyr, the low-mass galaxies have grown significantly more in mass (≈ 0.17 dex) but less in light radius r l (≈ 0.05 dex) than the high-mass galaxies (≈ 0.14 dex growth in mass, ≈ 0.1 dex in r l ). The small ≈ 0.03 dex difference in mass growth rates between the samples combined with the mass dependence of T 95 seen in §7 indicates that the low-mass galaxies have only built up slightly more mass relative to the high-mass galaxies, but that this growth occurred later. We also see this downsizing effect in the "turnup" time -at which galaxies stop growing significantly in mass and start growing in light radius r l -which occurred earlier in high-mass than low-mass galaxies (≈ 3.5 Gyr ago compared to ≈ 1 Gyr ago).

Evolution of the mass-half-mass-radius plane
While the light-weighted radius measurements are directly comparable to the size evolution of galaxy populations observed at different redshifts, the mass distribution of a galaxy is more fundamental to its build-up. In Figures 9  and 10, we therefore also show the evolution of the masssize plane but using a half-mass size r m (equivalently defined as the radius containing half of the stellar mass within The left column shows the mass-size plane for the light radius (r l , top) and mass radius (r m , bottom) measurements. The right column indicates the overall change in each galaxy's mass and size (in dex) from the first frame of the animation. The redshift of each galaxy is accounted for, such that in any given frame the star-formation histories of each galaxy is sampled at the difference between the frame age and the lookback time implied by the galaxy's redshift.  Figure 10. The distributions of galaxies at individual snapshots in Figure 9 showing the mass-r l (upper panels) and mass-r m (lower panels) planes at selected redshifts z. The corresponding lookback times t are also indicated. As fiducials, the grey contours and circle markers indicate the distribution of galaxies and the mean positions of each mass bin at z = 0, while the magenta contours and coloured diamond points indicate the distribution and mean positions at each redshift's time slice.
1.2 R e due to the limitations of the MaNGA FOV) using the SSP template mass-to-light ratios. We find that despite increases in the observed light size r l of the galaxy population, the corresponding increase in the mass size r m of the same galaxies is minimal; we find an increase of 0.05 dex in size for almost all galaxies, even in those with low present-day stellar masses. This weak evolution is in agreement with the results presented by Suess et al. (2019a,b) using entirely independent approach to show that the half mass radius does not evolve significantly compared to the evolution of the half light radius.
The physical size growth of spiral galaxies over the last 10 Gyr has therefore been extremely small, at typically only 10% growth. Such an increase in mass radius -however slight -requires a radial increase in the regions of ongoing star formation. Since younger stellar populations dominate the light of a spiral galaxy at any time slice or lookback time, the increase in measured radius in observations of the same galaxies becomes significant. A small amount of star formation in the outskirts of the galaxies will contribute a large amount to the light while contributing comparatively little to the bulk of the galaxy, causing a strong mass-to-light gradient. Direct measurements of the growth of galaxies from observations therefore produce an overestimate of the underlying mass growth rate. This effect has also been recently quantified for cosmological galaxy catalogs from CANDELS (Suess et al. 2019a,b), who showed that the half-light radius growth of galaxies, both star-forming and quiescent, previously reported in many works is significantly weaker for the half-mass radius.
Interestingly, it has been reported by Frankel et al. (2019) that the structure of stellar populations seen in the Milky Way provide evidence for a slower growth in halfmass radius than in the half-light radius, and the evidence presented here -as well as from high-redshift surveys (see above) -suggests that this feature is common in the growth of spiral galaxies. This slow size growth of spiral galaxies seems to be in tension with predictions from semi-analytical models and hydrodynamics simulations of galaxy evolution in the context of the ΛCDM cosmology (see for a discussion Avila-Reese et al. 2018 and more references therein).

Limitations of the data
Due to the limited FOV of the MaNGA observations, we are unable to measure a true half-light (or half-mass) radius for any given "time slice", since we do not have any background in the images. We are able to confirm in Figure 8 that the radius of half of the light (or mass) contained within 1.2 R e is a good proxy for the present-day galaxy, but we have no way of confirming this at other stellar population ages. However, since we find that the oldest populations are most concentrated, the measured sizes in the earlier age-steps in the mass-size evolution are likely to be closer to the true sizes. The observed increase in size is therefore a conservative estimate of the real change. Since little of a galaxy's mass is located outside 1 R e (e.g. Pérez et al. 2013), we expect that the mass radius r m measurements are likely to be close to true half-mass radii.

Stellar population models and spectral fitting
Although we show in Appendix A that Starlight can measure stellar populations if the models used to do so are correct, this work assumes that the model spectrum templates of the E-MILES (Vazdekis et al. 2016) and Asa'd et al. (2017) libraries are representative of the true observed stellar populations. There are a number of unresolved problems in the field of stellar population modelling; see Conroy (2013) for a comprehensive review. For example, in §4.5, we described a correlation between weights assigned to populations 10 9.5 years and those of 10 7.2 years due to a deficiency in the SSP templates. This is likely to be related to the UV upturn problem, due to the presence of hot stars in old stellar populations (Yi 2008) which is not accounted for in the SSP models. There is also uncertainty surrounding the shape of the IMF and ongoing debate on whether it varies between and within galaxies (La Barbera et al. 2013;Alton et al. 2017;Vaughan et al. 2018;Parikh et al. 2018). In principle, any variation of the IMF over cosmic time is likely to affect our analysis too.
We also assume here that stellar metallicity is a onedimensional parameter. In reality, the individual elemental abundances can vary from star to star. Further time-slicing work can be done to measure the simultaneous change in star-formation histories and metallacity evolution, including variation in α-enhanced metals, but this is beyond the scope of this project. Although we are confident that the fitting methods used here can recover the distributions of stellar population ages and metallicities, the degeneracy between metallicity and [α/Fe] is harder to assess. However, in Peterken et al. (2019b), we found that removing the extra metallicity dimension appears to have little effect on the derived star-formation history.
This work has also assumed a single Calzetti et al. (2000) exinction model which affects every stellar population contained within a single spectrum equally. As we state in §4, we expect that younger stellar populations are instead likely to be affected by a greater amount of extinction, but we are unable to resolve this difference in non-parametric fitting using Starlight. How this deficiency affects the measured star-formation histories is not known.
Notwithstanding these shortcomings and assumptions used in the fitting process, the resulting star-formation histories tell a consistent story of inside-out formation in spiral galaxies with no noticeable artifacts, and the coherent structures visible in the time-slicing of galaxy 8329-12701 shown in Figure 1 gives confidence in the fitting method for the purposes described here. The clear inside-out formation reported here might even be underestimated: Ibarra-Medel et al. (2019) have recently shown that any intrinsic signature of inside-out growth is diminished by the instrumental/observational setting and the stellar population modelling, mainly the age resolution of the SSP templates.

Effects of radial mixing and mergers
Time-slicing methods can only reveal the current locations of different stellar populations in a galaxy. In this work, we have interpreted these present-day distributions to be indicative of the radial distributions of star formation at the age of the stellar population, and make no attempt to correct for the effects of radial migration or mergers. Fortunately, simulations suggest that the radial distributions of stellar populations in a galactic disk are not significantly altered by radial migration (Avila-Reese et al. 2018;Navarro et al. 2018;Barros et al. 2020), indicating that the assumptions made here are at least approximately valid.
High-resolution simulations of Milky Way-like galaxies show that radial migration has no preferential direction, with most stars being scattered similarly inwards and outwards, by typically no more than 1-2 kpc (Avila-Reese et al. 2018). Instead, stars are equally likely to move in either direction over their lives (Sellwood & Binney 2002;Avila-Reese et al. 2018), with observations implying that any resulting observed growth as a result of migration occurs slower than the intrinsic growth of the disk (Martínez-Lombilla et al. 2019). Any radial migration of an initially centrally-concentrated distribution of stars is likely to become slightly less concentrated over time, an effect which is observed in the stellar metallicity distributions of the solar neighbourhood in the Milky Way (e.g. Feltzing et al. 2019;Frankel et al. 2018). A galaxy with a radial distribution of star formation that is not varying over time would be observed using time-slicing methods to have been slightly decreasing in measured radius over the same time frame, since the oldest populations will have more time to disperse and would therefore appear at larger radii. The measured variations of spatial distribution of stellar populations of different ages in §8 are also therefore likely to be a close lower limit on the true variation of the sizes of spiral galaxies over the same time period. Similarly, the recovered change in light size r l in §9 is therefore a slightly conservative but representative estimate of how the galaxy evolved over the same time period.

CONCLUSIONS
We have derived spatially-resolved star formation histories for a sample of 795 low-redshift spiral galaxies using Starlight applied to integral-field spectroscopic observations from SDSS-IV MaNGA. From this fossil record analysis, we have built maps indicating the regions in which stellar populations of different ages are located in any given galaxy. We analysed the radial profiles of these "time slices" to extract the historical growth of the population of spiral galaxies. The main findings are: • Using E-MILES single stellar population template spectra, the star formation histories measured by Starlight are unreliable for the youngest populations used in the fit (in this case those younger than 3 × 10 7 years). We found evidence that this is related to the UV upturn (Yi 2008) and a solution to this problem requires population models to include the presence of hot old stars (whatever their nature) in the oldest population templates. However, despite this degeneracy between the oldest and youngest template weights, the derived star-formation histories of the stellar populations older than 3 × 10 7 years are trustworthy.
• We have quantified evidence for inside-out galaxy growth in three different ways, which all indicate that such a growth mode is dominant in the majority of spiral galaxies, and is most significant in high-mass galaxies: -The mass-weighted mean age gradient of spiral galaxies tends to be slightly negative; the outskirts are younger than the centres in ≈ 60% of all spiral galaxies. This fraction rises to 80% for galaxies with stellar mass M > 10 10.22 M .
-By measuring a time T 95 by which 95% of the stellar mass had built up in each location of the galaxy, we find that T 95 decreases with radius in the majority galaxies. Gradients in T 95 are steepest in the highest-mass galaxies.
-The concentration c of each "time slice" was found for each galaxy. The youngest stellar populations (younger than ≈ 10 8.5 years) are more radially extended than the oldest (≈ 10 10 years old) populations in all cases, and this effect is most significant in high-mass galaxies.
• By considering the simultaneous increase in stellar mass and the increase in light radius with the addition of everyounger stellar populations, we found that the mass-size distribution of spiral galaxies evolves with very little change in rank; galaxies grow in mass and size at similar rates to other galaxies with similar masses and sizes. This suggests that a "like for like" approach when comparing the sizes and masses of distributions galaxies at different redshifts is representative of how the individual galaxies themselves have evolved.
• We found that over the last 10 Gyr, galaxies with high present-day stellar masses have grown their half-light size by approximately twice the amount that low-mass galaxies have, although low-mass galaxies have grown slightly more in mass.
• However, when the half-mass radius of the galaxies was used instead, we found that spiral galaxies have barely altered their radial mass distributions over the same time period. Although galaxies appear to grow in (light) size over cosmic time, we show that this is an overestimate of their actual physical growth. This apparent discrepancy is due to a small amount of star formation occurring in the outskirts being able to dominate a galaxy's light while contributing very little to the physical bulk of the galaxy.

ACKNOWLEDGEMENTS
Funding for the Sloan Digital Sky Survey IV has been provided by the Alfred P. Sloan Foundation, the U.S. Department of Energy Office of Science, and the Participating Institutions. SDSS-IV acknowledges support and resources from the Center for High-Performance Computing at the University of Utah. The SDSS web site is www.sdss.org. It is common to spatially bin neighbouring spaxels of integral-field spectroscopic data to create regions with a minimum signal-to-noise ratio (SNR) before fitting. However, since we wish to retain and measure the full spatial information of the stellar populations -and therefore fit each spectrum independently instead of binning -we require the Starlight fits in regions with low SNR to be reliable. Ge et al. (2018) showed that Starlight may exhibit bias in the fitting of spectra with low SNR, but Cid Fernandes (2018) contends that these effects are not significant in most physical applications and with the robust Starlight configura- . The mean age of spaxels in different signal-to-noise ratio bins (blue) in a spiral galaxy compared to the mean age of the spectrum of all spaxels combined (black squares). There is no significant bias in the average age measured by Starlight compared with varying SNR. tion used here. To assess this conclusion, we have performed a series of tests laid out here.

A1 Average fits of regions with low signal-to-noise
To test how the signal-to-noise ratio (SNR) of a spectrum in a MaNGA datacube may affect the fitting results from Starlight, we combined spectra of a single galaxy (plate-IFU 8329-12701) within different SNR bins to form a single integrated spectrum for each bin. In combining spectra from the MaNGA datacube, emission lines were removed and the spaxel spectra de-redshifted and interpolated onto a common wavelength base before summing. Each single spaxel's SNR was then defined as the median value over the fitting wavelength range of the ratio between the spaxel's flux spectrum and the reciprocal of the square root of the inverse variance spectrum (as measured by the MaNGA DRP).
We chose to combine spaxel spectra in SNR bins of width 2, centred on every even value. A single spectrum was created by combining all spectra from spaxels with SNR between 3 and 5, another from spaxels with SNR between 5 and 7, etc., up to a spectrum comprising the sum of all spaxels with a SNR between 29 and 31. Each of the combined spectra's signal-to-noise ratio is greater than 60 and most are greater than ∼ 200. These combined spectra were then fit using Starlight with an identical configuration to that of the science fitting to see how their measured star-formation histories varied from the average of their constituent parts. In the absence of any systematic bias in Starlight, it would be expected that the average SFH measured in all individuallyfitted spaxels in a given SNR bin should be the same as the SFH measured from the average spectrum of those spaxels, regardless of the actual variation in SFH shape between spaxels in a given bin.
We find that the light-weighted mean age of the summed spectrum is always within ∼ 0.2 dex of the mean age of all individual spaxels (separately fitted) in a given SNR bin, indicating that signal-to-noise effects do not significantly bias

SFR / spaxel
Signal-to-noise = 25-27 Figure A2. The star-formation histories of spaxels (blue) in three different signal-to-noise ratio bins compared to the SFH of the spectrum of all spaxels combined (black ). Starlight shows worse performance at higher signal-to-noise ratios.
the average results (see Figure A1). However, we find that the full star-formation histories of the summed spectra are most discrepant for the bins of larger SNR, as shown in Figure A2. This is likely due to an effect of small systematics (e.g. sky subtraction or flux calibration) dominating over random noise when summing spectra of already-high signalto-noise ratios. In summing high-SNR spectra, the modest reduction in combined SNR is outweighed by the increase in systematic errors when considering the fine detail required to measure a SFH. The fact that these effects have less significance in measuring the average properties highlights the level of extra complexity involved in measuring SFHs over mean ages.

A2 Recovery of a single stellar population of known age
To further examine whether the above effects are due to systematics in the spectra rather than in Starlight, we tested how well Starlight is able to return the age and metallicity of a single stellar population with known parameters. We can create spectra representing single stellar populations of any age and metallicity by interpolating over the grid of E-MILES SSP template spectra. We produced spectra representing 200 ages and three metallicities using a bilinear interpolation in 2D log space of the 66 SSPs used in the fitting. We then degraded these spectra to signal-to-noise ratios of 10 7 10 8 10 9 10 10 Input age (years) 10 7 10 8 10 9 10 10 Output age (years) Figure A3. Distributions of the measured mass weights given by Starlight in fits of stellar populations of known age, for a signalto-noise ratio of 5 and A V = 0.2. Input spectra are interpolated from the grid of E-MILES SSPs at Z = Z . The recovered weight distributions are smoothed by 0.3 dex in age, and the green line indicates equality between input and output ages. The noise in the weights assigned to templates older than the input SSP age is due to the Starlight configuration used: specifically, we force 30% of all templates to be assigned a flux weight for consistency with the main text. This small amount of noise in flux weights is then amplified in the conversion to mass weights. 5, 10, 15, and 20 by adding Gaussian noise, and also applied a dust extinction with A V = 0.2 using a Calzetti et al. (2000) extinction curve. We blur these 2400 individual spectra to the MaNGA LSF, and then applied Starlight using the same SSPs and Starlight configuration as described in the main text to compare how well the populations are recovered under different circumstances.
The 0.3-dex temporally-smoothed distributions of mass weights measured from the Starlight output of these known SSPs is shown in Figure A3 for a signal-to-noise ratios of 5 and Z = Z . Results for a signal-to-noise ratio of 10, 15, or 20 were not noticeably improved, and providing an input metallicity of Z = 10 −0.625 Z or Z = 10 −1.25 Z did not change the results. Starlight is able to recover the mean age of input stellar populations of all ages older than ≈ 10 8 years even with an input spectrum signal-to-noise ratio of 5, highlighting the diagnostic power of using such a long wavelength range to model the large-scale continuum in fitting.
A tendency for Starlight to also include small levels of older populations is highlighted. This is a consequence of the combination of a strong trend in mass-to-light ratio with stellar population age and the robust Starlight configuration used. As Starlight fits the input spectrum (i.e. in light space), we force it to assign a weight to at least 30% of all SSP spectra to ensure full SFH recovery in science cases. In the case of a single input stellar population, this will result in small spurious weights being given to other templates, and this noise becomes amplified in old populations when considering mass weights.
It is not necessarily clear how much the boundary between the two template libraries affects Starlight's inability to recover stellar populations younger than ≈ 10 7.5 years, or whether it is purely due to the lack of diagnostic spectral information at these ages.

A2.1 Effects of kinematics and dust
To test whether these results are improved when Starlight does not also have to model the kinematics of the input spectrum, we repeated these test, but with the velocity and dispersion fixed to the known input values. The results were entirely unchanged.
We also performed these same tests with A V = 0 and A V = 0.8 (instead of A V = 0.2 as used above) to check that Starlight is still able to recover populations in low-and high-extinction environments, and found the results to be unchanged here too.

A3 Recovery of a known star-formation history
Finally, to simulate the effects of low SNR on a spectrum comprising multiple stellar populations (such as is found in real galaxies), we created spectra of three different starformation histories (SFHs) using the template SSPs. The different SFHs reflect different cases: A: a flat SFH, where the star-formation rate is defined as SF R(t) = 0.1 M yr −1 for all lookback times t; B: a peaked and then declining SFH (as seen in many galaxies) with a star-formation rate represented by SF R(t) = 0.2 + N 1.5 9.2 (log(t)) M yr −1 , where N σ µ (x) denotes a Gaussian function of x centred on x = µ with standard deviation of σ dex; C: a declining and rejuvenating SFH with a star-formation history represented by SF R(t) = 1.2 − N 0.8 8.1 (log(t)) M yr −1 .
In building these SFHs, we assign weights to each SSP assuming that they represent all star-formation between their nominal age and the next-youngest SSP. We also included two different metallicity distributions, neither of which vary with stellar population age: X: a flat distribution over the range of metallicities in the E-MILES templates, where the relative flux of each SSP of any given age is defined by F i = 1; Y: a peaked distribution where the relative flux of each SSP is defined at each age t by F i = N 0.3 −0.71 (log(Z i /Z )).
These spectra were then degraded to different SNRs of S/N = 3, 5, 7, 10, 15, 20, and 30. When each of these 42 spectra of known SFHs were fit using Starlight in the same configuration used in the main text, we found the general shape of SF R(t) is recovered in all cases, as shown in Figure A4. For a SNR greater than 5 (the lowest typically found in the outskirts of MaNGA galaxies in the Primary or Primary+ samples; see Yan et al. 2016b), the derived SFHs show very good agreement (with a value of SF R(t) within 0.15 dex of the known value at all lookback times t) to the input SFHs. This is also shown in Figure A5, where a simple χ 2 goodness-of-fit measurement is obtained between the input SFH and each of the recovered SFHs. Starlight is able to recover SFHs A and B for all cases with S/N ≥ 10, and while the measurement of SFH C is worse, it is recovered equally well for S/N ≥ 7 in metallicity distribution X and S/N ≥ 15 for metallicity distribution Y.

A4 Implications
We assume throughout this work that the E-MILES model spectra are accurate representations of the stellar populations they represent. A full test of whether this is indeed the case (as conducted by e.g. Ge et al. 2019) is beyond this study, but the tests shown here imply that if this is true, we expect Starlight to be able to recover the true SFHs under all the conditions analysed in this work. In fact, we find evidence that fitting spaxels individually -rather than summing spectra from neighbouring spaxels -may be the most robust approach to avoid the dominance of systematics from compromising the ability to measure a SFH. Notwithstanding this robustness, to ensure that the low signal-to-noise regions of the galaxy are not affecting our results in ways we don't anticipate, in all stages of the science analysis shown in this work, we ensure that we weight spaxels by their flux or mass, ensuring that the central regions (with higher SNR and therefore with probably good fits) are emphasised, and low signal-to-noise regions are down-weighted. 10 7 10 8 10 9 10 10 Age (yr) 10 7 10 8 10 9 10 10 Age (yr) Figure A4. The measured SFHs for different input spectrum signal-to-noise ratios (coloured lines) compared to the input SFH shape (black line) for each SFH (left to right) and metallicity distribution (top and bottom). Recovered SFHs are smoothed by 0.3 dex in age. The red-shaded region indicates that for which SSP weights are ignored in science cases (see main text). The general shape is recovered well in all cases, particularly for signal-to-noise ratios greater than 5.  Figure A5. A χ 2 goodness-of-fit measurement of the input to the measured SFH in Figure A4, for each of the three SFH shapes (line styles) and metallicity distributions (top and bottom) for different input signal-to-noise ratios. Increasing S/N above 7 does very little to improve Starlight's ability to recover the SFH.