The functional brain networks that underlie Early Stone Age tool manufacture

After 800,000 years of making simple Oldowan tools, early humans began manufacturing Acheulian handaxes around 1.75 million years ago. This advance is hypothesized to reflect an evolutionary change in hominin cognition and language abilities. We used a neuroarchaeology approach to investigate this hypothesis, recording brain activity using functional near-infrared spectroscopy as modern human participants learned to make Oldowan and Acheulian stone tools in either a verbal or nonverbal training context. Here we show that Acheulian tool production requires the integration of visual, auditory and sensorimotor information in the middle and superior temporal cortex, the guidance of visual working memory representations in the ventral precentral gyrus, and higher-order action planning via the supplementary motor area, activating a brain network that is also involved in modern piano playing. The right analogue to Broca’s area—which has linked tool manufacture and language in prior work1,2—was only engaged during verbal training. Acheulian toolmaking, therefore, may have more evolutionary ties to playing Mozart than quoting Shakespeare. The advent of Acheulian stone-tool technologies 1.75 million years ago is likely to have coincided with changes in early human cognition. Using functional near-infrared spectroscopy neuroimaging, modern Acheulian toolmakers are shown to use the same brain network as is involved in playing the piano.

Given this, the functional brain activity of modern humans as they reproduce the stone toolmaking behaviours of extinct hominins can shed light on the functional brain activity of these first hominin toolmakers. This is, of course, an inexact science. For instance, we cannot know to what extent the cognitive operations of modern humans resemble those of early humans, nor can we pinpoint the effect of modern culture and formal education on the cognitive operations of modern humans during toolmaking tasks. We can assume, however, that extinct hominins, producing the same tool types and using the same operational sequence as modern humans, probably possessed at least the minimum cognitive operations that modern humans use to complete the task 10 . Thus, the functional brain activity of modern humans can tentatively be used to infer the functional brain activity of earlier human species. Here, we examine functional brain activity as modern humans learned to make Early Stone Age (Oldowan and Acheulian) tools to shed light on the brain networks and cognitive skills that were needed to complete these tasks (Fig. 1a-c).
Acheulian stone-tool manufacture is hypothesized to require more cognitive control and working memory than Oldowan tool manufacture 11 . This is because shaping a stone into a handaxe and maintaining a sharp edge along the entire piece (see Fig. 1d) requires the toolmaker to proceed through a series of complex action sequences that have an ambiguous goal hierarchy 12,13 . Nevertheless, activation of working memory neural circuits has been purportedly absent during replicative Oldowan and Acheulian tool production experiments 1,14 (but see Supplementary Fig. 1 and Supplementary Discussion). This may reflect the challenges of using neuroimaging techniques, like functional magnetic resonance imaging (fMRI), to capture real-time brain activity during the act of making stone tools. For example, two fMRI studies have attempted to simulate tool production by having participants observe videos of the toolmaking process 11,15 , rather than actually knapping. These studies might have underestimated the role of working memory circuits because participants did not have to hold complex action sequences actively in mind during the imaging task.
Researchers have also hypothesized that a special co-evolutionary relationship exists between toolmaking and language, because the earliest stage of stone toolmaking skill transmission appears to improve with verbal instructions 16 . Also, studies using positron emission tomography (PET), fMRI and functional transcranial Doppler ultrasonography have revealed that both behaviours activate overlapping brain regions and present similar cerebral blood flow lateralization signatures 1,14,15,17 . This suggests that language may have piggy-backed on the motor and hierarchical processing 1 The Stone Age Institute, 1392 West Dittemore Road, Gosport, Indiana 47433, USA. 2   sub-served by the ventral precentral and inferior frontal gyrus (IFG), brain areas critically involved in Acheulian tool manufacture 2 . Because the learning context was not carefully controlled in these neuroimaging studies, however, this overlap could be the product of learning to knap by receiving verbal instructions from an interactive teacher. It is possible, for instance, that participants in these studies relied upon internal speech, recalled verbally delivered instructions, or enlisted specific language-based behavioural strategies because they learned with language instruction 18 . This may not mimic the learning context that existed during the Pleistocene epoch, when hominins probably did not possess modern language or the cognitive elements required for interactive teaching.
In the present study, we tested these hypotheses by examining the brain networks that underlie Early Stone Age toolmaking. Firstly, we used image-based functional near-infrared spectroscopy (fNIRS), a cutting-edge neuroimaging technique that measures changes in oxygenated and deoxygenated haemoglobin (oxy-Hb, deoxy-Hb) in the cortex. This approach produces reconstructed images of localized functional brain activity that can be directly compared to fMRI results 19,20 . Because fNIRS is less influenced by motion artefacts than fMRI, it was possible to use fNIRS to measure real-time, localized cortical activity as people made Oldowan and Acheulian tools. We predicted that fNIRS would detect a relative increase in the activation of brain areas involved in cognitive control and working memory during Acheulian tool production when contrasted with Oldowan tool production.
Secondly, we carefully controlled the learning context. We taught 31 participants to make both types of tools across seven learning sessions (Fig. 1e). During individual training sessions, fifteen of the participants learned to knap stone through verbal instruction by watching videos of a skilled knapper's actions as he demonstrated and explained how to knap (his face was not visible); sixteen of the participants learned to knap through nonverbal instruction using the same instructional videos, but with the sound turned off. Brain activity was measured while participants completed a motor baseline task that involved striking two rocks together without attempting to make flakes, as well as during an Oldowan task and an Acheulian task. We predicted that the two learning groups would show different neural activation patterns, with selectively greater activation in language-specific brain areas, including the right IFG, in the verbal instruction condition.
A two-way analysis of variance with task (Oldowan, Acheulian) and group (verbal, nonverbal) as factors (see Supplementary Discussion) replicated the Acheulian-biased activation in the left ventral precentral gyrus (PrG) from previous PET research 1 (Fig. 2a). This area forms part of the visual working memory (VWM) network 19 (see overlap between dark green and red in Fig. 2a). Working memory is not a uniquely human feature, but modern humans have been argued to possess an 'enhanced working memory' that did not evolve until the Late Pleistocene epoch-much later than the onset of Acheulian tool manufacture 21 . Our findings suggest that even stone tool industries as ancient as the early Acheulian required working memory.
The analysis also revealed novel areas of activation associated with Acheulian toolmaking, including middle and superior temporal areas (Fig. 2b, c), as well as the supplementary motor area (SMA) (Fig. 2d). The temporal areas are involved in complex sound processing, auditory short-term memory and the integration of visual, auditory and sensorimotor information in relation to tool use [22][23][24] . The SMA forms the cognitive control centre of a medial premotor system, the function of which is to plan complex action sequences, especially those requiring bimanual coordination 25 . The superior temporal gyrus, middle temporal gyrus and SMA are connected by white fibre tracts that coalesce at the insular cortex 26 , which plays a notable role in guiding behaviour through attentional modulation 27 . Although blood oxygenation concentrations in the insular cortex are too deep to record with fNIRS, this area has been implicated in stone-tool production in previous work 15 .
Acheulian toolmaking depends on the execution of a skilled striking platform set-up to plan the direction, shape and size of a series of flakes that will effectively thin and shape the piece 28,29 . The activation of bilateral temporal areas during the Acheulian task may signify that participants were holding the varying sounds of impact in mind to judge whether a platform was successfully prepared for the removal of a flake. The ability to plan and execute a flexible sequence of actions to make a handaxe could be accomplished by integrating the working memory component of the left ventral PrG with the complex motor planning of the SMA and the auditory feedback and multimodal processing of the superior temporal gyrus and middle temporal gyrus via the insular circuit. Notably, this cognitive network is nearly identical to one that is active when trained pianists play the piano 30 , consistent with our proposal that this network is essential for audiomotor integration. The relatively weak Oldowan activation in this network is also informative. In the Oldowan task, each strike is an independent event that attempts to create a flake with a sharp edge; there is little need to actively hold a long chain of actions in mind to meet the overarching goal of the task.
The ANOVA showed four clusters where the instruction context had an effect on cortical activation during the toolmaking tasks (Supplementary Table 2). Post hoc tests identified two areas where the Acheulian task significantly varied by group. A large cluster that includes the right temporal pole and pars orbitalis was activated in the nonverbal group and suppressed in the verbal group (Mann-Whitney U = 55.0; P = 0.009; Fig. 2e). The right temporal pole is a multimodal association cortex involved with semantic processing 31 and has strong connections to pars orbitalis The lithic reduction processes of early Homo (a) were replicated by 31 modern human subjects while we used functional near-infrared spectroscopy (b) to record regional brain activity from portions of the frontal, parietal, and temporal cortices of the brain (c). d,e, Both Oldowan (d,e, left) and Acheulian (d,e, right) tools from the archaeological record (d) were reproduced by the participants in the study (e). Image in a reproduced with permission from Mark Boulton/Alamy Stock Photo. and the insula 32 . The right orbital portion of the prefrontal cortex is known to be involved in decision-making and reward-related feedback 33 . This may indicate that the nonverbal group relied more extensively on auditory and visuo-spatial feedback while planning actions related to handaxe production. Post-experiment interviews support this claim. Only participants in the nonverbal group emphasized sound and tactile sensation as important to their thought process while knapping. Their descriptions also mentioned visuo-spatial imagery more often than descriptions produced by the verbal group. The second cluster, pars triangularis of the right IFG, had significantly higher activation in the verbal group than the nonverbal group during the Acheulian task (Mann-Whitney U = 198.0; P = 0.001; Fig. 2f). This right hemisphere analogue to Broca's area participates in language functions, such as syntactic and sentence processing, especially in relation to context 34 , as well as some nonlanguage functions, such as response inhibition 35 . This suggests that participants who received verbal instruction may have engaged in inner speech during the Acheulian task, which is supported by postexperiment interviews ( Supplementary Fig. 2). Notably, this cluster overlapped with the IFG cluster from previous work that led to the conclusion that language may have co-opted the neural circuits involved in toolmaking 1 (see yellow region in Fig. 2f). If language evolved by co-opting the motor areas of the brain that were used first for Early Stone Age tool manufacture, then we should observe activation of the right IFG in both groups as a result of the complex knapping task. Because this area shows elevated activation only among the verbal group participants, this suggests that language instruction in the modern learning context is responsible for right IFG activation in this and previous studies. Caution is urged, therefore, when interpreting results of neuroarchaeological studies that do not control for spoken language in the learning context.
Unique cortical areas recruited during the Oldowan task include the hand representation portions of the primary sensorimotor cortex in both hemispheres ( Fig. 3a, b). This suggests the involvement of a lateral premotor system, which is dependent on external visual input to recognize and assign significance to external objects 25 . This is unsurprising, as the only goal of the Oldowan task is to visually identify ideal platforms and remove flakes until the core is exhausted. An evaluation of the video footage captured during the experiment and participant responses during an exit interview reveal that the absence of activation in these hand areas during the Acheulian task might have resulted from participants using the leg rather than the hand as a support for the core. Participants also took their time to evaluate progress more often during the Acheulian task than during the Oldowan task, which could have resulted in less activation in the hand motor areas.
The Oldowan task also appears to come under increased cognitive control when it has been learned in the absence of verbal instruction (Fig. 3c). For example, it is only in the nonverbal group that the left MFG, or frontal eye field, is activated (Mann-Whitney U = 33.0; P < 0.001). This area-also activated in a previous study 1 (see yellow cluster in Fig. 3c)-forms part of the dorsal visual attention network 36 . The recruitment of this network in the nonverbal condition only, suggests that learning to produce simple flakes without language requires increased attention to visuo-spatial demands.

NATURE HUMAN BEHAVIOUR
When learned verbally, Oldowan tool production elicits activity in the left dorsal PrG (Fig. 3d), an area that also is activated when passively reading action words related to the arm 37 .
Considered together, our findings suggest that Oldowan tool manufacture relies on the coordination of visual attention and motor control to successfully remove simple flakes. It would not be surprising to find that a homologous cognitive network is active in wild chimpanzees when they skilfully crack nuts with stone tools 38 , or even in capuchin monkeys when they strike two stones together, which can sometimes lead to unintentional flakes similar to those made by early hominins 39 . In summary, results of this experiment point to cognitive abilities that were more ape-like than human-like among hominin toolmakers prior to 1.8 Ma.
Acheulian tool manufacture, in addition, requires the integration of higher-order motor planning, working memory and auditory feedback mechanisms to attend to information from multiple modalities as the toolmaker coordinates the different goals required by this more complex task. We propose that, like the processing of an auditory speech stream, Acheulian knapping requires the knapper to discriminate between knapping sounds and to assign meaning to those sounds based on how they relate to the hierarchy of goals involved in making a handaxe (for example, how does this strike and its associated sound get me closer to setting up an ideal platform to remove a flake that will be long and thin enough to remove this nearby convexity; how does this strike and its associated sound relate to the overall shape of the handaxe that I am trying to achieve). Thus, the knapping of Acheulian tools may have played a role in fine-tuning this function in the superior temporal gyrus, perhaps facilitating the evolution of neural connections involved in speech perception. Interestingly, the Acheulian technocomplex coincides in timing with the evolution of a derived middle ear anatomy in Homo that was more attuned to human speech frequencies 40,41 . Together, fossil and neuroarchaeological evidence now show that a major shift in hominin auditory processing occurred after Homo diverged from Australopithecus and Paranthropus and before the appearance of H. heidelbergensis.
The adoption of the Acheulian toolkit by early Homo also coincides in time with a more unpredictable environment, an increase in brain and body size, and a more diverse diet that relied upon tool-assisted hunting and foraging of large game animals and tough, fibrous plant products 42 . As reliable food items became scarcer in this unpredictable environment, those individuals who were capable of holding multiple modes of information in mind to guide and coordinate their motor behaviours probably experienced higher reproductive success because of their enhanced ability to produce complex tools. We speculate that this ability allowed these individuals and their offspring greater access to a diverse set of food resources.
Our findings do not neatly overlap with prior claims of a technological origin for language. There is more support for a working memory hypothesis, as the VWM plays an active role in the network identified here that today allows modern humans to perform such behaviours as skilfully playing a musical instrument. Our data suggest that this cognitive network was probably necessary for early Homo to make Acheulian handaxes and might also have been important for other learned, complex behaviours. Additionally, a larger working memory capacity may have led to more complex imitative abilities, as has been suggested previously 43 . We propose that selection for this integrated, multimodal network around 1.8 Ma in response to an unpredictable environment marked a turning point in the evolution of the hominin brain, leading to the expansion of prefrontal and temporal cortices 3 , a more complex cognitive toolkit, and the evolution of a new species of Homo.

Methods
Experimental design, participants and procedure. An a priori power analysis was performed for sample size determination based on data from a pilot study, comparing verbal with nonverbal instruction. Beta values from fifteen 20-s intervals of knapping were extracted from a channel that overlies anterior Broca's area. The effect size (Cohen's d = 1.13) was considered to be large using Cohen's criteria 44 . With α = 0.05 and power = 0.80, the projected sample size needed with this effect size is approximately 14 subjects per group 45 .
Participants were recruited for the study through posted flyers that advertised for individuals interested in learning to make stone tools. Anyone interested in participating in the study received an online questionnaire that determined their eligibility to participate. They were screened for knapping experience, handedness, neurological, psychiatric and physical handicaps, and drug use. Only individuals with no prior experience making stone tools were asked to participate. Because of  46 , the subjects were tested using the Benton Neuropsychology Clinic Handedness test during the screening process to determine their laterality quotient 47 . Only subjects who fell within the range of + 75 to + 100 points, or extreme right-handedness, were included in the experiment.
After positively demonstrating right-hand dominance and consenting to participate, subjects were asked about their psychiatric and neurologic history. Individuals who had experienced traumatic brain injury (including stroke, anoxia and hypoxia, brain tumour, infections of the brain and so on), loss of consciousness, a history of seizures or severe learning disability were not included. Individuals with serious psychiatric disorders, such as autism, were excluded from the study. Additionally, the Drug Abuse Screen Test (DAST-10) was included to quantify the degree of drug abuse problems of potential subjects 48 . Individuals with a recent history of drug abuse show impairments in cognitive tasks 49 . Only individuals who received a score of 2 or lower were permitted to participate. The study was approved by the IRB and Human Subjects Office at the University of Iowa (IRB ID: 201304789), and all subjects signed an informed consent document before participating.
Participants were divided into two groups based on their performance during a manual dexterity test so that dexterity levels were equally distributed across groups.
One group received verbal instruction while learning how to knap stone (n = 15; 8 females, 7 males), and the other group received nonverbal instruction only (n = 16; 8 females, 8 males). Manual dexterity was measured using the Minnesota Manual Dexterity Test (MMDT). This test assesses the manual dexterity required to place sixty round pegs with the dominant hand in specific places on a board 50 . While it is often used by physical and occupational therapists to determine baseline progress data from an injured patient, the MMDT has also proven to be a reliable and valid method for obtaining measures of manual dexterity in healthy adults 50,51 . For the final sample of included participants, the nonverbal group averaged 182.4 ± 17.5 s to place all sixty pegs in the holes on the board in three iterations, whereas the verbal group averaged 182.7 ± 16.9 s. There was no significant difference in dexterity between the two groups on the basis of this assignment (t = 0.06, P = 0.95). Males, who averaged 181.4 ± 14.2 s, and females, who averaged 183.6 ± 19.5 s, also did not significantly differ from each other in their dexterity scores (t = − 0.34, P = 0.74).
After screening and group assignment, participants attended their first practice session. One participant dropped out of the study halfway through this first session. Four additional participants were withdrawn after their first neuroimaging session because they had dark or thick hair that interfered with our ability to obtain high-quality fNIRS signals. Finally, two subjects withdrew from the study before the final neuroimaging session for personal reasons. The final sample had 31 participants (nonverbal, n = 16; verbal, n = 15; n = 16 females; n = 15 males; age = 24.0 ± 8.1 years (mean ± s.d.)) who completed the entirety of the experiment.
The participants individually attended seven 60-min knapping practice sessions, during which they learned how to knap stone tools by watching instructional videos. We chose video instruction rather than in-person instruction to ensure that every subject received the exact same instructions at the same rate and also to control for interactive teaching, as there is currently not enough evidence to confirm that early Homo was capable of interactive teaching. The videos featured an expert knapper with over 12 years of experience. His face was not visible in the frame, although his hands, lap and torso were visible. This prevented the nonverbal group from picking up on any verbal cues that were communicated by the face. Both groups watched the same instruction videos; however, the nonverbal group watched a silent version. Each practice session proceeded in the following order: (1) a 10-min instruction video; (2) 20 min of practice; (3) the same 10-min instruction video; and (4) 20 min more to practice. Subjects were not able to manipulate the video in any way, for example, by pausing it. All the debitage created while knapping fell on a large tarpaulin mat. After the participants completed a core or core tool and were ready to move on to another rock, the core/core tool and its corresponding debitage were collected, bagged and labelled with the rock number and other pertinent information for further analysis.
Each practice session introduced a new goal for them to meet, or reviewed and refined skills already introduced. The skills and tool types learned during practice sessions 1 and 2 were comparable to the skills and tool types of Oldowan simple tool production. This is a quick and expedient method of obtaining a sharp flake to use as a tool 52 . They learned how to recognize ideal striking angles on the raw material and tried to create flakes. They continued to practice making expedient flakes during the second practice session. The second video taught them how to recognize the best raw material for flaking. Subjects learned which materials fracture easily by trial and error. This was also communicated verbally to the verbal group. Practice sessions 3-7 introduced and reviewed skills involved in the production of the early Acheulian technocomplex, which involves a more efficient removal of flakes and the intentional shaping of a large cutting tool 53 . The third practice session video featured alternate flaking around a square edge as the main goal for this session, which is an important skill for making bifaces. The instruction video for the fourth practice session introduced core bifaces and the instructor in the video demonstrated biface manufacture at a very slow rate. In the fifth practice session, the video began to focus more on primary thinning of a piece to remove large convexities. The sixth instruction video presented information on how to shape and refine a biface by trimming. Finally, the subjects were presented with an instruction video during the seventh practice session that focused on the entire process of bifacial reduction so that they could continue to practice the skills they learned from previous sessions.
For all practice and neuroimaging sessions, subjects were required to wear safety goggles, leather work gloves and lap pads. They were also given the choice to wear a facemask to block out small particles of airborne silicates.
In addition to the training sessions, participants attended three 90-min neuroimaging sessions after the first, fourth and seventh training sessions, during which they were video recorded and brain activity was observed using the TechEn CW6 system. They sat in a small room surrounded by black curtains. The experiment program was designed with EPrime software. The presentation of stimuli was synchronized with the CW6 system. Set-up involved measuring the participant's head to ensure the proper cap size and measuring 10-20 landmarks to ensure proper cap placement on the head. Hair was cleared at each optode site. The 10-20 landmarks and positions of the sources and detectors on the head were then digitized.
Each imaging session consisted of (1) a motor baseline task made up of nine 40-s blocks of activity segregated by 20-s rest periods to observe activation of motor-related brain areas while striking rocks together without the added element of actual knapping; (2) an Oldowan toolmaking task that was segregated into five 1-min blocks of activity with 15-s resting periods in between each block; and (3) an Acheulian toolmaking task segregated into fifteen 1-min blocks, separated by 15-s rest periods. The order of the tasks was not randomized during each imaging session nor was the length of resting periods; therefore, there is some possibility that habituation effects affected our results. These limitations should be addressed in future studies.
To eliminate the possibility of linguistic contamination, the experiment was designed so that all instructions were given through a silent video with timing of events indicated by different tones, and subjects were instructed to not talk during the experiment. Subjects were told at the beginning of the experiment to perform the same activity that they viewed in the instruction videos, which preceded each new task or event. Instructions also included training on the meanings of different tones they would hear throughout the session that would signal whether to stop or start an action. Only data from the final neuroimaging session are included here, because this was the first point when more than 90% of the surveyed participants were able to identify the different goals of the Oldowan and Acheulian tasks.
At each practice and neuroimaging session, subjects were presented with three or four local, granitic rocks of varying sizes that were naturally rounded for use as hammerstones. A goal of the training was to introduce the subjects to different qualities, shapes and types of rock to fracture so that they would learn to select the blank of highest quality and the most workable edges from the three choices that they were always provided. Thus, a variety of unheated cherts from the Midwestern United States, Texas, and California were obtained from collectors in Missouri and Texas, although most of the material was Burlington chert, a fine-to mediumgrained stone that is easy to flake 54 . Prior to being made available for the subjects to knap, each stone was assigned a unique, identifying label, weighed on a digital scale and assigned a measurement of volume by the water displacement method. Spalls and cobbles ranged between 69.6 and 3,000.0 g in mass (mean = 676.8 g) and had a volume between 20 and 1,200 cm 3 (mean = 284.3 cm 3 ). Generally, smaller pre-made spalls of chert with edges of very acute angles were provided in the first two practice sessions. By the third and fourth practice sessions, the participants could choose from medium-sized spalls without cortex that had edges with more difficult angles, as well as rounded cobbles with cortex but with one or more flakes already removed to help them get started. A mix of small-to medium-sized spalls and cobbles were available to choose from for the Oldowan task during the neuroimaging sessions. Larger, more challenging pieces, many with square edges, were provided for the fifth, sixth and seventh practice sessions and the Acheulian task during the neuroimaging sessions.
Behavioural data acquisition and processing. A key issue when comparing different groups in neuroimaging studies that measure changes over learning is that participants might learn at different rates depending upon their group assignment. To examine this possibility, digital callipers were used to take measurements on cores and flake debris from both knapping tasks during the final neuroimaging session to determine whether one of the learning groups produced stone tools with greater skill than the other group (see Supplementary Discussion). All core and debitage pieces were collected after the completion of each finished core during the neuroimaging session. Any debitage that passed through a 6.35 mm screen was discarded. The remaining pieces were labelled and measured. Each piece was weighed to the nearest tenth of a gram and allocated to a metric size category continuum as defined by the smallest of a series of nested squares on centimetre graph paper into which the piece would completely fit (that is, 1 cm 2 , 2 cm 2 , 3 cm 2 ,..., and so on). The maximum thickness was recorded for each piece. All non-core debitage was coded as a flake (either complete, proximal or distal) or nonflake debitage shatter 55 . Any flakes with an intact striking platform underwent measurements for the maximum platform width and thickness.
These measurements were applied to a total of 5,757 debitage pieces that correspond to 72 cores, which were reduced by 30 of the participants in the study LETTERS NATURE HUMAN BEHAVIOUR (debitage output from the final neuroimaging session for one participant was not available for analysis). Relative knapping skill as determined by the debitage was measured using the following variables. The first set of variables that was measured corresponded to flake and platform shape. Platform shape, determined by the ratio of maximum platform width to platform thickness, is a common method used to measure knapping skill 18,28,56 , as platform shape contributes to the size and shape of the overall flake. The ratio of flake size to flake mass was also included to determine flake shape differences between the groups 18,56 . A larger ratio in both cases indicates a flake that is both relatively thin and elongated, which demonstrates the knapper's ability to remove desired flake tools in the case of the Oldowan task and long, thinning flakes for shaping the core tool in the case of the Acheulian task. We calculated the relative platform area ((platform width × platform thickness)/flake size) with the expectation that knappers of a higher skill level would produce smaller, thinner platforms relative to the size of the rest of the flake 28 .
The second set of variables that was measured correspond to the efficient use of raw material, as inefficient use of raw material is indicative of low skill level 57 . We examined the proportion of intended flakes to unintended shatter fragments, both on low quality and high quality material 18,56 , with the expectation that the assemblages of relatively more skilled knappers would include a higher percentage of flakes than the assemblages of less skilled knappers, demonstrating better control of the material. We also examined the proportion of whole flakes to flake fragments. Previous experimental research demonstrated that the assemblages of skilled knappers included more flake fragments than the assemblages of less skilled knappers, perhaps a result of skilled knappers striking the core at a higher velocity 56 . A clear sign of knapping skill in the case of the Oldowan task is the level of reduction of the cobble into usable flakes 56 . We measured this by determining the proportion of the original cobble's mass into flake, shatter and unexploited core mass, with the expectation that the more skilled knappers would have a larger percentage of flake mass and a smaller percentage of unexploited core mass. Finally, we examined the relative number of missed strikes on cores and debitage (total number of missed strikes/original cobble mass), which can be observed as incipient cones of percussion, micro-flake scars or battered edges and hammerstone marks 18 . While it is impossible to get an exact count of missed strikes by looking at the lithics alone, if one group were to have a higher number of missed strikes than the other, this would be indicative of less skill, indicating less manual control.
Forty-nine core tools (attempted bifaces) from the Acheulian task were analysed. Along with the measurements described above, core tools were determined to be bifaces by the presence of two opposing faces and at least one bifacial edge. A bifacial edge is defined as any sharp edge that has been created by removing flakes near the same location that run across opposite planes of the stone. This would require the knapper to strike off one flake and then flip the piece over and use the newly created angle to remove a second flake, a technique known as alternate flaking. The proportion of successful bifaces was determined by dividing each group's total number of successful bifaces by the group's total number of attempted bifaces. The maximum breadth and thickness of each successful biface were recorded with digital callipers. The ratio of biface breadth to thickness is informative about the level of biface refinement, such that a refined handaxe should have a larger breadth relative to thickness, which would present as a larger ratio 28 .
At the conclusion of the experiment, participants were asked questions related to their experience in the experiment and their answers were recorded. Specifically, they were asked what they thought the goals were for the knapping tasks and whether or not they believed they achieved these goals. They were asked to explain how the two knapping tasks differed from each other, at what point in the experiment they understood there were differences between the two knapping tasks and whether or not they used different strategies to achieve the different goals of the two tasks. They explained what they were generally thinking about while knapping, whether or not these thoughts included language. Finally, they were asked for their opinion on whether language would be beneficial for learning to knap. Some of their answers have been summarized in Supplementary Fig. 2.
Designing the fNIRS cap to record from target regions of interest. Prior to the study, we identified a set of regions of interest (ROIs) reported in three stone knapping studies that involve either PET or fMRI 1,14,15 . To further investigate the supposed involvement of the ventrolateral prefrontal cortex during the transition to bifacial flaking, we also included coordinates from supplementary table 2 from ref. 58 , which averages the coordinates for the ventrolateral prefrontal cortex reported in six other studies. Similarly, to test for the involvement of the dorsolateral prefrontal cortex during Early Stone Age tool manufacture, coordinates for dorsolateral prefrontal cortex activation were compiled from refs 59,60 .
Next, we used previously described methods 19 to design a custom optode geometry to record from these ROIs. This involved digitizing candidate source and detector locations on an EasyCAP (Brain Products GmBH, Germany) using a Polhemus Patriot Motion Tracking System (Colchester, Vermont, USA) and projecting these positions onto an adult atlas available in AtlasViewer GUI in the HOMER2 software package (http://homer-fnirs.org/) 61 . Final adjustments to the optode geometry were made after performing Monte Carlo simulations to create a sensitivity distribution for each source-detector pair (that is, the sensitivity of each source-detector pair to detecting changes in absorption of near-infrared light) and visually inspecting whether these sensitivity volumes overlapped with the target ROIs. The end result was an optode geometry that recorded from all ROIs, including regions along the central sulcus, lateral prefrontal, superior temporal and inferior parietal cortex.
Image acquisition and processing. fNIRS data were acquired at 25 Hz with a TechEn CW6 system with wavelengths of 690 nm and 830 nm. Light was delivered to a customized cap via fibre-optic cables. The probe geometry had 12 sources and 24 detectors, creating 36 channels with a source-detector separation of 3 cm and two short source-detector channels with a separation of 1 cm (see Fig. 1c for optode coverage). HOMER2 software was used to demean and convert the data into optical density units. A targeted principal component analysis was applied to data from the three tasks mentioned above to eliminate noise and motion artifacts 62 . We used a general linear model to obtain beta values (β ) for oxy-Hb and deoxy-Hb measurements in every channel for all conditions in every task for each subject. Signals from the short source-detector channels were regressed from the rest of the channels to account for effects from superficial layers of the head.
The image reconstruction process is summarized briefly here (see refs 19,20 for a more extensive explanation of this process). 10-20 head landmarks from the session that had the best symmetry were chosen as the reference for each subject. The landmarks from the other two sessions were transformed (linear) to fit this reference set of landmarks. The transformation matrices were applied to the corresponding source and detector positions. AtlasViewerGUI (available within HOMER2) was used to project the points onto an adult atlas using a relaxation algorithm. The projected geometry was used to run Monte Carlo simulations on the basis of a GPU-dependent Monte Carlo algorithm 63 for each session and subject. This resulted in sensitivity profiles (100 million photons) for each channel of the probe geometry for each session and subject. Head volumes and sensitivity profiles of channels were converted to NIFTII images. Subject-specific head volumes were skull-stripped and transformed to the head volume in the native atlas space using an affine transform (BRAINSFit in Slicer 3D). The transformation matrix obtained was applied to the sensitivity profiles to move them to the transformed head volume space (BRAINSResample in Slicer3D). Sensitivity profiles for all channels were thresholded to include voxels with an optical density (OD) of greater than 0.0001 (ref. 19 ). These profiles were summed to create a session and subject-specific mask and then these masks were summed across all sessions and subjects. Those shared voxels were used to create an intersection mask across participants.
The β coefficients obtained for each channel, condition (within each task) and subject for oxy-Hb and deoxy-Hb were combined with the forward model results obtained from the Monte Carlo simulations to create voxel-based changes in oxy-Hb and deoxy-Hb concentration using image-reconstruction methods that have been previously described 20 . In brief, the image reconstruction problem can be formulated as the following generic equation: Inverting L to solve for X results in an ill-conditioned and under-determined solution that might be subject to rounding errors. An alternative is to use Tikhonov regularization 64 . In this case, the above 'system' can be replaced by a regularized 'system' . The solution is given by the Gauss-Markov equation where λ is a regularization parameter that determines the amount of regularization and I is the identity operator.
The solution to Equation (2) can be found by minimizing the cost function 65 , where the size of the regularized solution is measured by the norm λ × |X− X 0 | 2 . Whereby X 0 is an a priori estimate of X, which is set to zero when no prior NATURE HUMAN BEHAVIOUR 1, 0102 (2017) | DOI: 10.1038/s41562-017-0102 | www.nature.com/nhumbehav LETTERS NATURE HUMAN BEHAVIOUR information is available. Here, X is determined for each chromophore and condition separately. Once Equation (3) is solved, there is now a voxel-wise estimate of the concentration data. Therefore, the best estimate of the channelwise concentration data for each condition (from the general linear model) has been combined with information from the photon migration results to create an estimate of the voxel-wise concentration data for each chromophore, for each condition, and for each subject.
The resultant β maps were intersected with the intersection mask to restrict analyses to the voxels that were common to all sessions and subjects. Consequently, β maps were obtained for each condition (within each task) and subject for oxy-Hb and deoxy-Hb concentration levels.
Statistical analysis. The haemodynamic responses of the verbal and nonverbal groups and the Oldowan and Acheulian tasks were compared using two-way ANOVA tests for both the oxy-Hb and deoxy-Hb signals, conducted with the 3dMVM function in AFNI (analysis of functional images) 66 . Resultant functional images of main effects and interactions were corrected for family-wise errors using the 3dClustSim function (corrected at α = 0.05, corresponding to a cluster size threshold of > 27 voxels). We analysed the highest-order effect in each spatially unique cluster; therefore, main effect areas that overlapped with areas where an interaction occurred between group and task were interpreted on the basis of the interaction effect.
Using the coordinates for the centre of mass of activation for each effect, we extracted the β values in these areas for the Oldowan and Acheulian tasks and the verbal and nonverbal groups. In cases of a significant interaction, the averaged β values of task and group were compared using the Wilcoxon signed-rank test and Mann-Whitney U-test, respectively. We also compared β values from the knapping conditions to the motor baseline conditions using the Wilcoxon signedrank test to identify significant clusters that were unique to stone knapping and not simply general motor regions. Only those significant clusters where post hoc tests determined knapping activation to be significantly higher than motor baseline activation were included in the final results discussed in the main text. Because the motor baseline task did not control for auditory stimulation while clicking rocks together, temporal cortex clusters were included in the final results, even if the signal in these regions was not significantly higher than the motor baseline signal.
To test for differences in knapping skill between the verbal and nonverbal groups, Kolmogorov-Smirnov, Mann-Whitney U and Student's two-sided T-tests were used for each variable related to the debitage and bifaces, with results considered significant at P < 0.05. Data availability. The datasets generated during the current study are available from the corresponding authors upon reasonable request.