Words go together like ‘bread and butter’: The rapid, automatic acquisition of lexical patterns

While it is possible to express the same meaning in different ways (‘bread and butter’ versus ‘butter and bread’), we tend to say things in the same way. As much as half of spoken discourse is made up of formulaic language , or linguistic patterns. Despite its prevalence, little is known about how the processing system treats novel patterns and how rapidly a sensitivity to them arises in natural contexts. To address this, we monitored native English speakers’ eye movements when reading short stories containing existing (conventional) patterns (‘time and money’), seen once, and novel patterns (‘wires and pipes’), seen 1-5 times. Subsequently readers saw both existing and novel phrases in the reversed order (‘money and time’; ‘pipes and wires’). In 4-5 exposures, much like existing lexical patterns, novel ones demonstrate a processing advantage. Sensitivity to lexical patterns – including the co-occurrence of lexical items and the order in which they occur – arises rapidly and automatically during natural reading. This has implications for language learning and is in line with usage-based models of language processing.


Formulaic language processing
There is a growing body of research demonstrating that a range of formulaic language is processed more quickly than novel or non-recurring language by adult native speakers when other factors like length and single-word frequency are controlled for: idioms such as 'break the ice' (e.g.Swinney & Cutler, 1979;Siyanova-Chanturia, Conklin & Schmitt, 2011); binomials such as 'salt and pepper' (e.g.Carrol & Conklin, 2020;Siyanova-Chanturia, Conklin & van Heuven, 2011); and lexical bundles such as 'did you hear' (e.g.Arnon & Snider, 2010;Tremblay, Derwing, Libben & Westbury, 2011).For example, in an eyetracking study, native English speakers' fixations while reading sentences containing binomials were shorter for binomials than their less frequent reversed forms (Siyanova-Chanturia, Conklin & van Heuven, 2011).This advantage was not solely attributable to predictability (as assessed in a cloze task), which led to the conclusion that the faster reading reflected something over and above the predictability of the upcoming word, and that the lexical pattern itself is entrenched in memory.An ERP study with native English speakers demonstrated that binomials elicited larger P300s and smaller N400s compared to infrequent but strongly associated words and semantic violations (Siyanova-Chanturia, Conklin, Caffarra, Kaan, & van Heuven, 2017).This indicates that frequent lexical patterns are characterized by the pre-activation of a mental 'template' that uniquely matches the unfolding configuration (increased P300) as well as a reduced processing load and easier semantic integration (decreased N400).However, research to date has investigated existing linguistic patterns; little is known about the acquisition of novel patterns (i.e.sequences of words, or 'patterns', that have never been encountered before), particularly in natural contexts.
A similar processing advantage has been found in young children for formulaic language.Bannard and Matthews (2008) showed that native English speaking two-and threeyear-olds were more accurate at repeating frequent sequences ('sit in your chair') than matched infrequent ones ('sit in your truck').In addition, three-year-olds were faster at repeating the first three words if they were part of a frequent sequence.Evidence in second language (L2) acquisition is more mixed; some studies show a processing advantage for formulaic language while others do not (for an overview and discussion see Conklin, 2019).
A focus in the L2 literature has been on the role of input on the implicit learning and processing of formulaic language.For example, Northbrook and Conklin (2019) showed that Japanese junior high school learners of English engaging in a phrasal judgment task (i.e.'is this an acceptable phrase in English or not') responded significantly faster and more accurately to lexical bundles that occurred in their textbooks (e.g.'do you play') than to matched lexical bundles that they had not encountered in their texts (e.g.'did you hear').
Further, their response times were sensitive to the frequency of occurrence in their textbooks, with faster response times for more frequent textbook lexical bundles.An emphasis in L2 research has been on the amount and type of input that is needed for formuliac language to be learned (e.g.Pellicer-Sánchez, 2017;Sonbul & Schmitt, 2013;Webb, Newton & Chang, 2013).However, these studies have either presented non-native speakers with existing multiword items (generally using a pre-test to determine which are/are not known, which may not be a wholly accurate representation of whether items have been encountered previously), or with non-words that may unduly draw attention to the items.In addition, many of the studies involved explicit tasks (e.g.making a judgement about an item) -like the Northbrook and Conklin (2019) study described above -that may not reflect the more natural processing of formulaic language.

Acquiring linguistic patterns
Based on research with native speakers, it has been hypothesized that linguistic knowledge should be rapidly acquired, even from a single exposure, although repeated exposures should improve acquisition and retention (Ullman, 2015).Indeed, young children can acquire the meaning of a word in their native language after a single exposure (Carey, 1978), particularly if there is strong contextual support for the word-object mapping (Baldwin, 1993), or a restrictive syntactic environment (Gleitman, Cassidy, Nappa, Papafragou & Trueswell 2005).
Research on adults in their native language has demonstrated that they are sensitive to the cooccurrence statistics of mappings between novel objects and words (Vouloumanos, 2008).
Both children and adults track statistical regularities in a range of linguistic input.For example, children use the transitional probabilities between novel syllables to segment speech into word-like units (Saffran, Aslin & Newport, 1996).Adults apply statistical computation to syntactic acquisition (Thompson & Newport, 2007) as well as being sensitive to the frequency of alternative structures and alternative meanings of ambiguous words (Duffy, Morris & Rayner, 1988;Trueswell, 1996).
Taken together, research demonstrates children's and adults' remarkable ability to quickly extract regularities from the linguistic input in their native language.Arguably, the types of linguistic phenomena that have been studied -like those that were just mentionedare all important for successful communication.In contrast, a change in word order in many binomials does not change the overall understanding of the sentence, although it may change the emphasis placed on the two entities, as in the following: 'Many countries agree on the importance of the separation of church and state vs. state and church'.While not altering the overall meaning, the second lexical combination sounds less natural to native speakers (Siyanova-Chanturia, Conklin & van Heuven, 2011), and occurs less frequently in English (101 vs. 4 occurrences per 100 million words in the British National Corpus, 2007).
Alongside this, because we produce words one at a time, language can be seen as sequential.In many cases, syntax determines word order.However, for binomials, there is no syntactic reason that 'church' should occur before 'state'.A range of semantic and phonological factors have been proposed to explain why binomials become 'frozen' in a particular order (Benor & Levy, 2006;Fenk-Oczlon, 1989;Mollin, 2012).Although considerable work has been done to generate a set of semantic and phonological constraints to account for word order in binomials, this has not yielded a complete explanation.For one, there are often a number of competing constraints at work, and it is unclear how they interact.
Second, there are frequent counter-examples for each of the constraints (e.g.male before female: 'man and wife' but 'bride and groom').This suggests that much of what contributes to the order in binomials is conventionalization resulting from repeated usage over time.
Binomial ordering can therefore be conceptualized in terms of a trade-off between purely linguistic knowledge (e.g.phonological, semantic, syntactic and other constraints) and itemspecific experience, where the most frequent items 'polarize' toward one word order or the other (Morgan & Levy, 2016).Although the initial preference for one order over the other may be determined by linguistic constraints, increasing frequency of exposure should lead to entrenchment of a particular order in memory.The current study investigates how this entrenchment occurs for novel, non-conventionalized binomials that do not have an established word order.We focus our research on binomials because they should be relatively easy to learn; object-noun-phrase conjunctions, verb-phrase conjunctions and subject-nounphrase conjunctions are learned fairly early by English-speaking children (4;0, 4;5 and 4;9 respectively) (Ardery, 1979).This means that we could see effects that might not be as apparent for more complex structures.
An important question is whether we can find evidence for a sensitivity to linguistic regularities when the input closely resembles 'real-world' input.Thus, do speakers acquire linguistic patterns in natural contexts (e.g. while reading normally), and if so, how many occurrences are needed for a processing advantage to emerge that is akin to that of existing patterns?Much of the research on statistical language learning has been carried out in fairly contrived situations in which participants see and/or hear sets of repeated visual and/or auditory stimuli (e.g. a few images paired with a novel word).In real-world or more natural situations, it might not be possible to keep track of all of the available statistical information (Vouloumanos, 2008), in particular because memory and attention might constrain statistical learning (Kareev, 1995;Turk-Browne, Junge & Scholl, 2005).The current research explores whether a sensitivity to linguistic patterns is apparent when the input and design of the study mimics a fairly authentic reading experience; participants are simply asked to read three stories containing existing binomials ('time and money') and novel binomials ('wires and pipes') in their forward and reversed forms while their eye movements are monitored.

Research questions and hypotheses
In the current study, we address the following two research questions: 1. Is the language processing system sensitive to novel linguistic patterns in input that simulates a real-world context? 2. How many occurrences are needed for novel patterns to demonstrate a processing advantage in a real-world context?
We expect that existing binomials will demonstrate the well-documented processing advantage over their reversed forms; reading times should be significantly less for the whole phrase and the initial words of existing binomials should 'prime' the reader for the final word.More specifically, reading 'time and' should prime the reader for 'money', more than 'money and' does for 'time'.Thus, the final word of the binomial should be read more quickly when it is in its conventional, forward form than when it is in its reversed form.If the processing system is not sensitive to new lexical patterns, then novel binomials should behave similarly in their forward and reversed forms, and we should see no effect of repetition.This means that reading 'x and y' should be no different than reading 'y and x', regardless of how many times 'x and y' is encountered.However, we expect that lexical patterns are quickly and automatically registered while reading short stories and thus a pattern should emerge for the novel binomials whereby they become more like existing binomials over time.That is, on the first encounter there should be no difference between 'x and y' and 'y and x', but as the former is encountered more, it should become 'preferred' relative to the latter and thus read more quickly.In other words, additional encounters with novel binomials should be read more quickly, both compared to the previous encounters and relative to the reversed form.

Methods
Participants.Forty English native speaker undergraduates from a British university participated for course credit.

Materials.
A full set of items with data about their characteristics is available in Appendix A. The stimuli were 12 existing and 25 novel binomials.Existing binomials were identified from previous studies, were all 'noun and noun', were highly frequent and had a highly conventionalized order (occurrences in the BNC per 100 million words: forward M = 352, SD = 306; reversed M = 23, SD = 27; ratio forward to reversed M = 49:1, SD = 90.3,which are significantly different t(11) = 9.23, p < .001).
Novel binomials were 21 'noun and noun' and 4 'verb and verb' word pairs conjoined by the conjunction 'and'.These were created by taking common nouns and verbs that were from the same semantic field and which displayed no clear word order preference (e.g.'goats and pigs', 'wires and pipes', 'write and phone'). 1 When creating the novel binomials, candidates were rejected when either word was the first part of an existing binomial.For example, 'cup' occurs in the binomial 'cup and saucer'; thus, an invented binomial like 'cup and mug' was not considered an appropriate item, as encountering the word 'cup' might lead to prediction/expectation of the existing binomial.We confirmed that none of our items formed part of an established phrase in the BNC.To ensure that the novel binomials were 'plausible' phrases in English, they occurred at least once in either order in the BNC, but not more than 11 times.This demonstrates that none of the binomials were frequent phrases in the same way as the existing binomials.As the novel items did not have 'forward' or 'reversed' forms, we arbitrarily assigned each pair a 'forward' version, which was counterbalanced over two presentation lists for the main experiment.Overall, mean BNC frequency for 'forward' phrases was 3.8 (SD = 2.7) and for 'reversed' phrases was 3.4 (SD = 2.9); the ratio of forward to reversed forms was 1.5:1, SD = 1.2, which was not significantly different (t(24) = 0.80, p = .43).The words in each novel binomial were matched for length (in characters and syllables) and frequency, and any differences were accounted for by including single word characteristics in the analyses.
Both existing and novel binomials were assessed for the association strength of the content words using the Edinburgh Associative Thesaurus (Kiss, Armstrong, Milroy & Piper, 1973).Table 1 summarizes the characteristics of the stimuli, with further details provided in Appendix A.
Table 1.Item characteristics for all existing and novel binomials.Mean values are provided with standard deviation in brackets and range underneath.Frequency is expressed on the Zipf scale (van Heuven, Mandera, Keuleers & Brysbaert, 2014) from 1 to 7; 2 association strength is measured on a scale from 0 to 100; word length is measured in characters (letters) and syllables.3.4 (0.4)   2.9-4.1   2.2 (0.5)   1.3-3.0   40 (28)   0-81   4.2 (1.1)   3-7   1.1 (0.3)   1-2   5.1 (0.6)   4.0-6.2   5.3 (1.4)   3-8   1.4 (0.5)   1-2   4.9 (0.4)   4.0-5.6) Three short stories of approximately 1,500 words each were created, containing a different set of existing and novel binomials.Existing binomials appeared once in their existing/forward form and novel binomials appeared between one and five times in their forward form, with five novel binomials at each frequency of occurrence (i.e.five appeared once only = 5 occurrences; five appeared twice only = 10 occurrences; five appeared three times only = 15 occurrences, five appeared four times only = 20 occurrences; five appeared five times only = 25 occurrences; for a total of 75 occurrences of novel binomials).The reversed form for both existing and novel binomials occurred once in each story after all occurrences of the corresponding forward form.Presentation was across two counterbalanced lists, such that all items appeared on both lists, but what was classified as 'forward' for the novel binomials alternated across lists (e.g.'wires and pipes' = forward on list one; 'pipes and wires' = forward on list two).There was no effect of list in any analysis, so results were collapsed across lists.Crucially, the lack of an effect of list confirms that there was no a priori word order preference for the items.
To assess the predictability of the invented binomials, the stories were presented in a cloze task to a matched group of native English speakers who did not take part in the main experiment.Each story was presented with the second word of all binomials removed and participants were asked to fill in the gaps, e.g.'We realized that a whole load of grass and ________ had blocked the guttering' (target = 'leaves').Items were also presented separately in isolation, with participants asked to fill in the gap with the word they thought best fitted, e.g.'grass and ______'.This provided a measure of both the stand-alone cloze probability and contextually defined cloze probability for all items.In total 65 participants took part in the norming, seeing either the forward or reversed novel binomial, either in context or as a stand-alone item.Inclusion of either the stand-alone or contextual cloze values made no difference to any of our analyses, so they are not considered in detail in the results section.
Procedure.Participants were seated in front of a computer monitor with their head movements stabilized using a desk-mounted chinrest.Eye movements were recorded monocularly at a sample rate of 500Hz with an EyeLink 1000+ system from SR Research.
Before the experiment, accuracy was verified using a nine-point calibration and validation grid.During the experiment, between each screen a fixation point appeared to allow for trialby-trial drift checking and recalibration if required.Participants were instructed to read the texts as normally as possible for comprehension and to press the spacebar when they had finished.In the texts, neither existing nor novel binomials appeared at the beginning or end of a line or across a line break.Following each story a series of yes/no comprehension questions appeared to ensure that participants had attended to the text.The system was recalibrated before each story.

Results
Analysis.Data were 'cleaned' prior to analysis.Following standard practice, individual fixations that were less than 100 ms or longer than 800 ms were removed.Trials that experienced track loss or where a trial was discontinued early were removed.For items where this was the case, all occurrences of the item were subsequently removed from the dataset (e.g. if the first encounter with a phrase was missing, all subsequent encounters, including the reversed form, were removed).For the first analysis (existing and novel binomials in forward and reversed forms) 6.9% of data was excluded and for the second (effects of repetition on novel binomials) 10.5% of the data was excluded.As this is a somewhat conservative procedure, we also analyzed the data without these exclusions and the pattern of results remained the same.
All analyses were conducted using R (version 3.5.3)and R Studio (version 1.1.463).
Linear mixed effects models were constructed and analyzed using the lme4 (version 1.1-21; Bates, Maechler, Bolker & Walker, 2015) and lmerTest (version 3.1-0; Kuznetsova, Brockhoff & Christensen, 2016) packages.Effects plots (Figure 1) were produced using the effects package (version 3.1-2; Fox, 2003).Linear mixed effects models were constructed for: 1) the whole phrase, and 2) for each of the content words (word 1 and word 3) individually.Since the conjunction 'and' (word 2) was skipped more than 65% of the time in all conditions, this was not subjected to further analysis.Separate models were constructed for the measures first pass reading time (first pass RT: the sum of all fixations prior to leaving the word or phrase) and total reading time (total RT: the sum of all fixations on the word or phrase throughout the trial).First pass reading time is considered an 'early' eye-tracking measure, indexing highly automatic word recognition and lexical access processes, while total reading time is a 'late' measure indexing initial lexical retrieval and subsequent integration of a word.Because they tap into different aspects of processing, we may see different effects in them (Conklin, Pellicer-Sánchez & Carrol, 2018).Duration measures were log-transformed to reduce skewing.For word level analyses, skipped items (words which received no fixations during first pass reading) were discounted from any subsequent analysis.In all models we adopted the maximal random effects structure justified by the design (Barr, Levy, Scheepers & Tily, 2013).Two analyses were carried out.First, we compared reading measures for existing binomials and novel binomials in their forward and reversed forms.The existing binomials occurred only once before their reversed form, while the novel binomials occurred between one and five times.For these analyses only the first occurrence is compared to its reversed form.For the second analysis, we compared the different number of occurrences (1-5) of the novel binomials.
Research question 1.Is the language processing system sensitive to novel linguistic patterns in input that simulates a real-world context?: A comparison of existing and novel binomials in forward and reversed forms.Here we compare existing binomials and the first occurrence of novel binomials in their forward form to their reversed form, which are the last occurrence (see Table 2).Model outputs for the whole phrase, word 1 ('time'/'wires') and word 3 ('money'/'pipes') are reported in Table 3.All models included fixed effects of type (existing vs. novel) and direction (forward vs. reverse) as well as word level fixed effects for length (in letters) and frequency (on the Zipf scale). 2 We included number of repetitions as a covariate in all models including novel phrases to account for the difference in number of encounters on the subsequently encountered reverse form, and random intercepts for subject and item were included, as well as by-subject random slopes for the effects of binomial status * direction (analysis 1).We also added phrase frequency (on the Zipf scale), association strength, cloze probability and context-specific cloze probability one by one in a stepwise fashion and compared the resulting models with the original model using log-likelihood tests.
None of these improved any of the models.For word level analyses any words that received no fixations during first pass reading were excluded.Final models were checked for collinearity and no issues were observed (all VIFs < 5).Note: For word 1 and word 3 values, words that received no fixations are discounted; values for the whole phrase include trials where either word 1 or word 3 (but not both) were skipped.
Table 3. Linear mixed effects model for existing vs. novel binomials in forward and reversed forms for whole phrase reading times, word 1 and word 3.As can be seen in Tables 2 and 3, a clear picture emerges where existing binomials like 'time and money' are read more quickly than their reversed form 'money and time'.
Table 3 shows that for the whole phrase the forward form is faster in first pass RT (t = -2.61,p = .011)and total RT (t = -3.64,p < .001).It also demonstrates that the locus of this effect is the final word, which is read more quickly in the forward form for first pass RT (t = -3.44,p < .001)and total RT (t = -3.19,p = .002).This pattern of results is consistent with the literature on formulaic language more broadly, and binomials in particular, and offers further evidence that existing lexical patterns are read more quickly than less frequent formulations.
However, in previous studies formulaic language has generally been presented in isolation or in single sentences.The current results demonstrate that the processing advantage persists when readers need to integrate the meaning of binomials in sentences as well as into a longer stretches of discourse.
For novel binomials, the pattern is somewhat different (see Table 3).Forward forms are not read more quickly than reversed forms for first pass RT (t = -0.33,p = .744),and the only significant difference occurs in total RT at the phrase level, such that once a phrase has been seen in its forward form (e.g.'wires and pipes'), the reversed form ('pipes and wires') is actually read more quickly (t = 3.24, p = .001).
The advantage for the reversed form of novel binomials is in conflict with our predictions, whereby we expected no difference between an initial encounter with 'x and y' and a subsequent encounter with 'y and x', but is in line with the prediction that multiple encounters with 'x and y' lead to incrementally shorter reading times.What these results point to is that after repeatedly encountering a lexical pattern like 'wires and pipes', the processing system has become sensitive to the combination of the words 'wires' and 'pipes'.
This means that seeing 'pipes and' speeds up the reading of 'wires', even though the lexical items do not appear in what has been established as their canonical order.This finding is contrary to what happens with existing binomials, which have been encountered many times in the canonical order over a participant's lifetime.An important question is whether the processing system is sensitive to not only the co-occurrence of lexical items, but also to the order in which they appear.The next analysis will explore this more fully.

Research question 2. How many occurrences are needed for novel patterns to
demonstrate a processing advantage in a real-world context?: Looking at effects of repetition on novel binomials.Table 4 provides a summary of phrase and individual word reading times for novel binomials as a function of number of occurrences, as well as for the final reversed form.Each number of occurrence (1-5) had five items and values are cumulative.This means that the first occurrence includes all 25 items, regardless of whether they were subsequently repeated; 20 of these were then seen a second time; 15 of these a third time; 10 of these a fourth time; and 5 of these a fifth time.All 25 were then seen in their reversed form.We constructed linear mixed effects models to investigate the effect of repetition (Table 5).We first included a fixed effect of occurrence (with levels: 1, 2, 3, 4, 5 and Reversed).The first encounter was coded as the baseline and we used the difflsmeans function in the lmerTest package to compare whether each subsequent encounter was significantly different from the first encounter, and from the reversed form.Random intercepts for subject and items were included, as well as by-subject random slopes for order of occurrence.We constructed a separate model including fixed effects of number of repetitions and direction (forward/reversed) to see whether reading a binomial more times led to an effect on both the forward form (Do more encounters lead to faster reading time?) and the reversed form (Is the reversed form slower for novel binomials that have been encountered more times, compared to ones that have been encountered fewer times?).Random intercepts for subject and items were included, as well as by-subject random slopes for the effect of number of repetitions * direction.
Table 4. Mean word and phrase level reading times in milliseconds (standard deviation in brackets) for first pass reading time and total reading time for novel binomials as a function of number of occurrences, as well as the reversed forms.

First pass reading
Whole phrase 366 ( 188) 381 ( 210) 350 ( 194) 328 ( 165) 328 ( 212 Note: For word 1 and word 3 values, words that received no fixations are discounted; values for the whole phrase include trials where either word 1 or word 3 (but not both) were skipped.
Table 5 shows the analysis of the effect of number of encounters on whole phrase reading times.While the difference between occurrences one and two is not significant, subsequent repetitions all lead to significantly shorter reading times compared to occurrence one: occurrence three in first pass and total RT, occurrence four in first pass and total RT and occurrence five in first pass but not total RT.Importantly, after the lexical pattern has been seen several times, there is a processing advantage for it compared to the subsequently encountered reversed form.Comparison of each level with the reversed form showed that for first pass RT, a significant advantage for the forward form emerged after three encounters: occurrence three (t = -2.10,p =.040), occurrence four (t = -3.67,p < .001)and occurrence five (t = -2.93,p = .004).For total RT this was the case for occurrence four (t = -3.20,p = .001)but not occurrence five (t = -0.02,p > .05).It is likely that this did not reach significance at five occurrences because of reduced item power: five occurrences only had 5 items, while four occurrences had 10 and three had 15. 3 On the whole, it appears that the processing system is not only sensitive to the co-occurrence of lexical items, but also to the canonical form or order once this has been read multiple times, and this effect is particularly evident in first pass RT. 4 The overall effect of repetition can be seen in Figure 1, which shows that first pass RT (top row, left panel) and total RT (bottom row, left panel) decrease as the number of forward occurrences increases.This effect is confirmed by the analysis in Table 6, where there is a significant overall effect of repetition for novel forward phrases in first pass RT and total RT.
For reversed forms, number of repetitions indicates how many times each phrase was seen prior to the reversed form being encountered.Figure 1 shows that this had no effect on first pass RT (top row, right panel), but for total RT there was a greater cost for reversed phrases when the forward phrase had been encountered more times.In other words, seeing a phrase more times in the forward form leads to slower reading for the reversed form, compared to phrases that were not seen as many times in the forward form.This is confirmed by the interaction of repetitions and direction reported in Table 6 for first pass RT and total RT.Table 6.Linear mixed effects model for whole phrase reading times for novel binomials as a function of number of repetitions*direction.

DISCUSSION
It is thought that linguistic knowledge should be rapidly acquired.As we discussed in the Introduction, there is considerable evidence that this is true for elements of a native language that are fundamental to successful communication.However, important questions have remained unanswered.With regard to our first research question -whether the language processing system is sensitive to novel linguistic patterns when the input simulates a realworld context -we found that reading times for novel binomials become faster as the number of occurrences increases.It appears that the co-occurrence of the lexical items 'wires' and 'pipes' is recorded in memory, making the final reversed form faster than the initial encounter with the forward form in total reading time.Addressing our second research question about the number of exposures that are required, we see that in as little as four to five exposures, a novel pattern is read more quickly than its reversed form.In other words, it develops the well-attested pattern of existing binomials in which the forward form is read more quickly than the reversed form.Importantly, this advantage emerges over the course of reading short stories on a computer.Thus, the sensitivity to lexical patterns arises during normal reading.
Our findings are in line with usage-based approaches, which put a premium on linguistic input -meaning that experience of and exposure to language results in high frequency, repetitive sequences of words being stored in memory (e.g.Bybee, 2002;2013;Tomasello, 2003).In a usage-based approach, when language users encounter utterances, they store them and look for patterns amongst them (Tomasello, 2003).Importantly, the first time a sequence of words is encountered, it leaves an imprint in memory that is strengthened by subsequent exposure ( Logan, 1988).In this view, development of language knowledge, as well as more specific aspects like literacy, is primarily driven by exposure.Thus, when individual units and sequences of language have a high degree of repetition, this will lead to 'the conventionalization of categories and associations, as well as to the automation of sequences' (Bybee, 2013, p. 50).This means that our experiences with language determine memory representations, which do not exist in isolation, but are formed and function in dynamic networks that are continually updated to reflect the nature of ever-changing linguistic experience.
The current results are in line with accounts of language processing which hold that linguistic skill is based on experience with language and that acquisition is reliant on finding patterns (Tomasello, 2003).Thus, similar to the well-established sensitivity to regularities that are seemingly important for communication, we find a sensitivity to novel binomials.In a usage-based approach, the frequency of a pattern is important for being able acquire it.
When a reader encounters 'wires and pipes', the utterance is registered.Further occurrences are also registered and regularities can be detected.After a novel binomial has been seen once, the association between the two words leads to one word priming the other (Bybee, 2002).This would explain why the reversed 'pipes and wires' is faster than the first encounter with the forward 'wires and pipes'.However, the sequential order of the pattern is also recorded, which means that after four to five occurrences the forward form of the binomial is faster than the reversed form.
Our findings are also consistent with theories of memory and learning associations.
Thus, whenever two items, like 'wires' and 'pipes', are simultaneously active in memory, the strength of the association between them increases and they become more likely to activate each other.There are a number of models that detail how this occurs in memory and how associated items cue retrieval of each other (e.g.Search of Associative Memory, Raaijmakers & Shiffrin, 1981).To account for the current results, such models need to explain order effects.For example, the results could be explained in terms of the strength of association, which might be stronger from 'wires' to 'pipes' than from 'pipes' to 'wires'.Alternatively, transitional probabilities might be encoded in memory.This would be in line with eyetracking research showing that the transitional probabilities of words have a measurable influence on fixation durations (McDonald & Shillcock, 2003).
Finally, the current results support the view that lexical patterns are acquired automatically.It is thought that when a person attends to a particular input that some of its attributes, like frequency, which require little or no intentional processing, are automatically encoded in memory (Hasher & Zacks, 1984).Thus, when readers attend to a story, the frequency of lexical patterns is encoded.It is exactly this frequency of occurrence that allows for novel lexical patterns to become entrenched in memory and to elicit a processing advantage.The pattern of results is compatible with a Hebbian view of learning in which inputs into the system produce a pattern of activity, and acquisition occurs based on these patterns of activation (Wennekers, Garagnani & Pulvermüller, 2006).Thus, analogous to the Hebbian view that 'cells that fire together wire together', here we see that 'words that occur together wire together'.Relevant for the current results, Hebbian models can have internal order that determines the spatio-temporal sequences of activity patterns (Wennekers et al., 2006), termed 'phase sequences' by Hebb (1949).
In sum, this research demonstrates that binomials, a type of formulaic language, can be acquired rapidly and automatically from input that approximates a natural reading situation.However, important questions remain.Noun-and verb-phrase conjunctions are learned early in life (Ardery, 1979).Future work will need to investigate whether other types of formulaic language and more 'difficult' structures can be acquired in natural contexts and how much input is needed.In language acquisition -both L1 and L2 -an important question is how durable or long-lasting learning is.With regards to binomials, would the processing advantage that emerges after four to five occurrences of a novel pattern be evident a day later, a week later, etc.? Further, is there a difference in the durability of the word associations ('wires-pipes') and the effect of word order ('wires and pipes') vs. ('pipes and wires')?
Again, these are questions for future research.In addition, it would be of interest to better explore the effect at four and five occurrences, where it begins to emerge.In the current study, because we did not want to overwhelm the stories with repeated novel binomials, we elected to have five occurrences at every level.This meant that while all 25 items contributed reading times at one occurrence, only 5 did at five occurrences, resulting in much less item power at the highest number of occurrences.Future research should more carefully study the emerging processing advantage at four and five occurrences.Additionally, the processing pattern for novel binomials was more clear-cut in first pass RT than total RT, suggesting that the establishment of a novel pattern might have an effect in terms of early processes such as lexical access/expectancy, but not for later processes like meaning integration and/or semantic processing.This contrasts with what has been demonstrated for established binomials in eye-tracking where effects are seen in both early and late measures (Carrol & Conklin, 2020), and in ERP, where there is evidence for activation of a lexical 'template' (increased P300) and a reduced processing load and easier semantic integration (decreased N400) (Siyanova-Chanturia et al., 2017).It would be of interest to compare novel and existing binomials in an ERP paradigm to determine whether early stages of acquisition are characterized by easier lexical access (i.e. an increased P300 is evident for both existing and novel binomials) and only at later stages is semantic integration less effortful (i.e. a decreased N400 is only evident for existing binomials).Finally, we have focused on adults in their L1.
Research will need to address whether binomials and other types of formulaic language can be learned in natural contexts by less proficient speakers such as children in their L1 or L2 and adults in their L2.Thus, while we demonstrate that linguistic patterns are rapidly and automatically learnable in natural contexts, many open questions remain to be addressed by future research.

ENDNOTES
1 Care was taken to avoid the hypothesized semantic and phonological constraints on binomial word order (e.g.Benor & Levy, 2006;Fenk-Oczlon, 1989;Mollin, 2012;Morgan & Levy, 2016).We ensured that our items did not violate the constraints considered in Morgan and Levy (2016, p. 389).Crucially, the novel binomials were presented across counterbalanced lists to minimize any specific ordering effects, i.e. if an order for a novel pair was 'preferred', the effect of this would be offset by the item appearing in the 'dispreferred' order on the other list.Notably, there was no effect of list in any of the analyses, demonstrating that there was no particular word order preference for our items.
2 The Zipf scale is a logarithmic scale designed to express relative frequency, taking into account the size of the corpus.A Zipf score of 1 represents a frequency of 1 per 100 million words; 2 represents a frequency of 10 per 100 million words; 3 represents 100 per 100 million words, and so on. 3The study design meant that there were fewer items at 5 repetitions: All 25 items contribute to reading times at the 1st occurrence, only 20 go on to have a 2nd occurrence, only 15 go on to have a 3rd occurrence, 10 a 4th occurrence, and 5 a 5th occurrence.Thus, there is much less item power in the 5th occurrence versus the 1st (25 items vs. 5).Notably an effect was still evident: The fifth occurrence had shorter reading times (first pass and total RT) compared to the initial presentation.For first pass RT, there were shorter reading times for the third, fourth and fifth repetition relative to the reversed form.While increasing the number of items at later repetitions would help to address the issue of item power, this would have the disadvantage of increasing the salience of the repetition manipulation.To have 10 items at 5 repetitions (and at the other levels as well), this would necessitate 150 repetitions of novel binomials in the texts. 4It is important to note that the pattern for first pass RT is more clear-cut, suggesting that the effect may be an early one that is not evident in later processing.In other words, the establishment of a pattern might have an effect in terms of lexical access/expectancy, but not in terms of meaning integration/overall semantic processing.Thus, it shows up in early eyetracking measures (first pass RT) but not in later ones (total RT).

APPENDIX A
A full list of the existing and novel binomials used in the experiment, with raw frequencies and ratio of forward to backward occurrences, Zipf frequencies (scale from 1 to 7 with 7 being highest frequency) and association strength (from 0 to 100).

Figure 1 .
Figure 1.Overall effect of increased number of repetitions on forward and reversed forms of

Table 2 .
Mean reading times in milliseconds with standard deviation in brackets for existing binomials and the first occurrence of novel binomials in their forward and reversed forms for first pass reading time and total reading time.Differences between forward and reversed forms are shown, with p-values based on the Differences of Least Squares Means extracted from the models reported inTable 3 (*p < 0.05, **p < .01,***p < .001).

Table 5 .
Linear mixed effects model for whole phrase reading times for novel binomials as a function of number of encounters (baseline = first encounter).