The PHaVE List: A pedagogical list of phrasal verbs and their most frequent meaning senses

As researchers and practitioners are becoming more aware of the importance of multi-word items in English, there is little doubt that phrasal verbs deserve teaching attention in the classroom. However, there are thousands of phrasal verbs in English, and so the question for practitioners is which phrasal verbs to focus attention upon. Phrasal verb dictionaries typically try to be comprehensive, and this results in a very large number of phrasal verbs being listed, which does not help practitioners in selecting the most important ones to teach or test. There are phrasal verb lists available (Gardner and Davies, 2007; Liu, 2011), but these have a serious pedagogical shortcoming in that they do not account for polysemy. Research indicates that phrasal verbs are highly polysemous, having on average 5.6 meaning senses, although many of these are infrequent and peripheral. Thus practitioners also need guidance about which meaning senses are the most useful to address in instruction or tests. In response to this need, the PHrasal VErb Pedagogical List (PHaVE List) was developed. It lists the 150 most frequent phrasal verbs, and provides information on their key meaning senses, which cover 75%+ of the occurrences in the Corpus of Contemporary American English. The PHaVE List gives the percentage of occurrence for each of these key meaning senses, along with definitions and example sentences written to be accessible for second language learners, in the style of the General Service List (West, 1953). A users’ manual is also provided, indicating how to use the list appropriately.


I Introduction
There are several reasons why English phrasal verbs 1 (PVs) are important to learn. The first is that they have been found to be very frequent in language use. For example, based on a corpus search of the British National Corpus (BNC), Gardner and Davies (2007) estimate that learners will encounter, on average, one PV in every 150 words of English they are exposed to. Biber, Johansson, Leech, Conrad, and Finegan (1999) estimate that PVs occur almost 2000 times per million words. Furthermore, PVs carry a large number of meanings and functions. Gardner and Davies (2007) found that each of the most frequent English PVs had 5.6 meaning senses on average. These meaning senses often cannot be conveyed by a single word equivalent, or may carry connotations that their single word equivalent does not have (Cornell, 1985). More importantly, using PVs is crucial to fluent English and to sounding native-like. Because PVs are widely used in spoken informal discourse, failure to use PVs in such situations is likely to make language sound unnatural and non-idiomatic (Siyanova & Schmitt, 2007). However, PVs may be seen as an unnatural construction for some learners whose first language (L1) lacks such a structure. Their syntactic peculiarity (some PVs allow for particle movement, others do not) and semantic complexity (some PVs have meanings that are highly idiomatic and opaque) make them particularly difficult to learn and prone to avoidance (Dagut & Laufer, 1985;Hulstijn & Marchena, 1989;Laufer & Eliasson, 1993). Finally, they are composed of two or more orthographic words, which means that instead of recognizing them as single semantic units, unaware learners may attempt to decode the meanings of their individual components, and therefore misinterpret them. In short, PVs are both very important and very difficult to learn. This makes it all the more necessary to include them in the curriculum. However, as with individual words, the decision of which items to include must be made, often based on frequency criteria. Two corpus studies (Gardner & Davies, 2007;Liu, 2011) have established lists of the most frequent PVs in English, thereby identifying the most useful items to be taught. However, no information other than PV frequency and ranking order was provided, which makes these two lists inadequate for teachers and learners. The lack of semantic information, especially in the case of polysemous 2 items, means that teachers and learners are left to make their own judgements as to which meaning senses should be taught or learned. As a consequence, as both Gardner and Davies (2007) and Liu (2011) point out in their recommendations, research is needed to determine the most frequent meaning senses of these most frequent PVs. Just as priority should be given to the PVs that occur most frequently in language, priority should also be given to those meaning senses that occur most frequently for any individual PV. Therefore, the purpose of the present study is to narrow the scope of meaning senses of the most frequent PVs to be acquired, based on the frequency of occurrence in a large representative corpus of English (the Corpus of Contemporary American English, or COCA). This resulted in the creation of the PHrasal VErb Pedagogical List (the PHaVE List). 3 This article presents the PHaVE List, reports its development, and discusses its pedagogical potential to inform English language teaching, material development, and testing.

II Phrasal verb frequency lists 1 The rationale behind frequency lists
Whilst many English language teachers and researchers now recognize the importance of multiword knowledge in developing learners' proficiency (Moon, 1997;Wray, 2002;Schmitt, 2004), one of the main problems that teachers have to face is deciding which multiword items should be included in a syllabus and taught to learners. A frequency count appears to be the most sensible parameter to consider in making this decision (Liu, 2011), and indeed is consistent with the idea that language teaching should reflect authentic language use. In addition, actual frequency of occurrence is a more reliable indicator of usefulness than pure intuition (Hunston, 2002;Schmitt, 2010). Estimates of the number of PVs in English vary. For instance, according to McCarthy and O'Dell (2004), there are more than 5000 PVs and related noun and adjective forms currently in use in English. According to Gardner and Davies (2007), there are a total of 12,508 PV lemmas in the BNC. Both are substantial figures and unviable to teach, clearly indicating the need to establish frequency lists of PVs in order to help teachers make an informed choice in their pedagogical selection. This was pointed out as early as 1985 by Cornell, who speculated that without any attempt to select PVs, 'their discovery may be uncomfortably similar, from the learner's point of view, to the opening of Pandora's box' (p. 277); hence the need for selection and gradation prior to teaching, 'even at the risk of controversial inclusions and omissions'. Before the first attempt at a PV frequency list was made, teachers were left with little but their own intuition to select the few PVs to be dealt with in the classroom. However, as Darwin and Gray (1999) point out, their intuitions may not be correct. One corpus-based frequency study of English PVs was carried out by Biber et al. (1999). However, due to the limited number of PVs they addressed (31), it will not be discussed here. Instead, we will focus our attention on two more recent and comprehensive corpus-based frequency studies of PVs.
2 Gardner and Davies' (2007) frequency list Gardner and Davies (2007) carried out a BNC search consisting of queries to identify every instance where a lexical verb was followed by an adverbial particle, with varying degrees of adjacency between the two. The outcomes were lemmatized so that all inflectional forms of the same verb were counted together (e.g. pick, picked, picking). Strikingly, they found that the top 20 lexical verbs found in PV constructions (e.g. go, look) account for 53.7% of all PVs in the BNC. Furthermore, these 20 lexical verbs, combined with only eight particles (out, up, on, back, down, in, over, and off), account for more than half (50.4%) of the PVs in the BNC. Looking at individual PV lemmas (e.g. pick up, go on), they found that only 25 make up nearly one-third of all PV occurrences in the corpus, and 100 make up more than one-half (51.4%). Although Gardner and Davies' main purpose was to investigate some characteristics of PVs, the inventory of the most frequent 100 PVs they studied could be considered a worthwhile list of high-frequency (and therefore useful) PVs in English. However, as noted by Liu (2011) and the authors themselves, the final 'list' has several shortcomings, among which (1) the fact that it contains only PVs made up of the top 20 PV-producing lexical verbs, thus potentially discarding other highly frequent PVs, and (2) the fact that these PVs may not be so frequent in other varieties of English than British English, given that the BNC was used as the only data source.

Liu's (2011) frequency list
Liu examined all the PVs already included in Biber et al.'s (1999) and Gardner and Davies' (2007) lists, noting a high degree of overlap between the two lists, with only four of Biber et al.'s 31 PVs not in Gardner and Davies' list of 100 PVs. In addition to searching the 104 combined PVs in the COCA, he queried the COCA and the BNC for the other most common PVs, using four recent comprehensive PV dictionaries as a search list guide. The total search was 8847 PVs (5933 extracted from the dictionaries, and 2914 extracted as a 'by-product' of his own query method). The criterion for inclusion in his list was 10 tokens per million words, for the three following reasons:  (104), plus an additional 48 PVs. Liu notes that whilst these 152 most frequent PVs comprise only 1.2% of the total 12,508 PV lemmas in the BNC, they cover 62.95% of all the total 512,305 PV occurrences, which 'helps demonstrate the representativeness and hence the usefulness of these most frequently used PVs' (p. 668). He also notes that the most common PVs appear to be rather similar between American and British English. Despite the fact that the BNC and COCA cover different time periods (from the 1980s to 1993 and 1990 to the present, respectively), no substantial difference was found between the two corpora, suggesting that PV use has remained relatively stable over the past decades and may remain so over the next ones. Because he combined look around with look round and turn around with turn round (the different forms being simply a result of usage variation), the number of PVs in Liu's list falls from 152 to a final total of 150.

III Phrasal verbs and polysemy
1 How polysemous are the most frequent phrasal verbs?
One particularly interesting finding in Gardner and Davies' study is that PVs are highly polysemous lexical items, with the PVs on their list having 5.6 meaning senses on average. This means that, in reality, the learning load of PVs is probably greater than most other words or word combinations in English. This 5.6 meaning sense average figure suggests that mastering the most frequent PVs in English does not entail knowing only 100 or 150 form-meaning links, but between 560 and 840. However, while PVs are undoubtedly highly polysemous, there are reasons to question Gardner and Davies' exact polysemy figures. First, WordNet, the electronic database used by Gardner and Davies to recognize distinctions between different meaning senses of the same word forms, seems to yield redundant meaning senses (i.e. what constitutes a single meaning sense comes up as two different entries). A quick search using only one example given by Gardner and Davies, put out, is enough to illustrate this (the seventh and eighth meaning senses are the same baseball sporting term): • S: (v) trouble, put out, inconvenience, disoblige, discommode, incommode, bother (to cause inconvenience or discomfort to) 'Sorry to trouble you, but…' • S: (v) put out (put out considerable effort) 'He put out the same for seven managers' • S: (v) smother, put out (deprive of the oxygen necessary for combustion) 'smother fires' • S: (v) stretch out, put out, extend, hold out, stretch forth (thrust or extend out) 'He held out his hand'; 'point a finger'; 'extend a hand'; 'the bee exserted its sting' • S: (v) douse, put out (put out, as of a candle or a light) 'Douse the lights' • S: (v) put out (be sexually active) 'She is supposed to put out' • S: (v) put out, retire (cause to be out on a fielding play) • S: (v) put out (retire) 'he was put out at third base on a long throw from left field' • S: (v) publish, bring out, put out, issue, release (prepare and issue for public distribution or sale) 'publish a magazine or newspaper' • S: (v) anesthetize, anaesthetize, anesthetise, anaesthetise, put under, put out(administer an anesthetic drug to) 'The patient must be anesthetized before the operation'; 'anesthetize the gum before extracting the teeth' Second, it also seems to omit some important meaning senses. For instance, look up yields only one meaning sense ('seek information from'), ignoring 'raise one's eyes' (as in he looked up from his book) and 'improve' (as in things are looking up). Therefore, although WordNet may be used as a tool to discover the various meaning senses of a word or word combination, it certainly cannot be used as the sole data source for PV meaning sense counts. This points out the limits of electronic databases: a manual count, although undoubtedly more time-consuming, would have yielded a more accurate number. However, to our knowledge, this 5.6 figure is the only estimate of the number of meaning senses of the most frequent PVs currently available in the literature. Another figure could be obtained by counting the number of meaning sense entries of the most frequent PVs in dictionaries. However, as we will explain below, such procedure may also yield inconsistent numbers.

Polysemy in dictionaries
The primary purpose of dictionaries is to provide exhaustive information by including all the meaning senses associated with a particular form that learners are likely to encounter. In concrete terms, this means that dictionaries may contain quite a large number of meaning senses for each PV. As we discovered, this is especially the case in PV dictionaries. For instance, go on has 22 meaning sense entries in the Collins COBUILD Phrasal Verbs Dictionary (3rd edition, 2012): 3. To go on means to happen. 4. If you go on to do something, you do it after you have finished something else. 5. If you go on, you continue to the next part of stage of something. 6. If you go on in a particular direction, you continue to travel or move in that direction. 7. If you go on, you go to another place, having visited a first place. 8. You say that land or a road goes on for a particular distance, when you are talking about how big or long it is. 9. If a period of time goes on, it passes. 10. If someone goes on, they continue talking. 11. If someone goes on, they continue talking to you about the same thing, often in an annoying way. 12. You say Go on to someone to encourage them to do something. 13. You say Go on to someone to show that you do not believe what they have said. 14. You say Go on to someone to agree to something they suggest. 15. If you go on something that you have noticed or heard, you base an opinion or judgment on it. 16. If a light, machine, or other device goes on, it begins operating. 17. If an object goes on, it fits onto or around another object. 18. If something, especially money, goes on something else, it is spent or used on that thing. 19. When an actor or actress goes on, they walk onto a stage. 20. If you go on a drug, you start taking it. 21. If you say that someone is going on a particular age, you mean that they are nearly that age. 22. If you are gone on someone, you are in love with them.
As we can see, the Collins COBUILD dictionary covers a very large range of meaning senses, some of which seem to overlap to various degrees. The resulting effect, while relatively comprehensive, seems to be counter-productive from a pedagogical perspective: learners may easily feel overwhelmed by the amount of information included within a single entry. They may struggle to find the information they need.
Furthermore, there appears to be a clear lack of consistency between some of the most established English dictionaries or lexical databases. For instance, give out has six meaning senses in the Collins COBUILD, the first being 'if you give out a large number of things, you give them to a lot of people'; three meaning senses on Oxford Dictionaries online (British and World English), the first being 'be completely used up'; four meaning senses on WordNet, the first being 'give off, send forth, or discharge; as of light, heat, or radiation, vapor, etc'; and one meaning sense on Cambridge Dictionaries online (British English), being 'if a machine or part of your body gives out, it stops working'. This example illustrates the fact that not only do dictionaries differ in the number of meaning senses they present, but also in the order in which they present them.
Thus, dictionaries (paper and online versions) and online databases have the following shortcomings: 1. They may contain an overwhelming amount of information under each PV entry. 2. They may exclude important meaning senses. 3. They are not consistent in the way they present meaning senses, which makes it difficult for teachers and learners to decide which meaning senses should be prioritized for teaching and learning.
This suggests that whilst dictionaries may be good as reference sources, they are clearly limited for pedagogical purposes. Teachers and learners need a more pedagogically-oriented source of reference that will be helpful to them in two ways: by containing a more condensed amount of information, and by providing the right type of information (i.e. the meaning senses that occur the most frequently).
In conclusion, corpus-based frequency studies of PVs have found that a restricted number of PVs account for a large proportion of all PV occurrences in English. This is good news because it suggests that teaching and learning only these most frequent PVs, besides being more manageable than teaching and learning all the PVs, is highly profitable. However, as dictionaries and lexical databases show, many of these most frequent PVs have multiple meaning senses. Because dictionaries and lexical databases appear to be inadequate tools as far as decisions about which meaning senses to teach/learn are concerned, the need for a pedagogical list of PVs, based on frequency criteria, is now evident. The following section deals with the methodology adopted to develop such a list.

Choosing the items
The PVs analysed in this study are those included on Liu's (2011) list of the 150 most frequently used PVs in American and British English, which is to date the most recent corpus study investigating PV frequency. The list contains all the items previously identified by Biber et al. (1999) and Gardner and Davies (2007), with an additional 48 items extracted by Liu from the COCA using statistical procedures such as the chi-square and dispersion tests. Liu presents it as 'a comprehensive list of the most common PVs in American and British English, one that complements those offered by the two previous studies with more necessary items and more detailed usage information' (p. 661). The list has the advantage of including items that have been identified and extracted by three different studies involving different procedures and corpora, which increases our confidence that those items which made the final list are indeed the most frequent PVs in English. Two different corpora (BNC and COCA) including a wide range of genres and registers were analysed by Liu, and thus two different varieties of English, making the list useful to learners of British English as well as to learners of American English. It could be argued that, considering the huge number of PVs in English (see above), including only 150 PVs is not enough and more items should be added. However, we decided to limit our pedagogical list to 150 items for two reasons. The first is that, as seen previously, these 150 most frequent PVs already cover 62.95% of all the total 512,305 PV occurrences in the BNC. This suggests that learning only these PVs (at least in the first instance) is highly efficient and beneficial. In making his list, Liu searched a total of 8847 PVs, which is a very substantial number; among these, only the final 150 had at least 10 tokens per million words in either the COCA or the BNC, which suggests that the rest of the PVs may be simply too infrequent to be worth including on the list. The second reason is that the pedagogical purpose of the PHaVE List is paramount. Therefore, one point we had to constantly keep in mind was to make our list as practical and usable for practitioners as possible. For this reason, it could not be too long. As Liu points out, this is a prerequisite for a frequency list to be 'truly meaningful ' (2011, p. 667). It is worth noting that the final PHaVE List contains 38 pages, which might already be considered at the limits of practicality.
2 What information to give? a Meaning senses. After choosing the items, the next step was deciding what type of information should be included on the PHaVE List. Since the process of learning a word usually starts with establishing its form-meaning link (Schmitt, 2010), the most obvious type of information to include was meaning. Moreover, as Cornell (1985) interestingly points out, many PVs have no exact single word equivalent because they carry connotations that their single word equivalents do not have. We have thus sought to mention these connotations in our definitions whenever applicable, since knowing a word is not only knowing its form-meaning relationship, but also being aware of its connotations, semantic restrictions and prosody (Schmitt, 2010).
We have already discussed our main purpose for creating the PHaVE List, which is to reduce the total number of meaning senses to be acquired to a manageable number based on frequency criteria. Therefore, a decision had to be made as to which meaning senses were frequent enough to be included in our list and which meaning senses were not. Although this entailed that the meaning senses included in our list did not account for all PV occurrences in the corpus and in day-to-day English usage, the assumption was that they should account for a large majority of occurrences. Conversely, those not included in our list should only represent a very small fraction of the combined occurrences, making them unsuitable for inclusion in the sense that the effort undertaken to learn them would yield rather little benefit in comparison to learning their more frequent counterparts. Keeping this cost-benefit equilibrium in mind, some form of compromise had to be found between including enough meaning senses in our list for it to provide an adequate coverage of PV occurrences, and keeping it concise enough for it to be manageable for practitioners. Indeed, enumerating five or six different meaning senses for each item would make the PHaVE List of little added value compared to dictionaries, whose aim is to provide exhaustive information. In comparison, the PHaVE List aims to provide teachers and learners with only the most essential information that should be targeted for explicit teaching/learning. b Meaning sense frequency percentages. In concrete terms, this need for compromise translated into having to decide on a coverage percentage that would determine inclusion or non-inclusion of meaning senses in our list, i.e. all meaning senses needed to reach this percentage in order to be included. For instance, let us take the PV show up with the following meaning sense distribution: Meaning Sense 1: 81%; 2: 16.5%; 3: 2.5%. It appears that very little coverage is gained from the last two meaning senses in comparison to the first one, representing by itself a coverage of 81%. However, for the sake of consistency, a similar coverage threshold needed to be used for all the items. After careful examination of the data yielded by the corpus search, we settled upon a threshold of 75% as optimal, i.e. the meaning senses included in the PHaVE List for each item should account for at least 75% of all occurrences of this PV in our corpus search. Although it can be argued that the remaining uncovered 25% (one-fourth) is not a negligible proportion of the total, the underlying rationale of the PHaVE List to reduce overall meaning senses to a manageable number drove this decision.
However, in numerous cases, the primary meaning sense did not reach 75% coverage. Therefore, in addition to this 'upper-end' threshold, the need for a 'lower-end' threshold became progressively evident as we collected the data. This is because many meaning senses represent such a small proportion of the total that they are not worth including in the list. We therefore set the lower threshold as 10% for a meaning sense to be included in the list, i.e. all the meaning senses included in the PHaVE List account for at least 10% (one-tenth) of a PV's total occurrences in our corpus search. Indeed, it seems sensible that those meaning senses accounting for less than 10% of coverage are not worth prioritizing for explicit attention. This means that if the 75% threshold was not reached by the primary meaning sense, additional senses were included if they added at least 10% coverage. This continued until the 75% total coverage threshold was reached, or until meaning senses with at least 10% coverage were exhausted. In order to provide teachers and learners with an idea of the relative importance of the meaning senses for each PV, the allocated meaning sense percentages were included next to each definition, e.g. 'Make an appearance at a social or professional gathering (81%)'. This idea of including a percentage number for each meaning sense was inspired by the General Service List (GSL) compiled by West (1953), a list which has had a wide influence for many years in the field of ESL/EFL. The GSL contained 2000 headwords considered to be of the greatest general service to learners of English, listed alphabetically with brief definitions and example sentences. A frequency number was given for each headword, and a percentage number was given for each meaning sense, representing the relative frequency of that meaning sense in the total number of occurrences of the word. Below is an example (1953, p. 12):

AGREE, v. 672
(1) (consent) He agreed to give it a trial He was asked to do it, and he agreed 20% (2) (concur in an opinion, be of one mind) He agreed that it should be given a trial He agreed with Jones on (as to, about) the proposed new building; _in opposing the plan 65% (3) (be in harmony) Birds in their nests agree The figures don't agree 13% c Example sentences. Based on the pedagogical purpose of the list, we decided that each meaning sense definition reported in the PHaVE List would be illustrated by an example sentence (e.g. 'She didn't show up at the meeting'). Example sentences are widely used in English learners' dictionaries as they are believed to strongly facilitate comprehension of the definitions. They are also used in the GSL (see above). They are usually considered very helpful because they 'perform a useful backup to the explicit grammatical designation, in clarifying in real language data what is stated abstractly and generally' (Jackson, 1985, p. 58). We created each example sentence ourselves in order to avoid possible copyright issues that could arise from using extracts from the COCA. Nevertheless, many were modelled on sentences from various sources found on the internet as well as from the COCA itself, with the aim to produce as natural and authentic sentences as possible. Finally, the example sentences were entered into the 'Vocabprofile' section of the Compleat lexical tutor (Cobb n.d.) in order to make sure that they did not contain highly infrequent words likely to be unknown to learners.
d Ordering. Finally, the ordering of the items was the same as the ordering used in Liu's list, i.e. by frequency order. This is because such an ordering could allow users to instantly see which PVs are the most frequent among the listed PVs. Likewise, the ordering of each PV's meaning senses is based on frequency ranking. A list of items by alphabetical order (Appendix 1) and another by frequency order (Appendix 2), as well as the full PHaVE List and a Users' Manual, are provided in the Supplementary Materials to be found on the journal website. This allows users to access the list via both frequency and alphabetical orders.
3 Sources a Dictionaries. Prior to the corpus search, a preliminary list of the different meaning senses of each PV was made, using a wide range of well-known and established English dictionaries (in print and online) and one lexical database. These were: It is worth noting that the level of specificity at which these dictionaries distinguished between meaning senses could vary to a large extent. For instance, phrasal verb dictionaries tend to make much more refined distinctions than general dictionaries, and thus include many more entries under each phrasal verb. Therefore, we attempted to synthesize the information we found in all these dictionaries, in order to reach a level of specificity that best captured the level adopted by the majority.
We worded our definitions with the goal of encapsulating the various instances of meaning senses in the corpus as closely as possible. We also had in mind the purpose of the study and the potential users of the list, and so we made an effort to keep them relatively concise and simple. All in all, each definition on our list can be considered as a synthesis of the various definitions we found in dictionaries, adjusted to what we found in the corpus.
b The corpus. The corpus chosen for the purposes of the present study was the COCA (Davies, 2008), described as follows on the COCA homepage: The Corpus of Contemporary American English (COCA) is the largest freely-available corpus of English, and the only large and balanced corpus of American English. The corpus was created by Mark Davies of Brigham Young University, and it is used by tens of thousands of users every month (linguists, teachers, translators, and other researchers). COCA is also related to other large corpora that we have created. The corpus contains more than 450 million words of text and is equally divided among spoken, fiction, popular magazines, newspapers, and academic texts. It includes 20 million words each year from 1990-2012 and the corpus is also updated regularly (the most recent texts are from summer 2012). Because of its design, it is perhaps the only corpus of English that is suitable for looking at current, ongoing changes in the language. (April 2014) The COCA thus offers the four following advantages: it is very large, it is balanced across several genres and discourse types, it is regularly updated, and it is freely accessible. Aside from these advantages, the COCA was used by Liu (2011) to establish his list of the 150 most frequent English PVs (our reference list), which made it an obvious choice to also use in our study. All five sections (spoken, fiction, popular magazines, newspapers, academic texts) of the COCA were considered and given equal weight in the process of calculating meaning sense frequency percentages. The main reason for this choice was that the purpose of the study was to provide a list which would be useful to a wide range of learners from various backgrounds and interests, with various types of exposure to English. Just as in the GSL, the PHaVE List aims to be of general usefulness for people using English for a variety of reasons and through exposure to various media. The reported frequency counts should be able to reflect meaning sense frequencies from natural exposure to English through various sources. Although isolating the academic section could potentially have provided university students or lecturers with more relevant information than combining all sections, the fact that PVs largely and predominantly occur outside academic texts (Liu, 2011) makes the creation of an academic meaning sense list of little value.

Corpus analysis procedure
As Liu (2011) rightly points out, querying for PVs in a corpus is a challenging task. The first step is to enter the lexical verb in square brackets into the COCA interface, so as to yield the tokens of the various forms of the verb (for instance, make/makes/making/made for the lemma make). In addition, if we take the example of the PV go in, simply entering the lexical verb lemma in the form of [verb] plus its particle (i.e. [go] in) could potentially generate tokens that are not actually PVs. For instance, 'we went there in March' contains [go] + in but the combination does not work as a PV, since in works as a preposition in the time adverbial phrase 'in March', and not as an adverbial particle (AVP) of go. The simple procedure to avoid such tokens is entering the verb lemma in the form of [verb] in the WORD(S) box, and then AVP.[RP*] in the COLLOCATES box below (so as to yield adverbial particles only; RP being the search code for adverbial particles in the COCA). For instance, the search code for the PV go in would be: WORD(S) [go] COLLOCATES in. [RP*] Another issue to consider was the number of intervening words between the lexical verb and the adverbial particle. Since Gardner and Davies (2007) and Liu (2011) limited their search to PVs separated by two intervening words maximum (e.g. turn the company around), we decided to limit our own search to PVs separated by two intervening words maximum as well. As Gardner and Davies (2007, pp. 344-345) note, PVs separated by three or more intervening words are rare and a search for them will yield 'many false PVs'. It is worth mentioning that despite all these search tools, each PV entry produced a small number of false tokens and errors, which were discarded.
For each of the 150 PVs analysed in this study, a random sample of 100 concordance lines was examined by the first author. The randomized sample included concordance lines extracted from various genres and years, drawing from the entire corpus. As it can be reasonably argued that a single sample of 100 concordance lines is not large enough to allow for reliable meaning sense frequency percentages, a second random sample of 100 concordance lines was analysed to confirm the results. Percentages obtained in the first sample were compared to those obtained in the second sample. This enabled us to see how reliable the initial percentages were, and to obtain more representative final percentages by averaging the two. As it transpired, there was almost always a very strong degree of similarity between the two random samples. The variance between percentages very seldom went beyond 10 percentage points, and in most cases was within five percentage points. The ranking order of the meaning senses between samples was almost always the same. In the rare exceptions, the difference of distribution between two meaning senses was so small that even a small increase or decrease in percentages could reverse the ranking order. Overall, this consistency gives us confidence that the average percentages included in the PHaVE List reflect a true picture of the meaning sense occurrences in the COCA.

Inter-rater reliability
Another step taken to increase confidence in the final percentages was the inclusion of inter-rater reliability for a small sample of PVs in our list (five). These were selected across the list by a ranking criterion: the 10th, the 20th, the 30th, the 40th, and the 50th most frequent English PVs in Liu's list (2011): grow up, look up, stand up, turn around, move on. All these items were concurrently searched and analysed by a 24-year-old educated native speaker of English, currently doing a PhD in Mathematics. Prior to his corpus search, we gave him instructions on how to use the COCA, what to query, and what information to look for. We deliberately gave him no instructions as to how meaning sense groupings should be made or how to differentiate between two meaning senses, so that he would not be influenced by the first author's judgements. After an initial trial, he indicated that he was very comfortable with the procedure. The latter was exactly the same as the one undertaken by the first author: the same search codes were used, and two random samples of 100 concordance lines were analysed. Percentages were compared and similarity of judgements was assessed. Table 1 shows the first author's and the second rater's percentages for the nine meaning senses found for all five PVs.
As we can see, the percentages of the six meaning senses for grow up, look up, stand up, and turn around are very similar, with a maximum discrepancy of three percentage points. Similarly, the percentages for Meaning Sense 1 ('start doing or discussing something new (job, activity, etc.)') and 3 ('forget about a difficult experience and move forward mentally/emotionally') for move on are very close, making up a total of about two-thirds of the total occurrences. The one meaning sense with a larger discrepancy was 2 ('leave a place and go somewhere else') with 28% vs. 18.5%. This was partly caused by the Rater 2 grouping this and other similar (but less frequent) meaning senses in different ways than the first author. This shows that even with a careful manual analysis, it is sometimes difficult to differentiate between overlapping meaning senses. However, the big picture is that the two raters were identifying the same meaning senses, because what really matters for a pedagogical list is that there is agreement in terms on what meaning senses should be presented as the most important and frequent, even if the percentages of occurrence are not exactly the same. Also, the discrepancy was for a secondary meaning sense (sense 2) making up only around one-quarter of the occurrences; for the vast majority of the occurrences (around two-thirds), there was close agreement. The inter-rater reliability data thus proved satisfactory in these terms, and provides evidence that the PHaVE List provides useful information about the meaning sense percentages, independently of subjective individual judgements.

The PHaVE List: A sample
The main result of this study, and indeed its 'end-product', is the PHaVE List itself. Therefore, we will first illustrate the list with an extracted sample of PVs with one, two, three, and four meaning senses. The complete PHaVE List and Users' Manual can be found in the Supplementary Materials section of the journal's website.

Raise one's eyes (88%)
He looked up from his book and shook his head. 3. Leave the ground and rise into the air (14%) The plane took off at 7am.

WORK OUT
1. Plan, devise or think about STH carefully or in detail (33%) We still need to work out the details of the procedure.

Exercise in order to improve health or strength (23%)
He works out at the gym 5 times a week.

(+ well/badly) Happen or develop in a particular way (15%)
Everything worked out well in the end.
4. Prove to be successful (12.5%) Despite our efforts, it just didn't work out.
We can see that the PHaVE List is presented in an obvious and consistent format, with a clear frequency ordering of the PVs and of the meaning senses. Some PVs have literal meaning senses (look up), others have figurative meaning senses (show up), yet others have both (take off). The meaning sense percentages are indicated next to the definitions. The PVs contained in the example sentences are in bold and underlined to make them maximally noticeable. The example sentences provide a clear context and help disambiguate the definitions. In some cases, connotations are included in the definitions (e.g. 'Leave or depart, esp. suddenly or hastily'). In other cases, semantic preferences (e.g. 'Remove STH (esp. piece of clothing or jewellery from one's body)') or collocations (e.g. '(+ well/badly) Happen or develop in a particular way') are included. The second meaning sense of stand up is also interesting because it shows that a PV can be part of a larger phrase or chunk (stand up and say STH) with a very specific meaning associated to it ('Make public knowledge a privately held position'). This pattern ([PV] and do/say STH) was found in three other cases in the study: go out, come out and sit back. Although they were frequent enough to be included on the PHaVE List, these meaning senses were for the most part missing from dictionary entries and WordNet. This suggests that corpora definitely remain the best tools for uncovering language patterns, especially colloquial and situation-specific ones.

Meaning sense distribution
Based on our upper-and lower-threshold criteria, the total number of meaning senses included in the PHaVE List is 288. This is a far more manageable number than the totals which could be derived from Gardner and Davies' and Liu's

Applications
Just like any existing frequency list, the PHaVE List has a number of practical applications. For language teaching practitioners (e.g. teachers, syllabus designers, materials writers, and testers), the PHaVE List provides one means of handling a difficult aspect of one of the most challenging features of the English language: polysemy. Because many PVs are polysemous and may have up to 10 or 15 meaning senses, it is impossible to deal with all of them in the classroom or in textbooks. Therefore, the list offers the possibility of prioritizing their most frequent, and thus most important meaning senses, thereby allowing for a more systematic approach to tackling polysemous PVs. It is hoped that the PHaVE List will contribute to a more principled integration of PVs into language instruction and syllabi. In addition, the PHaVE List can provide useful information for testing and assessment purposes. There may be uncertainty with polysemous items about which meaning senses should be tested. The list presents meaning sense frequency percentages and ranking orders, allowing test-makers to make informed decisions as to which meaning senses should be tested, depending on language proficiency level. It is worth pointing out that the list does not imply that infrequent PVs and meaning senses should be completely discarded and are not worth learning. They should also be given explicit attention, but at much later stages of second language (L2) learning. Importantly, it should be borne in mind that the senses/uses of the phrasal verbs in the list vary in semantic transparency, and that teachers may want to take this into account in their cost-benefit analysis: the less transparent, abstract senses of the listed phrasal verbs probably require more investment of teaching time than the more transparent, concrete senses. In other words, factors other than frequency/utility can inform pedagogic decisions as to where learners need help.
In order to provide practitioners with a summary of the most essential information they want to know about the list, a PHaVE List Users' Manual can be found along with the list itself in the Supplementary Materials section of the journal website. Because we anticipate possible misunderstandings and misuses, the manual also serves as a means to establish what the PHaVE List is and what it is not, and how it might be used appropriately. The PHRASE List Users' Guide by Martinez and Schmitt (2012) was used as a model for this purpose.

Limitations
Because the meaning sense frequency percentages were derived from a corpus, it is unlikely that they are 100% reflective of all language use and individual language exposure. They are inherently an artefact of the various texts which the corpus contains. The PHaVE List is derived from the COCA, which has many advantages: it is very large, it is very recent and regularly updated, and it is balanced across several genres and discourse types. However, it is reflective of mostly American English. What has been found as the most common meaning sense for a particular PV may be different in other varieties of English, although Liu (2011) found that there was not much difference between the PVs in American and British English. Because it combines several sources (popular magazines, newspapers, academic texts, TV broadcasts, etc), it may not reflect individual experiences and exposure types. For instance, someone using English for reading finance newspapers may not find the list very reflective of their own use.
Furthermore, the meaning sense percentages should be seen as estimates, and not as fixed, exact absolutes. Using a different corpus, or making somewhat different judgements about how to group overlapping meaning senses, may lead to slightly different meaning sense percentages. Nevertheless, the meaning senses identified and their rank ordering can be used with confidence.
Overall, users should remain aware of the fact that the PHaVE List aims to be of general service and usefulness. It is precisely for this reason, however, that it should prove useful to a wide range of English language teaching professionals and students.

VI Conclusions and possibilities for future research
In conclusion, this study shows that the vast majority of the most frequent PVs in English are polysemous, and that, on average, around two meaning senses account for at least 75% of all the occurrences of a single PV in the COCA. This suggests that although PVs may have a lot of meaning senses, only a limited number of meaning senses is usually enough to cover the majority of all their occurrences. This is good news for both learners and teachers. The fact that PVs are polysemous is clearly not a new finding, but this study shows just how pervasive polysemy is among the most frequent PVs in English. Despite this, as Gardner and Davies (2007) have already pointed out, it is surprising to find so many empirical studies on PVs that make no distinction between frequency of word form and frequency of word meaning.
Possible avenues for future research are manifold. For instance, previous research has investigated learners' knowledge of PVs (Schmitt & Redwood, 2011), but we know nothing about how well they know the different meaning senses of polysemous PVs. Is this knowledge likely to be determined by meaning sense frequency? Does it match the PHaVE List's percentages? In addition to this, our meaning sense frequency information could also be used to determine the effect of meaning sense frequency in PV processing for both native and non-native speakers.