Bringing CASE in from the Cold: the Teaching and Learning of Thinking

Thinking Science is a 2-year program of professional development for teachers and thinking lessons for students in junior high school science classes. This paper presents research on the effects of Thinking Science on students’ levels of cognition in Australia. The research is timely, with a general capability focused on critical thinking in the newly implemented F-10 curriculum in Australia. The design of the research was a quasi-experiment with pre- and post-intervention cognitive tests conducted with participating students (n = 655) from nine cohorts in seven high schools. Findings showed significant cognitive gains compared with an age-matched control group over the length of the program. Noteworthy is a correlation between baseline cognitive score and school Index of Community Socio-Educational Advantage (ICSEA). We argue that the teaching of thinking be brought into the mainstream arena of educational discourse and that the principles from evidence-based programs such as Thinking Science be universally adopted.

problematic, particularly when curriculum documents specify them as core to teaching and learning programs. Familiar to many teachers and educators is Bloom's Taxonomy of thinking skills, which suggests a hierarchy of thinking patterns, from remembering, through to synthesising, evaluating and creating. Practitioners may recognise that, according to Bloom's Taxonomy, more difficult questions for students tend to be those requiring explanations, understanding and application of concepts rather than recall. Demands for students to be able to demonstrate a deep understanding of science subjects have led to the call for "less about what and more about how" (Leyser 2014, p. 45) which resonates with the need for classroom teachers to stimulate within students' higher levels of thinking as indicated by Bloom's Taxonomy.
The focus of this paper is the impact on Australian students of a cognitive acceleration or thinking program involving both a teacher professional learning program and a classroom intervention. Over 2 years, the professional learning was targeted at school teachers of science to develop their theoretical understanding and pedagogy in teaching thinking skills to their students. We used the Cognitive Acceleration through Science Education (CASE) program that was originally developed at King's College, London, in the United Kingdom (UK) and published commercially as Thinking Science (Adey et al. 2001). The Thinking Science intervention has accumulated significant evidence of effects, both on students' cognitive development and school achievement over the last three decades (for example, Adey and Shayer 1990;Babai and Levit-Dori 2009;Endler and Bond 2000;Oliver et al. 2012;Shayer 1999). The findings have shown that it is possible to improve high school students' achievement in science, with evidence of long-term and far-transfer effects (Shayer et al. 1981;Shayer 2000a, b).
While the Thinking Science program was developed some time ago, the general support for and knowledge of the importance of developing thinking skills in students remains high in England. For example, in a recent review of the English national curriculum, the Department for Education acknowledge that "improving students' thinking and reasoning skills is of high interest to teachers" (Department for Education 2012, para 1). In Australia, "critical and creative thinking" is a cross-curricular general capability in the newly implemented F-10 national curriculum (Australian Curriculum and Assessment and Reporting Authority 2012). The Australian Curriculum clearly states that the development of thinking skills, together with the imparting of knowledge, are the primary purposes of education and that critical and creative thinking are embedded across all learning areas. However, in Australia, there are few professional learning programs for teachers to support their implementation of this new cross-curricular general capability and uncertainty as to what is meant by critical and creative thinking. Some schools have used this opportunity to implement "brain-based" programs in order to develop thinking in students in the absence of evidence (see, for example, Stephenson's 2009 commentary on Brain Gym®). In the research presented in this paper, we focus on the teaching of critical thinking and imparting within students the skills of cognition and not creative thinking skills.
The purpose of this paper is threefold: (1) to describe the implementation of the Cognitive Acceleration through Science Education (CASE) or Thinking Science program in Australia; (2) to detail the relationship between the cognitive levels of students and the school's Index of Community Socio-Educational Advantage (ICSEA); and (3) to present data showing the impact of the program on students' cognitive development.

The Challenges for Thinking Programs
Prevalent in educational institutions are a number of myths regarding classroom-based thinking programs, activities and approaches that are supposedly related to research on the brain (Adey 2012;OECD 2007). For example, teachers and curriculum materials often arrange lessons around different learning styles that students might have, including visual, auditory or kinaesthetic; or around multiple intelligences, including logical-mathematical, spatial, linguistic, musical or interpersonal intelligences. Even when a person's preferred learning style is used, there is no evidence of educational improvement (Pashler et al. 2008). Mainstream psychology has consistently provided considerably more evidence to support a high correlation between different aspects of intelligence, or a general intelligence quotient, g, rather than multiple intelligences (Visser et al. 2006). By contrast, thinking programs that include the development of metacognition in students have been shown to be effective in raising student achievement (Higgins et al. 2005(Higgins et al. , 2007McGuinness 1999).
Despite our concerns about teachers' use of classroom pedagogies for which there is little evidence (Stephenson 2009), detailed analyses of the large body of literature in the field of education indicate that a limited number of programs do improve students' thinking and the performance of students on cognitive and curriculum-based tests (Higgins et al. 2007). One of these programs is the Philosophy for Children (P4C) program developed in the USA by Matthew Lipman (1976) which engages children in philosophical inquiry in a collaborative manner to ensure the development and growth of "reasonableness". By reasonableness, Vansieleghem and Kennedy (2011) claim that the "emphasis is on analytical reasoning as a guarantee for critical thinking" (p. 177). The P4C program requires students to participate in non-judgmental dialogue, thinking, listening and reflecting-activities that are quite different from the passive listening and copying of notes that often results from a traditional didactic approach to teaching and learning.
An example of a program involving the stimulation of cognition at the tertiary level that is supported by published evidence was introduced by the Physics Nobel Laureate, Carl Wieman. Wieman criticises teaching and learning that is dominated by the memorisation of facts and information and suggests teachers address key pedagogical strategies: "reducing cognitive load … addressing beliefs and stimulating and guiding thinking" (Wieman 2007, p. 13). Large effect sizes were reported when comparisons were made between student learning outcomes from a traditional lecture and a teaching and learning program grounded in Wieman's application of cognitive psychology and physics education. The conclusion that "deliberate practice teaching strategies can improve both learning and engagement in a large introductory physics course" (Deslauriers et al. 2011, p. 864) augurs well for improving learning at the tertiary level.

The Theory and Pedagogy of the Thinking Science Program
The Thinking Science intervention used in this research involves 30 "thinking" lessons delivered over 2 years, usually about one every 2 weeks during school term. In the UK, the program is implemented in year 7 and year 8, the first 2 years of secondary school when students are between 11 and 13 years of age. Each thinking lesson focuses on a specific reasoning pattern (or schemata) including controlling variables, ratio and proportionality, compensation and equilibrium to analyse process, correlation, probability, classification, formal models of thinking and compound variables. Groups of lessons spiral through increasing levels of complexity related to the reasoning patterns.
The theoretical framework underpinning Thinking Science was strongly influenced by the developmental psychology of Piaget (Shayer 2003) and the socio-cultural psychology of Vygotsky (Moll 1990). While both schools of psychology informed and grounded the development of the curriculum materials and methodology of the cognitive acceleration programs, much has been learned about learning since the1930s when Piaget and Vygotsky conducted their research. For example, Vygotsky's work gave us the notion of the Zone of Proximal Development, ZPD, and the need to be teaching "ahead of development" of the child (Vygotsky 1986, p. 188) which is critical to Thinking Science pedagogy. Through the professional development, teachers learn to become aware of their students' levels of cognition so they can pitch classroom activities beyond their current thinking capabilities, to stimulate new ways of thinking without going too far beyond what they are capable of in a given context. Moreover, Piaget's descriptions of cognitive development as the stages in the formation of intelligence, and the reasoning patterns underpinning scientific thinking (Piaget 1950;Shayer 2003), provide the basis on which the 30 Thinking Science Lessons spiral through increasing layers of complexity as well as the specific scientific nature of the problems for each of the Thinking Science classroom activities.
Extrapolating from the underpinning theory briefly described above, Thinking Science lessons each have five central stages or pillars: (1) concrete preparation, (2) cognitive conflict, (3) social construction, (4) metacognition and (5) bridging (Shayer 2003). Concrete preparation involves the teacher describing the problem, setting the scene and clarifying the vocabulary relevant to the thinking lesson. For example, in a lesson exploring the relationship between the variables of electric current and thickness of wire, some exploratory "talk" about what is meant by current helps focus the students' thinking on what to measure rather than on the nature of an electric current. Data are collected during this phase, and students and teachers often refer to this as the "doing" part of the lesson.
Cognitive conflict is a deliberately introduced, non-intuitive element of the lesson that is surprising for the students because it does not make sense when they use their current thinking patterns to try to understand the phenomenon. Imagine a beaker full of water. Students all agree that the beaker is full but when asked what might happen if you added anything else (for example, salt) to the beaker, most elementary and even high school students generally predict that water will flow over. Gradually, sprinkling salt into the beaker of water, the teacher shows that this does not happen even when adding several more pinches of salt. This sort of event, the "cognitive conflict", triggers a response, causes disquiet or "cognitive dissonance" in students and serves to engage the minds of students to make sense of the experience. In this case, students need to think and talk about an abstract idea or model for explaining the phenomenon. Cognitive conflict is considered the driver of cognitive growth because a mental struggle is required by the students to move beyond their current ways of thinking. For example, one activity in Thinking Science initially helps students to establish a relationship between two variables and then presents them with data where no relationship can be identified. Student cognition is stimulated by this moderately difficult intellectual challenge, which is accompanied by group questioning, discussion and problem solving drawing on the Piagetian idea of equilibration and the Vygotskian idea of a ZPD.
Social construction occurs as students work together in small groups in an attempt to solve the challenge then share the development of ideas and explanations in a whole class discussion. We use the term social construction to describe periods of focused small group activity in the classroom, where students construct, share, develop and discuss meaning(s). Importantly, language is used to mediate meaning and make sense of the activities and problems presented throughout the lesson. The large, whole class setting is used for all groups to listen, contribute their group's ideas and for individuals to refine and develop their own understanding. Teachers are pivotal in facilitating the whole class discussion, asking for contributions from all groups. At various points throughout the lesson, teachers ask specific metacognitive questions to develop students' abilities to reflect on their own and each other's thinking.
Metacognition is about the students becoming aware of how they were thinking and how others were thinking when they discussed and/or solved the problem, and aware of what they learned that is different to what they understood and could do prior to the lesson. Used in cognitive acceleration programs, metacognition brings "thinking about thinking" (Frith 2012) into the classroom discourse and requires time during the lessons for teachers to draw on students to share their problem-solving strategies and reflect on their errors as well as altered or new thinking patterns. It includes developing the knowledge about when and how to use different strategies in problem solving and teachers asking explicit questions about how students plan, monitor and evaluate their thinking strategies. For classroom purposes, Larkin (2006) suggested that it is helpful in "being able to reflect on one's own thinking" (p. 7) and teachers play an active role in helping students to develop this skill.
Finally, the bridging or transfer part of a Thinking Science lesson is used by teachers to relate the reasoning pattern to an everyday science lesson, or real life. For example, having worked through the lessons on probability in Thinking Science, teachers might discuss with students the probability of getting lung cancer from smoking, or they might actively transfer the thinking patterns learnt to genetics when students are solving Mendelian genetics problems that require an understanding of probability.
Sometimes the pillars of cognitive acceleration are discernible as discrete and sequential within a particular lesson, although frequently they are highly integrated. Anecdotal evidence suggests that as teachers become skilled at using the pillars, they adopt them in their regular science lessons and provide opportunities for students to draw on the problem-solving strategies and ways of thinking developed during the Thinking Science lessons. Metacognition and the transfer of metacognitive skills to other lessons have been identified as "two of the most significant concepts in the field of teaching thinking" (Leat and Lin 2003, p. 386).

The Impact of Thinking Science on Students' Cognitive Development
In the original trial and experimentation with the CASE intervention in the UK, students in CASE schools achieved statistically significantly higher results compared with their peers in control schools in the British General Certificate of Secondary Education (GCSE), the national examination taken when students are 16 years of age, 3 years after the intervention. Moreover, the statistically significant finding was found not only in the science subject area, but also in mathematics and in English language . The improved student achievement in subjects other than science has been attributed to CASE having an effect on general intellectual growth, or perhaps "a fundamental effect on students' general ability to learn, and that they can then turn this generally enhanced learning ability to bear on all school subjects" (Shayer 2000a, p. 9) as well as on science-related thinking skills (Adey and Shayer 1994). Improving cognitive ability was evident across all ability ranges with independent meta-analyses and reviews supporting these findings (Higgins et al. 2005;McGuinness 1999). Summaries can be found in Shayer and Adey (2002) and Shayer (1999).
Due to the reported impact on student cognition and achievement in science cognitive acceleration, programs have been developed in other subject areas, including mathematics Adhami 2007, 2010), technology (Backwell and Hamaker 2004) and the arts (Gouge and Yates 2002). Moreover, a series of Let's Think! programmes based on the same theory and pillars have been developed for primary school-aged children (e.g., Adey et al. 2002;Venville et al. 2003). The collection of cognitive acceleration programs has been reported in a meta-analysis to show a mean effect size of 0.61 Topping 2004, in Higgins et al. 2005). Cognitive acceleration programmes have also been successfully adapted to educational contexts in countries outside the UK including China (Hu et al. 2011), Malawi (Mbano 2003), Finland (Hautamäki et al. 2002), Oregon (USA) (Endler and Bond 2000), Pakistan (Iqbal and Shayer 2000) and Ireland (Gallagher 2008;McCormack 2009). In a trial in Israel, a compacted intervention using a small number of the CASE lessons was effective in promoting year 9 students' "reasoning abilities and attainment in science, particularly in regard to the control of variables" (Babai and Levit-Dori 2009, p. 445). The hypothesis that intelligence is modifiable and can be "enhanced by appropriate curriculum intervention" (Oliver et al. 2012, p. 212) resonates with findings about neuroplasticity and learning.

Purpose and Research Questions
In 2010, the authors initiated a medium scale cognitive acceleration intervention in Australia using the Thinking Science professional learning materials and classroom "thinking" lessons from the UK. The intervention involved 6-day out-of-class professional learning with participating teachers and in-class observation and feedback. Due to the school structure in the Australian state in which the research was conducted, the Thinking Science lessons were implemented with students when they were in years 8 and 9 (12 to 14 years of age) compared with the typical years 7 and 8 in the UK when they are about 6 months younger. The thinking lessons were incorporated alongside the standard curriculum with students participating in a thinking lesson about every 2 weeks as a replacement of a regular science lesson over the 2-year period of year 8 and year 9.
The purpose of the research presented in this paper was to determine the effect on participating high school students of implementing the Cognitive Acceleration through Science Education (CASE) or Thinking Science program in the educational context of Australia. More specifically, the research question was as follows: What was the effect of the cognitive acceleration program on participating students' cognitive development over the 2-year program? To inform the potential expansion of the intervention within Australia, we also were interested in how the program impacted students in different schools, the general range of cognitive development evident in Australian school students, and the degree to which students' cognitive development correlated with the socio-educational status of their school.

Research Design and Methods
The design of this research was a quasi-experiment with 62 teachers and 655 students from seven high schools, including nine cohorts of students participating in the Thinking Science intervention and 120 students forming the comparison group. Mixed methods of data collection were used including cognitive testing of students prior to and after the Thinking Science intervention, and qualitative surveys and focus group interviews with teachers participating in the Thinking Science intervention. Data from the interviews are not presented here.

Participants
Data were collected in seven high schools whose administration and science teachers volunteered to participate in the Thinking Science intervention. The data collection involved 62 teachers and 655 students when they were in year 8 and year 9 (ages 12-14) over the period when Thinking Science was implemented in their science lessons. The schools included one small rural school and one regional school, with the remaining schools located in a state capital city. Five schools were government funded, and two were private schools. One of the government schools was an academic select school. Table 1 provides an overview of the participating schools.
Australian schools are identified with a value of Index of Community Socio-Educational Advantage (ICSEA) developed by the Australian Curriculum, Assessment and Reporting Authority (ACARA). Variables used to determine the ICSEA are derived from the Australian Bureau of Statistics (ABS) and include location of the school (rural, regional metropolitan), parental education, occupation and income, proportion of students with languages other than English and proportion of Indigenous students. In determining the ICSEA, information about students' family background is used to identify variables which have "the strongest association with student performance" (http://www.acara.edu.au/verve/_resources/Fact_Sheet_-_About_ ICSEA.pdf) as measured by national tests at years 3, 5, 7, and 9. The average ICSEA value is 1000, and standard deviation is 100 points. Schools' ICSEA values are reported publicly on the Australian Government My School Web site (www.myschool.edu.au) and are subject to small changes in value reflecting changes in the school population from year to year. Additionally, the Web site provides detail of the distribution of students within each school in four quartiles of relative advantage/disadvantage. This provides more nuanced information about the students' family backgrounds with respect to their socio-educational status. The participating schools are representative of a range of ICSEA values as shown in Table 1.

Quantitative Measure of the Cognitive Levels of Participating Students
Piagetian Science Reasoning Tasks (SRT) were used to measure the levels of thinking from early concrete to formal operations in the school population. SRTs were developed to assess the non-verbal, general reasoning capability of students. The history, development, validity and reliability of these Piagetian-based and Rasch-scaled tasks have been described by Shayer et al. (1976a, b), Shayer and Wylam (1978), Shayer and Adhami (2007) and Shayer (2008). Results from these studies using the SRT detail the levels of thinking in the schoolaged population and distribution of levels of thinking at different ages, and provide a reference point for researchers and educators. The tests arose from the interviews conducted by Piaget in seeking to elicit the reasons for children thinking in a particular way and categorising their thinking patterns within a developmental or Piagetian framework. Data from the SRT have been correlated with other non-verbal reasoning tasks to establish reliability (Shayer et al. 1976) and were used to determine the effectiveness of the Thinking Science intervention in England Shayer 1990, 1994;Shayer and Adey 1992).
The cognitive level of a sample of 10,000 students aged between 9 and 14 years was determined using the SRT (Shayer et al. 1976). From these data, early adolescence was identified as being a period of "rapid development in concrete thinking" (p. 164), with approximately 20 % of children using formal operations (Shayer et al. 1976;Andrich and Styles 1994). Other Rasch-scaled tests have been developed which both measure the thinking levels of students and correlate well with the SRT including Bond's Logical Operations Test (BLOT) (Endler and Bond 2006) and Raven's Matrices (Styles 2008). Data from both Bond's and Shayer's work suggest there exists in schools "a broad range of cognitive development evident at average ages 13, 15 and 17 years, but that range decreased little (if at all) over the five years of high school" (Endler and Bond 2000, p. 3). More recently, students' scores on the SRT have been highly correlated with scores on the Essential Secondary Science Assessment (ESSA) test, used in the Australian state of New South Wales (Millar, pers. comm.) (see http://www. schools.nsw.edu.au/learning/7-12assessments/essa/). Raven's Matrices attempt to measure the reasoning ability component of general intelligence, g, where the task is to identify a missing element of a picture. Tests show that this sort of abstract procedural reasoning loads more highly onto measures of g than any other measurement (p=.83), including verbal reasoning, processing speed and working memory (Kaufman et al. 2011). Results on the SRT and Raven's matrices are highly correlated, with the Raven's providing a "finer level of scale" (Styles 2008, p. 96) than the SRT, and both providing information about cognitive development using a non-verbal resonating task.
As procedural reasoning is a factor loading onto g, so general reasoning ability is a predictor of scientific reasoning (Shayer et al. 1976;Shayer and Wylam 1978). Higher levels of reasoning are associated with higher levels of academic performance in school (Shayer 2000). Such reasoning ability may not reflect instructional quality or maturation, being neither "tied to age [nor] curriculum exposure" (Wiliam 2007, p. 5). In contrast, when science (defined in terms of knowledge) is tested, scores reflect instructional quality and opportunity to learn among other variables. Moreover, similar reasoning patterns may not always be reflected in similar patterns of knowledge content. A comparative study of college level physics students in the USA and China showed few differences in the distribution of reasoning despite quite different approaches to school education in the respective countries and very large differences in levels of content knowledge (Bao et al. 2009).
Researchers working with teachers in the study reported in this paper determined that use of the BLOT or ESSA tests as measures would exclude many students from the data collection due to the literacy demands of these tasks. By contrast, the SRT uses familiar laboratory apparatus to show students the activities of pouring water, weighing small items on a scale, using a ruler and balancing a beam, activities that can be readily demonstrated by teachers in the participating science classes. Because of the demonstrations, the literacy demands of the SRT on the students are low. To standardise the process, teachers were provided with a video and power point presentation prepared by the School of Isolated and Distance Education in Western Australia initially for use with students in remote parts of Australia. Piagetian SRTs were used to determine students' levels of cognitive development before and after the intervention of the Thinking Science. The SRT (volume and heaviness) was administered to all participating year 8 students prior to the implementation of the Thinking Science program, and a different SRT (equilibrium and balance) was administered on completion of the full program at the end of the second year. Teachers administered the tests in their science classes using the available video, power point and classroom equipment. The test papers were scored independently by researchers.
Only twice-tested students from each participating school were included in the data set. All test papers were scored by two researchers, with the numeric scores resulting in very high rates of scorer agreement. Any uncertainty in the "extended answers" or explanations were discussed until a consensus was reached. For example, for one question in the baseline test, students needed to calculate a numerical score and give an explanation of that answer in order to be scored as "correct". On this item, less than 1 % of papers marked contained answers that were problematic for the scorers; in instances where classes of students had not completed one or more questions (such as omitting the last page), or where consensus could not be reached on what the students had written, the papers were not included in the comparative data set.
Published data with control and experimental groups are available for researchers to use for comparative purposes, particularly in the absence of particular populations. We drew on these data (Adey and Shayer 1990) in an earlier study to determine the effect of a pilot study with one school cohort (Oliver et al. 2012). The control data served as a comparison to gauge the effect of the intervention. The comparison data were drawn from a population of age-matched students who did not participate in the Thinking Science intervention but were twice-tested at equivalent time points at the start and end of the program. As children mature, their levels of cognition increase (Shayer et al. 1976); it was thus necessary to check that the gains made over the course of the program are more reflective of the effectiveness of the program rather than the actual raw scores. Cognitive gains made by the participating students were compared with those who did not experience the intervention using a t test of significance. To determine the effect of the intervention, effect sizes were calculated as suggested by Allen and Bennett (2008), and Cohen's d was used to indicate the magnitude of the differences in cognitive gain between the Thinking Science and comparison groups. Using Cohen's (1988) conventions as a guide, d of .20 can be considered small, d of .50 is medium and d of .80 is large.

Findings
The findings are structured into two main sections. The first section presents findings with regard to the relationship between cognitive levels of Australian students and the socioeducational status of their schools as well as the range of cognition evident within a particular school at the start of the intervention. The second section presents findings related to the effect of the intervention on students' cognitive development. Figure 1 presents the data from the participating schools with the mean baseline score for the year 8 students and the schools' Index of Community Socio-Educational Advantage ICSEA. These data are taken from the large data set of year 8 student tests collected at the start of the intervention. The correlation between the students' levels of thinking and the school ICSEA value was positive (r=0.71).

Cognitive Levels of Students in Australian Schools
Students' levels of thinking were determined using a Piagetian SRT. Figure 2 shows the range of levels of cognitive development within one cohort of year 8 students at one school (school 5). Table 2 presents the data on cognitive gains from students in each of the nine cohorts in the seven participating schools and the comparison sample as reported by Adey and Shayer (1990). A total of 654 students were twice tested from the initial schools' sample of more than 1200 year 8 students. These students started at a lower mean cognitive level compared with the comparison population, but made greater cognitive gains over the intervention period.

The Effect of the Intervention on Australian Students' Cognitive Development
The mean gains made in each cohort and overall are significant at the .05 level when compared with the comparison group, with one exception (school 1, cohort 1b). The overall mean effect size of 0.56 compares with the gain made by the comparison group and falls within what Hattie (2009) described as being "worthwhile" and comparable to the gains reported in a pilot case study reported earlier (Oliver et al. 2012). The Thinking Science intervention had a differential impact on students from different school cohorts with effect sizes ranging from 0.2 to 0.995 ( Table 2). The smallest effect was found with cohort 1b in school 1, a small rural school, and the largest in school 7, an academically selective school.

Discussion
The findings (Fig. 1) show that participating Australian students' levels of thinking were closely correlated with their schools' Index of Community Socio-Educational Advantage  (ICSEA) and we speculate that this finding reflects a degree of social inequity. While we return to a discussion of ICSEA later in this discussion, it is not within the scope of this study to explore the issue or conundrum of inequity in depth, but to present the data as one variable that may enable intervention programs such as Thinking Science to be successfully implemented, sustained and developed in schools. The overall effect size of 0.56 for the quasi-experiment in this Australian study (Table 2) indicates that the Thinking Science intervention based on CASE professional development and classroom thinking activities was broadly highly successful. Very few studies report effect sizes of this sort of significance (see Hu et al. 2011, who reported increasing effect sizes over the duration of a "Learn to Think" intervention program). A meta-analysis undertaken at the Evidence for Policy and Practice Information and Co-ordinating Centre (EPPI Centre) at the Institute of Education, University College, London, concluded that the cognitive acceleration family of interventions seemed to have clear benefits (Higgins et al. 2004). The findings presented in this paper are also consistent with similar research conducted in other places in the world, that is, the same program was shown to be effective in raising students' levels of cognition and scholastic performance (e.g. Endler and Bond 2000;Hautamäki et al. 2002).
We speculate that key to the success of Thinking Science are the cognitive conflicts set within a specific reasoning pattern, the pedagogy that drives the discussion of ideas in student groups, and metacognition. These instructional strategies when used together have the capacity to improve the reasoning ability of students. The results of the pilot study conducted in one school in Australia and reported earlier (Oliver et al. 2012) demonstrated that participating students' achievement in science between years 7 and 9 showed greater gains than other students in the state of Western Australia as measured by the state-wide monitoring standards in education tests (WAMSE, see http://www.scsa.wa.edu.au/internet/Years_K10/WAMSE).
Improving the thinking of teenagers has consequences for their performance in school and beyond in terms of equity, economics and life course (OECD 2010). The teenage years are of particular interest to educators as they include the second period of considerable intellectual growth spurts (Andrich and Styles 1994;Styles 2008), more recently been confirmed using brain imaging by Dosenbach et al. (2010) and Ramsden et al. (2011). It is from adolescence that development of formal operations is manifested in reasoning. The goal of CASE, through its rich pedagogy, is to develop formal operational thinking in all students regardless of their maturation or schooling. We are encouraged by the findings of this study because they demonstrate the effects of teaching thinking to this age group on students who show varying degrees of aptitude for, and attitude towards, their learning: students are not set on a specific intellectual trajectory. The results of this study show that interventions like Thinking Science and the P4C programs can make a difference to students' thinking skills and cognitive capacity, and subsequently their scholastic achievement as well as improved attitudinal measures (Trickey and Topping 2006;Topping and Trickey 2007).
We find it interesting to note that the students in schools considered to be more "disadvantaged" as indicated by the ICSEA in this data set do appear on average to make gains compared with the comparison group, but not as much as the students in more "advantaged" schools (Table 2). While specific data are not reported here, a positive "teacher effect" was identified in the data in a low SES school, where greater fidelity to the program was observed by some teachers. Such fidelity was determined by researcher visits to the school during the provision of in-school professional development and the use of rubrics during classroom observations to document teacher and student activity during the lessons (developed by Lecky, pers. comm.). This latter instrument enabled teachers to develop classroom coaching skills as they assumed greater responsibility for the program in school, becoming more expert teachers and sharing the leadership of the professional development.
Other studies on cognitive acceleration interventions have shown the effects of individual teachers on students' thinking (Venville et al. 2003), which points to the non-homogeneous impact in the schools participating in this study. Indeed, teachers exert considerable effect on students' learning and gains in achievement (Taylor et al. 2010). Understanding the impact of high-quality teaching is a likely driver of policy development and the monitoring of teaching standards.
One possible explanation for the findings in this study is that the schools with higher ICSEA values had greater stability in terms of student population, staffing and participation in the professional learning opportunities. There was high attrition from the data set, with the school with the smallest gain having the greatest attrition of both students and teachers involved in the professional learning program. Conversely, the school with the greatest gain had the lowest attrition of both students and teachers. Schools experience different and changing priorities, with varying rates of student attendance and teaching staff turnover. Such factors inevitably impact on the effectiveness of an intervention program and raise questions about scalability and sustainability (see Lee and Krajcik 2012 for a discussion and overview), not to mention some of the less tractable problems of social equity, resource allocation and access to what we might call "high quality teaching".
The findings presented in this paper are, nevertheless, educationally significant as in optimal conditions, with a stable student population, high rates of school attendance and a science department that embedded the intervention practices into the teaching and learning program, the effects were clear: students showed a large gain in their levels of reasoning compared with students in other schools. There may also be other influences such as the school environment, families, peers and other resources not considered in this study that support interventions such as this one in schools and the impact on individual students.
It is important to comment on the limitations of the data used for the comparison group in this study. While the comparison group was matched with the participating students for age and the duration between the pre-and post-testing time points, the comparison group was different in terms of time, that is, the years in which the students participated in the research and space, that is, the location in which they lived. Such differences in the "starting points" between the experimental and control groups have been addressed through a long-term study of the cognitive levels of children in the UK. These data show that compared with an age-matched cohort tested 30 years apart, fewer of today's early adolescents use formal operations than their counterparts in 1976 Shayer and Ginsburg 2009). In contrast with the received wisdom of the Flynn effect, Shayer documented that current day students leaving primary schools are less capable of reasoning than the previous generation. Given that a large proportion of science curriculum documents internationally depends on students being able to use formal operations, for example, to conceptualise atoms and molecules, these findings do not bode well for the science education of adolescents today. The findings presented by Shayer and colleagues underscore the need for curriculum documents and educators to recognise the importance of stimulating and developing adolescent students' cognition. The case for a CASE intervention appears to be compelling.

Implications
The findings from this study show that the Thinking Science intervention that included a highly prescribed pedagogy for which the participating teachers were given professional development, and a highly prescribed program of thinking lessons, did impact on participating Australian students' levels of cognition as measured by the Piagetian reasoning tasks. The findings raise questions about the types of professional development and interventions that should be supported within schools and the degree to which teachers should be "reined in" to specific pedagogies and scripted lessons that have demonstrated success.
There is real tension between implementing an educational intervention with fidelity (Andrews 2012) and allowing teachers to have the "freedom, space, and resources to create next [best] practice" (Hargreaves and Fullan 2012, p. 51). Once teachers have participated in the Thinking Science professional development, they can apply the core theoretical elements of cognitive challenge, the demands of constructing explanations, and accounting for the thinking to any of their regular science lessons. For example, one teacher we observed working with a low literacy year 9 class introduced a lesson by asking students what they thought would happen if salt was added to a full beaker of water. Instead of using the regular classroom materials and work cards to support student learning, the teacher applied the central ideas of Thinking Science pedagogy and developed a lesson appropriate to and challenging for these students. He gradually sprinkled salt into the water, which acted as a hook for the lesson and the cognitive conflict experienced enabled a rich discussion to emerge about the nature of matter and solubility. However, we do not know whether applying the theory and adapting regular lessons to include cognitive conflict, social construction and metacognition is sufficient to improve students' cognition in the same way that we know that rigidly following the scripted Thinking Science lessons does.
Can we provide adequate professional learning to enable teachers to "adapt materials in ways that align to standards and support learning goals" (Penuel and Fishman 2012, p. 295), and that reflects the reality of the "complex interaction between the innovation content, the local working conditions and sense making by the school team" (März and Kelchtermans 2013, p. 15)? Indeed, further research is needed to explore the rationale for teachers to choose how to develop professionally, and at the same time offering teachers professional learning opportunities for highly effective intervention programs. Given that "differences in teacher effectiveness account for a large proportion of differences in student outcomes" (Jensen and Reichl 2011, p. 6), programs that do make a "difference in educational improvement to the most disadvantaged students" (AERA, Australian Curriculum and Assessment and Reporting Authority 2012, p. 2) need to be supported by policy makers and administrators. Universities have a role to play in disseminating evidence of best practice, supporting teacher development and informing policy direction (Connor et al. 2014;Lee and Krajcik 2012).
The findings presented in this study on the implementation of the CASE or Thinking Science intervention with nine cohorts of students in seven schools in Australia are consistent with data from the suite of cognitive acceleration programs that merit their consideration and adoption in schools in many countries and cultures throughout the world (Adey 2005). We argue that the "why" of changing teaching practice needs to be at the heart of the debate about improving science education, followed by the "how" and "what" it takes to get us there. Based on the findings of our research, we suggest that there is a moral imperative to bring CASE back from the cold and situate the theory, practice and impact into the current debate about pedagogy.

Conclusion
The overall impact of the Thinking Science intervention on the cognition of 654 students in seven high schools in Australia was positive and was represented by an effect size of 0.56 when compared with a comparison group. The findings indicate that the Thinking Science intervention had different impact in different schools with effect sizes ranging from 0.2 to 0.995. Thinking Science and other cognitive acceleration programs have contributed to "a growing body of accumulated evidence that they are effective at improving pupils' performance on cognitive and curriculum tests when they are researched in school settings. Their effect is relatively greater than most other researched educational interventions" (Higgins et al. 2005, p. 4). Overall, the findings presented in this paper support the wider implementation of cognitive acceleration pedagogy in Australian schools to support the general capability of critical thinking of the Australian Curriculum. There is, however, tension between the need to implement an intervention such as Thinking Science with fidelity and the professional freedom of teachers.