The Causal Effect of Public Service Motivation on Ethical Behavior in the Public Sector: Evidence from a Large-Scale Survey Experiment

Public service motivation (PSM) and ethical behavior are central concerns in public administration. Yet, experimental evidence on the causes of ethical behavior and the causal effects of PSM remains scarce, curtailing our understanding of both. This article draws on a novel survey experimental design to improve this understanding. The design is based on a simple insight: asking about PSM can render salient PSM-oriented identities of respondents. By randomizing the order of PSM and outcome questions, PSM may be exogenously activated among survey respondents, and the causal effects of this activation assessed. Drawing on this design and a sample of over 5,000 Chilean central government employees—the largest experimental PSM survey sample to date— we find that PSM activation enhances willingness to report ethical problems to management. This provides the first experimental evidence that PSM may promote ethical behavioral intent, and suggests that activating public employees’ PSM can benefit public sector ethics.


Introduction
Understanding how ethical decision making among public servants can be strengthened is a central concern in public administration (Menzel 2015).Unethical behavior by public servants is widespread in both developing and developed countries, according to survey evidence (e.g., Kolthoff et al. 2010;Meyer-Sahling, Schuster, and Mikkelsen 2018;OPM 2012).It can undermine trust in government and foster corruption, among many other ails (Perry 2015;Vigoda-Gadot 2007).
How can public sector institutions encourage ethical decision making among public servants?Our article looks at one potential lever: activating public servants' public service motivation (PSM)-that is their "orientation to delivering services to people with a purpose to do good for others and society" (Perry and Hondeghem 2008, vii).Public employees often have greater PSM than their private sector counterparts (see, e.g., Pedersen 2013).PSM may thus be a potentially potent lever to harness for public managers to encourage ethical decisions and behavior.Understanding PSM more generally is also of scholarly weight.It is, as a public sector ethics, a core concern in public administration, with two recent meta-studies counting over 400 public administration studies on the two topics (Menzel 2015;Ritz, Brewer, and Neumann 2016).
Despite their centrality in public administration, however, experimental evidence on ethics and PSM remains scarce.In a recent review of 109 articles of ethics in public administration (Menzel 2015), for instance, none employed experimental designs.Although PSM has seen more experimental research (e.g., Bellé 2014;Bellé and Cantarelli 2015;Christensen et al. 2013;Clerkin et al. 2009;Esteve et al. 2015;Esteve, van Witteloostuijn, and Boyne 2015;Neumann 2016;Tepe 2016), most of this research has not sought to experimentally manipulate-and thus estimate a causal effect of-PSM itself. 1 To our knowledge, only three experiments which manipulate PSM have been conducted so far (Bellé 2013;Christensen and Wright 2018; Pedersen 2015).As detailed below, however, methodological concerns limit how much can be learnt from these three studies alone.
The lack of experimental evidence limits our understanding of both the causes of unethical conduct and the consequences of PSM.Experimental findings about unethical behavior in other disciplines-such as economics and management-have been strikingly different from those in observational studies in public administration (Bellé and Cantarelli 2017).The reliance on observational designs in public administration studies to-date thus implies, according to a recent meta-analysis, that we lack a "solid understanding of the causal mechanisms underlying unethical behaviour in public organizations" (Bellé and Cantarelli 2017, 328).
Assessments of the validity of observational studies of PSM are similarly skeptical.Much of the PSM research endeavor is motivated by positive statistical associations between PSM and a range of favorable attitudes and behavior: performance, work motivation, job satisfaction, and organizational commitment, among many (e.g., Alonso and Lewis 2001;Andersen and Serritzlew 2012;Brewer and Selden 2000;Kim 2005;Naff and Crum 1999).Yet, although these observational associations are suggestive, they are at risk of suffering from omitted variable biases, reverse causality, and other threats to validity (Wright and Grant 2010; see also Antonakis et al. 2010). 2,3  As a result, rigorous evidence about both the causes of ethical behavior in the public sector and the causal implications of PSM remains highly circumscribed.This article seeks to address this gap.By experimentally estimating the effect of PSM on ethical behavioral intent, it can further our understanding of these two core concepts in public administration.
To assess the causal effects of PSM, the article draws on a novel survey experimental question order design.The design is based on a simple insight: asking about PSM can render salient PSM-oriented identities of respondents.By randomly assigning respondents to groups being asked PSM questions before and after outcome questions, PSM may be exogenously activated in respondents, and the causal effects of this activation assessed.Contrary to prior experimental research which has largely focused on students, we undertake this experiment with public servants and one of the largest original survey samples on PSM to date: over 5,000 public employees in 11 central government institutions in Chile.Our findings provide the first experimental evidence for the importance of PSM in encouraging ethics among public servants and, to our knowledge, the first experimental evidence on the causes of ethical behavioral intent in the public sector.
To derive these findings, our article, initially, reviews the literature on, first, the relationship between PSM and public sector ethics and, second, the scarce existing experimental designs to assess causal effects of PSM.Subsequently, we derive our hypotheses and develop our experimental design to test them.Thereafter, we delineate our survey sample and data.Last, we present our results, followed by a discussion and conclusion.

PSM, Ethical Behavior, and the Reporting of Ethical Problems
The past years have seen a marked increase in studies of ethics in public administration (see Menzel 2015).Ethical behavior is commonly understood as behavior "that is subject to or judged according to generally accepted moral norms of behaviour" (Reynolds andCerani 2007, 1610) and that thus "reach[es] or exceed[s] some minimal moral standard . .., such as being honest, obeying the law, and whistle-blowing" (Reynolds andCerani 2007, 1610).
A subset of this literature has explored the relationship between PSM and ethical behavior and decision making, underscoring its substantive importance in the field (e.g., Brewer and Selden 1998;Caillier 2015;Kwon 2014;Lim Choi 2004;Maesschalck, van der Wal, and Huberts 2008;Stazyk and Davis 2015;Vandenabeele and Kjeldsen 2011;Wright, Hassan, and Park 2016).Yet, with the exception noted in the subsequent section, all of these studies are observational and, as detailed below for the case of ethical reporting, potentially vulnerable to biases.
1 Instead, PSM serves as a moderator, dependent variable, or nonexperimentally manipulated independent variable in most prior experimental studies. 2 This risk should not be taken lightly.As Wright and Grant (2010) had noted several years ago, omitted variable, reverse causality, and common source bias all threaten the validity of inferences from partial correlations between PSM and other variables.To illustrate these potential biases with the example of PSM consequences: observational PSM studies infer effects from partially correlating a presumably favorable attitude or trait (such as PSM) with other presumably favorable attitudes, traits, or behavior (such as work motivation).
Responses to both-and thus correlations between them-may be caused by social desirability and common source biases (cf.Kim and Kim 2016).Moreover, work motivation and other favorable attitudes and behavior may cause PSM on their own.Perhaps most importantly, a range of other, omitted favorable attitudes and behavior are likely to correlate with both PSM and other positive outcomes.In conjunction, these biases imply that partial correlations may not enable valid inferences about PSM (cf.Wright and Grant 2010).3 Methods with stronger claims for causal identification in observational studies-in particular instrumental variables and regression discontinuity designs-of course exist (Antonakis et al. 2010).To our knowledge, these, however, have not been employed in studies of PSM or public sector ethics.This may not surprise: finding suitable instruments or discontinuities in PSM and public sector ethics is challenging, thus putting a premium on a research agenda that experimentally manipulates PSM to assess its causal effects.This is, of course, not to say that prior studies do not provide helpful insights into the relationship between PSM and ethical behavior.Not least, they provide several theoretical rationales to link the four component dimensions of PSM-self-sacrifice, commitment to public values, attraction to public service and compassion4 -to ethical behavior.These theoretical rationales also imply that we could plausibly expect PSM activation to shape ethical attitudes, decision making, and behavior.
Most intuitively, self-sacrifice offers clear and direct linkages to ethics.Unethical behavior is often driven by greed and furthering one's self-interest (Wang and Murnighan 2011).A willingness to sacrifice one's own interest thus leads to an expectation of more ethical decisions and behavior (Wright, Hassan, and Park 2016).For instance, corruption is frequently defined as the abuse of public office for private gain, leading to a clear expectation that employees willing to forego private gains for the common good are less prone to corruption (cf.Kwon 2014).Congruent with this line of reasoning, self-sacrifice is found to exert the strongest effect of all PSM dimensions on ethical conduct in at least some studies (Lim Choi 2004).
Self-sacrifice, however, is not the only PSM dimension that may be expected to affect ethical decision making favorably."High road" ethical decision-making depends on values which are embedded in one's internal moral compass and grounded in personal integrity, reflection, and virtue (Stazyk and Davis 2015).A commitment to public values-a second important PSM dimension-is thus likely to lead to ethical behavior: employees with higher levels of PSM act more consistently with their own values when they behave ethically (Stazyk and Davis 2015). 5ext, to this normative rationale, attraction to public service may foster ethical decision making.Individuals who value public service often view contributing to the common good, tackling social problems and helping one's community as primary work rewards (Crewson 1997;Rainey 1982).As such, those attracted to public service are more likely to engage in a range of prosocial and ethical behaviors as they find them more rewarding (e.g., Brewer and Selden 1998;Stazyk and Davis 2015;Wright, Hassan, and Park 2016).In fact, experimental studies in other disciplines have shown that individuals who are less focused on their own self-interest tend to behave more honestly (e.g., Winterich, Mittal, and Morales 2014).
Last, an affective motive could be at play: compassion could curb unethical behavior.Although, to our knowledge, prior public administration studies have not explicitly theorized an affective mechanism, individuals who feel sympathetic to the welfare of others may feel emotionally more compelled to behave ethically (or at least altruistically, see, e.g., DeSteno 2015).
All four PSM dimensions-and PSM more generally-may thus be plausibly associated with ethical decision making and behavior.Our empirical expectation-which is formalized into hypotheses further below-is thus that PSM activation affects ethical behavioral intent positively.Our article thereby focuses on one form of ethical behavioral intent in particular: the willingness of public servants to report ethical problems to management.The motivation for this choice is two-fold.First, as reflected in the recent increase in scholarly attention on the topic, ethical reporting plays an important role in the prevention of corruption and unethical behavior in public sectors (e.g., Brewer and Selden 1998;Caillier 2015;Hassan, Wright, and Yukl 2014;Kaptein et al. 2005).Huberts and De Graaf (2008, 645), for instance, found based on Dutch corruption cases that close colleagues often realize that corrupt officials are overstepping formal boundaries but, in the cases where corruption ensues, "decided not to report anything or speak to their superiors."Similarly, a national business survey in the United States found that less than 55% of employees who observed misconduct reported it to management (Ethics Resource Center 2005).Assessing the causal effect of PSM on ethical reporting intent is thus important in its own right.
Second, observational studies on the effect of PSM on ethical reporting have yielded inconclusive findings to date.Three studies found a significant positive effect (Caillier 2017;Vandenabeele and Kjeldsen 2011;Wright, Hassan, and Park 2016).A fourth study found no robust effect in a full model specification with a slightly different set of controls (Caillier 2015).
That PSM is not invariably associated with ethical reporting suggests, on the one hand, that the two are conceptually distinct.Their relationship thus cannot be taken for granted and is, hence, worth studying empirically.On the other hand, these inconsistent findings underscore the fragility of drawing inferences about the effects of PSM on ethical behavior and decision making from partial correlations.These might or might not become insignificant with additional controls (among other threats to validity).As such, prior inconsistent findings also underscore the utility of causally and experimentally identifying the effect of PSM on ethical behavioral intent.
With this in mind, the next section briefly reviews prior studies which have sought to experimentally manipulate PSM to contextualize our own experimental design.

Experimental Studies on the Causal Effects of PSM
Since its initial conceptualization by Perry and Wise (1990), research on PSM has grown exponentially to more than three hundred studies (Ritz, Brewer, and Neumann 2016).Existing quantitative studies are based overwhelmingly on partial correlations.This may not surprise.Assessing the causal effect of PSM requires random manipulation of PSM.Yet, PSM has been found to be in good part a function of antecedents which can usually not be randomly assigned, such as gender, age, education, or parental socialization (Ritz, Brewer, and Neumann 2016).In other words, many of the antecedents of PSM are not dynamic, suggesting PSM might at least be in part a relatively stable trait, settled early in life (cf.Brewer 2003;Houston 2006;Karl and Peat 2004).At the same time, however, panel studies suggest that the PSM of individuals can change over time (e.g., Kjeldsen 2013).Public sector institutions in particular can play a role in these changes, socializing public servants into-or out of-PSM (see Perry 2000;Vandenabeele 2007;Vogel and Kroll 2016).This suggests that at least part of an individual's PSM is amenable to change due to more proximate factors-and may thus be experimentally manipulated.
Three studies have sought to experimentally manipulate PSM to date (Bellé 2013;Christensen and Wright 2018;Pedersen 2015; see also Bellé 2014;Grant 2007Grant , 2008)).They offer two solutions to random assignment of PSM: PSM cultivation and PSM activation (Bellé 2013;Grant 2007Grant , 2008;;Pedersen 2015;Wright and Grant 2010).PSM cultivation refers to the fostering and increase of PSM.Experiments may, for instance, randomly assign employees to treatments that bring them into contact with program beneficiaries in a way that highlights meaningful impact and appreciation of the employees' work, or to self-persuasion interventions to commit them to public service (Bellé 2013;Christensen and Wright 2018;Grant 2007Grant , 2008)). 6By contrast, PSM activation refers to active engagement of existing levels of PSM.Activation renders salient "PSM as the motivational basis for action" (Pedersen 2015, 736).
Our own experimental design builds on the insight from these studies that PSM may be experimentally manipulated and activated through low-intensity survey treatments.At the same time, we depart from prior studies in regards to how PSM is experimentally manipulated, with a view to facilitating greater replicability and applicability.
Consider, first, the validity and applicability limitations of Pedersen's (2015) design.Pedersen (2015) shows that law students in a Danish university indicate a greater willingness to spend more time on a future university survey if the survey's purpose is public service-oriented, relative to a control group for which no survey purpose is stipulated.This effect is larger for respondents with greater PSM.Pedersen (2015) thus usefully underscores that lowintensity survey treatments may potentially activate PSM.Generalizing from and applying this design to other contexts and outcomes is thorny, however.The study focuses on the willingness of students to fill out future surveys for a university researcher, rather than attitudes and behaviors undertaken on-the-job by public servants in state institutions.In addition, the design does not attempt to check for the possibility of social desirability bias (SDB).Both higher PSM scores and higher willingness to fill out future surveys in response to public service appeals may be due to SDB, not least as recent studies suggest PSM scores are prone to SDB (cf.Kim and Kim 2016).As such, the design may have simply tested the effect of activating SDB rather than PSM. 7xternal validity and replicability limitations also extend to Bellé's (2013) otherwise remarkable study.Bellé (2013) shows in a field experiment that nurses whose PSM was exogenously manipulated through beneficiary contact or self-persuasion perform better.These inferences, however, are based on data from 90 nurses in a single hospital in Italy, who volunteered to help with a humanitarian emergency in a former war zone.Arguably, if PSM could ever be expected to affect performance, it is among staff in social sectors who volunteer for a humanitarian emergency.Generalizability to harder cases of more ordinary public sector work and to behavior other than job performance is thus anything but certain.Moreover, Bellé's (2013) approach is not easily replicable, particularly when seeking a larger number of respondents.It requires the careful design of a field or lab experiment to randomize beneficiary contact, or a survey-based or field-based self-persuasion intervention, for which the risk of confounding-effects of self-persuasion on variables other than PSM-is likely to be high (cf.Dafoe, Zhang, and Caughey 2018).
Last, in the study which is arguably closest to our own, Christensen and Wright (2018) assess, with a group of US undergraduate students from a religious university, whether a series of prosocial primes and self-persuasion exercises inspired by Bellé (2013) and Arieli et al. (2014) enhance ethical behavior.In three separate studies, they find no effect.As the authors themselves note, however, limitations in their sample, priming intervention, and outcome measure may explain their null findings.In two of three studies, their intervention was ineffective in priming the PSM of participants, plausibly as students at a religious university already receive frequent prosocial primes in their university environment, as the sample size was small relative to the present study,8 and as the prime did not relate to specific work outcomes.
In the third study, only 18 students engaged in unethical behavior across treatment and control groups, complicating the identification of treatment effects.
In sum, existing works provide some initial evidence for a causal effect of PSM and suggest that PSM may be experimentally manipulated.Also, in light of their methodological and external validity limitations, however, our understanding of the causal implications of PSM remains limited relative to the otherwise sizable body of works on PSM.This also holds for the relationship between PSM and ethical behavior.The next section thus develops a novel experimental designand associated hypotheses-to manipulate PSM and assess its effect on ethical behavioral intent.

Hypotheses and Experimental Design
In our experimental design, we build on studies of PSM activation (Pedersen 2015), social identity theory (Akerlof and Kranton 2000;Brown 1986;Tajfel and Turner 2004), as well as insights from question order survey experiments in other disciplines (e.g., Cohn, Fehr, and Maréchal 2014).
Social identity theory suggests that public servants have multiple identities, based on social categories into which they classify themselves (cf.Brewer et al. 2002).Identities are tied to norms prescribing appropriate behavior.Identities shape behavior as individuals experience disutility if they deviate in their behavior from what their identities prescribe, and utility if they comply.Decision-making in this sense is identity fulfillment (cf.March and Olsen 1989).The extent to which a given identity guides behavior in a specific situation depends on the relative weight (salience) individuals attach to that identity in that moment-that is on the extent to which a given situation renders salient (i.e., activates) each identity (Stets and Burke 2000). 9As a result, public service-oriented behavior and behavioral intent by public employees may be encouraged by rendering more salient-that is activating-public service identities of public employees (Pedersen 2015;Vandenabeele and Perry 2008;Vandenabeele 2007).
In our experimental design, we render salientthat is activate-PSM through a novel design, which relies on question order randomization.Question order survey experiments have been made important contributions to other disciplines (e.g., Cohn, Fehr, and Maréchal 2014) and have been a staple in the survey methodology literature (see Oldendick 2008 for an overview).Yet, somewhat curiously, they have hardly been drawn on in public administration and mostly focused on citizen satisfaction surveys (Andersen and Hjortskov 2016;Van de Walle and Van Ryzin 2011).
The underlying rationale of question order survey experiments is simple: "preceding questions provide the context in which the respondent answers an item, and changing this context can make a large difference in survey results" (Oldendick 2008, 2).Although survey responses can be shaped by a range of distinct question order effects (see, for instance, McFarland 1981;Strack 1992;Moore 2002), these effects typically rely on the common intuition that earlier questions prime respondents to think about the issues covered in earlier questions when answering subsequent questions.
Our particular question order design draws on this logic and a simple insight informed by social identity theory: asking about public service motivation can render salient-and thus activate-public service motivation.Its intuition is straightforward.A standard PSM battery requires respondents to answer 16 successive questions about PSM.This implies that respondents spend time answering questions concerning their commitment to public values, their willingness to make sacrifices, their compassion, and their attraction to public service. 10In other words, asking about PSM serves as a reminder to respondents that PSM-related values exist and-to the extent respondents actually hold these values-matter to respondents' identity at work (cf., e.g., Cohn, Fehr, and Maréchal 2014).This engagement with PSM may be expected to make more salient-that is activate-the PSM-founded aspects of the respondents' identities as public officials. 11 If asking about public service motivation activates public service motivation, however, then this activation may be randomly assigned-and the causal effect of this activation assessed-by randomizing whether respondents are asked PSM questions before or after outcome variables.To assess the causal effect of this activation, in the treatment condition, respondents are asked a battery of PSM questions before being asked about the outcome variable of interest-in our case their willingness to report ethical problems to management.By contrast, in the control condition, respondents are asked the PSM battery only after the outcome variable of interest.PSM is thus only activated in the treatment group-not the control group. 12 With this experimental activation design, our core expectation-PSM has a positive effect on the willingness to report ethical problems to management-can be translated into two testable hypotheses: Hypothesis 1 (H1): Activating PSM will make respondents more willing to report ethical problems to management.
Hypothesis 2 (H2): Activating PSM will have a larger effect on ethical reporting for respondents with higher levels of PSM.
The rationale for adding H2 is straightforward.
Activating PSM is predicated on the assumption that respondents, in fact, count on PSM-or, in social identity terms, that respondents have a public service-oriented identity that can be rendered salient (cf.Stets and Burke 2000).Where respondents count on little or no PSM, there is little PSM to activate in the treatment group.The effect of activating PSM on the outcome variable of interest should thus be larger among respondents with high levels of PSM. 13  The observable implications of these hypotheses are intuitive.For H1 to be true, there should be a significant difference in the outcome variable between respondents in the treatment and control group.For H2 to be true, the difference in the effect on the outcome variable between treatment and control groups should be larger for high-PSM respondents than for low-PSM respondents.
Asking about PSM may, of course, activate not only PSM, but also SDB.In other words, being asked a series of PSM questions might make respondents respond to subsequent questions in a more socially desirable way, not least as prior list experimental research underscores the risk of SDB in the measurement of PSM (Kim and Kim 2016).To at least partially address this risk, we add a placebo question in randomized order to the outcome question-that is a question that would be affected by SDB but could, contrary to the outcome variable of interest, not be affected by a greater salience of public service-oriented identities.Inspired by Paulhus' (1984) SDB scale, our SDB check asked respondents whether they had usually accepted constructive criticism at work in the past.Because activating PSM should not affect past experience, we should not see an effect of our treatment on this variable.However, because accepting constructive criticism (as opposed to arguing or being vengeful with colleagues) is socially desirable in most settings, we should expect our treatment to have an effect on the SDB question if increased social desirability bias is at play.As with other SDB checks, this check, of course, cannot fully rule out that our findings are driven by SDB.Instead, it on only disentangles, to some extent, whether asking about PSM activates a general tendency of respondents for socially desirable answers. 14  11 Our treatment is unlikely to cultivate PSM.Answering a battery of questions is unlikely to change people's work-related identities and core attitudes.12 We randomized the order of PSM dimensions within the question order experiment to not give undue weight to any one dimension in particular in the PSM activation.13 H2 might appear to contradict Linos' (2018) argument that PSMoriented messages are less effective at attracting applicants to public sector institutions as potential applicants are already aware of the public service-oriented nature of public sector work.In Linos' (2018) treatment, however, PSM messages shape behavior of potential future public sector workers by informing them of salient characteristics of the organization that seeks to attract them.By contrast, our treatment does not provide new information to public employees about the public service-oriented nature of the organization they currently work for.
Rather, it renders salient the PSM-founded aspects of their workrelated identity.Having noted this, our treatment relates to Linos (2018) in that asking about PSM might plausibly also remind respondents of the public service-oriented nature of the organization they work for.This might strengthen perceived person-organization fit for high-PSM individuals in particular, which in turn might encourage more ethical behavior (cf.  15,1 Of the 15,706 employees in our survey frame, 5,742 employees completed the survey, yielding a response rate of 37%.Of these 5,742 respondents, 974 choose not to reply to either our reporting measure or at least one of our 16 PSM questions and were excluded from the subsequent analysis.Excluding these respondents yields 4,763 survey responses and a response rate of 30%. 17,18Our respondents are roughly representative of public employees in the eleven surveyed central government institutions in terms of gender, albeit slightly younger and with a greater share of graduates of vocational (rather than university) degrees.They are also roughly representative of employees in Chile's central government as a whole in terms of gender, albeit slightly younger (appendix 1).As detailed below in the robustness checks, our results remain significant when excluding individual subgroups, giving us no reason to believe that a fully representative sample of survey respondents would have yielded different results.
Ahead of survey authorization, the survey was presented in-person by one of the authors to the heads (or their representatives) of the participating institutions in September 2016.The leadership of all participating institutions endorsed the survey and encouraged participation from their employees.
To ensure measurement validity, the survey was extensively pre-tested prior to its implementation, including through revisions of the survey items with high-level and technical staff of Chile's Civil Service Agency and 10 face-to-face cognitive interviews with public employees in a range of institutions and levels of hierarchy to ensure the meaning of survey questions was well understood.As the PSM battery was developed in English, all survey questions were translated and subsequently back-translated between Spanish and English to avoid translation issues and ensure congruence between the meaning of translated questions in Spanish and the existing literature in English.These duties of care enhance confidence that respondents understood the items-including the PSM battery-in the intended fashion (see appendix 2 for the Spanish translation of the PSM survey items).
To measure our dependent variable-willingness to report ethical problems-we duplicate the single item used in Wright, Hassan, and Park (2016): "I feel comfortable reporting ethical problems to upper management."Answer options to this and all PSM battery items were on a five-point Likert scale (0-4), ranging from "strongly disagree" to "strongly agree."Following Wright, Hassan, and Park (2016, 652), we thus only measure the willingness to report.This focus on behavioral intent rather than (past) behavior is deliberate: activating PSM during a survey response cannot affect the frequency of prior ethical reporting.
In addition to enabling meaningful experimentation in our setting, relying on behavioral intent measures of ethical reporting follows common practice in the literature (e.g., Caillier 2012; Vandenabeele and Kjeldsen 2011; Wright, Hassan, and Park 2016; exceptions are, e.g., Brewer and Selden 1998;Caillier 2017).Nonetheless, of course, it is a limitation of our study: ethical behavioral intent and ethical behavior need not stand in a one-to-one relation.In meta-analyses, however, they are 15 The survey experiment in this article was embedded in a larger survey on civil service management and bureaucratic attitudes and behavior in Chile.The 11 institutions that participated in the survey were invited to this end by Chile's Civil Service Agency.16 The results we report below do not take account of the resulting nested data structure.Given our experimental setup, which included simple randomization of our treatment for all respondents, our findings are not at risk of bias from differences in the institutional setting of our respondents.Consequently, we opt for simplicity in our models.Having noted this, we also ran our models using a fixed effects specification, which sustained our conclusions (see appendix 6).17 Because H1 can be tested without relying on responses to the PSM battery, we are able to test this hypothesis using a larger sample (comprising 5,050 responses To measure PSM, we make recourse to the international PSM measurement scale developed in Kim et al. (2013).Although PSM measurement is subject to an ongoing discussion (see Perry and Vandenabeele 2015), Kim et al.'s dimensions are considered as the "current authority" in at least some works (Prebble 2016, 2).We thus replicate Kim et al.'s 4 dimensions and 16 items: attraction to public service (APS), commitment to public values (CPV), compassion (COM), and self-sacrifice (SS) (table 1). 19 We used block order randomization on Qualtrics to randomly assign respondents to a treatment group in which the PSM battery preceded the outcome question, and a control group in which the outcome question preceded the PSM battery.Balance tests suggest that randomization in our experiment was successful in relation to observable characteristics: treatment and control groups are not significantly different in age, gender, education, and years of service in the public sector (appendix 3).
To further enhance our confidence in the assumption that PSM is a meaningful construct in the Chilean setting, we developed a measurement model for our PSM construct. 20This model will also enable us further below to test H2. 21Table 2 shows the path coefficients and means in the measurement model. 22The model fits the data reasonably well (χ 2 = 1172.510[df = 100, p < .001],comparative fit index [CFI] = 0.975, root mean square error of approximation [RMSEA] = 0.044), giving us some reassurance that PSM is a meaningful construct in Chile.To further examine the scale properties of the four PSM dimensions, we tested the scale reliability of the individual dimensions.Cronbach's alpha for each dimension was appreciably above standard benchmarks (0.87 for APS, 0.83 for CPV, 0.80 for COM, and 0.84 for SS).To test the internal discriminant validity, we compared the fourdimensional model to a one-dimensional alternative.The former performs significantly better in terms of fit (Δχ 2 = 607.44[df = 4, p < .000],ΔCFI = 0.310, ΔRMSEA = 0.114).Furthermore, following Kim et al. (2013, 91), we tested the correlation between the four dimensions.These fall between 0.264 (between CPV and SS) and 0.670 (between APS and COM).In no case does the 95 pct.confidence interval include 1.000.
Thus, the scale properties of the PSM construct show acceptable validity and reliability.This also implies that we need not make any changes to the scale developed by Kim et al. (2013).With our PSM scale validated, we proceed to estimating our results.

Results
Congruent with prior studies, PSM and willingness to report ethical problems are significantly correlated in our survey at the 0.001 level (r = 0.12).To assess whether this association is causal, we turn to our survey experiment.
With our randomized treatment, we can assess our two core hypotheses.Our first hypothesis suggests that activating PSM will make respondents more willing to report ethical problems to management.If this were true, we should observe an average treatment effect (ATE) of our question order experiment.Figure 1 shows that this is, in fact, the case.The figure shows estimates from an ordinary least squares (OLS) model regressing our PSM battery treatment on willingness to report (est.= 0.200, p two-sided < .001,see model 1 in table A4.1 in appendix 4 for further details).Respondents whose PSM is activated prior to answering how willing they would be to report ethical problems to management have a significantly higher willingness to report.In other words, just being reminded of PSM values through a PSM battery increases our respondents' average willingness to report ethical problems to management.This effect of activating PSM is also substantively relevant: the willingness to report ethical problems shifts upwards by 0.200 points on a 0-4 scale. 23 Our second hypothesis posited that activating PSM will have a larger effect on ethical reporting for respondents with higher levels of PSM.The intuition for this was straightforward: if PSM activation is causally related to willingness to report, we would expect respondents with higher levels of PSM to be more affected by our treatment.By contrast, we would expect a smaller, or perhaps no, treatment effect on 19 Note that Kim et al.'s (2013) APS scale differs from Perry's (1996) attraction to policy making, as well as other scales in the literature, in order to strengthen discriminant validity vis-a-vis CPV.20 This is necessary not least as applying the PSM construct outside North America (where it was developed) has occasionally proven to be difficult (e.g., Kim et al. 2013).21 All analyses for the measurement model were conducted using the lavaan package for R (Rosseel 2012).Because our variables are, strictly speaking, ordinal, and because some variables show signs of skew, we estimated our measurement model using robust diagonally weighted least squares.Using ordinary maximum likelihood, estimation does not qualitatively change our results.22 We give each dimension scale by fixing one path coefficient to one.All path coefficients are significant at a 0.001 level.The model is fitted to 4.768 of our 5.742 respondents using listwise deletion.respondents with lower levels of PSM because the activation of identities that are only weakly developed in respondents should have weaker implications for behavioral intent. 24 To test this hypothesis, we estimated nonlinear treatment effects at varying levels of PSM.The result is shown in figure 2 (using three natural splines, see model 4 in table A4.1 in appendix 4).We find that the treatment only has a positive significant effect at high levels of PSM-yet not a significant effect at low levels of PSM.Our analysis thus suggests that our treatment only affects respondents who have some level of PSM for our treatment to activate. 25ur results hold throughout a range of robustness checks, which address several important internal and external validity threats: SDB, consistency bias, 24 Theoretically, asking about PSM may also remind respondents without any PSM that they do not hold any PSM-related values-in other words, it might activate their lack-of-public-service-oriented identitywhich in turn might curb their willingness to report ethical problems.We do not find empirical support for this theoretical possibility in our data, however.Considering the welfare of others is very important Self-sacrifice SS1 I am prepared to make sacrifices for the good of society SS2 I believe in putting civic duty before self SS3 I am willing to risk personal loss to help society SS4 I would agree to a good plan to make a better life for the poor, even if it costs me money satisficing, attrition, an effect of asking about ethical reporting on PSM, and sensitivity of our findings to the exclusion of specific subgroups or of specific Chilean state institutions.First, social desirability bias (SDB) could be at play.As noted in the discussion of our experimental design, asking about PSM might simply make our respondents answer questions in a more desirable way.Previous survey experimental research on the causal effects of PSM had not addressed this threat to validity.As noted above, to assess whether our findings are affected by SDB, we asked our respondents whether they had usually accepted constructive criticism at work in the past.Although not ruling out altogether that our findings are caused by SDB, this allows us to assess one SDB channel-the respondent's general tendency for socially desirable answers-which may drive our findings.We tested whether our treatment affects this variable (appendix 5).The effect of our treatment on the SDB question is insignificant. 26 For respondents at specific PSM levels, SDB could, of course, still be at cause.In particular, SDB could be higher for respondents with greater PSM, who might be more prone to responding in a socially desirable manner (cf.Kim and Kim 2016).This would threaten the validity of our inference about H2.To address this threat to validity, we estimated the effect of our treatment on the SDB question for varying levels of PSM (using three natural spines to capture nonlinear effects).As illustrated in appendix 5, there is no significant positive treatment effect on the SDB question at any level of PSM.Our SDB check thus does not give us any indication that either H1 or H2 are affected by SDB.
We can also rule out that our results stem from satisficing (respondents' shortcutting cognitive response processes by selecting the first response option they encounter) or consistency bias (respondents seeking to provide answers that are consistent with earlier responses).If either of these two biases were at play, high-PSM respondents in treatment group should select high-ethical reporting, whereas low-PSM respondents in the treatment group should select low-ethical reporting.As the control group did not receive the 16-item battery prior to the ethical reporting question, by contrast, satisficing and consistency biases could be expected to be less pronounced in the control group (if they were at play).Yet, as figure 2 underscores, the treatment only has a positive significant effect at high levels of PSM-yet not a significant negative effect at low levels of PSM, which satisficing or consistency biases would presuppose.Neither 26 Paulhus' (1984) index is measured on a seven-point frequency scale and subsequently commonly rescored such that the two highest scores (in casu those who claim to have been the most willing to accept criticism at work) are assigned a value of 1; the remaining answers are assigned a value of 0. Following this practice, we estimated a logit model predicting the rescored variable using only our order treatment.The resulting estimate is similarly insignificant (est.= −0.06,p two-sided = .392).Estimating an OLS model on the original variable gives similar insignificant results (est.= −0.06,p two-sided = .170).We re-estimated both models using robust GLM methods and found, again, insignificant effects of the treatment on the SDB check.satisficing nor consistency biases are thus compatible with our results.Furthermore, we rule out attrition as a rival explanation.Respondents in our treatment group had to answer 16 PSM questions before the ethical reporting question.Nonrandom attrition in the treatment group due to these additional pre-outcome survey questions might bias our inference.Less motivated respondents in the treatment group might be more inclined to drop out of the survey before the ethical reporting question due to having to answer sixteen additional questions before the ethical reporting question.Less motivated respondents might also be less willing to report ethical problems.To address this concern, we compared nonresponse rates to the ethical reporting question in the treatment and control groups.There are no statistically significant differences, giving us no reason to believe that attrition is biasing our findings. 27 As a final threat to internal validity, we also rule out that our results stem from an effect of asking about ethical reporting on PSM-rather than asking about PSM on ethical reporting.This might be of concern as our control group receives, contrary to the treatment group, the ethical reporting question prior to the PSM battery.In this instance, the moderator variable in our analysis (figure 2)-that is PSM-could be affected by the treatment.To address this concern, we estimated a system of equations, which incorporates this potential effect. 28This model, similarly, supports H1 and H2 (see appendix 6).
A range of rival explanations and threats to the internal validity of our results can thus be addressed.Beyond internal validity, we assess, as a last duty of care, whether our data provide suggestive evidence for generalizability.We do so by assessing whether our findings remain robust to the exclusion of specific groups of respondents-in particular specific institutions or specific age and education groups.Assessing sensitivity for the exclusion of age and education groups also speaks to the aforementioned concern with survey representativeness.As noted, our online survey respondents are, on average, slightly younger and more prone to having vocational degrees than the survey population.
To rule out that our findings are driven by specific groups, we re-estimate ATEs, excluding in each iteration one specific subgroup.As detailed in appendix 6, ATEs remain significant when excluding individual institutions from the sample; when using a fixed (institutional) effects specification; and when excluding specific age or education groups from the sample (see appendix 7).This suggests that our results do not stem from individual institutions or demographic subgroups, but rather have relevance across institutions and across demographic groups.
In sum, our validity checks enhance confidence in our findings: activating PSM in our respondents does make them more willing to report ethical problems to management, and this effect is larger the higher the respondents' level of PSM-that is the more respondents count on PSM to activate.Moreover, this effect is not limited to individual groups of respondents or institutions, but holds across multiple of them; we find no evidence that rival explanations explain these findings (satisficing, SDBs, consistency biases, attrition, and an effect of asking about ethical reporting on PSM).Our experimental evidence thus suggests that activating PSM has a causal effect on ethical behavioral intent in the public sector.

Discussion and Conclusion
Our findings have important implications for the scholarly understanding of PSM and ethical behavior in the public sector, and for survey research on PSM.
Substantively, our findings provide the first experimental evidence for the importance of PSM in encouraging ethical behavioral intent among public servants.Experimental evidence thus suggests that PSM may benefit not only performance, as prior studies had suggested (Bellé 2013; see also Bellé 2014), but also integrity in public sectors.With these findings, our article also validates and triangulates prior observational studies on the relationship between PSM and ethical behavior, most of which had identified positive statistical associations (e.g., Brewer and Selden 1998;Caillier 2015;Kwon 2014;Lim Choi 2004;Stazyk and Davis 2015;Vandenabeele and Kjeldsen 2011;Wright, Hassan, and Park 2016).PSM thus appears to hold genuine promise for better functioning public sectors.
For practitioners, our findings suggest in particular that activating PSM among public employees is both feasible through low-intensity treatments (cf.Bellé 2013; Pedersen 2015) and beneficial for public sector ethics. 29Public organizations might wish to take this 27 Specifically, a logit model predicting nonresponse to our reporting item using our order treatment returns a tiny and insignificant estimate (est.= 0.001, p two-sided = .887).28 We estimated a model with two equations and three natural splines, using the systemfit package for the R environment (Henningsen and Hamann 2007).One equation replicates the model discussed in the main text, and the other equation models a treatment effect of ethical reporting on PSM.We do indeed find a substantively weak but statistically significant effect of our treatment on PSM (est.= 0.017, p two-sided = .015).
Our findings remain robust to modeling this effect.
lesson to heart.It provides suggestive evidence that PSM reminders to employees may enhance public sector integrity.
Next to contributing to the scarce experimental literature on PSM, our article also contributes to addressing the dearth of experimental evidence on ethics in public sectors (see Menzel 2015).Reliance on observational studies has led scholars to conclude that we lack a "solid understanding" of the causes of ethical behavior in public organizations (Bellé and Cantarelli 2017, 328).This article provides experimental evidence to enhance our understanding of one such cause: PSM.
Beyond contributing to our understanding of PSM and ethical behavior, our findings also have important implications for survey research on PSM and experimental research in public administration more generally.Our results suggest that whether PSM questions precede or follow outcome questions shapes the size of correlations between PSM and outcome variables.Inconsistent findings in meta-analyses (e.g., Ritz, Brewer, and Neumann 2016) about the consequences of PSM for specific outcomes may thus be partially due to survey design effects: whether PSM questions preceded or succeeded outcome questions in surveys may affect whether and how strongly PSM correlates with other variables.To avoid survey design effects, scholars designing PSM surveys should thus either randomize the order of PSM and outcome questions or ensure outcome questions consistently precede PSM survey questions to disentangle correlation and activation.
For experimental research in public administration more generally, our findings also suggest the importance of large sample sizes to avoid type II errors for effects, which are theoretically important but of limited size.In our experiment, Cohen's d is 0.157.It required at least 2,098 respondents for a power of 0.95 (see appendix 8 for a distribution of statistical power across sample sizes).Most prior experimental studies in public administration would have been too underpowered to detect this effect (see James, Jilke, and Van Ryzin 2017).
Although our article thus contributes importantly to the literatures on PSM and ethical behavior in the public sector, and advances experimental evidence and approaches in public administration more generally (see, e.g., Baekgaard et al. 2015;James, Jilke, and Van Ryzin 2017), several limitations remain and point to important avenues for future research.
First, we only assess the effect of PSM on one form of ethical behavioral intent by public servants-willingness to report ethical problems.In doing so, we follow observational studies of PSM and ethical reporting, which had all similarly assessed intent, not behavior (e.g., Caillier 2012; Vandenabeele and Kjeldsen 2011;Wright, Hassan, and Park 2016).Intentions, of course, often predict behavior (e.g., Ajzen and Fishbein 1980), and meta-analyses suggest that ethical behavioral intent is closely associated with ethical behavior (Armitage and Conner 2001;Hertz and Krettenauer 2016).Whether these behavioral intentions translate into actual behavior of public servants in our case-and to other ethical behavior beyond ethical reporting-however remains for future research to assess.Our design would lend itself to assessing this: PSM questions could be exogenously asked before or after ethical behavioral tasks, such as honesty games (cf.Cohn, Fehr, and Maréchal 2014;Gächter and Schulz 2016).
Second, although our SDB check gives us no reason to believe that SDB accounts for our findings, we cannot-as aforementioned-conclusively rule out this possibility.Even if SDB did account for our effects, however, this need not invalidate the appeal of PSM activation.It would implicate that public sector organizations can draw on PSM messages to signal socially desirable behavior to public employees.
Third, we cannot disentangle with our survey design which PSM dimension contributed (most) to the effect on ethical behavioral intent.We randomized the order of the 16-item battery and the outcome question (next to randomizing the order of the four PSM dimensions within the 16-item battery to not give undue weight to any one dimension).Future research could adapt our survey design to assess the effects of individual PSM dimensions, by randomizing whether individual PSM dimensions precede or succeed outcome questions.
Substantively, our survey experimental design also cannot provide any evidence on how long-lived the effect of PSM activation on ethical behavior is.As our control group receives the PSM battery after the outcome variable, PSM is also activated in our control group during the survey.A second-wave survey or behavioral game at a later point in time thus would not be able to identify any effect of PSM activation.This points to the utility of field experiments or iterative surveys, with high-intensity PSM treatments at the outset, to complement the evidence presented in this article and assess how long-lived the effects of PSM activation on ethical behavior in public sectors are.
Moreover, as prior survey experimental research on PSM activation (Pedersen 2015), our survey experimental test drew inferences in part based on an assumption of convergent validity.We showcased that asking about PSM enhances the willingness of public servants to report ethical problems, and provided a panoply of additional pieces of evidence, which are compatible with the assertion that this effect is due to PSM activation-and not SDB, consistency bias, satisficing, or other threats to validity.Our article, however, does not directly test that PSM activation mediates the effect of asking about PSM on ethical reporting.We Last, as any study, our study comes with external validity limitations.We do have some confidence that our findings are generalizable across OECD governments: we survey central government employees rather than students; draw on the, to our knowledge, largest survey experimental PSM sample to date, with around 5,000 respondents; survey Chile, an OECD member with limited public sector corruption; and find significant effects across institutions and demographic groups within the central government.Whether our results do, in fact, travel to other countries-and other outcome variables beyond ethical reporting-remains an empirical question.
Our article thus leaves fertile ground for future research to experimentally study the causes of ethical behavior and causal effects of PSM.More broadly, it also points to a research agenda on identity activation of public servants.In our study, a low-intensity survey treatment activated the PSM-founded aspects of the respondents' identities as public officials.Future studies could equally use low-intensity treatments-such as question order experiments-to activate other identities of public servants.From bureaucratic rule-following identities to professional (e.g., lawyers, policemen) identities in public service, among many, research opportunities to study which identities can be activated and what behavioral consequences they have are manifold.

Figure 1 .Figure 2 .
Figure 1.Average Treatment Effect: PSM Activation and Ethical Reporting Downloaded from https://academic.oup.com/jpart/advance-article-abstract/doi/10.1093/jopart/muy071/5245901 by University of Nottingham user on 05 January 2019 thus encourage scholars who embed question order randomization in future PSM research to include PSM activation checks with outcome questions.
Downloaded from https://academic.oup.com/jpart/advance-article-abstract/doi/10.1093/jopart/muy071/5245901 by University of Nottingham user on 05 January 2019 Kim 2012).14 SDB may be driven not only by the respondent's general tendency for socially desirable responses, but also by what particular forms of behavior are socially desirable in a given situation.A PSM treatment may indicate to respondents that ethical behavior itself is socially Downloaded from https://academic.oup.com/jpart/advance-article-abstract/doi/10.1093/jopart/muy071/5245901 by University of Nottingham user on 05 January 2019 Our survey was conducted in the Chilean central government, with support and authorization from the Chilean Civil Service Agency (Dirección Nacional del Servicio Civil).Chile's central government is a propitious environment for inferring about the causes of ethical behavior in OECD contexts.Like other OECD countries, Chile has very limited public sector corruption, ranking 24 out of 176 in Transparency International's Corruption Perception Index (Transparency International 2017).Findings might thus travel to other OECD country settings.Moreover, our sample of central government employees offers important external validity advantages over prior experimental ethics research, which has overwhelmingly draw on students (Belle and Cantarelli 2017; Christensen and Wright 2018).Our survey was conducted online on Qualtrics between November 2016 and May 2017.The survey frame comprised all employees in 11 central government institutions: the Treasury, Economic Development Agency (CORFO), Civil Service Agency (DNSC), Attorney General (MP), Social Security Administration (IPS), Planning Directorate in the Ministry of Public Works (MOP), Solidarity and Social Investment Fund (FOSIS), Directorate for Libraries, Archives and Museums (DIBAM), Legal Medical Service (SML), National Fishery Service (SERNAPESCA), and National Health Fund (FONASA).
(Uziel 2010)nd our SDB check cannot rule out that this alternative SDB mechanism accounts for our findings.Moreover,SDB may be more pronounced for future (ethical) intent than past (accepting criticism) behavior.In addition, SDB checks more generally are widely used but nonetheless contested in the literature(Uziel 2010).We thus cannot rule out conclusively SDB as the underlying mechanism with our SDB check.We return to this limitation of our study in the conclusion.

Table 1 .
Survey Items in the PSM Construct

Table 2 .
PSM Measurement Model