Belief Formation in a Signalling Game without Common

Using belief elicitation, the paper investigates the process of belief formation and evolution in a signaling game in which a common prior is not induced. Both prior and posterior beliefs of Receivers about Senders’ types are elicited, as well as beliefs of Senders about Receivers’ strategies. In the experiment, subjects often start with diffuse uniform beliefs and update them in view of observations. However, the speed of updating is influenced by the strength of initial beliefs. An interesting result is that beliefs about the prior distribution of types are updated slower than posterior beliefs, which incorporate Senders’ strategies. In the medium run, for some specifications of game parameters, this leads to outcomes being significantly different from the outcomes of the game in which a common prior is induced. It is also shown that elicitation of beliefs does not considerably change the pattern of play in this game.


Introduction
When making a decision in a situation involving uncertainty, individuals may form beliefs about the probabilities of various outcomes of uncertain events. Within game theory, the Harsanyi (1967) approach to games with incomplete information postulates that players' beliefs about the events describing their incomplete information are derived from a commonly known probability distribution. If this probability distribution is not known to the players, how do they form beliefs about it (and about other player's behaviour) and update them with experience?
This paper reports on an experiment in which the process of forming and updating beliefs is explored. Individuals play a signalling game in which one player, the Sender, has a piece of private information (type) and can send a message to another player, the Receiver. The Receiver sees the message but not the type of the Sender and takes an action. The payoffs of both players depend on the Sender's type, the message, and the action. To take an appropriate action, the Receiver needs to form beliefs about the Sender's type based on the message the Sender sends.
The Receiver can get an idea about the appropriate action by inferring something about the Sender's type from the message sent. This inference may not be straightforward and the Receiver's prior beliefs about the distribution of types are important to form beliefs about type based on message.
Prior beliefs about types can be explicitly induced by specifying the probabilities of the possible types of the Sender. Without explicitly induced prior beliefs, players can learn from observations if the game is repeated often enough. Drouvelis, Müller, and Possajennikov (2012) (henceforth DMP) investigated how behaviour in the signalling game can be different depending on whether the probabilities of Sender's types are known or not known before a series of interactions starts. The reason for the possible difference is that without explicitly induced common prior beliefs about types, players can use different prior beliefs and thus employ originally different strategies. Path dependence can then lead to possibly different medium to long run outcomes, even if learning from observations allows to approximate the probabilities of Sender's types.
In this paper, it is further investigated how beliefs are initially formed and updated in such situations. This is important because a model of behaviour in a game with uncertainty cannot be complete without specifying beliefs and their updating. Indeed, predictions about behaviour in DMP were derived based on a belief updating process (first applied to signalling games, albeit only for beliefs about strategies, in Brandts and Holt, 1996). However, whether beliefs are really updated in the way the model suggests could not be answered without direct observations of them.
In the experiment reported in this paper, subjects made choices in a signalling game, as well as reported their beliefs at regular intervals, both about the Sender's type and about the strategies of the players. Belief elicitation was incentivised. Belief elicitation procedures have been used in experiments before (e.g. Nyarko andSchotter, 2002 andCosta-Gomes andWeizsäcker, 2008). Rutström and Wilcox (2009) discuss the methodological issues of the influence of belief elicitation procedure on the actual play. Whether belief elicitation affected play is tested in this paper (it does not appear so). While there are several procedures for eliciting beliefs, reviewed e.g. in Palfrey and Wang (2009), the most common quadratic scoring rule is used in the experiment reported.
Beliefs are elicited both about the type of the Sender and about strategies of the players. Sender's type is determined exogenously by a random device, thus it represents an "objective" uncertainty. Strategies of the players, on the other hand, are likely to be determined endogenously within the game. The strategic uncertainty is, thus, "subjective" and may depend on the models the players use to determine the behaviour of the opponent. Drawing from psychological research, Nickerson (2004, Ch. 8) argues that beliefs about "objective" uncertainty take more time to be revised than beliefs about an individual's performance. Since in the experiment both types of beliefs are observed, it is possible to check whether some beliefs are updated faster than others in an interactive setting.
Without more explicit information about the resolution of uncertainty, "the principle of insufficient reason" (e.g. Sinn, 1980 and references therein) states that if there is no reason to believe that one event is more likely than another, then they should be assigned equal probability. In the signalling game context, the principle is more applicable to beliefs about Sender's type. Beliefs about strategies can also be subject to this principle; however, some reasoning can be used to determine which strategy is more likely to be played by the opponent.
Thus, the main research questions of this paper are whether initial beliefs (both about types and about strategies) are close to being uniform, how beliefs are updated, and whether some beliefs are updated faster than others. The data suggest that beliefs about Sender's types indeed start close to being uniform; even beliefs about strategies are not far from the uniform distribution. Beliefs are updated as observations accumulate in the natural direction of the frequency of events. However, updating is not as fast as simple frequency count would suggest, indicating that initial beliefs have a sizeable weight in the updating process. Beliefs about strategies indeed appear to be updated faster than beliefs about types.
Given these properties of belief updating, the play in the game exhibits differences between the situations with known probabilities of Sender's types and unknown ones, due to path dependence in one of the treatments. This happens because starting from the uniform initial beliefs the play is taken to a different equilibrium than starting from known correct probabilities of Sender's types, if initial beliefs about types are not updated fast. In the other treatments, in the long-run there is no noticeable difference in behaviour between cases of known probabilities of types and unknown ones. Therefore the uncovered process of belief formation and updating has sometimes important consequences for longrun outcomes, and the paper identifies situations where this process matters and where it does not matter.

The Signalling Game and Belief Elicitation
Individuals were asked to play the signalling game given by the payoff tables below. In the game, the type of the Sender (Player 1) is determined randomly, with the probability of Type t 1 being p and that of Type t 2 being 1−p. Three values of p are considered, p = 1/4, p = 1/2, and p = 3/4. The Sender, knowing the type, chooses one of two messages, m 1 or m 2 . The Receiver (Player 2) observes the message sent by the Sender but not the Sender's type and takes one of two actions, a 1 or a 2 . Payoffs depend on the Sender's type and message and on the Receiver's action. The first number in a cell in the tables is the payoff of the Sender and the second number is the payoff of the Receiver. For each of the values of p, the game has two separating equilibria [(m 1 , m 2 ), (a 2 , a 1 )] and [(m 2 , m 1 ), (a 1 , a 2 )], where the first element is the message of the Sender if type t 1 , the second is the message if the Sender is type t 2 , the third element is the action of the Receiver after receiving message m 1 , and the last element is the action after receiving message m 2 . 1 Apart from the differences in the value of p, the other treatment difference in the experiment is that in some treatments this value is commonly known to the players, while in the other treatments the value is not revealed to them. In this way it can be investigated how the information about the probability of Sender's type affects the adjustment towards equilibrium.
The payoffs in the game were chosen so that a naive adjustment process discussed in Brandts and Holt (1996) and extended in DMP to situations without commonly known prior distribution converges to the equilibrium [(m 2 , m 1 ), (a 1 , a 2 )] in the treatment with p = 1/4 and known, while in the other treatments the process converges to the equilibrium [(m 1 , m 2 ), (a 2 , a 1 )]. This happens under if beliefs about types are not updated or updated only slowly.
The naive process starts with a belief that the strategy of the opponent is uniform. With such a belief, both types of the Sender prefer to play m 1 . If p = 1/4, the best response of the Receiver to the uniform strategy of the Sender is a 1 against both messages. Type 1 Sender then switches to m 2 and in response the Receiver switches to a 2 against m 2 . The equilibrium [(m 2 , m 1 ), (a 1 , a 2 )] is reached. If p = 1/2 or p = 3/4, the best response of the Receiver against the uniform belief about the strategy of the Sender is a 2 against both messages. Now it is Type 2 Sender that would want to switch to m 2 , and then the Receiver switches to a 1 in response to m 2 . The equilibrium [(m 1 , m 2 ), (a 2 , a 1 )] is reached.
If p is unknown, naive beliefs are that each type is equally likely. In this case the process will start like the process described above with p = 1/2. If this belief about the value of p is not updated, or updated very slowly, the play can follow the adjustment path to the equilibrium [(m 1 , m 2 ), (a 1 , a 2 )], as if p = 1/2 is known.
DMP show that there are no statistically detected differences in the observed play between treatments in which the value of p is known or not for p = 1/2 or p = 3/4. For p = 1/4, there are differences in play depending on whether this value of p is known or not, although they are not as clean as predicted by the naive adjustment theory. One possible explanation is that the overall direction of adjustment depends on the speed of belief revision about the type, relative to the speed of belief revision about the strategies. If the adjustment of type beliefs is much slower than that of the beliefs about the strategies, the path in the previous paragraph is followed. On the other hand, if type beliefs are revised faster, the Receiver may realize sooner that Type 1 is less likely than Type 2 and follow the adjustment path for p = 1/4.
In DMP, beliefs were not elicited although it was shown that the behaviour in the initial periods of treatments without commonly known value of p was not statistically different from the behaviour in the treatment with known value p = 1/2. While this provides an indirect evidence for the naive theory of belief formation, to understand better their initialization and adjustment, it is important to observe beliefs directly, as noted in Nyarko and Schotter (2002).
To perform this direct check on the formation and adjustment of beliefs, in this paper beliefs are elicited during the course of play, as in Nyarko and Schotter (2002)). The novel angle is that since the signalling game under consideration involves a genuinely random move (with an unknown distribution), players have to form and update beliefs about uncertain events that are conceptually different. The random move by Nature is an objective uncertainty, with a stationary distribution. 2 By contrast, the strategic uncertainty about the strategies of the opponent is random only from the view of the player, and its distribution may be changing as the opponent learns how to play the game. Nickerson (2004, Ch. 8) reports psychological evidence about different speed of belief formation depending on whether uncertainty is objective or about a person's performance. Nevertheless, the evidence is not about behaviour in a strategic situation and the analysis presented in this paper is a further step towards understanding how players in a game deal with such different kinds of uncertainty.
In the experiment belief elicitation is incentivised via a quadratic scoring rule, as e.g. in Nyarko and Schotter (2002) and Costa-Gomes and Weizsäcker (2008). While this works only for risk-neutral players, payoffs are sufficiently low and there are many periods so that risk-neutrality is not an implausible assumption. The choice of the quadratic scoring rule was also motivated by the consideration of relative simplicity of its explanation.
In contrast to other papers that used belief elicitation, in the experiment beliefs are elicited not every period but every few periods. This is done in an effort to concentrate subject's efforts on this task rather than making it routine. It also allows subjects to gain more observations to base their guess on. Although it reduces the number of observations, the likely extra effort for the task and the better base for the guess may be sufficient to hope that the reported beliefs are good representation of real ones.

Experiment and Belief Elicitation Design
The design of the experiment in DMP is followed, with the addition of belief elicitation. The signalling game is described in the previous section. Subjects were assigned the role of either Sender or Receiver, and made corresponding decisions.
Belief elicitation was based on the following procedure. Suppose that a player has beliefs about a binary random variable X. The beliefs are that X = 1 with probability q and X = 0 with probability 1 − q. A player is asked to report q. The quadratic scoring procedure gives payoff where I(·) is the indicator function that takes value 1 if its argument is true and 0 otherwise. Given this payoff, and assuming risk-neutrality, it is optimal to report the true belief q (see e.g. Palfrey and Wang, 2009). The experiment contains treatments with and without the known prior probabilities of Sender's types. In treatments in which the probabilities are not known, Receivers are asked about their beliefs about Sender's type before the message is received (prior beliefs) and after they receive the message (posterior beliefs). In treatments in which the value of p is known, Receivers are asked only about their posterior beliefs. Senders are asked about the probability of Receiver's actions after they had sent the message in all treatments.
In the treatments in which the value of p is unknown, prior beliefs about the Sender's type represent beliefs about an event that is independent of the opponent's actions. On the other hand, posterior beliefs of Receivers and beliefs of Senders about Receiver's actions concern events that are affected by the actions of the opponent. Formation and adjustment of beliefs may be different depending on the distinction between "objective" events and events influenced by the opponent.
In the experiment, beliefs were elicited according to rule (1) with A = 50. An experimental session lasted 36 periods. Beliefs were elicited in Period 1 (initial beliefs), and then every 5 periods (i.e. in periods 1, 6,11,16,21,26,31,36), about the events described in the previous paragraphs. See the instructions (in Appendix A) for more details.
The decision not to elicit beliefs every period and to set A = 50 were made for several reasons. To get enough incentives to think about beliefs, payoffs for getting them right are comparable with those from playing the game. The subjects could get a maximum of 50 points from correctly predicting the type or the action of the other player, while in the game 50 was the second-highest payoff. Due to budgetary constraints, such high payoffs for beliefs were not possible if beliefs were elicited every period. Facing the trade-off between paying less every period or having a higher payment every few periods, the latter option was chosen since it gives the subjects more incentives to take the belief reporting task seriously. Also, subjects had more observations between the periods of belief elicitation and thus could have a better basis to form their view of probabilistic events.
The treatment differences are the value of p (p = 1/4, 2/4, 3/4), and whether this value is known or not (K or N ). In the sequel a treatment is denoted Xy, with X = K if p is known and X = N if not, and y = 1 if p = 1/4, y = 2 if p = 2/4, and y = 3 if p = 3/4. Thus a treatment without commonly known value of p and p = 1/4 is denoted N 1 and similarly for the other treatments.
The length of the sessions was 36 periods, to allow enough opportunities for learning, while at the same time not too long to have subjects bored. The sessions lasted approximately 90-100 minutes. In each session, the roles of Sender and Receiver were assigned randomly at the beginning. Then 8 or 16 participants were randomly matched within groups consisting of 4 Senders and 4 Receivers. The matching protocol and the type assignment was the same as in DMP. Points were converted to pounds at the rate of £0.05 for 10 points.
The new (with respect to DMP) set of experiments was done in the Centre for Decision Research and Experimental Economics (CeDEx) laboratory at the School of Economics at the University of Nottingham in February-March 2009. There were 3 sessions in treatments N 1 and K1, since these treatments are likely to produce the most interesting treatment difference. The number of sessions was chosen to have enough observations for the nonparametric tests below. For each of the other treatments, one session was run. In each session 16 subjects participated, divided into two matching groups of 4 Senders and 4 Receivers, thus making two largely independent observations per session (one session, in treatment K3, had only 8 participants).
In the best equilibrium of the game, and with best predictions, a subject could earn £16.28. The uniformly random strategy, together with the uniform prediction, would have earned on average £10.16 per player. The average earnings were in fact £11.72 per subject, higher than the uniform way of playing and predicting, but way off the payoff in the best equilibrium and for the best predictions.
The main aim of the experiment was to explore the way the beliefs are formed and updated. Since beliefs are elicited directly, one can formulate two hypotheses concerning beliefs, one for their initialization and the other for updating.
Hypothesis 1 Initial beliefs are uniform. The hypothesis is based on the principle of insufficient reason (Sinn, 1980 provides a relatively recent analysis of it). If it is rejected, then apparently subjects initialized their beliefs differently discerning some reasons for doing so. The hypothesis is more likely to hold for beliefs about the prior distribution beliefs of types, since strategic considerations can lead to different beliefs about actions of Receivers and strategies of Senders.
Hypothesis 2 Beliefs are updated with experience. The subjective probability of experienced outcomes increases.
There are several ways to operationalize the hypothesis, since there are many ways to update beliefs in the direction of experienced outcomes. The details of hypothesis operationalization are left for the next section.
The third hypothesis is a composite hypothesis controlling for the possible differences in behaviour depending on whether beliefs are elicited or not.
Hypothesis 3 The behaviour in the experiment with belief elicitation is not different from the behaviour without belief elicitation.
The hypothesis compares the data from the new experiment with the data on the same game but without belief elicitation in DMP. There, it was found that there are differences in behaviour between treatments N 1 and K1, and there are no differences between treatments with known and unknown prior for other values of p. The hypothesis checks whether the presence of belief elicitation makes the subject behave more or less strategically and thus the patterns of play are different in the present experiment.
The hypothesis serves as a check on procedures. Players may behave differently depending on whether they are asked about their beliefs or not. If the hypothesis is not rejected, then beliefs elicitation does not appear to change the way the game is played.

Behaviour with and without eliciting beliefs
To begin, behaviour in the experiment with belief elicitation is analysed and compared with the behaviour without the elicitation of beliefs. Thus, Hypothesis 3 is analysed first. Figure 1 shows the average strategies in treatments with p = 1/4, both in the new experiment with belief elicitation (solid lines) as well as such strategies without belief elicitation (dotted lines) from DMP. The solid and dotted lines of the same colour are rather close to each other in each panel. Thus the differences in play between the cases in which beliefs are elicited and in which they are not appear minimal. Table 1 shows the results of non-parametric tests based on matching groups as independent observations for the latter part of the sessions (Periods 21-36), when behaviour is more stable. 3 In the table, "b" refers to the treatment with elicited beliefs while "nb" to the treatments without belief elicitation. The first two rows of the table indeed confirm that there are no statistically significant differences between the corresponding treatments in the proportions of the times with which strategies are played. Figure 1 also shows that for p = 1/4 there is a difference between the treatment in which p is known and the treatment in which p is unknown. This difference is preserved in the new set of experiments with belief elicitation, and is also confirmed by non-parametric statistical tests in Table 1. 4 Strategies in treatments with p = 1/2 and p = 3/4 are similar and thus the data for these  H 0 : P rop N 1b ≥ P rop K1b for m 1 |t 2 and a 1 |m 1 . * * -p < 0.05; * * * -p < 0.01.  Receivers: a1|m2 Although the use of messages as Type 2 Sender and the use of actions as Receiver after message m 2 appear erratic in the figure, it is a consequence of rather few observations as Type 2 and after message m 2 . In these treatments, Senders are more often Type 1, and as such they overwhelmingly play m 1 , which is almost exclusively answered by a 2 . The two left panels of Figure 2 capture this from many observations of such behaviour. Thus even if there are apparent differences in some panels, the overall trend appears similar in all panels, and the differences are small in the panels that are based on more observations. For some strategies (m 1 |t 2 and a 1 |m 1 ), non-parametric tests detect significant differences between treatments with and without belief elicitation while for other strategies such differences are not detected. 5 For the comparison between treatments with known and unknown value of p, no differences are found. Thus the results are mixed but overall the differences in behaviour in treatments with p = 1/2 or p = 3/4 appear small.
Result 1 Belief elicitation does not change the behaviour in treatments with p = 1/4. There are differences in behaviour between treatments with known and unknown prior probability of types if p = 1/4. If p = 1/2 or p = 3/4, the results are more mixed. While there are no differences in behaviour between treatment with known and unknown probability of types, for some strategies there are differences in behaviour depending on whether beliefs are elicited while for other strategies there is no such differences.

Initial beliefs
For treatments in which the actual probability of Sender's type was not revealed to the subjects, the most natural guess, based on the principle of insufficient reason, is that each of the two types is equally likely. Figure 3 presents the histogram of 40 observations of reported initial prior beliefs about the Sender's type in the N treatments.
Most of the reported beliefs lie within the interval 46-55%, i.e. close to 0.5 probability of Type 1. The average reported prior belief about the type is given in Table 2.  The one-sample t-test and the Wilcoxon signed-rank test do not reject the hypothesis that the median reported belief is equal to 0.5. 6 Thus the prior belief about the Sender's type is centred on 0.5 and, according to the histogram, is concentrated on this value.
The reported posterior beliefs of Receivers about the types of Senders in Period 1 in the N treatments and in treatment K1 are given in the following table: 7 N treatments Treatment K1 t 1 |m 1 (30 obs) t 1 |m 2 (10 obs) t 1 |m 1 (20 obs) t 1 |m 2 (4 obs) Table 3: Initial posterior beliefs about the type of the Sender The initial posterior beliefs about types in the N treatments are also not far from 0.5, although standard deviation is higher than for the initial prior beliefs. The Wilcoxon-Mann-Whitney test does not find a significant difference between the posterior beliefs for the two different messages, and the signed-rank test for paired observations does not detect a significant difference between the reported prior and posterior beliefs about types.
The last two columns of the table report posterior beliefs about the Sender's type in treatment K1. In this treatment there is also no significant difference between the type beliefs after the two messages. Recall that in the K1 treatment the common prior p = 0.25 is induced. Although the average reported posterior beliefs are higher, they are no significantly different from 0.25 by the signed-rank test.
Thus there is little evidence that the average initial posterior beliefs of Receivers take into account the possible separation of types of Sender by messages. The reported beliefs are consistent with Senders pooling, including with the possibility of both types of Senders choosing one of the two messages uniformly randomly.
The beliefs of Senders about actions of Receivers in Period 1 are N treatments Treatment K1 a 1 |m 1 (30 obs) a 1 |m 2 (10 obs) a 1 |m 1 (20 obs) a 1 |m 2 (4 obs)  In the N treatments, the average beliefs of senders are quite close to 0.5, although they are heterogeneous (the standard deviation is high). Non-parametric tests do not find a significant difference in these beliefs by message, or from the uniform belief 0.5 on action a 1 . Note though that these beliefs are not very accurate: the last row shows the proportions of actions actually played by Receivers and they are much lower than the beliefs reported by Senders.
In treatment K1, Senders report beliefs that action a 1 is going to be taken more often than action a 2 by Receivers. These beliefs are sensible because, knowing that Type 2 is more likely, Receiver indeed gets a higher payoff by choosing a 1 . These beliefs also reflect to some extent the actual proportion of choices of action a 1 . It appears that Senders did make some adjustment for strategic consideration of Receivers already in Period 1 if the common prior probability of Type 1 p = 1/4 was induced. With an unknown prior though, Senders' beliefs are close to a 50-50 chance of receiver taking either action.
Result 2 Initial beliefs of Receivers about the prior probability of Sender's types are close to uniform in treatments with an unknown value of p. Initial posterior beliefs of Receivers about the Sender's type are not different from the initial prior beliefs about the type. Initial beliefs of Senders about the actions of Receiver are close to uniform in the treatments with an unknown value of p but put more weight on a 1 in treatment K1.
To see that subjects took reporting of beliefs in Period 1 seriously, one can check whether they are consistent with the chosen message or action. In the experiment Receivers play best response to the reported posterior beliefs in Period 1 76% of the time. For Senders it is not possible to determine whether their choice of message is indeed a best response because they are not asked for their beliefs about the action of Receiver in response to the non-chosen message. One possibility is to consider whether no beliefs about action after the non-chosen message would make the message played consistent with best response. 8 Since one can often find beliefs making the choice of message consistent with best response, only 5% of messages and reported beliefs of Senders in Period 1 are clearly inconsistent with best response. Alternatively, one can assume that in Period 1 Senders have the same beliefs about Receivers' action after both messages. If this assumption is adopted, 70% of Senders' chosen messages and reported beliefs in Period 1 are consistent with best response.

Belief adjustment
4.3.1 Beliefs about the prior probability of the Sender's type Table 5 shows the evolution of the average belief of Receivers about the prior (i.e. before seeing the message sent to them) probability of the Sender's type for the three N treatments. Figure 4 illustrates these beliefs graphically.   Starting from the beliefs about the probability of type t 1 close to 0.5 for all three treatments, reported beliefs generally move in the right direction (downwards for p = 1/4 and upwards for p = 3/4, although movements for p = 1/2 and p = 3/4 are more erratic because they are based on fewer observations (8 subjects in each of N 2 and N 3) than for p = 1/4 (24 subjects). Non-parametric tests for N 1 treatment confirm that beliefs in the last period are different from those in the first period. 9 Thus it appears that beliefs are adjusted in the direction of experienced outcomes.
To analyse further the process of belief adjustment, several models of belief evolution based on observations are compared. These models of empirical beliefs are • Baseline. Beliefs are equal to the proportion of the times Sender was type t 1 in a given Receiver's set of observations. Let A τ 1 be the count for type t 1 and A τ 2 the count for type t 2 in period τ . If type t i is observed in period τ , then A τ +1 . The initial counts are A 0 1 = A 0 2 = 0.
• Forgetting (Cheung and Friedman, 1997). This process behaves like the baseline process except that the counts are discounted: A τ +1 i = γA τ + 1, A τ +1 j = γA τ for j = i. If γ < 1, then observations further back in the past have less weight in the total count, i.e. they are getting "forgotten".
• Initial strength (Brandts and Holt, 1996). This process is like the baseline process except that the initial counts are not 0 but A 0 1 = A 0 2 = A, where A is estimated from the data. Larger values of A would mean that new observations have less weight compared with the initial beliefs, i.e. beliefs are updated slower.
• Forgetting and initial strength. The process combines both the forgetting parameter γ and the initial beliefs strength A.
The forgetting parameter γ and the initial beliefs strength A are estimated from the comparison of the beliefs predicted by the model with the reported beliefs by minimizing the sum of squared errors (SSE) between the prediction and the reported beliefs. The results of the estimations and the obtained minimized SSE scores are reported below.  The table contains also the SSE scores for two other benchmark models. One is the one previous period model where beliefs are equal to the observation from the previous period (i.e. equal 1 if Sender was type t 1 in the previous period and 0 otherwise). Another model, reported in the last column, is the one that predicts probability 0.5 all the time.
It can be seen from the table that the baseline model and the forgetting model do not improve much on the 50-50 prediction. However, models with the initial strength of beliefs do better, and the one with forgetting is not very different from the one without forgetting. It appears that the best model is the one with the strength on initial beliefs A P r = 14.7. Since each new observation has weight 1, the value 14.7 indicates how slowly the beliefs about the "objective" probability of the Sender's type change.

Posterior beliefs about types and beliefs about strategies
The model with the strength on initial beliefs seems to fit the data best among the considered models for the prior beliefs about types. If this model also explains the evolution of posterior beliefs about types or beliefs about strategies, one can compare the different speeds of belief revision since the parameter A can be seen as a measure of this speed.
For treatments N 1 and K1, for which there are more observations, the evolution of the average posterior beliefs of receivers is given in the following table.
Treatment N 1 Treatment K1 Period t 1 |m 1 t 1 |m 2 t 1 |m 1 t 1 |m 2  One can see from these numbers that there is type separation in treatment K1, where one of the separating equilibria is played, while the picture is much more mixed in treatment N 1. Indeed, few matching groups converged clearly to either of the separating equilibria in this treatment. Figure 5 shows the evolution of posterior beliefs graphically, together with the predictions of the best adjustment model (dotted lines, labelled N 1e and K1e), which is explained in more details below.
Posterior beliefs appear to start close to 0.5 (although lower for the K1 treatment) and then move generally in the direction of experienced outcomes (which are reflected in the dotted lines representing an empirically based adjustment model). Non-parametric tests show that there are differences in the reported posterior beliefs in Period 1 and in Period 36 for most of the comparisons (expect for beliefs about t 1 |m 2 in treatment N 1). 10 Subjects seem to learn something about the posterior beliefs over time.
To see which adjustment model fits best, the same models as for the prior beliefs about types were considered, with the following results:   The model with an initial strength of beliefs has the lowest SSE score. An interesting observation that the estimated strength parameter of this model, A P s = 5.8 is considerably lower than the corresponding parameter for the prior beliefs about types, A P r = 14.7. It appears that posterior beliefs about types are updated faster than prior ones, possibly because posterior beliefs incorporate beliefs about strategies as well, which are updated faster than beliefs about the "objective" uncertain process.
The table also reports the proportion of choices that were best responses to reported beliefs (column "Reported") or that would be best responses to beliefs predicted by the model is very similar to that of the baseline model; the score for the initial strength model without forgetting is similar to that of this model with forgetting. Thus only the baseline and the full (initial strength and forgetting) scores are reported. 20 model. Receivers chose best response to their reported beliefs 80% of the time, while if their beliefs were following the best adjustment model, their actions would have been best responses 76% of the time. This is close to 80%, thus the adjustment model reflects the reported beliefs to some extent.
Senders in the experiment reported beliefs about receivers' action in response to the message sent. For treatments N 1 and K1, the following table presents the evolution of the average beliefs about strategies.
Treatment N 1 Treatment K1 Period a 1 |m 1 a 1 |m 2 a 1 |m 1 a 1 |m 2  There is again a clearer separation of beliefs about Receivers' responses for treatment K1 than for treatment N 1, because the play in the K1 treatment converges to one of the separating equilibria while in the N 1 treatment in most of the matching groups there is no convergence. Figure 6 illustrates the evolution of average reported beliefs, together with the predictions of the best adjustment model (dotted lines, labelled N 1e and K1e). The best adjustment model is explained below.
Strategy beliefs also start close to 0.5 in treatment N 1 and from a higher value in treatment K1. Then they move to some extent in the direction of experienced outcomes although this movement is less clear than for the prior or posterior beliefs about types. Indeed, non-parametric tests detect statistical difference between reported strategy beliefs in periods 1 and 36 only for beliefs about a 1 |m 2 in treatment K1. 12 Nevertheless, the adjustment models above can be applied to strategy beliefs as well.  The lowest SSE score is again achieved by the model with an initial strength of beliefs. The estimated strength parameter of this model, A St = 6.6 is close to the corresponding parameter from the estimation of posterior beliefs A P s = 5.8 and is lower than the corresponding parameter for the prior beliefs, A P r = 14.7. It seems that on average beliefs about strategies are updated faster than beliefs about the prior probability of the Sender's type.
The observations can be summarized in the following result: Result 3 Beliefs adjust towards observed realizations of the relevant events. Across subjects, the model with a weight on initial beliefs explains the reported beliefs better than the other models. The weight on initial beliefs is larger for beliefs about the prior probability of the Sender's type than for beliefs about the posterior probability of the type or about the Receiver's strategies.
The last part of the result resembles the psychological evidence in Nickerson (2004, Ch. 8) that beliefs about a person's performance are updated faster than beliefs about an "objectively" uncertain process. The prior probability of types is "objectively" uncertain, while the posterior probability of types and the probability of a given action of Receiver depend on the behaviour of the players. In the strategic situation under consideration beliefs about probabilities of events that depend on players' decisions are updated faster, which is represented by a lower weight on initial beliefs about such events. As observed above, Receivers played best response to their beliefs 80% of the time. For Senders, it is not possible to determine whether their messages are fully consistent with their reported beliefs because beliefs about Receiver's action after the non-chosen message were not elicited. Only 5% of senders' messages and reported beliefs in all periods and all treatments are inconsistent with having some beliefs after the non-chosen message that would make the chosen message a best response to the reported beliefs. It is also worth noting that subjects' payoffs from belief statements were 36-37 points on average (depending on treatment). Reporting belief 0.5 would have earned a subject 38 points for sure, while reporting beliefs corresponding to the baseline model of empirical beliefs (i.e. reporting the empirical frequencies of types or actions observed so far) would have earned 39-40 points. It appears that subjects tried to make guesses but their attempts were not very successful.

Conclusion
In a situation in which probabilistic information is not provided, subjects learn about it from experience. The results reported in this paper show that, roughly, belief adjustment starts from a uniform distribution and adjusts towards experienced outcomes. The model that fits the observed data best is the one with some weight on initial beliefs, with beliefs incorporating new observations slowly.
The paper uses a novel approach in that beliefs are elicited only at some periods. This allowed subjects to have more data between elicitation rounds and thus get smoother re-ported beliefs. It may also makes belief elicitation less prominent for the subjects thus helping to keep their behaviour similar to a similar experiment without belief elicitation. Subjects also often played a best response to their beliefs showing that belief reporting and the choice of strategies tasks were taken seriously.
There are some differences in the adaptation of beliefs about impersonal events (the determination of types) and about strategies. Subjects may have an initial belief about the impersonal process and change it in the direction of the observed frequencies slowly. For strategies the influence of the initial belief is weaker. Strategies are conscious choices of the opponent and it may make sense to realize that the opponent is also learning thus pre-conceived ideas about his or her behavior should get less weight.
The analysis in the paper focusses on the models that fit data better on average. Subjects may be heterogeneous in their initial beliefs and update them using different parameters or even processes. While the extension to heterogeneous subjects is clearly potentially interesting, it would require more data collected for each subject. The present analysis gives a step for understanding the process of belief formation and updating in aggregate.
The results of the paper advance the understanding of belief formation processes and discriminate across alternative models of belief formation. It is done here on the example of a signalling game, for which the importance of the common prior assumption is also demonstrated. With the theory of belief adjustment proposed in this paper, it may be easier to understand behaviour in other economic situations involving uncertainty as well.
A Instructions for the treatment with unknown value of p Please read these instructions carefully. Please do not talk to other people taking part in the experiment and remain quiet throughout. If you have a question, please raise your hand. We will come to you to answer it.
In this experiment you can earn an amount of money, depending on which decisions you and other participants make. The experiment consists of 36 rounds, in each of which you can earn Points. Your payout at the end of the experiment is equal to the sum of Points you earn in all rounds, converted to pounds. For every 10 Points you will be paid 5p.

Description of the experiment
Participants are assigned the role of either "A-participant" or of "B-participant". In each round of the experiment, all participants are matched randomly in pairs, one from each role. A random draw determines the type of the A-participant, which can be either "Type 1" or "Type 2". The random draw is such that with an X% chance the A-participant is of Type 1, and with a (100 − X)% chance of Type 2. There is a new random draw each round, and the value of X is constant over all rounds of the experiment. After the random draw, the A-participant is informed about his/her type and decides between options "C" and "D". After that, the B-participant is informed about which option was chosen by the A-participant, but not about the type of the A-participant, and chooses between options "E" and "F". The payoffs of the two participants are determined according to the tables overleaf on page 2.
In some rounds of the experiment, the B-participant is asked to predict the type of the matched A-participant, both before and after the A-participant has chosen an option, and the A-participant is asked to predict the option that will be chosen by the matched Bparticipant. You are asked "What is the chance that the participant is of Type 1 / chooses option E" and "What is the chance that the participant is of Type 2 / chooses option F". You answer with two numbers Y and Z between 0% and 100%, and the sum of the two numbers should be 100. The points you earn depend on your prediction and on the actual type or option chosen by the participant according to the formulas overleaf on page 3.
[In the treatments with known value of p, X was explicitly given, e.g. 75. In the last paragraph, the word "before" was deleted, i.e. the B-participant was asked only after the A-participant has chosen an option.]

Payoffs from the choice of options
The payoffs of both participants depend on the A-participant's type, the option chosen by the A-participant and the option chosen by the B-participant.

The A-participant's payoffs
The payoffs of the A-participant (in blue) in each round are given in the following two tables (along with the B-participant's payoffs in red). For the A-participant of Type 1, payoffs are given by the table on the left, and for the A-participant of Type 2, by the table on the right.
Payoff The B-participant's payoffs The payoffs of the B-participant (in red) in each round are given in the following two tables (along with the A-participant's payoff in blue). If the A-participant chose option "C", the payoffs are given by the The payoffs of both participants depend on the prediction and on the actual type of, or option actually chosen by, the matched participant.

The A-participant's payoffs
If an A-participant predicts that the chance that the B-participant chooses option "E" is E% and the chance that the B-participant chooses option "F" is F % = (100 − E)%, the points earned are Note that you get the maximum 50 points when you predict, for example, that the chance of Type 1 is 100% and Type 1 actually happens, or that the chance of Type 1 is 0% and Type 2 actually happens. You get 0 points if you prediction is completely wrong. You get an intermediate number of points if you predict that the chance of each type or of each action is between 0% and 100%. The formulas are designed in such a way that you maximize your expected payoff from your prediction if you state your true belief about the chance of the type of the A-participant, or of the action about to be chosen by the B-participant.

Summary
To give you an overall picture of the rules, the timing of events in each round can be summarized as follows: 1. The computer randomly matches participants in pairs.
2. The computer randomly determines the A-participant's type. With an X% chance the A-participant is of Type 1 and with a (100 − X)% chance of Type 2. The value of X is constant over all rounds of the experiment.

27
3. The A-participant is informed about his/her type. Then the A-participant chooses between options "C" and "D".
4. The B-participant is informed about the choice of the A-participant, but not about his/her type. Then the B-participant chooses between options "E" and "F".
5. Payoffs result as described in the tables above.
6. In some rounds, the participants are asked to predict the type of, or the option that will be chosen by, the matched participant. Payoffs for these predictions are added to the payoffs above.

Number of rounds, role assignment and matching
The experiment consists of 36 rounds. The role of either the A-participant or the B-participant will be randomly assigned to each participant in the room at the beginning of the experiment. You will then keep the same role during the entire experiment.
In each round the computer will randomly match one A-participant and one B-participant from a group of eight subjects. The matching is completely random, meaning that there is no relation between the participant you have been matched with last round (or any other previous round) and the participant with whom you are matched in the current round.

B Data and tests
B.1 Tests reported in Table 1 B.1.1 Data for the tests reported in Table 1 The following tables show the proportions of strategies observed in periods 21-36 in each matching group (MG) of each treatment and the total proportions by treatment. Recall that "b" refers to treatments with belief elicitation and "nb" to treatments without belief elicitation. The results of the tests are reported in the main text.

Proportions of Senders
B.1.2 Tests for robustness of the results reported in Table 1 All periods The following tables show the proportions of strategies observed in periods 1-36 in each matching group (MG) of each treatment and the total proportions by treatment. Recall that "b" refers to treatments with belief elicitation and "nb" to treatments without belief elicitation. H 0 : P rop N 1b ≥ P rop K1b for m 1 |t 2 and a 1 |m 1 . * * -p < 0.05; * * * -p < 0.01.

Last eight periods (Periods 29-36)
The following tables show the proportions of strategies observed in periods 29-36 in each matching group (MG) of each treatment and the total proportions by treatment. H 0 : P rop N 1b ≥ P rop K1b for m 1 |t 2 and a 1 |m 1 . * * -p < 0.05; * * * -p < 0.01.

Proportions of Senders
B.2 Tests for treatments with p = 1/2 and p = 3/4 The following tables show the proportions of strategies observed in periods 21-36 in each matching group (MG) of each treatment and the total proportions by treatment. Recall that "b" refers to treatments with belief elicitation and "nb" to treatments without belief elicitation. The results of the tests are: