Information Theoretical Analysis of Unfair Rating Attacks Under Subjectivity

Ratings provided by advisors can help an advisee to make decisions, e.g., which seller to select in e-commerce. Unfair rating attacks—where dishonest ratings are provided to mislead the advisee—impact the accuracy of decision making. Current literature focuses on specific classes of unfair rating attacks, which does not provide a complete picture of the attacks. We provide the first formal study that addresses all attack behavior that is possible within a given system. We propose a probabilistic modeling of rating behavior, and apply information theory to quantitatively measure the impact of attacks. In particular, we can identify the attack with the worst impact. In the simple case, honest advisors report the truth straightforwardly, and attackers rate strategically. In real systems, the truth (or an advisor’s view on it) may be subjective, making even honest ratings inaccurate. Although there exist methods to deal with subjective ratings, whether subjectivity influences the effect of unfair rating attacks was an open question. We discover that subjectivity decreases the robustness against attacks.


Information Theoretical Analysis of Unfair
Rating Attacks Under Subjectivity Dongxia Wang , Tim Muller, Jie Zhang, and Yang Liu Abstract-Ratings provided by advisors can help an advisee to make decisions, e.g., which seller to select in e-commerce.Unfair rating attacks-where dishonest ratings are provided to mislead the advisee-impact the accuracy of decision making.Current literature focuses on specific classes of unfair rating attacks, which does not provide a complete picture of the attacks.We provide the first formal study that addresses all attack behavior that is possible within a given system.We propose a probabilistic modeling of rating behavior, and apply information theory to quantitatively measure the impact of attacks.In particular, we can identify the attack with the worst impact.In the simple case, honest advisors report the truth straightforwardly, and attackers rate strategically.In real systems, the truth (or an advisor's view on it) may be subjective, making even honest ratings inaccurate.Although there exist methods to deal with subjective ratings, whether subjectivity influences the effect of unfair rating attacks was an open question.We discover that subjectivity decreases the robustness against attacks.Index Terms-Unfair rating attacks, worst-case attacks, robustness, subjective rating, trust systems.

I. INTRODUCTION
U SERS can help each other make decisions by sharing their opinions, especially when direct experience or evidence is insufficient.Ratings are a discrete form of such shared information.Rating mechanisms are popularly applied in existing online systems, such as trust systems, recommender systems, e-commerce systems and security systems [1]- [4].
Not all ratings accurately reflect reality.Malicious advisors (attackers) may deliberately provide fake or unreliable ratings to impact the decisions of some other users (advisees).This is known as an unfair rating attack.Unfair rating attacks reduce the accuracy of decision making based on ratings.
Many approaches have been proposed in the literature to improve the robustness of trust systems against unfair rating attacks.What is typically proposed is to estimate trustworthiness of advisors, based on which ratings are discounted or filtered.We argue that being aware of advisors' honesty is not sufficient to have a complete picture of unfair rating attacks.How advisors behave when they are dishonest also needs to be studied.There are also some approaches which aim to propose countermeasures for the existing types of attacks.However, attackers can always adapt their behaviour or strategies to the countermeasures.In such an arms-race, the system designer will be one step behind the attackers.
We propose a probabilistic modelling of a rating by an arbitrary advisor, which allows us to consider various possible strategies of a random attacker.Given the uncertainty in the behaviour of the attacker, we propose to investigate the worstcase scenario that he can cause to an advisee.From a security perspective, a secure system should be prepared for the worstcase attacks.Measuring which unfair rating attack is worse requires the right metric.
An advisee aims to get information about the observed facts leading to a rating.From the advisee's perspective, an attack hinders his learning about the facts.How much information about the facts a rating provides can be quantified by a measurement from information theory, namely, information leakage [5].The less information ratings leak about the facts, the greater the impact of the attack, since the advisee is more hindered in his learning.The worst-case scenario for the advisee is that the attack is the one that minimises the information leakage about the facts.We say that a system is more robust than another system, when in all situations, the maximum impact of an attack is lower in the former system than in the latter.
Dishonesty is not the only element that reduces the quality of ratings, in realistic scenarios.Honest advisors can be subjective in rating, or have different preferences from an advisee.The same observed fact may cause different honest advisors to provide different ratings.More importantly, an honest advisor may come to a different conclusion than what an advisee would have, given the same facts.
In the literature, dishonesty and subjectivity are usually treated separately, or with one as a special case of the other.
It was an open question whether subjectivity changes the effects of dishonesty or unfair rating attacks.We compare the difficulty of achieving strong unfair rating attacks in rating with and without subjectivity, and find that the existence of subjectivity makes attacks easier.We then introduce an ordering of subjectivity, based on which we prove that higher degree of subjectivity means being less robust against unfair rating attacks.
Methods to mitigate the negative effect of subjectivity on ratings' accuracy have been proposed.Since subjectivity decreases robustness of rating systems, we study whether these methods improve the robustness.The first method is for advisors to rate individual features of a target (featurebased rating) instead of only providing an overall rating.We compare the restrictions that feature-based rating impose on achieving ultimate attack to that of overall rating.We find that feature-based rating does not necessarily improve robustness compared to overall rating, and may even worsen it.Clustering advisors based on their behaviour is another way to discern subjectivity difference.We alter the rating model to allow clustering, and find that clustering increases expected information leakage regardless of attackers' strategies (and hence robustness).There exist different ways to deal with clusters, e.g., excluding seemly dishonest clusters or exploiting all clusters.We find that clusters should only be excluded in extreme cases.
Our main contribution is a formal way to measure the amount of information a rating carries.Attackers can rate in different ways, which affects how informative a rating is.Our measure allows us to 1) compare the impact of different attacks, 2) identify the circumstances under which an attacker can eliminate all information, and 3) find the behaviour that minimises the information.We first present a measure for objective rating, and then a more involved version for subjective rating.Our measure allows us to formally reason about the interplay between subjectivity and dishonesty, which has not been done before.Furthermore, it also allows us to formally reason about approaches that deal with both subjectivity and dishonesty.In particular, we look at feature-based rating and at clustering.
The work in this paper mainly consists of two parts.In the first part, we study attacks where honest advisors are assumed to be objective in rating -an essential scenario. 1We propose a probabilistic rating model and an information-leakage based quantification method, as a basis of the study on unfair rating attacks throughout the paper.We find the worst-case attack strategies.In the second part, we study the effects of attacks when honest advisors can be subjective in any ways, emphasizing a comparison with the earlier results.We also study whether the existing methods of dealing with subjectivity would influence robustness against attacks.

II. RELATED WORK
We survey related approaches that solely deal with unfair rating attacks, and also those that consider both dishonest ratings and subjective ratings.

A. Dealing With Unfair Rating Attacks
Unfair rating attacks, also known as misleading feedback attacks, reduce the accuracy of rating-based decision making.They are among one of the most popular types of attacks in trust and reputation systems [7].Various approaches have been proposed to diminish the effect of unfair rating attacks.Most of them rely on estimation of the trustworthiness of advisors 1 The work in this part has been published in [6].
(called recommender trust or feedback reputation) to judge the quality of ratings.Typically, ratings are discounted/filtered based on trustworthiness of advisors, before being aggregated.Trustworthiness of an advisor can be evaluated based on various considerations, e.g., his similarity with the advisee [8]- [11], the time of rating [12], [13], and consistency between his previous ratings and the observed outcomes [14].
Similarity is a popular criterion for evaluating advisors and their ratings.Weng et al. determine the credibility of an advisor by measuring the statistical correlation between its ratings and an advisee's own experiences regarding the same targets [9].A local table is built for each advisor, to store their past ratings and the advisee's experiences.Ratings are weighted with the advisors' credibility values, which need to exceed a threshold set by the advisee's own confidence.In [8], Zhang and Cohen propose to use both local and global ratings to estimate the trustworthiness of an advisor.Whether a rating is local or global is determined by whether it refers to the target under evaluation or other targets in the system.For example, ratings about other sellers from an advisor are useful when there are few sellers with whom both the buyer and the advisor have interacted.Liu et al. propose an approach called iCLUB [11], where ratings are clustered based on their similarity.Advisors whose ratings are in the same clusters with an advisee's ratings are considered reliable by the advisee.Ratings from other advisors are considered as unfair and would be filtered.Note that the clustering does not distinguish whether filtered ratings are dishonest or subjective.Liu et al. also propose to use Dempster-Shafer theory to combine information from both local and global ratings to identify trustworthy advisors [10].
Alternatively, the time domain of rating can be employed.In [12], Yang et al. propose to detect suspicious ratings and also time intervals where attacks are more likely.The detection results help decide in what degree advisors can be trusted.Highly suspicious ratings are removed.In [15], a technique called CUSUM [16] is employed to detect suspicious time intervals where attacks very likely happen.To avoid mistakenly selecting normal but deviating ratings as suspicious, the correlations among advisors are then learned to identify which ratings are from colluding malicious advisors (which are assumed to have large correlation).The Expectation Maximization algorithm and hypothesis test method are applied to resist random and coordinated malicious rating attacks in [17].
Additionally, in [13], three aspects of rating behaviour are considered to evaluate advisors: the time when ratings are provided, similarity between an advisor and the advisee, and also confidence of the advisor.For example, from time aspect, ratings provided more recently are considered more reliable.From confidence aspect, ratings from a more experienced advisor are considered more convincing.Fuzzy logic is applied to fuse these three aspects.Finally, in [14], Yu et al. propose a reinforcement-learning based approach.An advisor's trustworthiness is updated after each interaction of the advisee based on whether its ratings are consistent with the observed behaviour of the target.
Accurate evaluation of advisors' honesty is crucial when it is the only criterion to assess ratings, but it is difficult to achieve.If the evaluation mechanism is majority rule,2 malicious advisors can choose targets that are rated by only a few honest advisors, and make their false ratings be the majority.In this way, they can reduce the reputation of those honest advisors and increase that of their own, deceiving advisees.This kind of behaviour is known as Reptrap attack [18].In fact, as we presented in our previous work [6], even if accurate evaluation of advisors' honesty is given, trust models may still perform poorly under strong attacks.Dishonest advisors can pick strategies, some of which can be much more serious than the others.It is insufficient to focus only on whether an advisor is honest, but to ignore their strategies.We aim to find and study the most serious attack strategies.
In approaches based on advisor reputation, ratings from dishonest advisors are usually discounted or filtered, possibly resulting in a loss of useful information.Some approaches try to make use of dishonest ratings.BLADE [19] and HABIT [20] aim to extract useful information from (dishonest) ratings, as long as there is statistical correlation between the ratings and the advisee's own experience.For example, if an advisor always rates with a negative bias, HABIT would correct this bias when using his ratings.BLADE records the correlation by building behaviour functions for advisors, which serve to interpret their ratings.For instance, if an advisor is detected to always badmouth a reputable seller, his ratings would be reversed.These approaches indicate the importance of correlation between ratings and the underlying facts/truth.Regardless of the forms of attacks, if ratings are sufficiently correlated with the facts, there can be a way to uncover them.However, if there is little correlation, the truth can hardly be learned.Based on this reasoning, we propose to measure the impact of an attack by quantifying how much information it provides about the truth.That means, we do not use criteria such as some heuristic perceptions of attacks 3or the direct effects of attacks on a specific system.Our measurement is general and would not be confined to specific systems.
Instead of judging the honesty of advisors before dealing with their ratings, some approaches directly detect and filter unfair ratings, using statistical methods for example.Weng et al. propose an entropy-based method to measure the deviation of ratings from an advisee's own experience [21].Ratings that deviate too much are removed.This methodology is sometimes called endogenous filtering, but it is a highly problematic approach [22] Alternatively, contextual information can be used for filtering.For example, Wang et al. propose a detection-based method for web service recommendation system [4].They aim to identify malicious ratings and also find the corresponding advisors' IP addresses.These advisors would then be refused to rate by the server.
Many defense mechanisms have assumptions about attackers' rating behaviour.For example, bad-mouthing 4 and ballot-stuffing attacks 5 are the most popularly studied unfair rating attacks.Sometimes, more complex attacks are analysed.Feng et al. study three types of attacks, namely RepBad, RepSelf and RepTrap [23].Jiang et al. propose a trust model based on evolutionary computation (named MET) to cope with four types of attacks and their combinations [24].Liu et al. study attacks that come from a cyber competition where human participants compete to break down a trust system [13].To be able to resist well-known attacks is useful, however, it cannot ensure robustness faced with future attack strategies.In fact, to assume attackers' behavior makes defense passive, as attackers can adapt their strategies, especially when they are aware of the system design.With this in mind, we study from an active perspective -we want to figure out what would be the worst case that attackers can cause.From a security view, a secure (robust) system should be prepared for the worst-case attacks.
Finally, we note that instead of dealing with dishonest ratings, some approaches aim to disincentivise advisors to rate dishonestly.Zhang et al. propose to reward reputable advisors by making sellers provide products with lower prices but increased quality [25] to them.In [26], a limited inventory of each seller is assumed, where buyers compete with each other to get the purchase.Buyers that report truthful ratings are assigned higher scores, making them have more opportunities to transact with reputable sellers.

B. Dealing With Attacks Under Subjective Rating
So far, when we use the term "unfair ratings", we mean ratings that are deliberately provided by strategic advisors.In some works, "unfair ratings" refer to any ratings that indicate divergent opinions with an advisee, even if the divergence comes from the conflict interests or views between an honest advisor and the advisee, e.g., subjective ratings [27].Here, we distinguish two kinds of ratings using different terms: "subjective ratings" are from honest advisors with divergent opinions, while "unfair ratings" still denotes ratings from attackers.
Subjectivity is typically unavoidable in realistic rating systems.Both subjectivity and dishonesty may cause biased ratings, impacting rating-based decision making.Considering their analogous negative effect on the accuracy of ratings, some researchers treat them equally without distinguishing the motivation of advisors [9], [13], [27].Some others propose to differentiate subjective but honest advisors from dishonest ones.In [28], Fang et al. propose to use a clustering scheme for each advisee to identify his advisors as subjective or dishonest.Advisors with similar subjectivity are clustered in a same group.An advisee can make use of ratings from both its subjective groups of advisors and also dishonest groups, if the dishonest advisors have fixed behaviour pattern.Noorian et al. propose a two-layer filtering approach where the first layer excludes malicious advisors, and the second layer discerns the dispositions of the remaining advisors [29].
Interestingly, subjectivity may change the nature of unfair rating attacks.For example, Noorian et al. [30] consider an attack where dishonest advisors disguise as honest-butsubjective.The effects of subjectivity and dishonesty are not additive.Hence, we formally study how subjectivity influences the robustness against dishonest behaviour.

III. PRELIMINARIES
We briefly introduce some concepts from information theory, which support our work throughout this paper.
Shannon entropy is used to measure the expected amount of information carried in a random variable, which is determined by the uncertainty of the random variable [31]: Definition 1 (Shannon Entropy): The Shannon entropy of a discrete random variable X is given: The Shannon entropy is maximal when all possible outcomes of X are equiprobable.The base of the logarithm is set to 2, wlog.
Conditional entropy measures the expected amount of information in one random variable when another random variable is known [31]: Definition 2 (Conditional Entropy): The conditional entropy of a discrete random variable X under Y is: For brevity, we leave out the cases where only one of X and Y is continuous.Note that 0≤H (X|Y )≤H (X).
Information leakage measures the gain of information about one random variable when another random variable is known.This definition coincides with mutual information [5]: Definition 3 (Information Leakage): The information leakage of X by knowing Y is: Only independent random variables do not leak information about each other and vice versa: A crucial theorem for proving inequalities, is Jensen's inequality [32].Applied to probabilities, it states that the uniform distribution has the maximal entropy, and that distributions closer to the uniform distribution have more entropy: Theorem 1 (Jensen's Inequality): For a convex function f : or f is linear.Two instances of convex functions are x log x and − log(x).
We introduce some shorthands which will be used throughout the paper.Given variable X, the lower-case x denotes one of its outcomes, and moreover P(x) means P(X = x).∀x means for any x in X's outcome set.We typically omit the domain of such variables and, for example, write x to denote the summation over all outcomes of X.Since x log(x) is a common term, we introduce the shortcut f(x) = x log(x).For practical reasons, we let f(0) = 0 log(0) = 0.

IV. QUANTIFYING ATTACKS UNDER OBJECTIVE RATING
In this section, we quantitatively study the effects (impact) of unfair rating attacks.Specifically, we consider the scenario where an arbitrarily selected advisor is rating a given subject.One important aim is to find the worst-case attack strategies that the advisor can undertake, for this specific rating.For now, we assume that honest advisors' ratings are objective, meaning that they are equal to the observed facts.We do not assume specific behaviour for attackers.The probabilistic rating model we propose considers any possible degrees of honesty and behavior of an arbitrary advisor, within the restriction of our assumptions.The work in this section has been published in [6].

A. Modeling Objective Rating
A rating process consists of an advisor who reports a rating, based on his observation of a fact about the target under evaluation, and an advisee who wants to use ratings to make decisions regarding the target.Take an e-commerce system as an example, a buyer can use ratings from other buyers about, e.g., the quality of a product or the reliability of sellers to decide whether to buy the product, or which seller to choose.We consider a set-up with a single advisor in a single rating.Random variables R and O represent the rating and the observed fact behind the rating.The exact meaning of O depends on the purpose of a system.
Consider a simple example, in rating whether an app is malware or not, O has two outcomes "Yes/No".Outcomes of O are discrete and finite, which w.l.o.g. are labeled as 0, . .., n−1 (n>1).For now, we assume that options of rating are the same as the possible outcomes of O, 6 which also belong to the list 0, . . ., n−1.For an advisee, without R, O is assumed to have maximum uncertainty, meaning its prior distribution is uniform and We characterize an arbitrary advisor's behaviour, considering both him being honest (with probability p, denoted as H ), and being dishonest (or strategic, with probability 1 − p, denoted as ¬H ).The probability that an advisor is honest can be interpreted in a Bayesian framework as representing the knowledge of an advisee about the honesty of the advisor in question.In a frequentist interpretation, the value p could indicate that there is a population of advisors where a fraction of size p is honest, and that of 1 − p is malicious, and we randomly select an advisor from this population.
Given an observation O = i , an honest advisor always reports the truth, i.e., P(R = i |O = i, H ) = 1.How a dishonest advisor rates can be characterized by the conditional probabilities P(R = j |O = i ), j ∈{0, • • • , n−1}, all of which form an n×n matrix denoted as α, with α i, j = P(R = j |O = i ) and j α i, j = 1.Subscript i, j also denote the row and column index of the entry α i, j , with both starting from 0.
Whatever behaviour an attacker exhibits, there exists a matrix α that describes it.Initially, we assume that attackers do not tell the truth, i.e., ∀ i α i,i = 0.This rating set-up is the naive rating model, shown in Figure 1(a).

B. Ultimate Attacks
For an advisee, to deduce the truth behind a rating, the rating needs to be correlated in some way to the observation.If the rating is completely independent of the observation, then there is no way to learn the truth.We name attacks which cause this extreme case as ultimate attacks.The strategy to achieve ultimate attacks in the naive rating model is provided in Theorem 2: Theorem 2: In the naive rating model, rating R is independent of observation O iff p = 1 n and α i, j = 1 n−1 (i = j ).Proof: If variables O and R are independent, then P(R = j |O = j ) = P(R = j |O = i ), for all j, i ∈{0, • • • , n−1} and i = j .The equation can be rewritten as p = (1− p)α i, j .Since j α i, j = 1, namely (n−1) p 1− p = 1, we get p = 1 n and α i, j = 1 n−1 .On the other hand, when p = 1 n and α i, j = 1 n−1 , n holds for all j, i , which implies the independence between O and R.
Intuitively, we expect that lower values of p (more probably an attacker) should make it easier to hide O.However, Theorem 2 implies that when p≤ 1  n , the observation cannot be perfectly hidden, whereas for p = 1 n , it can.Therefore, we need to alter the naive rating model to accommodate for the case p< 1 n .When p< 1  n , the independence of O and R implies that j =i α i, j <1, which is impossible in the naive rating model.This is caused by the fact that the advisor is forced to lie if he is strategical.Therefore, we must allow strategical/dishonest advisors to report the truth with non-zero probability.In fact, it is nature that strategical advisors may sometimes report the truth, as part of the deceit.Consider a real-world scenario: in a card game with only one Ace, King, Queen, the highest wins.Alice asks her (dishonest) opponent Bob about what his card is.If Bob always lies and when he states Queen, and Alice has the King, Alice would know that Bob actually has the Ace.Thus, as a strategical player, Bob should sometimes report the truth to deceive Alice.
It is sometimes assumed that dishonesty implies not telling the truth.The above argument establishes that not allowing attackers to tell the truth would be a modelling error.
Therefore, we introduce an alternative rating option α j, j (e.g., α 0,0 when j = 0) for a dishonest advisor, as depicted in the proper rating model in Figure 1(b).Now the strategy to achieve ultimate attacks changes as follows: Theorem 3: In the proper rating model in Figure 1(b), rating R is independent of observation O iff p≤ 1  n and α i, j = p 1− p +α j, j .Proof: If O and R are independent, then ∀{i, j }, i = j, P(R = j |O = j ) = P(R = j |O = i ).The equation can be rewritten as p+(1− p)α j, j = (1− p)α i, j or α i, j = p 1− p +α j, j .Take i fixed, and sum over j, j =i on both sides, we get 1−α i,i = (n−1) p 1− p + j =i α j, j .Since j α j, j ≥0, we get 1−np 1− p ≥0 and p≤ 1 n .On the other hand, if p≤ 1 n and α i, j = p 1− p +α j, j , P(R = j |O = i ) = p+(1− p)α j, j holds for all j, i , which means O and R are independent.When j α j, j = 0, we get α i, j = p 1− p .As j α i, j = 1, we get p = 1 n and α i, j = 1 n−1 , in which way Theorem 3 equals Theorem 2. Note that j α j, j >0 may occur in ultimate attacks, which implies dishonest advisors may sometimes report the truth without leaking information.
It is common for trust and reputation systems, as well as some security-related systems, to have the precondition that at least half of the participants are honest.Theorem 3 suggests that this requirement may be too strong.It indicates that R and O cannot be independent when p> 1  n , which means that, for n>2 and 1  2 > p> 1 n , in the frequentist interpretation, even if the advisee selects an advisor from a population that contains more attackers than honest advisors, the advisee would still learn from the rating.The larger n becomes, the larger the fraction of attackers in the population is allowed to be.

C. Minimizing Information Leakage
Ultimate attacks are the worst-case attacks since an advisee cannot learn anything about the truth.Although ultimate attacks cannot be achieved when p> 1  n , some strategies should still be better at hiding the observations than others.To capture this, we quantitatively measure how much a rating is correlated or dependent to the observation, using information leakage (Definition 3 in Section III).The information leakage between R and O measures how much information R provides about O.There is 0 information leakage iff R and O are independent, i.e., ultimate attacks.Less information R leaks implies that O is hidden better.The impact of an attack can then be quantified by information leakage.An attack has a larger impact than another attack, when its information leakage is less.
Below, we aim to find attack strategies with the largest impact, for p> 1  n , namely the strategies that minimise the information leakage.We may refer to these strategies as the worst-case strategies.The attacker partially controls R given O, so H (O|R) is a variable.H (O) is not controlled by the attacker and its a constant.
Definition 4: Level strategy is the strategy where: ∀ j, α j, j = 0 and ∀i, i = j, α i, j = 1 n−1 .The level strategy minimises information leakage: Theorem 4: For p≥ 1  n , the level strategy minimises the information leakage of O given R. Proof: Let Inequality 2 is derived based on the Jensen's inequality (Theorem 1 in Section III).Inequality 3 is derived based on the property that f(x) is superlinear and p ≥ 1 n .Finally, note that applying the level strategy from Definition 4 to term 1 yields term 3. Hence, the level strategy minimises information leakage.When p = 1 n , the level strategy leads to zero information leakage, as proven in Theorem 2. Now, we have found the worst-case attack strategies for advisors with honesty degree p ∈ [0, 1].Specifically, when p < 1 n , the strategy requires a dishonest advisor to report the truth sometimes.When p ≥ 1 n , the strategy requires the advisor to uniformly choose a dishonest rating.Moreover, ultimate attack with zero information leakage can only be achieved when p ≤ 1 n .An advisee can still get some information about the truth when p > 1 n .To illustrate our results, we plot the information leakage of O in the worst-case attacks, as a variable of p (with n = 3 and n = 10) and n (with p = 0.2 and p = 0.8), in Figure 2. The figure shows that when p ≤ 1 n or n ≤ 1 p − 1, the information leakage is zero.And when the difference between p and 1 n increases, the information leakage increases.
We have validated our information-theory based definition of the worst-case unfair rating attacks in [6] (see "Robustness Analysis" section).Three popular trust models BLADE [19], TRAVOS [33] and MET [24] were considered.We presented that even when correct p values are provided, these models show poor accuracy in trust evaluation under the worst-case attacks.This is in line with our theoretical results: that when no information exists in ratings (ultimate attacks), the truth cannot be deduced, and when there exists minimal information (other worst-case attacks), the truth may be derived if the strategies are known (see ITC in [6]).In this paper, we do not present these simulations, as we want to focus on our new studies in Sections V and VI.
For attacks with non-zero amount of information, there does not exist straightforward relation between the amount of information leakage and the accuracy of trust evaluation, or other types of decision making.This means that more information leakage does not necessarily lead to more accurate trust evaluation.Different models or mechanisms may react differently to a same attack or attacks with the same amount of information.For example, bad-mouthing ratings are oppositely related to the truth (see their information leakage in [34]).They are filtered in some approaches, but are made use of in some others (see Section II).Models that are designed against some specific types of attacks may show lower accuracy under some other attacks, even if the latter have more information leakage.Therefore, we do not build explicit connection between the amount of information leakage and the accuracy of decision making, but observe the fact that information leakage puts an upper bound on the accuracy that is achievable.The worst-case attacks result in the tightest upper bound for accuracy.Hence, we argue that the worstcase attacks should be considered when designing a secure system.

V. QUANTIFYING ATTACKS UNDER SUBJECTIVE RATING
In reality, the deviation of ratings from the observed facts may not only come from dishonest intentions of advisors.Given the same observation, different honest advisors may also report different ratings.A very important reason here is that their opinions regarding the observation are subjective. 7Subjectivity means "based on or influenced by personal feelings, tastes, or opinions" (Oxford Dictionary).Different advisors may have different subjectivity preferences.For example, they may put emphasis on different features, either when grading a target, or when suggesting an option.One honest user may rate a site unsafe due to excessive advertisements, whereas another honest user rates it safe, since it does not operate malware and delivers the promised functionality.Even if they emphasize the same features, they may have different expectations.In the example, when two honest users both take amount of advertisements as the criterion for safety, one user may find it excessive and rate it unsafe, whereas the other may find it acceptable and rate it safe.
Our interest in subjectivity is about to what extent honest ratings determine how the advisee would observe the truth.In the extreme case of the objective rating scenario from Section IV, an honest rating determines the truth completely.For example, if an honest advisor reports software as malware, then it is malware, and the advisee considers it as malware.In the subjective scenarios, there is no one-to-one link between the two.For example, a hotel room can be "good enough" or "not good enough" to an advisee, but advisors provide ratings in the form of 1 to 5 stars.In this example, a low star rating likely implies "not good enough" whereas a high star rating likely implies "good enough".
Subjectivity may reduce the usefulness of ratings.For example, a positive honest rating about a clean hotel in a bad neighbourhood may be found misleading by a user that cares about location rather than cleanliness.Both subjective ratings and dishonest ratings introduce inaccuracy.Some researchers treat them the same without differentiating the underlying motivation of advisors [9], [13], while some others orthogonally study them by distinguishing subjective advisors from the dishonest ones [28].However, it was an open question whether (and if yes, how) subjectivity influences the effects of unfair rating attacks.Below We formally study this issue.

A. Modelling Subjective Rating
Given an observation, various ratings may be chosen by an honest but subjective advisor, not only the rating that equals the observation.To include subjectivity, the proper objective rating model (e.g., depicted in Figure 1(b)) needs to be improved.
We still use notations p, α, O, R, H, ¬H with the same meaning as in the objective rating model.However, here we allow ratings to have different amount of options as the observations, with the outcomes of We introduce an n o ×n r matrix σ to characterize the rating behaviour of an honest advisor, with σ d,r = P(R = r |O = d, H ). Subscript d, r also denote the row and column index of the entry σ d,r , with both starting from 0. In the objective case, σ would be an identity matrix.The probability of receiving r when the truth is The matrix μ is defined as a shorthand for P(R|O), We formally define subjective rating as follows: Definition 5: σ -subjective rating is a rating function with Function f σ defines how an advisor's attributes p, σ, α decide its behaviour -the link between R and O.Note that the objective rating model in Figure 1(b) is a special case of f σ : σ is an identity matrix I .We refer to it as f I , which is an objective rating function.
A rating model for f σ is presented in Figure 3(a).There are four outcomes of O. R also has four options.σ 0,0 denotes the conditional probability of an honest advisor reporting 0 when 0 is the observed fact.
In Figure 4, we depict 4 examples of σ matrices that define the rating behaviour of honest advisors: σ b , σ c , σ d and σ e .Recall that rows are values for O, and columns are values for R. In the objective rating model, a value for O corresponds to the ground truth, but in the subjective rating model, it corresponds to how the advisee would view the truth.Also differently from the objective rating model, there is no one-to-one correspondence between R of honest advisors and O e.g., r = 0 may correspond to both o = 1 and o = 2, and also they may potentially mean different things.In our example, the two are equinumerous.
Matrix σ b is the identity matrix, denoting objective rating.Ratings and opinions correspond in σ b .Even if they do not, as long as an honest rating (R) completely determines what the opinion of the advisee would be (O) e.g., r = 2 determines o = 2, the rating is actually objective.Matrix σ c is close to matrix σ b , signifying low subjectivity.Values of R determine values of O to some degree, since for a given value of R, there is one highly probable value for O (and two improbable alternatives).
For matrix σ d , the honest ratings do not determine the would-be opinion of the advisee, since R probably equals 1, in which case O is equally probably as 0, 1 or 2. Matrix σ d is considered to be highly subjective.A possible alternative view would be to look at the extent to which the advisee's view on the truth (O) determines honest ratings (R), instead.Matrices σ b and σ c remain (nearly) objective in this view.Matrix σ d could naively be considered less subjective than σ c , since the value of R is determined to probably be a specific value given an o.However, the value of R cannot be determined by O or vice versa, as they are independent (∀o, r, p(r ) = p(r |o)).To measure the extent to which the would-be opinion determines the honest rating, we would have to normalise the probability by dividing by the prior probability of the rating.Note that since p(o|r ) = p(r|o) n o p(r) , meaning the conditional probabilities p(o|r ) for different o given a r are determined by the probabilities in a column p(r |o), these two views actually coincide.
Matrices σ b , σ c and σ d are unlikely to be the actual matrices of advisors, as they are extreme cases.Matrix σ e is a more realistic example of a (highly) subjective rating.If the value of R is i , then the most probable value for O is also i .Notice that the reverse is not true, since the advisor is most likely to rate R = 0 whenever the advisee would have opinion O = 1.Finally, the rating R = 1 weakly determines that O is probably 1, but it strongly determines that O probably is not 2.In the model we define in Section V-D, we take this into account when defining a partial order of subjectivity.

B. Information Leakage
Based on Definition 3 in Section III, we can compute information leakage of subjective rating f σ : Proposition 2: Given an attack strategy α, the information leakage of rating f σ is: Proposition 2 implies that a given attack may show different impact (or information leakage) under different subjective rating behaviour (σ ).
On the other hand, given a subjectivity matrix σ , is it possible for attackers to find a strategy α such that no information is leaked -ultimate attack or the worst case for an advisee?And if its possible, will there (and what would) be conditions for the values of p, n o , n r , σ .We study these questions in Section V-C.In particular, we will investigate whether subjectivity changes the conditions to achieve ultimate attacks, compared to objective rating.In Section V-D we quantitatively study the relationship between the degree of subjectivity and the impact of attacks -quantitative robustness comparison.

C. Ultimate Attacks
Ultimate attacks mean that there is zero information leakage of the observed fact O (refer to the definition in Section IV-B).No matter how sophisticated a system can be, the ratings are completely useless under ultimate attacks.Fortunately, the circumstances wherein an attacker can perform an ultimate attack are rare.Yet, for some settings it is rarer than others.Hence, we can use the difficulty to perform an ultimate attack as a proxy for the robustness of a system.
Theorem 5 proves that it is possible for an attacker to completely hide information if 1− p≥1− 1 r σ o * ,r (meaning the fraction of attackers is big enough from a frequentist interpretation), and the corresponding strategy depends on the values of p, σ .There already exist methods to distinguish subjectivity differences of honest advisors, e.g., by clustering [28], [29].Distribution of cluster memberships determines how probable certain subjective behaviours are, which coincide with the meaning of parameter σ .In order not to underestimate the power of the attacker, we must assume that the attacker also has access to p and σ .Furthermore, it is likely that the attacker can arrive at the same result for p and σ , e.g. by performing the same computations -assuming ratings are public knowledge.
It is obvious that n r , with equality only if all σ o * ,r equal 1.The identity matrix has all maximal elements equal to 1, and thus, it is easy to see that this generalises the results from Section IV.Even if there are not enough attackers to perform an ultimate attack on an objective-rating system ( 1 n r < p), there may be sufficiently many attackers to do so on a system with subjectivity, namely when 1 . The introduction of subjectivity makes it easier for attackers to completely hide information, thus, leaving a system less robust.
Further, o * in σ o * ,r denotes the observation under which reporting r is the most probable.The value of σ o * ,r reflects the subjectivity difference behind reporting r .The smaller σ o * ,r is, the more uniformly the values of σ o,r are distributed over o, and intuitively, the more probable that multiple o are reported as the observation behind r , which indicates more subjectivity difference.Theorem 5 implies that, with more subjectivity difference, the fraction of attackers necessary in the population (from which we select our advisor) to make a rating completely useless decreases.
We can compare the examples from Figure 4, and compute what the probability needs to be that an advisor is an attacker, in order for him to be able to perform the ultimate attack.For σ b , it is p ≤ 1 1+1+1 ≈ 0.333; for σ c , p ≤ 1 0.97+0.96+0.96≈ 0.346; for σ d , p ≤ 1 0.01+0.98+0.01= 1; and for σ e , p ≤ 1 0.7+0.4+0.7 ≈ 0.556.In the case for σ d , as O and R are already independent, attackers are even unnecessary to get 0 information.

D. Quantitative Robustness Comparison
Except ultimate attacks, we also want to investigate how the impact of attacks changes with the increase/decrease of the degree of subjectivity.First, we create an ordering of subjectivity, to be able to say that one advisor is more subjective than the other.Our ordering is not complete, so when an advisor is more subjective in one aspect than another advisor, but less so for another aspect, then the two advisors may be incomparable.
We define the ordering on matrices σ , which describe subjective rating behaviour, and take the natural extension to rating functions: σ σ iff f σ f σ .There are some notions that any reasonable ordering of subjectivity matrices must have: 1) for σ b , σ c , σ d , σ e from Figure 4: 2) the relation must be reflexive and transitive (i.e. it is a preorder order).No anti-symmetry, since two different matrices may be equally subjective.3) an objective matrix I o 8 is an maximal element (i.e.∀ σ I σ ).4) the uniform matrix U is a supremum (i.e.∀ σ σ U ): honest ratings are unrelated to the truth.
The definition of subjectivity assumes a ranking (π i ) of which ratings are more appropriate for which observations, in which case the more objective scenario should assign more probability to more appropriate ratings.It may be the case that some ratings are more likely to be provided a priori, which can be corrected by normalising the terms by dividing by the prior probability.The prior probability is proportional to i σ i, j , which we denote as σ j .Formally: Definition 6: Let σ and σ be n × n subjectivity matrices.A row σ j is less subjective than σ j , denoted σ j σ j if there exists permutation π j , s.t.π(σ ) i, j and π(σ ) i, j are nonincreasing over i , and Then, a subjectivity matrix σ is less subjective than σ , denoted σ σ when for all j , σ j σ j .
Proposition 3: The relationship is reflexive and transitive, and for all σ , I s U.
We rely on the observation that sometimes probability mass can move from one value to another, whilst decreasing information leakage: Lemma 1: Given μ i, j ≥ μ j n and μ i, j ≤ μ j n , we can define μ * equal to μ except at i, j and i, j where For other j † , μ i, j † = μ * i, j † , so the remaining sums are: Using the order of subjectivity of reporting, we can formalise the notion that increasingly subjective reporting makes it easier for an attacker to decrease information leakage: Theorem 6: For any ratings f f , for any Proof: The ranking of the rows π i from Def 5 may not rank the values of μ.Wlog, there exists α * , s.t.μ * = pσ + (1 − p)α * where π i ranks the values of The α * can be obtained by applying Lemma 1 to move probability from overly high ranked values to overly low ranked values; since σ follows the ranking, the resulting μ * has the property that μ * − pσ ≥ 0 and thus α * = μ * − pσ 1− p ≥ 0.
8 A matrix where every row and column has a single element equal to 1. Remains to prove there exists μ s.t.

VI. ROBUSTNESS OF EXISTING APPROACHES
TO DEAL WITH SUBJECTIVITY We consider two types of approaches to deal with subjectivity: feature-based rating, which is popularly applied in practise to help resolve conflicting emphasis on features in overall rating, and clustering advisors, which is proposed in the literature to distinguish advisors with different subjectivity.These approaches aim to mitigate the influence of subjectivity, so it is interesting to study whether they improve the robustness against unfair rating attacks.

A. Feature-Based Rating
Feature-based rating refers to settings where advisors need to rate each feature of an observation (or a target) instead of providing an overall rating.For example, in Booking.com or Expedia.com,consumers can score over multiple features of a hotel after their accommodation, such as cleanliness, comfort, location, facilities, staff and value for money.Compared with overall rating, distinguishing features helps avoid subjective ratings induced by different emphasis on features.Also, if all ratings are honest, potential consumers are presented a more comprehensive view of the hotel.
The rating function f σ modelled in Figure 3(a) does not distinguish features, and we name it as overall subjective rating function.The modelling of a feature-based rating can be directly derived from f σ .To better illustrate this, we first reformulate the modelling of f σ .There, O represents an observation regarding a target, which contains all related features of the observation that advisors care.We can split all features into two groups regarding whether the advisee cares, with O and O representing the group of features that the advisee cares and does not care respectively.We assume that cared features and uncared features are independent, i.In a feature-based rating scenario, there are no uncared features.Hence, only variables O and R remain in the reformulated rating model (see Figure 3(b)).In this way, the only difference between a feature-based rating and f σ in Figure 3 r , the featurebased rating would have the same amount of information leakage compared with the overall rating.This implies that, when a given overall rating framework is changed to a corresponding feature-based rating framework, the information an advisee can learn may remain the same.For instance, if subjectivity differences over feature O remain unchanged, which means ∀o , r , ς o ,r = σ o ,r , then attackers can choose the same strategy β o ,r = α o ,r to leak the same amount of information.By choosing proper β, attackers may even cause less information leakage in feature-based rating.For ultimate attacks, if r ς o * ,r < r σ o * ,r , the feature-based rating relaxes the condition on achieving ultimate attacks.
In summary, although it helps reduce subjectivity induced by different emphasis, feature-based rating does not necessarily improves robustness compared with overall rating.Intuitively, although subjectivity by different emphasis is reduced in feature-based rating, subjectivity induced by various expectation gets re-distributed as every advisor is forced to consider each feature separately.Hence, it is hard to judge whether subjectivity difference becomes less in feature-based rating.

B. Clustering Advisors
Thus far, we have taken an approach where we use a single matrix to model the subjectivity of all advisors.This is trivially sufficient when reasoning about a single advisor, as we did in Section V.There are two other cases where a single matrix is sufficient to model subjectivity for all advisors: The first case is the rather unrealistic case where all advisors have the same subjective rating behaviour.The second case is where we cannot distinguish subjectivity of different advisors.
While these three cases are interesting, they are insufficient.Generally, we have multiple advisors with different kinds of rating behaviour that we have some historical data about.Hence, in this section, we introduce a model that help reason advisors with different subjectivity matrices.
1) Modelling: To deal with the general case, we investigate the popular clustering approach [27], [28].Therein, advisors are assigned to clusters based on their (past) behaviour, and each cluster has their own behaviour model.Not only can the subjectivity matrices differ from cluster to cluster, we also allow p-value to differ, to model clustering based on degrees of honesty.Finally, we assume that the attacker knows which cluster he is in.This implies that his strategy matrix is chosen to minimise total information leakage.
Assume there are k advisors in total, with R i represents the rating of i th advisor, and R = R 0 , . . ., R i−1 .Let there be clusters c , c † , . . ., with symbols , † , § denote association to a cluster.R refer to ratings generated by all k advisors in c .Associated with cluster c is: probability of its advisors' honesty p , subjectivity matrix σ and strategy matrix α .The random variable C i dictates to which cluster the i th advisor belongs, and C = C 0 , . . ., C k .In other words, p(r i |d, C i = c ) = p σ +(1− p )α = μ .We use c j to denote j th advisor in cluster c .Clustering is typically based on previous behaviour of the advisors, and thus not related to what the observed facts are; so C is independent of O.
2) Robustness of Clustering: The crucial question is whether clustering increases the robustness of the system, which is equivalent as whether clustering increases the expected information leakage.Clustering indeed increases expected information leakage, except in the case where C has no impact on the relationship between O and R: Therefore, clustering increases the expected information leakage.No matter what the attacker's behaviour is, more information is expected after clustering.However, we strongly conjecture that the benefits of clustering are even greater: Clustering always outperforms not clustering.
A naive interpretation of the conjecture is that for all C, I (O; R|C = c) ≥ I (O; R).Not only does Theorem 7 not suffice to prove this version, it turns out to be false.As a counterexample, assume that for one cluster c § , its σ § -matrix has no information leakage -it is a cluster where useless advisors are put in.There is a non-zero probability that all advisors are put in this matrix, namely when the user is unfortunate enough to select only useless advisors.In this case, I (O; R|C = c) = 0 < I (O; R).In the remainder of this section, we will nuance our claim that clustering always outperforms not clustering.
Intuitively, the reason the counterexample fails, is that moving an advisor from one cluster to another changes the expected behaviour of a randomly chosen advisor.It only makes sense to compare a given clustering to a situation where the expectation of the behaviour is the same.We introduce the universal cluster for that purpose.The universal cluster, c § , has the desired property that p(o|r, c -so each cell in the universal behaviour matrix is a weighted average of the corresponding cells in the matrices of the clusters.Similarly, ∀o, + . . . .Now, α § may not minimise the information leakage for the given p and σ .Let α $ be the attacker strategy that minimises information leakage.Let c $ be the minimal universal cluster, where p $ = p § , σ $ = σ § .α $ is the matrix that minimises I (O; R|c $ ).
Using the new terminology, we can express our conjecture as I (O; R|c) ≥ I (O; R|c $ ).Note that it suffices to prove a simpler inequality to prove the conjecture: Proof: First, note that it suffices to prove for c § , since I (O; R|c $ ) ≤ I (O; R|c § ) by definition.Second, without loss of generality, we reorder the list of advisors such that each cluster's advisors occur consecutively.Then, if we can prove the proposition for two clusters, we can replace any pair of clusters by its universal cluster, and inductively apply the proposition.Remains to prove the proposition for two clusters: 3) Whether to Exclude Clusters: We discussed the robustness of clustering advisors.There exist various ways to deal with clusters.Some researchers choose to exclude clusters where the advisors are considered dishonest [29], while some others propose to learn from clusters where advisors are even strategic [28].We study the impact of these different ways on robustness.
Theorem 8 states that if a cluster provides no information about O, then it does not impact the correlation between O and other clusters.Corollary 1 implies that if a cluster has no information about O, then it can be completely excluded without making an advisee lose any information.Remember the conditions for a cluster to have 0 information leakage is that the probability of an advisor's being honest, p, needs to be below a threshold (Theorem 3 or 5).The value of p is not obvious to an advisee, and clustering mechanisms may not estimate p accurately.If a cluster is considered dishonest and gets excluded, when its real p value is above the threshold, then a user loses useful information.Similarly, if a cluster is considered very reliable, when its p value is actually below the threshold, then an advisee gets no information.

VII. CONCLUSION
In this paper, we proposed a quantitative measurement of unfair rating attacks based on information theory.How much information ratings provide about the truth determines the impact of attacks.We studied the scenario that an arbitrary advisor is rating a given subject.And we found the worstcase strategies against a user -meaning causing the minimal information leakage -that an individual attacker can undertake in his rating.
We first considered the scenario where rating is assumed to be objective.A probabilistic rating model was built to reason about possible rating behaviour of an arbitrary advisor who can have any degrees of honesty.We found that if we select an advisor randomly from a population, that advisor can hide the truth completely (perform the ultimate attack), if the population contains at least n − 1 times more attackers than honest advisors -where n is the number of rating options.And some of them need to report the truth.Otherwise, the truth can still be learned from the ratings even if more than half of the advisors are strategic.
Considering that subjective rating is typically unavoidable in reality, we improved the proposed rating model in several ways: 1) given an observation, we allow honest advisors to choose different ratings; 2) we allow the options of ratings to differ from the options of observations; 3) we distinguish features of an observation that an advisee cares and does not care.
We found that the introduction of subjectivity makes it easier for attackers to completely hide the truth.we also introduced an ordering of subjectivity, and found that more subjective rating makes a system less robust against unfair rating attacks.Since subjectivity decreases robustness, we studied whether existing methods of distinguishing subjectivity difference would improve robustness.Splitting ratings up, such that individual features are rated may mitigate or exacerbate the problem.Clustering advisors with similar subjectivity, however, improves robustness.Clusters with sufficiently many attackers may be ignored or blocked without consequence.

Fig. 1 .
Fig.1.Objective rating models with n options of observable facts and ratings.

Fig. 2 .
Fig. 2. The minimal information leakage of O varies with p and n (n > 1).
e., O ⊥ ⊥O .Outcomes of O (or O ) are the true values of the corresponding features, e.g., the score of a hotel's location.The total number of outcomes of O (O ) is denoted as n O (n O ).As there are three variables in rating, 2D behaviour matrices σ, α in overall rating now become 3D matrices with dimensions n O ×n O ×n r .Figure 3(b) presents the reformulated subjective rating model.Specifically, σ 0,1,3 means the probability that an honest advisor reports 3 when he observes 0 for O and 1 for O .The subjective rating behaviour regarding features O can be characterized by conditional probabilities p(r |o ).Let σ o ,r = 1 n O O σ o ,o ,r and α o ,r = 1 n O O α o ,o ,r .Then, p(r |o ) = o p(r, o |o ) = p•σ o ,r + (1− p)•α o ,r .The information leakage of both O and O is I (O , O ; R).An advisee is interested in the information of O but not O : I (O ; R), which is determined by σ o ,r and α o ,r .
(a) is that variable O becomes O .To be distinguished from σ o ,r , α o ,r in the reformulated model, we use ς o ,r and β o ,r to characterize subjective and strategic behavior in a feature-based rating respectively.ς o ,r and β o ,r determine the information leakage of O .Theorem 5 still holds: the condition to achieve ultimate attacks in feature-based rating is p < 1 r ς o * ,r , ς o * ,r = max o ς o,r .According to the definition of information leakage, as long as p(r |o ) remains the same forc any o , which means p

Theorem 7 :
I (O; R|C) ≥ I (O; R), with equality iff C and O are conditionally independent under R. Proof: First, note: I (O; R|C) = H (O|C)−H (O|R, C), and I (O; R) = H (O) − H (O|R). Since O and C are independent, it suffices to prove H (O|R, C) ≤ H (O|R).This is a known property of conditional entropy, and equality holds only if C and O are conditionally independent under R.