A Similarity Measure Based on Bidirectional Subsethood for Intervals

With a growing number of areas leveraging interval-valued data—including in the context of modeling human uncertainty (e.g., in cybersecurity), the capacity to accurately and systematically compare intervals for reasoning and computation is increasingly important. In practice, well established set-theoretic similarity measures, such as the Jaccard and Sørensen–Dice measures, are commonly used, whereas axiomatically, a wide breadth of possible measures have been theoretically explored. This article identifies, articulates, and addresses an inherent and so far not discussed limitation of popular measures—their tendency to be subject to aliasing—where they return the same similarity value for very different sets of intervals. The latter risks counter-intuitive results and poor-automated reasoning in real-world applications dependent on systematically comparing interval-valued system variables or states. Given this, we introduce new axioms establishing desirable properties for robust similarity measures, followed by putting forward a novel set-theoretic similarity measure based on the concept of bidirectional subsethood, which satisfies both traditional and new axioms. The proposed measure is designed to be sensitive to the variation in the size of intervals, thus avoiding aliasing. This article provides a detailed theoretical exploration of the new proposed measure, and systematically demonstrates its behavior using an extensive set of synthetic and real-world data. Specifically, the measure is shown to return robust outputs that follow intuition—essential for real-world applications. For example, we show that it is bounded above and below by the Jaccard and Sørensen–Dice similarity measures (when the minimum t-norm is used). Finally, we show that a dissimilarity or distance measure, which satisfies the properties of a metric, can easily be derived from the proposed similarity measure.


I. INTRODUCTION
S IMILARITY measures (SMs) are widely used in many areas, including decision making, data aggregation, approximate reasoning, and machine learning. Various SMs have come into use to capture likeness among objects, though each of the measures has its own strengths and weaknesses. Similarity is commonly represented by a nonnegative real number, often between 0, meaning objects are not similar at all, and 1, meaning they are identical. SMs are generally symmetrical, though, for certain objects, similarity can be better modeled by unidirectional or asymmetric functions [1]. In addition, SMs can be transformed to capture distance using various functions [2], showing dissimilarity among objects (e.g., color images as in [3]).
Interval-valued data have recently gained much interest for the modeling of uncertainty and vagueness, particularly in the modeling of survey data [4], the representation of symbolic data [5], and the capture of natural language expressions [6], as they offer simple and efficient representation of uncertain, vague, and imprecise information. In such areas, intervals are often compared through various SMs. Among them, the Jaccard [7] and the Sørensen-Dice [8] SMs-henceforth referred to as Dice SM for convenience-are the most commonly used set-theoretic SMs in practice. These two measures provide symmetrical similarity, growing slowly from a minimum (0) to a maximum value (1) in response to an increasing degree of overlap between two closed intervals.
Nevertheless, both of these, and indeed most set-theoretic SMs frequently suffer from a so-far not discussed pitfall, best described as aliasing. Aliasing occurs where the same similarity value is generated for very different interval pairs. Fig. 1 shows an example of such interval pairs for which the Jaccard and Dice SMs give the same similarity of 0.33 and 0.50, respectively. While SMs returning, such identical results for different sets of intervals is at least counter-intuitive, at worst it leads to incorrect inference in real-world applications. The reason why these measures exhibit aliasing is because their sensitivity in respect to changes in the relative size of the intervals is limited, that is, they are largely driven by the size of intersection and union. However, it is reasonable to expect that similarity would vary both in respect to the overlap and to the mutual similarity-in-size of the intervals.
We initially put forward the underpinnings of a new settheoretic SM in [9] for pairs of (closed) intervals that considered their reciprocal overlapping ratios to capture similarity with a high degree of sensitivity.
The rest of this article builds on our initial work [9] while using the term subsethood rather than overlapping ratio as they are equivalent in practice. Moreover, subsethood is commonly used and well understood in the literature. The major contributions of this article are as follows.
1) Aliasing is identified and articulated as a risk and potential shortcoming affecting many popular SMs. 1 2) Going beyond the common axioms for being an SM, five new axioms are introduced along with their justifications to expand the axiomatic definition toward robust SMs. Furthermore, an axiomatic definition of subsethood for intervals is also provided (see Appendix). 3) A new set-theoretic SM is proposed that is designed to be sensitive to the variation in the size of intervals, thus avoiding aliasing. 4) The new SM is explored theoretically, showing its conformity with the expanded axiomatic definition, that is, both common and new axioms of a robust SM. 5) A new distance measure (DM) is derived from the new SM for estimating distance or dissimilarity between intervals, and it is proved to be a metric. 6) The utility of the new SM is demonstrated in the context of both synthetic and real-world cases. Using both synthetic and real-world interval-valued data, we demonstrate its intuitive behavior in line with popular SMs-specifically being bounded above and below by Jaccard and Dicehighlighting how the new SM follows expected and intuitive behavior while still addressing pitfalls of existing SMs (e.g., by avoiding aliasing). The rest of this article is structured as follows. Section II briefly reviews SMs, including the Jaccard and Dice, as well as DMs. Section III discusses an axiomatic definition toward robust SMs for (closed) intervals, followed by proposing a new set-theoretic SM, which supports all axioms and then derives a new distance metric from the new SM. In Section IV, we explore the behavior of the new SM in respect to the Jaccard and Dice measures, using both synthetic and real-world intervalvalued datasets. Finally, Section V concludes this article. Table I 1 Beyond both Jaccard and Dice measures and other set-theoretic SMs, such as Szymkiewicz-Simpson coefficient [10], Otsuka coefficient [11], Sokal-Sneath coefficient [12], and simple-matching coefficient [13], are also subject to aliasing for intervals. presents a list of acronyms and notation used in this article to assist the reader.

II. BACKGROUND
We review SMs generally, including their most common properties arising from their axiomatic definition, followed by a detailed review of the Jaccard and Dice measures as the most widely used SMs for intervals in the literature. Next, we briefly discuss DMs and the properties required for the latter being a metric, followed by a review of subsethood, which is employed later in this article to derive a new SM. We focus throughout on measures applied to intervals in this article as the latter provide the foundation for the later extension to SMs for more complex data types, including fuzzy sets.

A. Similarity Measures
An SM S(a, b) → [0, 1] is a real-valued function that determines how two objects, a and b, are alike. Generally, the similarity between two objects is bounded by 0 and 1, where 0 means that both objects are completely different and 1 means that they are identical. The four common properties of an SM for sets a, b, and c are as follows [14].
[A3] Reflexivity: The aforementioned properties are mirrored in (and in many cases arise from) the axiomatic definitions associated with a wide variety of established SMs. Going beyond these standard and generic properties, we now provide a more detailed review of both the Jaccard and Dice SMs as the most common SMs used in practice.
1) Jaccard SM: The Jaccard SM [7] is one of the most widely used set-theoretic SMs, and it satisfies all of the aforementioned properties. Generally, the Jaccard similarity of two crisp sets a and b is defined as the ratio of the cardinality 2 of their intersection 2 The cardinality of a set a is defined as the number of elements within a [15]. In this article, we are interchangeably using the terms "cardinality" and "size" of a set. For intervals, we also use "width." and the cardinality of their union Using the crisp set difference operation [15], (1) can be written as where a\b is the set of items that are in a but not in b and b\a is the set of items that are in b but not in a. Equation (2) can also be derived from Tversky's parameterized ratio model of similarity [1] by setting the nonnegative factors α and β to 1 and letting f be a cardinality function. Note that this alternative form of the Jaccard SM at (2) is relevant for showing its relationship with the Dice and other SMs, detailed in Section III. Beyond crisp sets, the Jaccard SM is used to estimate the similarity for intervals or sets of intervals such as employed for example in data fusion [16], [17] and that of fuzzy sets [18].
A closed interval a is a set of real numbers characterized by two endpoints a − and a + with a − < a + . 3 The interval a is often represented as [a − , a + ] and its cardinality, size, or width is |a| = |a + − a − |. For comparing the intervals a and b, the Jaccard SM is expressed as where |a| = 0 and |b| = 0. |a ∩ b| is the size of the intersection between a and b and |a ∪ b| is the size of the interval segment(s) covering both a and b. Hence, S J (a, b) = 1 when a and b are completely overlapping and S J (a, b) = 0 when they are not overlapping at all. Similar to (2), we can rewrite (3) as where |a\b| is the size of the interval segment of a that is not overlapping with b and |b\a| is the size of the interval segment of b that is not overlapping with a. A fuzzy set [21] is defined as a set where the set's elements have membership ranging between 0 and 1. Formally, a type-1 fuzzy set A on a discrete and finite universe of discourse X is written as [22] where μ A (x) ∈ [0, 1] is the membership grade of the element x in A. For two type-1 fuzzy sets A and B on the discrete and finite universe X, the Jaccard SM can be written as [23] 3 Note that a is also known as a continuous [19] or convex [20] interval.
where μ A (x i ) and μ B (x i ) are the membership grades of x i in A and B, respectively. Equation (6) yields a value of 1 when the fuzzy sets are identical and 0 when they are disjoint. It is noted that the Jaccard SM has been further extended for interval-valued fuzzy sets [24], [25] and type-2 fuzzy sets [26], [27]; though, this is not discussed further here.
2) Dice SM: The Dice SM [8] is closely related to the Jaccard SM. To assess the similarity between two sets, it considers the ratio of the cardinality of their intersection and the average of their cardinality. Like the Jaccard similarity, it produces outputs in [0, 1]. Specifically, for two crisp sets a and b, the Dice similarity is expressed as where |a| is the cardinality of the set a. We can rewrite (7) by applying the crisp set difference operation [15] S D (a, b) = |a ∩ b| |a ∩ b| + 1 2 (|a\b| + |b\a|) .
Note that (8) can also be obtained from Tversky's ratio model [1] when both nonnegative factors α and β are 0.5. The alternative expressions of Jaccard at (2) and Dice at (8) show clearly that the averaging operation in the denominator of (8) results in the Dice similarity always being equal to-when sets are identical-or larger than the Jaccard similarity. We expand this in Section III.
In [16] and [17], the Dice similarity is used along with the Jaccard similarity for intervals. By following (4), the Dice similarity for two intervals a and b can be expressed as where |a| = 0 and |b| = 0. While less frequently used for fuzzy sets than Jaccard, the Dice SM is, for example, used in [28] and [29] for trapezoidal fuzzy numbers in the context of solving multicriteria decision-making problems.

B. Distance Measures
A DM D(a, b) → R + is a real-valued function that determines how far apart two objects a and b are. A DM is a metric when it satisfies the following properties for sets a, b, and c [30].
[B1] Nonnegativity: D(a, b) ≥ 0. The Jaccard DM is complementary to the Jaccard SM and is a distance metric [31]. It is simply obtained by D J (a, b) = 1 − S J (a, b). The Dice DM is also obtained by subtracting the Dice SM from 1, i.e., D D (a, b) = 1 − S D (a, b). However, it is not a distance metric, but is often referred to as a semimetric as it satisfies all of the preceding properties except the triangle inequality [31]. Note that a DM is often within the range [0, 1]-0 for identical objects and 1 for dissimilar objects-when it is derived from an SM bounded by [0, 1].

C. Subsethood
Subsethood between two crisp sets a and b is a relation that indicates the degree to which a is a subset of b [24]. It is defined as where |a ∩ b| is the cardinality of the intersection of sets a and b, and |a| is the cardinality of set a. 4 From (10), it is clear that subsethood is smaller when more elements of set a are not part of set b and is larger when more elements of set a are part of set b. In general, (10) is bounded on the interval [0, 1], where 1 means that a is a subset of b (a ⊆ b) and 0 means a is not a In a similar manner, the degree of subsethood of two closed intervals a and b can be defined as where |a ∩ b| is the size of the intersection between a and b and |a| = 0. 5 A binary notion of subsethood for fuzzy sets is first defined in [21] where for two fuzzy sets A and B on the universe This definition is inherently crisp-A is or is not a subset of B-which is incoherent in respect to the fuzzy set theory. Hence, many alternatives for fuzzy subsethood have been introduced. Using the set-theoretic approach, the degree to which A is a subset of B on a finite X is defined as [33] is a measure of the cardinality of the intersection of membership functions of A and B, and n i=1 μ A (x i ) is a measure of the cardinality of A. Contrarily, using the fuzzy implication operator I , it is expressed Furthermore, fuzzy subsethood is characterized by different axiomatizations. Initially, four axioms are proposed to define it as a binary fuzzy relation in [35]. Subsequently, Sinha and Dougherty [36] offer another set of axioms for fuzzy subsethood. We note that these axioms (except the first two axioms) are equivalent to those in [35]. However, in [37], it is argued that some fuzzy subsethoods do not maintain all axioms and new ones are proposed. For fuzzy sets A, B, C ∈ X, these axioms are as follows: (10) can also be derived from Tversky's ratio model [1] by setting α = 1 and β = 0. 5 It is noted that subsethood is also known as inclusion [32] and overlapping ratio [9] as it captures the overlapping ratio between intervals.
where P is the fuzzy set with μ P (x) = 1 2 and A c is the fuzzy set with [38] for defining new fuzzy subsethoods and their application to cluster validity. Besides, axioms [C2] and [C3] are altered in [39] as to introduce a new fuzzy DI-subsethood by aggregating implication functions, respectively.
In many cases, fuzzy subsethood axioms are extended for interval-valued, intuitionistic, and type-2 fuzzy sets. Specifically, In [40]- [42], key axioms for interval-valued fuzzy set subsethood are presented. Furthermore, in [43], an interval-valued fuzzy strong S-subsethood measure is defined by aggregating implication functions. The extension of fuzzy subsethood for intuitionistic [44] and type-2 [45] fuzzy sets are not discussed.
Remark 1: In this article, we build on the existing literature on fuzzy subsethood, particularly its key axioms to develop main properties of subsethood for intervals, see Appendix.

D. Interaction of Subsethood and SMs for Fuzzy Sets
The relationship between subsethood and SM has been explored for fuzzy sets. Particularly, Zeng and Li [46] and Li et al. [32] establish interchangeability between fuzzy subsethood and fuzzy set SM based on their axiomatic definitions. Zeng and Guo [41] also study similar relationship for interval-valued fuzzy sets. Furthermore, a set of axioms are proposed in [47] to define the properties of a fuzzy set SM, which is extensively used in earlier studies. Furthermore, three different SMs for fuzzy values are proposed and compared in [48] where key properties are coincided with the axioms defined in [47]. Besides, in [49] and [50], the same set of axioms to design fuzzy set SMs for comparing images with a restricted equivalence function is also considered [51].
Remark 2: A variety of works have focused on definitions and extensions of subsethood across the fuzzy set literature, including for type-1 [33], interval-valued [40], and type-2 [45] fuzzy sets. In this article, we focus on SMs for intervals rather than fuzzy sets as the latter provide the direct underpinnings for straightforward subsequent extension (via the alpha-cut decomposition representation) to fuzzy sets.
In the next section, we first introduce an expanded axiomatic definition for robust SMs, followed by proposing a new SM for closed intervals that fulfills said definition and provides robust comparison of intervals.

III. TOWARD ROBUST SET-THEORETIC SMS
As outlined in Section I, aliasing poses a so far not discussed challenge to the robust comparison of intervals. In order to address this and move toward robust SMs, we first put forward an axiomatic definition for robust SMs for (closed) intervals. Then, building on this, in Section III-B, we define a novel set-theoretic SM for intervals [9] based on their bidirectional subsethood, showing that it follows the axiomatic definition put forward. From this SM, we derive a new DM for intervals along with a proof showing the measure is a metric. It is worth mentioning that we also define the properties of the subsethood for intervals in respect to the wider literature, for conciseness, this is provided in the Appendix.

A. Expanded Axiomatic Definition for Robust SMs on Closed Intervals
A real-valued function S : a × b → [0, 1] is defined as an SM for (closed) intervals if it maintains the following axioms.
[ The aliasing example discussed in Section I calls for the axiom [P6]. Axiom [P7] is presented for intervals when one interval is a proper subset of the other. Here, the similarity should be less than 1 as both intervals are not equal. Axiom [P8] is introduced for intervals where they are scaled up by a factor. In this case, the ratio of overlapping and nonoverlapping segments stays the same while scaling; thus, the similarity between intervals should remain constant. Finally, axiom [P9] suggests higher similarity for equal-sized intervals when their overlap increases.

B. SM Based on Bidirectional Subsethood
As noted in Section I, the motivation behind the new settheoretic SM is to establish an SM that is sensitive to potentially (very) different cardinalities of the sets being compared, in particular, when one is a subset of the other. Thus, the proposed SM takes into consideration the reciprocal subsethood (i.e., the overlapping ratio) of both sets/intervals being compared, in order to estimate their overall similarity.
Definition 1: The bidirectional subsethood based SM S S h for a pair of intervals a and b is the t-norm of their reciprocal subsethoods S h (a, b) and S h (b, a), i.e., where is a t-norm.
In this article, we use to refer to either minimum t-norm or product t-norm, whereas using "∧" and "Π" to indicate the minimum and product t-norms, respectively. In future, we will explore other t-norms beyond minimum and product.
Theorem 1: Consider a minimum (∧) or product (Π) t-norm and the subsethood for intervals S h . Then, S S h (a, b) = (S h (a, b), S h (b, a)) is an SM for intervals a and b satisfying all axioms [P 1]-[P 9].
Proof: The proofs are given per axioms in Section III-C. Similar to (4) and (9), we can rewrite (13) as where |a\b| is the size of nonoverlapping segment(s) of a as to b and vice versa for |b\a|. Also, |a| = 0 and |b| = 0.

C. Properties of the Proposed SM
We now explore the properties of the proposed bidirectional subsethood based SM S S h (a, b). 6 Theorem 2 (Boundedness): Proof: All t-norms are bounded by 0 and 1 [52]. This is also true for the subsethood S h (see Appendix). This eventually means that S S h (a, b) is always within [0, 1], thus addressing the axiom [P1].

Theorem 9 (Scaling-invariance): For two interval pairs
where n > 0 is a scaling factor.
Proof: For the pair Again, for the pair Given that a 2 = n × a 1 and b Here, the new SM follows the axiom [P8].

Theorem 11: S S h (a, b) is bounded by the Jaccard and Dice SMs when is the minimum (∧) t-norm. That is, S
Proof: For the interval pair {a, b}, consider the formulations of the SMs at (4), (9), and (14).
To prove this theorem, we consider four cases: a = b,  a ∩ b = ∅, a ⊂ b, and a ∩ b = ∅, and a ⊂ b and b ⊂ a.  Case 1: If a = b, then all three measures yield a similarity of 1.
there is no nonoverlap segment of a; hence, |a\b| = 0. Inversely, there is a nonoverlap segment of b as to a; thus, |b\a| = 0. In this case, the three SMs can be simplified to Case 4: If a ∩ b = ∅, and a ⊂ b and b ⊂ a (a and b are partially overlapping), then assume the case |a| ≤ |b|. It implies that |a| − |a ∩ b| ≤ |b| − |a ∩ b| ⇒ |a\b| ≤ |b\a|. The three SMs are It is true that D (a, b). Again, it is clear that D (a, b). D (a, b). Note that for the case |b| ≤ |a|, the same procedure can be used to prove the aforementioned relation.

D. DM Based on Bidirectional Subsethood
A new DM D S h (a, b) can easily be derived from the S S h measure at (13) by taking its complement, capturing the dissimilarity between both intervals Alternatively, (15) can be written as Note that this alternative form of the proposed D S h (a, b) measure can now directly be used in pattern recognition problems with sets/intervals, such as classification and clustering. We next discuss the essential properties of the D S h (a, b) measure in terms of it being a metric.
Theorem 12: Proof: To prove that the DM, D S h (a, b) is a metric, we need to show that it satisfies the following properties for the intervals a, b, and c where is the minimum (∧) or product (Π) t-norm.
We provide proofs for all the aforementioned properties (a)-(d) in the following.  a = b, then S S h (a, b) is always the intervals a, b, and c, we consider a ⊆ b ⊆ c. It implies that |a| ≤ |b| ≤ |c|. To prove this theorem, we apply the formulation of the S S h measure at (14). Case 1: When all three intervals are equal (a = b = c), h (a, c) = 0, thus satisfying the triangle inequality. Case 2: When a ⊂ b ⊂ c, it implies that |a| < |b| < |c|. Now, we can calculate the similarity and distance for each pair of intervals by applying (14).
(iii) For a ⊂ c (by the transitive property of subsets [15])

Now, we have to show that
By placing the distance of each pair of intervals in the aforementioned equation, we get the following: It is true as |b| < |c|, thereby satisfying the triangle inequality.  Example satisfying triangle inequality: Consider three intervals a, b, and c in Fig. 2 where b, c ⊂ a and b ∩ c = ∅. Table II presents the similarity and distance between each pair of the three intervals using the S S h and D S h measures, respectively, with the minimum (∧) and product (Π) t-norms. Note that the same similarity and distance results are received using both t-norms. From the distance results, it is observed that for the interval pair {a, c}, their distance-which is 0.58-is less than the summed distances of the pairs {a, b} and {b, c}-which is c). This relation is also maintained for the interval pairs {a, b} and {b, c}, thereby demonstrating the D S h measure meeting the triangle inequality.
Theorem 13: D S h (a, b) follows the property of transitivity. That is, Thus As b ⊆ c, it follows that |b| ≤ |c|. Therefore, |a| |b| ≥ |a| |c| , which implies that h (a, c).

IV. DEMONSTRATION AND ANALYSIS
We now demonstrate and analyze the behavior of the proposed SM S S h with the minimum (∧) and product (Π) t-norms in the context of the S J and S D SMs for a set of synthetic examples in the first part and with a real dataset in the last part. Here, we alter different features of interval pairs to explore how well these three measures perform or follow intuitive results. In particular, we focus on the following aspects.
1) Propensity to exhibiting aliasing in response to variations in interval sizes. 2) Behavior in respect to intervals where one is a complete subset of the other. 3) Behavior in respect to intervals of equal sizes and overlapping ratio. 4) Response to variations in interval size while maintaining the same level of subsethood. 5) Response to linear increase in the overlap of intervals.

A. Synthetic Dataset Based Demonstration
For each of the aforementioned cases, a series of synthetic intervals is proposed and visualized.
1) Experiment on Aliasing Propensity: In Fig. 3, four different pairs of intervals {a, b} are considered where all pairs have an intersection of equal size. The similarity results for the pairs using the three SMs are shown in Table III. The S J and S D measures are subject to aliasing, providing the same similarity of 0.15 and 0.26, respectively, for all pairs.
Indeed, both measures provide-unexpectedly-identical similarities for pairs of intervals when the size of the union of their nonoverlapping segments remains constant. On the contrary, the S S h measure with both minimum and product t-norms (S S h ∧ and S S h Π ) yields a different degree of similarity for all cases, thereby exhibiting its aliasing-free and robust behavior. The reason is that the S S h measure captures the changes in the size of both input intervals as compared to S J and S D measures, which eventually affects their reciprocal subsethood and the overall similarity. Note that as shown in Theorem 11, the results of the S S h ∧ measure are bounded by the S J and S D measures.
2) Experiment With Interval Pairs When One Interval is a Complete Subset of the Other: Two separate cases are considered with different sets of interval pairs, where in both cases, one interval is a complete subset of the other: an increasing degree of overlap between both intervals by increasing the size of the smaller interval, and decreasing the degree of overlap by increasing the size of the larger interval. Fig. 4(a) shows five interval pairs with b ⊂ a, where b covers 10%, 20%, 30%, 40%, and 50% of a. On the contrary, in   Intuitively, their mutual similarity should be at most |a∩b| |a| for each pair. From the results, we see that both S S h (with minimum and product t-norms) and S J measures perform according to the intuition, whereas the S D measure exceeds this expected limit.

3) Experiment With Interval Pairs of Equal Size and Equal
Overlapping Ratio: In Fig. 5, five interval pairs are shown where the intervals are of equal size and the size of their intersection is varied to 10%, 20%, 30%, 40%, and 50% of their size. Table V provides the results for all pairs using the three SMs. In all pairs, the subsethood is equal, and it is intuitive to expect the similarity to be the same as this subsethood (as the intervals are both of equal size). In this case, the S S h with minimum t-norm (S S h ∧ ) and S D measures follow the intuition, whereas the S S h with   Fig. 6, where both endpoints of a and b are gradually multiplied by a factor n ∈ {2, 3, 4, 5} to generate new interval pairs. The degree of subsethood stays the same across the pairs. Adapting the definition from [53], an SM is invariant if its similarity output remains constant regardless of scaling the interval endpoints by a factor. Table VI shows the similarity for all pairs using the three SMs, where n is the factor applied to the interval endpoints. The results demonstrate the scaling invariance property for the given pairs of intervals for all SMs.

5) Experiment on Increased Overlap Linearly:
Adapting the definition from [53], an SM on intervals is linear if its similarity output varies linearly as to a linear change in the size of the overlap/intersection of the intervals. In Fig. 7(a), the overlap between two intervals of equal size is gradually increased in 10% steps. The corresponding similarity outputs for the pairs and all three SMs are shown graphically in Fig. 7(b). Results show that all three SMs provide higher similarity for increased overlap. Particularly, S S h with minimum t-norm (S S h ∧ ) and S D measures exhibit linearity in the similarity results, whereas S S h with product t-norm (S S h Π ) and S J measures display convexity (similarity increases rapidly with an increase in the size of the intersection).
In summary, case 1 shows that S J and S D measures are subject to aliasing and return the same similarity for very different interval pairs whenever the union of their nonoverlapping remains constant (regardless of any changes in their size). In case 2, when one interval is a complete subset of the other, proportionately increasing or decreasing their degree of overlap results in the S S h and S J measures returning sensible results, whereas the S D measure overestimates the similarity. In case 3, S S h with minimum t-norm (S S h ∧ ) and S D measures meet the expectation when intervals are of equal size and their intersection is changed proportionately, whereas the S J measure underestimates the similarity. In case 4, we multiply interval end-points by a factor, maintaining the same degree of subsethood. Here, all three measures show scaling invariance in the results as expected. Finally, in case 5, for linear increases in the size of overlap between the intervals of equal size, S S h with minimum t-norm (S S h ∧ ) and S D measures exhibit linearity in the similarity results, whereas the S S h with product t-norm (S S h Π ) and S J measures display convexity. These experiments demonstrate that the proposed SM shows behavior in line with expectation, which is not the case for the other, commonly used measures.

B. Real-World Example
In this part, we have used a real-world dataset to review the behavior of the three SMs [S S h with minimum (∧) and product (Π) t-norms, S J , and S D ], particularly in respect to the aliasing issue. The dataset used for this demonstration is the temperature data of different standard areas (districts) of the UK for the year 2016 from the UK Met Office. The entire dataset is available in [54]. The map of the standard areas used by the UK Met Office is shown in Fig. 8  The intervals are constructed by taking the minimum and the maximum temperature of each area for every season, as shown in Table VII. We have applied the S S h [with minimum (∧) and product (Π) t-norms], S J and S D measures to estimate similarities between all possible pairs of the nine areas for all seasons; here, the results for autumn and spring seasons are reported, which are shown in Tables VIII-XI. We do not include results for other two seasons-winter and summer-as they follow the same pattern.
Regarding similarity based on seasonal temperature, one can expect higher similarity for pairs of areas geographically close to each other (i.e., located in the same latitude), whereas expecting lower similarity for remote areas as it is generally recognized that latitude (i.e., distance from the equator) is an important factor affecting variations in temperature. Hence, in this real-world example, the districts in the Northern UK (e.g., SE and SW) are likely to be more similar as they are in the South (e.g., ML and EA). By the same token, a district in the north will be highly dissimilar to a district in the south. Generally, all three SMs appear to meet up with such expectations. For instance, considering autumn temperatures between SN and the other eight areas from geographically close to far off, three SMs appear to generate higher to lower similarity as temperature-range differs more.
However, in several cases, the S J and S D measures suffer from the aliasing-returning the same similarity for different ranges of temperature. As mentioned earlier, both measures provide identical similarity for pairs of areas as long as the size of the intersection and union of their temperature-range remain constant. Tables VIII and IX show that they produce the same similarity (i.e., 0.7396 and 0.8503, respectively) for pairs of areas, (EEN versus EA) and (ENW versus ESC) in spring, even though they have separate temperature ranges. Analogously, we also notice identical similarities for the pairs, (SW versus ML) and (EA versus ENW), and the pairs (ML versus ENW) and (ESW versus EEN) in autumn. In contrast, the S S h measure with minimum and product t-norms (i.e., S S h ∧ and S S h Π ), responding to variation in temperature range, yields a distinct outcome for all of these pairs, as shown in Tables X and XI. This real-world example showing the aliasing inherent to the S J and S D measures highlights the potential for misleading inference in practice. In particular, while clustering the areas with respect to temperature, some areas may be placed in the same group albeit having different ranges of temperature. This can be an issue when we compare real-world interval data for the purpose of grouping, ranking, and decision making. Therefore, an SM that avoids this aliasing-such as the proposed measure, is desirable.

V. CONCLUSION
The contributions of this article centered on to identification and articulation of potential shortcomings-specifically aliasing-affecting popular set-theoretic SMs. The latter have the potential to give misleading results, in turn affecting realworld applications dependent on the robust comparison of intervals. In order to address this limitation, this article developed the underpinnings of robust SMs by putting forward five new axioms that complement the traditional axioms associated with the most common SMs. Building on this set of nine axioms, this article established a new set-theoretic SM for intervals based on their bidirectional subsethood. The new SM was shown to avoid shortcomings, such as aliasing, while delivering intuitive results (e.g., being bounded above and below by Jaccard and Dice SMs), facilitating its potential use in real-world applications. As part of the development of the new SM, the article also put forward the definition of subsethood for intervals and proofs of its mathematical properties (see Appendix). Finally, a corresponding dissimilarity or DM was derived from the new SM, which was also proven to be a metric.
At an experimental level, the article provided a detailed investigation contrasting the behavior of the proposed SM vis-a-vis the Jaccard and Dice SMs-using both synthetic and real-world interval-valued data. The exhaustive analyses confirmed that the new measure exhibits desirable behavior while maintaining all essential features of an SM. In particular, the new SM is resilient to aliasing and provides desirable results in respect to all key features (e.g., linearity and scaling invariance), whereas popular SMs are shown to produce counter-intuitive results in some cases.
In the future, we plan to use this new measure for assessing similarity between discontinuous intervals and to develop corresponding extensions of the SM for type-1, interval-valued, and type-2 fuzzy sets using the α-cut decomposition representation. We also aim to explore it for capturing the mutual agreement of interval-valued evidence for aggregation as well as clustering and classification of real-world interval-valued datasets.

APPENDIX DEFINITION OF SUBSETHOOD FOR CLOSED INTERVALS
Building on the axiomatic definitions of subsethood in the literature (see Section II-C), the key properties of subsethood for intervals are captured in the following theorem.