1. The Thurstone Model

The Thurstone Model

声明:本文为本人毕业研究报告《The Exploration of Pairwise Comparison in Football Application》中的部分内容摘录与整理,仅用于学习与交流。

Introduction

The method of paired comparisons, originally proposed by Thurstone [1], is a cornerstone of psychometrics and preference modeling. It provides a statistical framework for inferring psychological scales from pairwise choices or judgments.

Thurstone’s model is a method for sensory difference testing that ranks various stimuli on a perceptual scale for comparative analysis. In this model, a set of stimuli denoted as O1,O2,,On are quantitatively compared based on the sensations Xi they evoke in individuals. These sensations are assumed to be normally distributed across a population, with each Xi lying on a sensation continuum S.

The sensation continuum S serves as the quantitative scale on which these sensations are placed, with each sensation Xi having an associated mean value Si. This mean Si represents the average perceived intensity or preference of stimulus Oi across the population, situating each stimulus on the continuum based on its evoked sensation’s mean.

The following model formulation encapsulates a concise mathematical summary of Frederick Mosteller’s [2] idea, transforming his conceptual research into a structured mathematical model.

Model Foundation

Given Xi and Xj as sensations evoked by the ith and jth stimuli, respectively, in an individual I, for a population of individuals, assume:

  1. Xi and Xj are jointly normally distributed with parameters:
    • Mean of Xi is μ(Xi)=Si for i=1,2,,n.
    • Variance of Xi is Var(Xi)=σ2(Xi)=σXi2 for all i=1,2,,n.
  2. The correlation between Xi and Xj is denoted by ρ(Xi,Xj)=ρij=ρ and assumed to be the same for all pairs (i,j) where i,j=1,2,,n and ij.

For each i,j=1,2,,n, ij, a multivariate normal (or Gaussian) distribution for a 2-dimensional random vector (Xi,Xj) can be defined as follows,

(Xi,Xj)N(μ,Σ)

where the joint mean μ is

μ=[SiSj]

and the joint covariance matrix Σ is

Σ=[σ2ρσ2ρσ2σ2]

with σ2 representing the common variance for all sensations and ρσ2 indicating the covariance between any two sensations Xi and Xj, underpinned by the correlation coefficient ρ.

Remark:
Thurstone’s original Case V assumed zero correlation (ρ=0) among stimuli, but Mosteller [2] demonstrated that Case V also holds if all pairwise correlations are merely equal, i.e., ρij=ρ for all ij. This relaxed condition is more reasonable in practical data analysis.


For further explanation of Model Foundation. The covariance of two random variables Xi and Xj is defined as:

Cov[Xi,Xj]=E[(Xiμi)(Xjμj)]

where μi=E[Xi] and μj=E[Xj].

The correlation coefficient between Xi and Xj is defined as:

ρij=Cov[Xi,Xj]Var[Xi]Var[Xj]

where Var[Xi]=σXi2 and Var[Xj]=σXj2.

Since Var[Xi]=Var[Xj], the denominator becomes

Var[Xi]Var[Xj]=σXi2σXj2=σ2

Thus,

ρij=Cov[Xi,Xj]σ2

where Cov[Xi,Xj]=ρσ2.

The joint covariance matrix is defined as:

Σ=[σXi2Cov[Xi,Xj]Cov[Xj,Xi]σXj2]

Note that Cov[Xi,Xj]=Cov[Xj,Xi], therefore, this simplifies to the previous form.


Now, we know the foundation of the model. However, there is an issue that has to be solved while building the model. Let’s look at Figure 1: there is X2<X1 when S1<S2, this variability is due to individual differences in perception and is captured by the standard deviation (σ) of the sensations.

Figure 1: The occurred issue because of individual differences in perception

Therefore, when comparing two stimuli, individuals are required to make a definitive choice between them, so there is no tie in comparison. This binary (yes/no) outcome is crucial for determining the probability (pij) that one stimulus is perceived as stronger than the other.

The assumption is that in an idealized scenario, researchers know the exact proportion of times one stimulus is perceived as stronger than another. This knowledge allows them to precisely calculate the spacing of stimuli on the sensation continuum, reflecting not just the order in which stimuli are ranked but also the magnitude of differences between them.

Objective

Our aim is to determine the relative spacings of a set of stimuli, denoted by S1,S2,,Sn, on a sensation scale. These spacings are derived from probabilities pij, which represent the frequency at which a sensation evoked by stimulus i is preferred over that by stimulus j.

Note:
The scale values Si are only determined up to a linear transformation (i.e., they are relative positions), reflecting the inherent arbitrariness of the zero point and scale in psychological measurement.

Model Formulation 1

Let Xi and Xj denote the sensations evoked by the ith and jth stimuli, respectively, in an individual. These sensations are assumed to be jointly normally distributed across a population, with means Si and Sj, and a shared variance σ2. The correlation between any two sensations Xi and Xj is denoted by ρ, assumed to be equal across all pairs of stimuli.

The probability of preference pij, i.e., the likelihood that sensation Xi is greater than Xj, is calculated as:

pij=P(Xi>Xj)=12πσ(dij)0exp{[dij(SiSj)]22σ2(dij)}ddij

where

  • dij=XiXj is the difference in sensations between stimuli i and j.
  • σ2(dij)=2σ2(1ρ) defines the variance of the difference in sensations.

The integral calculates the area under the curve of the PDF for dij, adjusted for the mean difference in sensations SiSj and normalized by the variance of dij. The variance is normalized by setting 2σ2(1ρ)=1.

To simplify for pij with the normalized variance:

pij=12π(SiSj)e12y2dy

Given pij, (SiSj) can be solved for using the standard normal distribution table. By setting S1=0 as a reference point (a baseline), the relative mean sensations Si for all stimuli can be calculated.

Example 1

Consider a football scenario where we want to assess which of two players, Player A and Player B, might perform better in terms of goal scoring in a specific match based on their past performance. Assume their goal-scoring abilities are normally distributed sensations XA and XB. SA and SB represent the average number of goals scored per match by Player A and Player B, respectively. Both have the same variability in their performance (σ2), and ρ=0.5 due to similar conditions.

Let:

  • SA=0.8
  • SB=0.5
  • σ2=0.2
  • ρ=0.5

So, σ2(dij)=2×0.2×(10.5)=0.2.

To find pij=P(XA>XB):

SASB=0.3pij=12π(SASB)e12y2dy

Using a standard normal distribution table, pij1Φ(0.3)10.3821=0.6179.
So, Player A has a 61.79% chance to score more than Player B.

Application:
The Thurstone model is widely used in product taste testing, consumer preference studies, psychological ranking tasks, and any context requiring conversion of paired choice data into a latent scale.

Model Derivation 1

Assume two normally distributed random variables Xi and Xj with means Si and Sj, and a common variance σ2. The correlation between Xi and Xj is ρ. Define dij=XiXj.

  • Mean of dij:
μdij=E[XiXj]=SiSj
  • Variance of dij:
σ2(dij)=Var[XiXj]=σ2+σ22ρσ2=2σ2(1ρ)

A random variable ZN(μ,σ2) has PDF:

f(z)=12πσ2exp{(zμ)22σ2}

Apply to dij:

f(dij)=12πσ2(dij)exp{[dijμdij]22σ2(dij)}

Compute the probability pij=P(dij>0):

pij=0f(dij)d(dij)=12πσ(dij)0exp{[dij(SiSj)]22σ2(dij)}d(dij)

Now, to normalize variance, set:

σ2(dij)=2σ2(1ρ)=1

This aligns the distribution with the standard normal N(0,1):

y=dij(SiSj)σ(dij)=dij(SiSj)

So, change of variable for the integration:

pij=12π(SiSj)exp{y22}dy

Conclusion

To sum up, the Thurstone model [2] is based on the idea of a continuum where each item (or stimulus) has a location that represents its "strength" or preference level. The probability of preferring stimulus i over j (pij) is modeled as the area under the normal distribution curve to the right of the difference in perceived strengths (SiSj). This model inherently assumes that the differences in perceptions are normally distributed.

However, the Thurstone framework actually includes several "cases":

  • Case V: Assumes equal variances and equal (often zero) correlations (as discussed here).
  • Cases III and IV: Allow for unequal variances and/or unequal correlations; they are mathematically more general but computationally more complex [1,2].

However, the Thurstone model, while complex in its probabilistic approach to preference analysis, is limited by its dependence on a normal distribution and a constant correlation between stimuli. These assumptions may not hold in real-world scenarios, making the model computationally intensive and less suitable for large datasets or environments requiring quick decisions.

The simplified equation represents the prototype of the Bradley-Terry model [3], which resolves these issues by simplifying the distribution assumptions into a simple ratio-based model, enhancing both computational efficiency and applicability.

References

  • [1] L.L. Thurstone. Psychophysical analysis. American Journal of Psychology, 38:368–389, 1927.
  • [2] Frederick Mosteller. Remarks on the method of paired comparisons: I. The least squares solution assuming equal standard deviations and equal correlations. Psychometrika, 16(1):3–9, 1951.
  • [3] Ralph Allan Bradley. Some statistical methods in taste testing and quality evaluation. Biometrics, 9(1):22–38, 1953.
  • [4] J. P. Guilford. Psychometric Methods. McGraw-Hill Book Co., 1936.

“觉得不错的话,给点打赏吧 ୧(๑•̀⌄•́๑)૭”

微信二维码

微信支付

支付宝二维码

支付宝支付

1. The Thurstone Model
http://neurowave.tech/2023/12/01/11-1-Thurston/
作者
Artin Tan
发布于
2023年12月1日
更新于
2025年8月2日