2. The Bradley-Terry Model

The Bradley-Terry Model

声明:本文为本人毕业研究报告《The Exploration of Pairwise Comparison in Football Application》中的部分内容摘录与整理,仅用于学习与交流。

Introduction

The core Bradley-Terry model was proposed by Ralph Allan Bradley [1], and it was further redefined by transitioning from the normal distribution used in Thurstone’s model [2] to a squared hyperbolic (Logistic Density) secant distribution. The following model formulation provided here primarily focuses on the Bradley-Terry model, intentionally omitting Thurstone’s model which was detailed earlier.

Model Formulation 2

Based on the context of Thurstone’s model for sensory difference testing, where stimuli O1,O2,...,On evoke sensations Xi quantitatively compared on a sensation scale S, the probability of preference pij between stimuli is explored.

Building upon Thurstone’s theoretical foundation to achieve a new model of preference probabilities via the logistic density (Squared Hyperbolic Secant Density), the probability of preference pij is redefined as:

(9)pij=P(Xi>Xj)=14(SiSj)sech2(y2)dy,

where Si=loge(πi) and Sj=loge(πj) denote the log-transformed strengths of stimuli i and j, thus measuring stimulus strength on Thurstone’s perceptual scale.

Simplifying this to the Bradley-Terry model:

(10)pij=πiπi+πj.

The probability pij that stimulus i is preferred over stimulus j can be articulated in terms of their relative strengths or perceived intensities, πi and πj.

Example 2

Let’s apply this model to assess which of two football players, Player A and Player B, is more likely to be perceived as the better player based on their performances in a season. Assume the strengths of the players are quantified based on their contribution scores, such as goals, assists, and defensive actions, over the season.

Player A’s strength, πA, is 20 (goals + assists + key defensive actions).
Player B’s strength, πB, is 15.

Calculate the probability of preference pij:

pij=πAπA+πB=2020+15=20350.571

This means there is a 57.1% chance that Player A is perceived as the better player compared to Player B based on their performance metrics.

Model Derivation 2 (Method 1)

The logistic probability density function (PDF) with location parameter μ and scale parameter s is given by:

f(x;μ,s)=exμss(1+exμs)2

Simplifying for μ=0 and s=1:

(11)f(x;0,1)=ex(1+ex)2.

The sech function is defined as the reciprocal of the hyperbolic cosine (cosh) function:

sech(x)=1cosh(x)=2ex+ex.

Now, observe that the denominator of equation (11) can be related to the cosh function:

1+ex=2ex/2(ex/2+ex/22)=2ex/2cosh(x2)

Then, rewrite equation (11):

f(x;0,1)=ex(2ex/2cosh(x2))2=141cosh2(x2)=14sech2(x2),

therefore, the redefined Thurstone’s model (equation 9) of the core Bradley-Terry model follows the logistic distribution (sech2).

Considering the difference in the log-transformed strengths:

(12)ΔS=SiSj=loge(πi)loge(πj),

we have the cumulative probability that Xi>Xj under the transformed logistic distribution:

pij=ΔS14sech2(x2)dx.

The cumulative distribution function (CDF) of the logistic distribution is given by:

(13)F(x)=11+ex

where μ=0 and s=1.

Substitute equation (12) into the logistic CDF (13) to calculate the probability pij:

(14)pij=F(ΔS)=11+e(loge(πi)loge(πj))

To simplify using properties of logarithms:

pij=11+eloge(πi/πj)

Since eloge(x)=1x, apply this to the formula:

pij=11+1πi/πj=11+πjπi=πiπi+πj

Method 2

Lecture Notes 24 [3] provide a detailed derivation of the core Bradley-Terry model. Let pij denote the probability that item i is preferred over item j, modeled as an independent Bernoulli random variable. The log-odds of pij is given by

log(pij1pij)=SiSj,

where Si=log(πi) and Sj=log(πj) represent the log-transformed strengths of items i and j, respectively.

Eliminate the logarithm, yielding

pij1pij=eSiSj.

Rearrange the equation to solve for pij:

pij=(1pij)eSiSjpij+pijeSiSj=eSiSj

which simplifies to

pij=eSiSj1+eSiSj=elog(πi)log(πj)1+elog(πi)log(πj)

Using the property that elog(x)=x,

pij=πi/πj1+πi/πj=πiπi+πj=eSieSi+eSj

which matches the stated form.

Conclusion

To sum up, when transformed by the logarithm (Si=log(πi)), it is used primarily to simplify multiplicative relationships into additive ones. Representation of the model as a ratio manages the original multiplicative relationship in a way that improves computational efficiency and numerical stability.

However, the Bradley-Terry model [1] has certain limitations, as it requires making a clear preference between any two items, which does not align with real-world application scenarios. Next, we will introduce the Rao-Kupper model [4] to address this issue by considering a third situation: considering the unsure case of A and B (either is possible).

References

  • [1] Ralph Allan Bradley. Some statistical methods in taste testing and quality evaluation. Biometrics, 9(1):22–38, 1953.
  • [2] Frederick Mosteller. Remarks on the method of paired comparisons: I. the least squares solution assuming equal standard deviations and equal correlations. Psychometrika, 16(1):3–9, 1951.
  • [3] STATS 200: Introduction to Statistical Inference. Lecture 24 — the bradley-terry model. https://web.stanford.edu/class/archive/stats/stats200/stats200.1172/Lecture24.pdf, 2016. [Online; accessed 9-April-2024]
  • [4] P. V. Rao and L. L. Kupper. Ties in paired-comparison experiments: A generalization of the bradley-terry model. Journal of the American Statistical Association, 62(317):194–204, Mar 1967. Available: https://www.jstor.org/stable/2282923

“觉得不错的话,给点打赏吧 ୧(๑•̀⌄•́๑)૭”

微信二维码

微信支付

支付宝二维码

支付宝支付

2. The Bradley-Terry Model
http://neurowave.tech/2023/12/11/11-2-Bradley-Terry/
作者
Artin Tan
发布于
2023年12月11日
更新于
2025年8月2日