4. The Davidson Model

The Davidson Model

声明:本文为本人毕业研究报告《The Exploration of Pairwise Comparison in Football Application》中的部分内容摘录与整理,仅用于学习与交流。

Introduction

Roger R. Davidson [3] introduced a scaling constant v, and based on Luce’s Choice Axiom [4] and the geometric mean, Davidson established a new model which unifies the model’s denominator. Although both the Rao-Kupper model [1] and Davidson model are extensions of the Bradley-Terry model [2], there are significant differences.

In the search for Maximum Likelihood Estimators (MLEs), the existence and uniqueness of solutions were demonstrated based on Ford’s Assumption [5].

Luce’s Choice Axiom

Luce’s Choice Axiom
Given a finite set of treatments {1,2,...,n} with associated worth πi0 for each treatment i and i=1nπi=1, Luce’s Choice Axiom can be concisely formalized as:

For i,j{1,2,...,t} and ij,p(ii,j)p(ji,j)=πiπj,

where p(li,j) is the probability of choosing treatment l from the pair i,j, with l=i or l=j, under the condition that p(ii,j)0 and p(ji,j)1.

Non-empty Subset Preference

Non-empty Subset Preference (Assumption)
The model assumes that in every division of treatments into two groups, at least one treatment in one group is preferred over another in the opposite group at least once. It is critical for ensuring that the likelihood function behaves properly and that a global maximum exists.

Model Formulation 4

In the context of pairwise comparison among n treatments, where each treatment i’s inherent preference (or stimulus strength), denoted by πi, satisfies the normalisation i=1tπi=1 with πi0 for all i.

For scenarios without a clear preference, p(0i,j), is calculated as proportional to the geometric mean of individual preferences:

p(0i,j)=vp(ii,j)p(ji,j),

where v0 acts as a scaling constant. Based on Lemma equation (above), the adapted model is:

p(li,j)=πlπi+πj+vπiπj,l=i,j,p(0i,j)=vπiπjπi+πj+vπiπj,

ensuring the total probability constraint:

p(ii,j)+p(ji,j)+p(0i,j)=1.

p(ii,j) is the probability that i is preferred over j. Similarly, p(ji,j) is the probability that j is preferred over i. p(0i,j) represents the probability of a tie.

It is also important to note that the Bradley-Terry model forms a special case of both the Rao-Kupper model when the threshold parameter θ=1, and of the Davidson model when the scaling constant v=0.

The Log-likelihood function:

lnL(π,v)=12i=1tsilnπi+Tlnvi<jrijln(πi+πj+vπiπj)

where

  • si=2wi+ti, i=1,...,t is the total number of wins and ties for treatment i.
  • wij and wji is the number of times treatment i preferred over j, and vice versa. wi=jwij.
  • tij is the number of times neither treatment is preferred, ti=jtij.
  • T=i<jtij represents the total number of ties across all treatment comparisons.
  • W is a matrix of wins [wij;i,j=1,...,t].
  • rij is the number of independent responses for the comparison of treatments i and j, rij=wij+wji+tij.
  • rii=wii=tii=0.

Each treatment is paired with others, and responses are independently recorded for each pairwise comparison. The total number of such comparisons is calculated as N=i<jrij.

The maximum likelihood estimates (MLE) (p,v^) for the parameters (π,v) is obtained by solving:

si/pigi(p,v^)=0,for i=1,...,tT/v^h(p,v^)=0

with the functions:

gi(p,v^)=jrij(2+v^pj/pi)/(pi+pj+v^pipj),h(p,v^)=i<jrijpipj/(pi+pj+v^pipj),

where

  • p=(p1,...,pt).
  • π=(π1,...,πt).

The Existence and Uniqueness of Solutions:
Following Ford’s [5] Assumption of Non-empty Subset Preference for the Bradley-Terry model, the maximisation of L(π,v) over the region {πi>0,πi=1,0<v<} is analyzed under a restriction on the matrix W. This setup requires T>0 and sets L(π,v)=0 on the boundary, allowing a uniformly continuous extension to the same region, which establishes the existence and uniqueness of the maximum.

  • For t=2, explicit solutions are given by pi=wiw1+w2 and v^=Tw1w2.
  • For t>2, iterative methods are needed.

Example 4

Using the model in a football tournament scenario among three teams: Team A, Team B, and Team C. Here’s how to calculate probabilities for one of the matches, Team A vs. Team B, to demonstrate the model’s application. Team strengths of A, B and C are πA=0.5,πB=0.3,πC=0.2. The scaling constant v=0.1. According to the equation (above), calculated probabilities:

  • Team A winning: p(AA,B)=πAπA+πB+vπAπB57.1%
  • Team B winning: p(BA,B)=πBπA+πB+vπAπB42.9%
  • Draw: p(0A,B)=vπAπBπA+πB+vπAπB4.8%

Model Derivation 4

The geometric mean is defined as:

G=(i=1nxi)1n

Thus, when n=2, the geometric mean would be x1x2. Then we can easily have the tie equation, where v0 is a scaling constant.

From the Lemma of Luce’s Choice Axiom, express p(ji,j) in terms of p(ii,j):

p(ji,j)=p(ii,j)πjπi

The probability of a tie, p(0i,j), is defined as:

p(0i,j)=vp(ii,j)p(ji,j)=vp(ii,j)(p(ii,j)πjπi)=vp(ii,j)πjπi

Given the total probability equation:

1=p(ii,j)+p(ji,j)+p(0i,j)=p(ii,j)+p(ii,j)πjπi+vp(ii,j)πjπi=p(ii,j)(1+πjπi+vπjπi)

Finally, solving for p(ii,j):

p(ii,j)=11+πjπi+vπjπi=πiπi+πj+vπiπj

Similarly, it is easy to obtain p(ji,j) and p(0i,j).

Now, to find out the maximum likelihood estimates. The likelihood L for all the observed outcomes is the product:

L=i<j[p(ii,j)]wij[p(ji,j)]wji[p(0i,j)]tij

The log-likelihood lnL is:

lnL(π,v)=i<j{wijlnp(ii,j)+wjilnp(ji,j)+tijlnp(0i,j)}=i<j(wijlnπi+wjilnπj+tijln(vπiπj))i<jrijln(πi+πj+vπiπj),

since rij=wij+wji+tij.

By using the properties of logarithms lnab=lna+lnb and lna=12lna:

tijln(vπiπj)=tij(lnv+12lnπi+12lnπj)

Substitute this back into the lnL:

lnL(π,v)=i<j(wijlnπi+wjilnπj+tij(lnv+12lnπi+12lnπj))i<jrijln(πi+πj+vπiπj)=i<j(wijlnπi+wjilnπj)+i<j(12tijlnπi+12tijlnπj)+Tlnvi<jrijln(πi+πj+vπiπj)=i=1t(jiwijlnπi)+i=1t(12jitijlnπi)+Tlnvi<jrijln(πi+πj+vπiπj)

The contributions to each πi from all pairings in which i is involved, either as the preferred or as the compared treatment, across all ji:

i<j(wijlnπi)+i>j(wjilnπj)=i=1t(jiwijlnπi)

Similarly, it is easy find for tij. Thus,

lnL(π,v)=12i=1tsilnπi+Tlnvi<jrijln(πi+πj+vπiπj)

Since, T=i<jtij and si=2wi+ti=ji2wij+jitij.

To find the MLE, we take the partial derivatives of lnL with respect to each parameter πi and v, and set them to zero.

For example, under the constraint itπi=1, set t=2: we have only one pair (i.e., i<j becomes 1<2), simplifying the equation to:

lnL(π1,π2,v)=12s1lnπ1+12s2lnπ2+Tlnvr12ln(π1+π2+vπ1π2)

Given that π1+π2=1, we can substitute π2=1π1. Therefore, the log-likelihood becomes:

lnL(π1,v)=12s1lnπ1+12s2ln(1π1)+Tlnvr12ln(π1+(1π1)+vπ1(1π1))=12s1lnπ1+12s2ln(1π1)+Tlnvr12ln(1+vπ1(1π1))

Derivative with respect to π1:

L=π1w12(1π1)w21vt12lnL=w12lnπ1+w21ln(1π1)+t12lnvlnLπ1=w12π1w211π1=0

Derivative with respect to v:

lnLv=Tvr12π1(1π1)1+vπ1(1π1)=0

Then solving for v:

v=T(r12T)π1(1π1)

where r12=w12+w21+T, and under the assumption that T>0. Therefore, the MLEs are:

p1=w12w12+w21,p2=w21w12+w21,v^=Tw12w21

Conclusion

To sum up, In the Rao-Kupper model [1], the probabilities pij and pji each have denominators influenced by the opposing stimuli strengths πj and πi. On the other hand, the Davidson model [3] standardizes the denominators of p(ii,j) and p(ji,j) by combining the geometric mean of the two stimulus strengths πiπj. This results in the same denominator in the Davidson model, thereby simplifying its development and analysis.

However, the previous models still have shortcomings when considering real-life application scenarios, such as quantifying the defensive and offensive abilities. Maher model [6] is based on the Poisson distribution, resolves these issues and calculates the expected number of goals scored by each team.

References

[1] P. V. Rao and L. L. Kupper. Ties in paired-comparison experiments: A generalization of the Bradley-Terry model. Journal of the American Statistical Association, 62(317):194–204, Mar 1967. Available

[2] Ralph Allan Bradley. Some statistical methods in taste testing and quality evaluation. Biometrics, 9(1):22–38, 1953.

[3] Roger R. Davidson. On extending the Bradley-Terry model to accommodate ties in paired comparison experiments. Journal of the American Statistical Association, 65(329):317–328, 1970.

[4] R.D. Luce. Individual Choice Behavior: A Theoretical Analysis. Wiley, 1959.

[5] Jr. Ford, L. R. Solution of a ranking problem from binary comparisons. The American Mathematical Monthly, 64(8):28–33, 1957. Part 2: To Lester R. Ford on His Seventieth Birthday.

[6] M. J. Maher. Modelling association football scores. Statistica Neerlandica, 36:109–118, 1982.


“觉得不错的话,给点打赏吧 ୧(๑•̀⌄•́๑)૭”

微信二维码

微信支付

支付宝二维码

支付宝支付

4. The Davidson Model
http://neurowave.tech/2023/12/11/11-4-Davidson/
作者
Artin Tan
发布于
2023年12月11日
更新于
2025年8月2日