Despite these difficulties, we can calculate a TSR from previous matches for each of the two teams in a Premier League contest. We can further use historical TSR match ups from previous seasons to establish the relationship between each team’s pre game TSR and the actual outcome of the game.
Logistic regression, where an outcome, such as a home victory, either happens or it doesn’t, is one route to determining match probabilities from game day TSRs for each team.
When Sunderland travelled to Tottenham on April 7th 2014, their TSR over the last 30 games was 0.43, over the last 10 it was slightly better, at 0.44. That of their hosts was 0.56 and 0.51, respectively.
Using historical, out of sample results from five previous seasons from soccer-data.co.uk and a logistic regression, the chance of a Spurs win at home to Sunderland based on 30 game averages was 68%.
The calculation involves two steps, where the respective home and away constants are obtained by running a logistic regression on matches played during the five most recent seasons of Premiership games, excluding 2013/14. The predictor variables were each team’s TSR from their previous 30 league games and the output was whether or not the result was a home win.
Firstly, calculate F= (8.19*HTSR)-(6.44*ATSR)-1.08;
Where HTSR is the TSR of the home side over the previous 30 matches and ATSR is that of the away team.
To convert this to a probability of the home team winning the game, finally take;
Home Win Probability = (exp^F)/(1+exp^F) = 0.68
The 30 game HTSR for Spurs was 0.56 and Sunderland’s ATSR was 0.43, giving Tottenham a likely winning probability of 0.68 when these numbers are put into the equations above.
This compares to a much lower home win probability of 0.53, when judged on TSR over just the previous 10 games. The home and away constants used to determine the value of F using TSR from the previous 10 matches are 7.11 and 5.12 respectively and the lone constant is 1.23. These slightly different constants are again derived from five seasons of Premiership data, but using each team’s TSR over the previous 10 games.
Bookmarks