Background

The PGA (Professional Golfers Association) Tour has been around since 1929 and is the main organizer for professional golf events in the United States. Each year, countless amateurs and professionals compete for their “tour card” which allows them to play in various tour events throughout the season. Currently, there are 215 tours cards active on tour, but that number can change based on a variety of factors. To maintain their PGA membership, players must participate in 15 events throughout the 44 tournament season.The PGA tour is a very competitive landscape in which some golfers will spend years on the Korn Ferry Tour fighting for their shot at a tour card. The new PGA Tour U now gives a fast pass to top performing collegiate athletes. The PGA tour is arguably the most competitive golf league and produces some of the best golfers from around the world. I wanted to look at this data to find important aspects of success on the PGA Tour and how amateurs can use this information to improve in those areas.

Data

The data used in this for this analysis contains PGA Tour Data from 2015-2018. The data is organized by golfer for each season they played between 2015 and 2018. This data includes key statistics such as total money earned. wins, and other performance indicators. Below, the data dictionary describes “Strokes Gained” which compares a player’s performance on each shot to the performance of other players on the same shot, with the goal of determining which parts of a player’s game are helping or hurting their overall score relative to the field. Data can be found here

Data Dictionary

Column Meaning
Player Name Player’s name
Rounds Rounds played
Fairway Percentage Percentage of fairways hit
Year Year of PGA Season
Average Distance Average distance of the player’s drives
gir Percentage of greens hit in regulation
Average Putts Number of putts per round
Average Scrambling % of pars after missed GIR
Average Score Average strokes per round
Points FEDEX points in the season
Wins Number of wins in the season
Top 10 Number of top 10 finishes in the season
Average SG Putts Average Strokes Gained: Putting
Average SG Total Average Strokes Gained: Total
SG:OTT Strokes Gained: Off the Tee
SG:APR Strokes Gained: Approach
SG:ARG Strokes Gained: Around the Green
Money Money won during the season
Avg Distance Fairway Percentage Average Putts gir
mean 292.6473 60.97703 29.09233 65.89779
max 319.7000 76.88000 30.54000 73.52000

These summary statistics of PGA tour golfers is vastly different from a typical casual golfer who plays on the occaisonal weekend. For example, a golfer who averages a bogey per hole, their driving distance on average is 80 yards shorter than those on tour.

Analysis

Who were the top 10 winners between the four seasons?

Player and Year Money (In $USD) Wins Top 10
Jordan Spieth 2015 12,030,465 4 14
Justin Thomas 2017 9,921,560 4 9
Jordan Spieth 2017 9,433,033 3 8
Jason Day 2015 9,403,330 3 8
Dustin Johnson 2016 9,365,185 2 12
Dustin Johnson 2017 8,732,193 3 7
Justin Thomas 2018 8,694,821 3 8
Dustin Johnson 2018 8,457,352 3 10
Hideki Matsuyama 2017 8,380,570 3 7
Justin Rose 2018 8,130,678 2 8

In this list of top 10 money earned in a season, three golfers cracked this list multiple times. Dustin Justin appeared 3 times and Justin Thomas and Jordan Spieth each appeared twice. Jordan and Justin were each 22 and 24, respectively, at the time of their top performing season. Both of these golfers won the FEDEx cup that year, which has the largest purse of any event on tour. Dustin Johnson, although he was 36 at the time of his highest performing season, was one of the top performers for three years straight.

Who has won the most money between 2015 and 2018?

Player Name Money (In $USD)
91 Dustin Johnson 32,064,197
149 Jordan Spieth 29,795,504
127 Jason Day 25,514,084
153 Justin Thomas 25,021,311
152 Justin Rose 19,924,028
48 Brooks Koepka 19,600,737
114 Hideki Matsuyama 19,020,620
214 Rickie Fowler 18,805,427
51 Bubba Watson 17,386,516
202 Patrick Reed 17,331,519

Between the 2015 through 2018 PGA Tour seasons, there is a wide range of prize money won between the top performing golfer and the tenth best golfer. In fact, Dustin Johnson won nearly double the tenth best golfer that season. This data honestly surprised me as I thought this number would be a larger over four seasons. Golf feels like a high paying sport, yet compared to other sports, there are professional athletes making well over $30 million a year. This difference in salaries is one of the leading arguments for golfers to leave the PGA Tour and join the LIV Tour.

How does average driving distance change each year?

Years 2015 and 2016 had very similar driving data, with an average distance around 290 yards. In 2017 there was a shift right, with a slight increase in average driving distance off the tee. In 2018, we begin to see an obvious increase in driving distance, with almost a 10 yard increase in driving distance. In 2018, manufacturers rolled out new drivers, with Callaway, Talyormade, and Ping topping the rankings. These drivers included major changes that increase forgiveness, the distance for drives that were not hit center on the face. By increasing this distance on mishits, it reduces the number of drives that negatively impact the distance.

How does average driving distance affect fairway percentage?

So many amateur and casual golfers are so caught up in trying to hit their drives further, yet there is another metric that may have a better impact on scoring than distance. Fairway percentage calculates the percent of holes where your drive lands in the fairway rather than the rough. By increasing this number, you will have a better lie and can have fewer obstacles while trying to hit the green. Although increasing your driver distance can be beneficial, you run the risk of losing accuracy. Looking at the graph, there is a distinct negative relationship between driver distance and fairway percentage. As driving difference increases, golfers on tour are tending to find the fairway less.

Does GIR or FW% have more of an affect on scoring?

Although there is so much importance placed on driving and getting off the tee, approach shots into the green are also an important aspect of the game that have a large impact on scoring. A green in regulation is defined as hitting the putting green in two strokes under par or better. For example, hitting the green 2 stokes on a par 4 would be a GIR. GIR% calculates the percentage of holes that a GIR was recorded. Looking at the two graphs, it is apparent that golfers who have a higher GIR% score better compared to golfers with a high fairway percentage.

Does Total Strokes Gained affect money won?

Strokes gained is a metric that helps calculate a golfer’s performance compared to the rest of the field. This can apply for areas of golf such as Off the Tee, Approach, Putting, etc. It is a good metric for golfers use to understand areas for improvement and areas of strength.

Drive for show, putt for dough?

What skills on the course translate to the most money won?

One of the oldest sayings in golf: Drive for show, putt for dough. But is that true? Using the Strokes gained metric, we can see how different skills around the course can translate to money won on tour. The least impact on money won on tour from these three metrics is SG: Around the Green. Surprisingly, putting does not have as large of an impact as expected, yet is still an exponential growth as golfer surpasses a positive strokes gained value. As PGA tour golfers approach +1 Off the Tee, the expected amount of money won in a season increases rapidly. This is a surprising result as the saying puts such an emphasis on putting over driving.

Text Analysis

To go along with the data that I looked at in the previous analysis, I was curious to look at the Wikipedia pages of those golfers to see if there are any trends to analyze. The text data was scraped from the complete Wikipedia pages of 294 golfers and analyzed to see different trends accross the pages.

Which golfers have the most postive sentiment in their Wikipedia page?

Golfer positivity
Rory McIlroy 222
Jordan Spieth 155
Retief Goosen 131
Lee Westwood 124
Ernie Els 121
Dustin Johnson 116
Phil Mickelson 112
Luke Donald 109
Padraig Harrington 109
Tiger Woods 101

Rory McIlroy is the clear leader in positivity scoring, which is calculated by subtracting the number of negative words from the number of positive words in their page. Rory is often touted as having the best swing and golf, and he has had a lot of success in his career. Tiger Woods is a golfer who has found a lot of success, winning 82 tournaments and 15 majors. 82 tournaments is tied for the most ever and will likely not be broken for a long time. Tiger Woods’ Wikipedia paage has 4,687, which is a little over 100 more words than Rory McIlroy. Although Tiger has the highest word count, when you compare his positivity scoring to Rory, Tiger only has a positvity score half of Rory’s. Tiger has had a long and successful career, yet his Wikipedia page’s scoring has likely taken a hit due to controversy and his life away from the golf course.

What are the most common words across all golfer Wikipedia pages?

Looking at the graph, a lot of these words are not surprsing to see, as Wikipedia addresses the golfer young and professional lives. One thing that was interesting to me in this graph is that tour is found in these Wikipedia pages almost twice as much as its counterpart PGA. I was really surprised that there was such a differencem until I thought about how it may be used in a sentence. For example, it was used in this context on Jordan Spieth’s Wikipedia page: “for leading the tour in scoring average.” It is also used in the Korn Ferry Tour, and other amateur or minor golf leagues.

Does winning more money increase the word count of a golfers Wikipedia page?

This graph is really interesting to look at, as there is a positive relationship between money won and the word count of the respective Wikipedia page. One draw back to this graph is that it does not highlight the total money won by the golfer in their career. This would be a better way to look at the correlation. For example, Tiger Woods and Ernie Els won a majority of their tournaments in their career before the 2015-2018 window. They have a lower total money won in this time period compared to other golfers in their prime at this time.

Conclusion

In conclusion, the PGA Tour is a highly competitive and challenging league for professional golfers. The data analyzed in this article highlights important aspects of success on the PGA Tour, such as driving, strokes gained, and competitive finishes. Key takeaways from the data include the dominance of three golfers in these four seasons: Dustin Johnson, Jordan Spieth, and Justin Thomas. Additionally, the data highlights the importance of distance and accuracy when driving, which can also be complemented by other skills like putting. Overall, this analysis provides valuable insights for aspiring golfers and fans alike, as this can help to understand some of the data professional golfer’s look at when improving aspects of their game. Keep this up and you may have a Wikipedia page of your own one day!