D’Artagnan Capital Fund Final Analysis

Author

Conall Kellner

Reason For Analysis

As a finance major at Xavier University about to head into the D’Artagnan Capital Fund (DCF), I wanted to know more about their choices in holdings. Xavier being a Catholic university they should support the Catholic teachings and morals in everything they do. Using this as a benchmark for the main portion of my analysis, I looked at how well the DCF follows catholic morals with their holdings. Mainly looking at the holdings’ climate scores and the performances of higher rater versus lower rated climate score companies. Then making a conclusion about any bias towards worse rated climate companies and discussing if the DCF should make changes to stick closer to Catholic teaching.

Another portion of the analysis includes sentiment involving emotional, positive, and negative words found in each companies’ history. For this portion I primarily looked for any bias in the way Wikipedia writes company histories. Bias would include older companies having more positive words in their histories, or more profitable companies having more positive words.

What is the D’Artagnan Capital Fund

The D’Artagnan Capital Fund is an actively-managed opportunities fund that focuses on investments in the large-cap and greater equity universe through a bottom-up valuation approach. Equities presented in the fund are researched extensively by sector analysts with the direction of portfolio managers who are responsible for their respective sectors. Through rigorous peer review of valuation models, research, and investment rationales, we seek to continuously outperform our benchmark, the S&P 500 Total Return index, on a risk-adjusted basis while remaining within our compliance by selecting the most mispriced equities in the universe that we can choose from.

This excerpt above was pulled directly from the DCF’s website. If you would like to investigate the fund further, here is the link to their home page: https://www.xavier.edu/capital-fund/.

The Data

The DCF currently has holdings in 41 publicly traded companies on the New York Stock Exchange (NYSE) and the Nasdaq. I gathered this data by using Rstudio to scrape financial information from Google finance about these 41 companies. Because stock prices and financial data change often I will mention that all of this data was taken on April 18th, 2024 at 10:30 am EST. The following is a list of all the groups of data I gathered from Google finance with a description of what each column has inside of it.

Another note, some values will show as null, NA, or missing. This does not mean that the company does not have data for that column. For instance some companies may be shown with NA company history. This is simply because this was not listed on Google finance. Some of the stock information being null however may mean that that stock does not provide dividends. For example, Tesla does not issue dividends and therefore will not have a dividend yield in this data.

If you would like to use the data for your own research, the following link can be used to download it. https://myxavier-my.sharepoint.com/:x:/g/personal/kellnerm3_xavier_edu/EdFnsRsdYdVDjj32a8NUKS8BpHrI2cCizU0DylmluDkEJQ?download=1.

Data Dictionary

(All prices are in USD$)

  • Company name - Name of the publicly traded company and/or stock.

  • Exchange - The public exchange that the stock is traded on. Either NYSE or Nasdaq

  • Climate Score - A score provided by CDP (formerly the Carbon Disclosure Project) that rates a company on its climate transparency and performance. Not all companies had a score available in Google Finance, so some are null. Here is a link to their website if you want more information: https://www.cdp.net/en/

  • CEO - The company’s current Chief Executive Officer

  • Founded - The date the company was founded. Some dates only include the year while others are specified down to the day

  • Headquarters - The name of where the HQ of the company is. Google Finance did not have some of the HQ’s listed and as a result some may be Null.

  • Stock Price - The current stock price as of April 18th, 2024 at 10:30 am EST.

  • Previous close - The stock price at the previous days close.

  • Day range - The high and low of stock price from the previous day. So, April 17th 2024.

  • Year range - The high and low of stock price from the previous year. So, 2023.

  • Market capitalization - The total amount of stock in $ that is in the market. It is the price of the stock times the total shares outstanding (shares outstanding is not included in this data but could easily be found by dividing market capitalization by share price).

  • Average volume - The average number of shares traded each day over the past 30 days.

  • P/E ratio - The ratio of the current share price over the trailing twelve month earnings per share (EPS). This signals if the price of the stock is higher or lower than other stocks.

  • Dividend Yield Percentage - The ratio of annual dividends to current share price that estimates the dividend return of stock.

  • Employees - This is the total number of employees that works for the company.

  • Company History - This is a paragraph that describes the history of the company provided by Google.

  • Website - This is the company’s website that can be used to further research the company or stock.

  • Page ID - This is the HTML link that was used to scrape each company’s page from Google finance. If you follow the link it will take you to the Google finance page of that company.

Primary Analysis

Reiterating that this analysis will look for any bias in performance measures based on a companies climate score.

Initial Count of Climate Scores

First, in order to get a sense of how many companies are within each climate score group, I will show a bar graph of the counts of each company in each climate score.

Just looking at this graph it is apparent that the DCF does not invest much into companies with lower climate scores. With over 50% of their holdings being in companies with scores of A or B. Now this would seem like the end of my analysis considering I have found a simple answer to my question of how the DCF invests. But, I still want to investigate how these companies perform on average grouped by climate score to discover trends in the overall market.

Stock Price by Climate Scores

The following graph includes a box plot that shows the average stock price for a given climate score. Stock price is a good measure of how a company is doing

While the average stock price for each climate score is primarily the same, as apparent because of the solid black line, there is some variance in prices when it comes too the higher rated climate scores. However, there are two clear outliers on this graph. NVIDIA and Costco. NVIDIA is the highest stock price on the entire graph with a B rated climate score. Costco has the lowest climate rating out of all of the holdings but the second highest stock price. While it is concerning that the lowest rated climate company has the second highest stock price, which could represent a trend in the stock market as a whole of worse rated companies performing better on the market. Overall, from this first graph it seems that higher scored companies do perform better than lower scored companies based on this limited date set.

Market Capitalization by Climate Score

The graph below illustrates the average market capitalization grouped by climate score. Market capitalization is a good measure of the size of a company. Using this to determine if larger companies have worse climate scores.

Once again it seems like the A rated companies tend to be the largest companies. Costco is a large company making it seem that the D rated companies do tend to be larger companies. But, without more D rated companies to get a true average, it is hard to tell. For this analysis though the A rated companies do tend to be larger.

Trading Volume by Climate score

Trading volume is the average amount of times a stock is traded per day over thirty days. So this next graph will show how popular each climate score is, in terms of trading. The higher average volume means its traded more often which indicates it is more popular. Some negative affects of trading volume however may indicate that a company made a wrong move resulting in shareholders to sell their stock in that company. But for this analysis we will use it as a measure for popularity of a stock. The more popular a company the better it must be performing because more people want shares of its stock. No one would invest in a poor performing company.

This box plot further supports that higher rated companies perform better than lower rated ones. For transparency, four companies were excluded from this graph for having extremely high trading volumes that skewed the data drastically.

Company Age by Climate Score

Now that much of my analysis has been completed for this portion and it seems that higher rated companies perform better, at least based on this sample. I wanted to find out if there was any correlation between the age of a company and its climate score. In order to do this analysis climate score was changed to be numeric. A = 1, B = 2, C = 3, and D = 4. Any Company that was not scored was removed.

A company’s age does seem correlate to its climate score. The C rated companies do tend to be the oldest. This is apparent as the line slopes up as it approaches the C scores and is at its highest point at C.

Conclusion of my Climate Score Analysis

Based on the analysis above and my initial question about DCF’s environmental awareness. It can be concluded that the DCF does select its holdings with the environment in mind. While I am not in the DCF yet and do not know all of their requirements of choosing a holding, It doesn’t seem like climate score is a requirement but they do seem to keep it in mind, at least from an outside perspective. Again, being at a Catholic university, should the DCF add the requirement of investing in companies with a climate score above a B to remain consistent with the Catholic teaching? I believe they should. Based on this analysis it seems like they do.

Some trends in the analysis above seem to point towards lower scored companies performing better. Further analysis could be done with a larger sample size to find out if D rated companies do perform better. But since there was only one D rated company in this data set, it is impossible to make a conclusion about that overall. It is possible to make the conclusion that the DCF’s holdings tend to be higher scored climate companies that perform better than lower scored companies.

Secondary Analysis

After scraping Google finance for information on companies that the D’Artagnan Capital Fund (DCF) has holdings in. I wanted to investigate the company history that was given by Google finance, which they cited that they took from Wikipedia. Since Wikipedia can be edited by the public I was curious if this causes any bias in these company histories. The following three methods of analysis are how I looked for any bias.

  1. Is there any emotional or positive/negative words that are more heavily used in certain company histories compared to others? if so, why?

  2. I also wanted to see the basic word counts for each companies history and see if there was any correlation between the age of a company and the length of their company history.

  3. Concluding with investigating the correlation between the posititivy score and the age of a company to see if different companies’ histories have higher or lower positivity scores based on their age.

The lexicon I Used

In order to even begin any sort of analysis on this unstructured text, I had to choose a lexicon to use along with cleaning the unstructured text. I used the NRC lexicon which has 13,872 observations (rows) and 2 variables (columns). Each row is a word (column 1) and each word has a sentiment word (column 2) which includes general emotional words such as: trust, fear, anger, anticipation, disgust, joy, sadness, surprise, negative, and positive. Using this lexicon to compare words from the lexicon and the words in the company history is how I did much of this analysis.

Using the DCF holdings data from above, I separated the company history and got rid of any words that are not worth analysis. Words we don’t want to analyze would be words such as: the, and, is, could, would, etc.

Analysis 1

Going down my first lane of analysis, I made a new data frame that consists of a joined table between the words in the company history and the NRC lexicon. Using this I was able to see the counts of each emotion for each companies’ history. Along with positivity and negativity counts. Using those positive and negative counts I made a new variable called “positivity” which takes all the positive counts and subtracts the negative counts from them to get an overall positivty score. I used these scores to graph which companies have the highest positivity scores.

This gives us a brief look at the positivity scores of each company. Its clear not all of the companies have the same positivity score, or even close to it. There is a lot of variability. Could this be counted as bias from Wikipedia? Well its possible, but if we read the actual company histories its clear why we got this result. The NRC lexicon has over thirteen thousand words and each word is assigned an emotion or positive/negative rating. So when looking at the company histories, for example AbbVie. Which is a medical company searching for cures for a lot of diseases and cancers. Because of this, the words “disease” and “cancer” show up often in their history. The NRC lexicon marks these words as negative resulting in their positivity score to be negative. The opposite is true for the highest scored company, United Airlines Holdings. Their history talks a lot about their merger with continental airlines which in turn mentions a lot about increased stock prices, worlds largest airline, and other positive words or phrases like this. So this graph alone does not answer if Wikipedia shows bias in its comany history.

Analysis 2

For my next line of analysis I created a scatter plot that shows the correlation, if any, to a company’s age and its history’s word count. Testing to see if Wikipedia shows any bias toward a certain age range and the amount of information given in its history.

Clearly there is not much correlation between age and word count. Most companies seem to have a word count of around 50 or 100. With an almost equal number of older and younger companies being around 5o or 100 words. This analysis also did not prove any sort of bias shown by Wikipedia in terms of age.

Analysis 3

My final test to find bias in Wikipedia’s company histories involves investigating if a companies age effects its positivity score. I did this by using a regression line and finding the confidence intervals eat each point. The gray portion of the graph below represents that confidence interval.

This graph shows that the average positivity score for older companies is slightly higher than that of the younger companies. Meaning that Wikipedia may show some bias in terms of age and the positivity of a companies history sentiment. Older companies do seem to have a slightly higher positivity score than younger companies.

Conclusion of my Sentiment Analysis

Overall it does not seem like Wikipedia shows a significant amount of bias in the writing of their company histories. While there may be some in terms of age and positivity scores, it can be hard to tell. Like my primary analysis stated, this is a small sample of the stock market and if a larger, more diverse, sample were taken it could result in differing trends. But based on my analysis Wikipedia shows no bias.