Spreadsheets are ubiquitous in corporate finance, and they have a lot of very nice features, especially for financial analysis. However, when investment analysts use spreadsheets for purposes it was not designed for, it can be a hassle. Often, they use it as a reporting tool, which it is not; and it is becoming increasingly common for some analysts to create very elaborated and complicated spreadsheets to collect huge amounts of data over time.
Today, private equity firms are swimming in oceans of data. The incoming data is diverse – balance sheets, income statements, statements of cash flow, operating metrics, debt schedules, budgets vs. actual, segment data, month/quarterly/annual data points, etc. But what they are increasingly finding is that the costs associated with data analysis go far beyond the cost of the data itself.
The spreadsheet tools that analysts currently use were developed in simpler times, when data was often difficult (or expensive) to access. And to manage portfolio companies data efficiently and make that data ready for analysis, there are a number of steps that need to be taken. These steps include, but are not limited to: (i) collecting and structuring the data, (ii) cleaning and standardizing the data, (iii) linking the data with other data sets, (iv) generating aggregations or new data attributes, (v) storing it in a database, and (vi) making it available for investment decisions.
I created this course partly in service to future investment analysts who may be interested in financial modeling programming. In class I use a simple prescription: Think big, start small, and grow fast. Students will start by collecting data from a portfolio of insurance companies and store it in a powerful-easy-to-access format. Once they have collected all the data from spreadsheets and pdfs, they will undertake a process of building, testing, and iterating data-driven solutions (eg., dynamic dashboards and hard-copy reports), then rapidly scaling them to achieve maximum impact. The central goal is to teach students how to get financial data and formulations out of spreadsheets, however foreign or daunting it may have seemed at the outset1.
Based on previous experiences, there is a great demand for students who can apply the tidy data modeling techniques2 covered in class. When value creation is more important than ever in many corporate deals, private equity firms have a significant opportunity to use data science to their advantage. Low-investment, high-ROI approaches can be applied to meet the objectives of limited holding periods with limited budgets for investment. Digital levers based on data science can not only generate significant value during the holding period but also be a crucial part of the investment thesis for the subsequent buyer.
[1] Walkthrough: Business Description
[2] Walkthrough: Security Description
[3] Walkthrough: Capitalization
[4.1] Walkthrough: Valuation - Financial Performance
[4.2] Walkthrough: Valuation - Trading Levels
[4.3] Walkthrough: Valuation - MA Comparables
[4.4] Walkthrough: Valuation - Public Comparables
[4.5] Walkthrough: Valuation - WACC DCF
It is becoming increasingly common for investment management firms to collect huge amounts of data over time. The incoming data is diverse – balance sheets, income statements, statements of cash flow, operating metrics, debt schedules, budgets vs. actual, segment data, month/quarterly/annual data points, etc. For private markets investing the challenge right now is to manage portfolio companies data efficiently and make that data ready for analysis, but before a single insight can be derived from any operating company, there are a number of steps that need to be taken. These steps include, but are not limited to: (i) collecting and structuring the data, (ii) cleaning and standardizing the data, (iii) linking the data with other data sets, (iv) generating aggregations or new data attributes, (v) storing it in a distributed database, and (vi) making it available for investment decisions.
For example, it is likely firms would end up spending multiple times more on data integration than for collecting data from operating companies; therefore, building data feeders is likely the least expensive part of decision analytics and the due diligence process.
[1] R for Data Science: Import, Tidy, Transform, Visualize, and Model Data
[2] Spreadsheet Munging Strategies
[3] Internal Rate of Return (IRR) and the notorious Reinvestment Assumption
I spent over 10 years of my career developing spreadsheet applications in the financial sector. Along the way, I published a very large collection of mathematical, statistical, financial and engineering libraries for Microsoft Excel. Considerable effort was put into making the backtesting libraries fast and memory efficient. These libraries were also used by multiple investment banks and asset managers to quickly collect fundamental data from a growing variety of online sources, while helping them make better informed investment decisions before the rest of the market caught on. Coupled with more in-depth analytics, including writing a very fast hash table with tiny memory footprints, money managers were capitalizing on a new and increasingly essential source of information.↩︎
The tidy data principles is the framework used for collecting, cleaning, organizing and analyzing data using the R Programming Language. I always advise students interested in private markets investing to learn some coding and statistical learning. If they learn to code in a programming language like R, it will open new avenues for their analytical work, will vastly improve their productivity and will protect the future of their career as forces of automation and artificial intelligence rapidly advance in the corporate world.↩︎