Workshop 1, Business Analytics for Decision Making

Author

Alberto Dorantes, Ph.D.

Published

February 4, 2024

Abstract

In this workshop we introduce business analytics. We start with a review of financial statements, financial ratios, percentage calculation methods, and then we practice bascis data analytic skills using real financial data

1 Introduction to Business Analytics

There is not a formal definition for Data Science. However, it can be seen as the intersection between statistics, mathematics, computer science and any knowledge domain that demands analysis of big volumes of data such as Banking, Finance, Economics, Marketing, Demographics, Genetics, Astronomy, etc. If the domain is a business discipline, then we can call it Business Analytics, or just Data Science for Business.

Data Science includes not only the data management - data collection, data merging and data preparation- but also data analysis tailored to understand a problem in a specific domain, and then be able to do predictions, and finally improve decision making in organizations. For data analysis we can use techniques based on a) statistical modeling, and also b) machine learning. Actually, machine learning is a combination of statistical, mathematical and artificial intelligence methods.

Focusing on the business domain, we can classify business analytics in the following:

Descriptive Analytics
Diagnostic Analytics
Predictive Analytics
Prescriptive Analytics

Descriptive analytics refers to analysis of historical business/economic data to better understand a current business situation. Diagnostic analytics refers to the ability of finding business insights - mainly business opportunities and threats- after analyzing present and past business data. Predictive analytics refers to the ability of creating possible future scenarios based on historical data and assumptions related to organizational variables. Prescriptive analytics refers to the ability of identifying/proposing a specific strategic business plan according to the previous descriptive, diagnostic and predictive analysis.

For descriptive analytics the main techniques are data management, statistical analysis and data visualization. For predictive analytics the main techniques are statistical modeling and machine learning.

Machine learning is a very dynamic and changing field. Machine learning refers to automatic algorithms that can learn from facts (data), identify patterns, and then provide a solution to a problem or come up with forecast of variables.

Machine learning is a combination of artificial intelligence techniques along with statistical and mathematical methods. The main purpose of machine learning is to process data and come up with models to a) understand patterns, and b) predict variables. Then, machine learning is a set of algorithms that receive data, select a model, train the model with the data in order to calibrate the model, and then execute the model and come up with insights and/or predictions.

The machine learning techniques we will focus in this course will be mostly statistic techniques applied to Marketing and Finance.

Successful data-driven companies apply Data Science to keep competitive in the market. Those organizations combine a variety of internal and external data sources in an analytics engine, translate the data into quantifiable and actionable insights to make effective decisions for the organization.

2 Review of basic business concepts

2.1 Understanding Financial Statements

In my note, Basics of Financial Statements, I explain in detail the key elements of the most essential financial statements, their interrelationships, and the primary financial insights they offer. The most relevant statements include:

Balance Sheet: Provides a snapshot of a firm’s financial position (its assets, liabilities and equity) at a specific point in time.
Income Statement: Reports a firm’s financial performance over a period, including revenues, expenses, and net income.
Statement of Cash Flows: Tracks cash inflows and outflows from operating, investing, and financing activities.
Statement of Shareholders’ Equity: Shows changes in equity accounts over time.

Understanding these financial statements is essential for business analytics and decision-making.

Let’s review the main elements of the income statement:

\[ S_t-COGS_t-SGAE_t-DEP_t =EBIT_t \]

Where:

$S_t$= Sales during the period t (from t-1 to t)

$COGS_t$= Cost of Good Sold during period t

$SGAE_t$= Sales and General Administrative Expenses during period t

$DEP_t$= Depreciation durint period t

$EBIT_t$= Earnings Before Interest and Taxes

From EBIT we estimate the final Net Income, which is the final economic wealth generated by the firm after all expenses:

\[ NI_t = EBIT_t-IE_t-T_t \]

Where:

$NI_t$ = Net income during period t

$IE_t$ = Interest Expenses during period t (also called financial expenses)

$T_t$ = Income Tax during period t

In some cases, depending on the industry and country, other income/expenses are considered after EBIT. For example, in the US, the income/expense that is not related to business operations are usually added after EBIT to calculate net income. In the excercise below we will add 2 variables that represent income or expenses that are not related to daily operations: otherincome and extraordinaryitems.

2.2 Financial ratios and their importance

Financial ratios are crucial for evaluating a company’s financial health and performance. They allow analysts to compare companies across industries and track performance over time. Key ratio categories include:

Liquidity Ratios: Assess a company’s ability to meet short-term obligations.
Profitability Ratios: Measure a company’s ability to generate profits from its operations.
Leverage Ratios: Evaluate the extent of a company’s financial leverage.
Efficiency Ratios: Indicate how effectively a company utilizes its assets.

A financial ratio is an indicator that usually measures something related to business performance. A financial ratio is the result of dividing 2 numbers (values of financial statement variables) to get a proportion.

In this workshop I will review a few financial ratios that we use in the exercise.

Return on Assets (ROA)

As its name suggest, ROA at period t measures how much net income a firm generates during the period t for $1.00 invested in total assets at the beginning of the period t (at t-1):

\[ ROA_t=\frac{NI_t}{TA_{t-1}} \]

Where

$NI_t$ = Net Income generated during the period t

$TA_{t-1}$= Total Assets at the beginnig of period t (or end of period t-1)

DuPont decomposition of ROA

In the decade of 1920, Donaldson Brown, executive at DuPont company, came up with the idea of decomposing important financial ratios such as ROA and ROE (Return on Equity) when DuPont was acquiring a stake in General Motors. That was the beginning of the DuPont analysis.

Let’s review the decomposition of ROA according to DuPont analysis:

If we do the following multiplication:

\[ ROA_t=\left(\frac{NI_t}{TA_{t-1}}\right)*\left(\frac{S_t}{S_t}\right) \] We do not modify the result since Sales divided by itself is one.

We can re-express ROA as:

\[ ROA_t=\left(\frac{NI_t}{S_t}\right)*\left(\frac{S_t}{TA_{t-1}}\right) \] These factors are: \[ ProfitMargin=\frac{NI_t}{S_t} \]

\[ AssetTurnOver=\frac{S_t}{TA{t-1}} \]

Then, ROA can be expressed as the multiplication of Profit Margin and Asset Turn Over.

Profit margin tells how much net income the firm gets for each $1.00 of sale. Asset turn over indicates how many times is total sales with respect to total assets. For example, if ATO=2 this means that the company sold the equivalent to 2 times its asset value.

This decomposition helps us understand whether profitability (Profit Margin) or efficiency (Asset Turn Over) is driving ROA!

Then, how does a firm can increase its ROA?

Improve profit margin, or
Improve asset turn over, or
Improve both!

2.3 Interpretation and Business Strategy

By analyzing DuPont ratios, we can infer strategic insights about firms, such as:

High profit margins indicating differentiation strategy
High asset turnover suggesting low-cost strategy

But, how can we judge whether 10% or 30% of profit margin is high, medium, or low? or whether an ATO greater than 1 is good or not. We need to compare these ratios of a firm against the corresponding average ratios of the firm’s industry.

In the exercise below we will calculate financial ratios for many real US firms, calculate the average for each industry so that we can have a good starting point to judge some values for financial ratios.

2.4 Review of percentages in business

The calculation and use of percentages is very common in our dainly live and the business world. Although most people have a good intuition about percentages and how they are calculated, it is easy to make mistakes when calculating business percentage measures.

There are two types of percentage measures we have to pay attention when calculating them:

Average of Percentages or Ratios
Percentage Growth measures

2.4.1 Average of Percentages / Ratios

Some people can easily make mistakes when calculating the average of percentages or ratios.

Here is an example:

Imagine that a firm only sells 2 products: A and B. The following table shows the total sales, cost and operating gross profit margin for these 2 products:

Product	Units Sold	Unit Price	Sales	Unit Cost (COGS)	Cost	Gross profit	Gross profit margin
A	10,000	$1.00	$10,000	$0.40	$4,000	$6,000	60%
B	1000	$90	$90,000	$81.00	$81,000	$9,000	10%
						Average=	?

If you report the gross profit margin for this company at the end of the year, how would you calculate it?

We can identify 3 methods:

Calculate the arithmetic average of gross profit margin of each product. In this case, the arithmetic average would be ( (60%+10%) / 2) = 35%
Calculate a weighted average of gross profit margin, assigning weights according to product value sales. In this case, product A sold only 10% of total sales, while product B sold 90%. Then, the weighted average would be: 0.10 (60%) + 0.90 (10%) = 15%
Calculate the weighted average as the total gross profit divided by total sales: ($15,000 / $100,000) = 15%

The last two methods end up with the same result. Which is the correct average gross profit margin for the firm?

Rule of thumb: When calculating average of percentages or ratios, DO NOT CALCULATE ARITHMETIC AVERAGE! You have to calculate WEIGHTED AVERAGE, and the easiest way is to SUM all values of the numerator divided by the SUM of all values of the denominator.

2.4.2 Percentage growth measures

When calculating the average percentage growth per period using historical percentage growth rates, it is easy to make the mistake of calculating the arithmetic average of these historical percentage growth rates.

Let’s do an example.

Imagine you have only 2 periods and you want to calculate the average return of an investment per period:

Returns over time
Period	Investment value (at the end of the period)	Simple period Return (R)
0 (today)	$100	NA
1	$50	-0.50
2	$75	+0.50

The simple period return is a percentage growth rate of the investment in the period.

Calculating the average simple return of this investment:

\[ \bar{R}=\frac{-0.5+0.5}{2}=0% \]

Then, the arithmetic (simple) average return is 0%, while I end up with $75, losing 25% of my initial investment ($100) over the first 2 periods. If I lost 25% of my initial investment over 2 periods, then the average return per period might be a midpoint between 0 and 25%. The accurate average return of an investment over time (multi-periods) is the “Geometric Mean” return.

The total return of the investment in the whole period -also called the holding-period return (HPR)- can be calculated as:

\[ HPR=\left(1+R_{1}\right)\left(1+R_{2}\right)...\left(1+R_{N}\right)-1 \]

Using the example, the HPR for this investment is:

\[ HPR=\left(1-0.50\right)\left(1+0.50\right)-1=0.75 - 1 = -0.25 \]

And the formula for the geometric average of returns will be:

\[ \bar{R_{g}}=\sqrt[N]{\left(1+R_{1}\right)\left(1+R_{2}\right)...\left(1+R_{N}\right)}-1 \]

Calculating the geometric average return for this investment:

\[ \bar{R_{g}}=\sqrt[2]{\left(1-0.5\right)\left(1+0.5\right)}-1= -0.13397 \]

Then, the right average return per year is about -13.4% and the HPR for the 2 years is -25%.

However, if we use continuously compounded returns (r) instead of arithmetic/simple returns (R), then the arithmetic mean of r is an accurate measure that can be converted to simple returns to get the geometric mean, which is the accurate mean return. Let’s do the same example using continuously compounded returns:

Continuously compounded returns
Period	Investment value (at the end)	Continuously compounded return (r)
0	$100	NA
1	$50	=log(50)-log(100)=-0.6931
2	$75	=log(75)-log(50)=+0.4054

If we sum the 2 cc returns (r) we get the correct holding-return for the 2 periods (but in continuously compounded):

\[ hpr = r_1+r_2 = -0.6931 + 0.4054 = -0.2877 \]

Finally, we convert from continuously compounded to simple returns by applying the exponential function and subtracting 1:

\[ HPR = e^{-0.2877}-1=-0.25 = -25\% \]

As illustrated, the continuously compounded returns are additive, where the simple historical returns are not additive - they are multiplicative.

We can get the average cc return with simple arithmetic mean:

\[ \bar{r}=\frac{r_1+r_2}{2}=-0.14385 \]

We convert from cc return to simple return by appying the exponential function minus 1:

\[ \bar{R}=e^{-0.14385}-1=-0.1339=-13.39\% \]

This is actually the same result we got with the geometric mean of returns!

It is very recommended to calculate continuously compounded returns (cc returns) and using cc returns instead of simple returns for data analysis, statistics and econometric models.

One way to calculate cc returns is by subtracting the natural log of the current investment value (at t) minus the natural log of the previous investment value (at t-1):

\[ r_{t}=log(InvestmentValue_{t})-log(InvestmentValue_{t-1}) \]

This is also called as the difference of the log value, which is a percentage measure of how much the investment changed from period t-1 to t.

We can also calculate cc returns as the log of the current investment value (at t) divided by the previous investment value (at t-1):

\[ r_{t}=log\left(\frac{InvestmentValue_{t}}{InvestmentValue_{t-1}}\right) \]

cc returns are usually represented by small r, while simple returns are represented by CAPITAL R.

But why do we use natural logarithm to calculate cc returns? First we need to remember what is a natural logarithm.

2.4.3 Reviewing the concept of natural logarithm

The natural logarithm of a number is the exponent that the number e (=2.71…) needs to be raised to get another number. For example, let’s name x=natural logarithm of a stock price p. Then:

\[ e^x = p \]

The way to get the value of x that satisfies this equality is actually getting the natural log of p:

\[ x = log_e(p) \]

Then, we have to remember that the natural logarithm is actually an exponent that you need to raise the number e to get a specific number.

The natural log is the logarithm of base $e$ (=2.71…). The number $e$ is an irrational number (it cannot be expressed as a division of 2 natural numbers), and it is also called the Euler constant. Leonard Euler (1707-1783) took the idea of the logarithm from the great mathematician Jacob Bernoulli, and discovered very astonishing features of the $e$ number. Euler is considered the most productive mathematician of all times. Some historians believe that Jacob Bernoulli discovered the number $e$ around 1690 when he was playing with calculations to know how an amount of money grows over time with an interest rate.

How the number $e$ is related to the grow of financial amounts over time?

Here is a simple example:

If I invest $100.00 with an annual interest rate of 50%, then the end balance of my investment at the end of the first year (at the beginning of year 2) will be:

\[ I_2=100*(1+0.50)^1 \]

If the interest rate is 100%, then I would get:

\[ I_2=100*(1+1)^1=200 \]

Then, the general formula to get the final amount of my investment at the beginning of year 2, for any interest rate R is:

\[ I_2=I_1*(1+R)^1 \]

The (1+R) is the growth factor of my investment.

In Finance, the investment amount is called principal. If the interests are calculated (compounded) each month instead of each year, then I would end up with a higher amount at the end of the year.

Monthly compounding means that a monthly interest rate is applied to the amount to get the interest of the month, which is added to the investment (principal). Then, for month 2 the principal will be higher than the initial investment. At the end of month 2 the interest will be calculated using the updated principal amount.

Putting in simple math terms, the final balance of an investment at the end of year 1 (beginning of year 2) when doing monthly compounding will be:

\[ I_2=I_1*\left(1+\frac{R}{N}\right)^{1*N} \]

For monthly compounding, N=12, so the monthly interest rate is equal to the annual interest rate R divided by N (R/N). Then, with an annual rate of 100% and monthly compounding (N=12):

\[ I_2=100*\left(1+\frac{1}{12}\right)^{1*12}=100*(2.613..) \]

In this case, the growth factor is $(1+1/12)^{12}$, which is approximately equal to 2.61.

Instead of compounding each month, if the compounding is every moment, then we are doing a continuously compounded rate.

If we do a continuously compounding for the previous example, then the growth factor for one year becomes the astonishing Euler constant e:

Let’s do an example for a compounding of each second (1 year has 31,536,000 seconds). The investment at the end of the year 1 (or at the beginning of year 2) will be:

\[ I_2=100*\left(1+\frac{1}{31536000}\right)^{1*31536000}=100*(2.718282..)\cong100*e^1 \]

Now we see that $e^1$ is the GROWTH FACTOR after 1 year if we do the compounding of the interests every moment with an interest rate of 100%!

We can generalize to any other annual interest rate R, so that $e^R$ is the growth factor for an annual nominal rate R when the interests are compounded every moment.

When compounding every instant, we use small r instead of R for the interest rate. Then, the growth factor will be: $e^r$

Then we can do a relationship between this growth rate and an effective equivalent rate:

\[ \left(1+EffectiveRate\right)=e^{r} \]

If we apply the natural logarithm to both sides of the equation:

\[ ln\left(1+EffectiveRate\right)=ln\left(e^r\right) \]

Since the natural logarithm function is the inverse of the exponential function, then:

\[ ln\left(1+EffectiveRate\right)=r \]

In the previous example with a nominal rate of 100%, when doing a continuously compounding, then the effective rate will be:

\[ \left(1+EffectiveRate\right)=e^{r}=2.7182 \]

\[ EffectiveRate=e^{r}-1 \] Doing the calculation of the effective rate for this example:

\[ EffectiveRate=e^{1}-1 = 2.7182.. - 1 = 1.7182 = 171.82\% \]

Then, when compounding every moment, starting with a nominal rate of 100% annual interest rate, the actual effective annual rate would be 171.82%!

3 Introduction to Data Analytics

Public firms are firms that issue shares to the public. In the US, you can get any historical financial statement of public US firms from [https://www.sec.gov/edgar/searchedgar/companysearch.html||the SEC EDGAR site]. This is the US Security and Exchange Commision (SEC), Company Filings Database. Among many other responsibilities, the SEC monitors and regulates the financial market (similar to the “Comisión Nacional Bancaria y de Valores” in Mexico).

In the SEC EDGAR (Electronic Data Gathering, Analysis, and Retrieval) database any body can find (for free) real financial statement and other corporate information of public US firms. For this exercise I created a dataset using Economatica, a leading financial software. An important data source of Economatica is SEC EDGAR.

I did some programming for data collection, data cleaning, and data management with financial statement variables of all US public firms from the NASDAQ and the New York exchange (around 3,600 firms). This information can be found in the SEC database in the 10-K reports, which are annual statements (Filings) of US firms.

Download the W1-template. This template has 2 Sheets:

• findata: it has real annual data of several financial statement variables for all US public from 2000 to 2023. This file is a dataset with a “panel data” structure, also called “long” format. This long format is required before we start creating Dashboards to summarize the data by different dimensions such as year, industry, firm, etc.

• usfirm: this is a catalog of all US public firms with general information such as company name, industry, status (active/cancelled), etc.

• data dictionary: dictionary that explains the meaning of each variable (column) of the findata dataset

The data is in thousands US dollars. I constructed this dataset following an “ETL” (Extraction, Transformation, Loading) process.

Your task is to use this integrated dataset to do the following Dashboards:

3.1 Dupont Dashboard

Create new columns in the Data Sheet for EBIT, TotalRevenue, Net Income, Shareholder Equity, and PreviousTotalAssets (at t-1). Consider the following:

EBIT (Earnings Before Interest and Taxes) = revenue - cogs - sgae - depreciation

totalRevenue = revenue + otherincome

netIncome = EBIT + otherincome + extraordinaryitems - finexp - incometax

previousTotalAssets = totalassets at the beginning of the year = totalassets in the previous year (at t-1)

The “OTHERINCOME” column represents all revenue different than operational Sales (for example, financial income, extraordinary sales, etc). Then, to to compute EBIT and those ratios related with it, consider ONLY revenue (Not OtherIncome) as Sales. To compute Net Income, ATO, and those ratios related with it, consider Sales + OtherIncome as Total Revenue. For the ratios that end up with invalid result, leave empty spaces (use the conditional function IFERROR or SI.ERROR to do this).

Create a Dashboard to see the following financial ratios BY YEAR:

Average ROA (AVGROA),

Average Profit Margin (AVGPM),

Average Turn Over (AVGATO),

Average Operating ROA (AVGROABIT),

Average Operating Profit Margin (AVGOPPM),

Average Operating Asset Turnover (AVGOPATO).

For the the Operating ratios (AVGROABIT AVGOPPM, AVGOPATO) use only revenue (without including Otherincome nor Extraordinaryitems) since other income is usually not part of daily regular operations of a company.

You have to include SLICERS (Segmentación de Datos) to select a) Active vs Cancelled firms, b) one or more company names, and c) one or more Industries.

Notes:

To compute ROA, ROABIT and ATO, use Total Assets at the beginning of the year, that is the same as Total Assets at the end of the previous year (use Total Assets at t-1), as it is stated in the Note about the basics of Financial Analysis.

For all average ratios, remember that you have to use calculated fields.

Remember, it’s usually NOT correct to calculate arithmetic averages for ratios or percentages!

In the same Dashboard design and create a graph that illustrate the behavior of ROABIT over the years

3.2 Financial Leverage Dashboard

This dashboard has to have the following ratios:

Average debt-to-equity ratio and Average debt-to-asset ratio per year. You have to include the same SLICERS you used for the DuPont Dashboard.

Before doing the Dashboards, you have to calculate some variables that you might need. You also have to add the firm name, status and industry as a new columns to the dataset using the “buscarv” (vlookup) function.

With these Dashboards you will be able to analyze the Dupont and financial leverage ratios by Industry for the US public firms over time. Now you can have a better benchmark for analyzing financial ratios for other companies.

Once you have your 2 Dashboards, copy and paste with values in the corresponding Sheets and you will receive feedback and grading of your results.

You have to submit your Excel file with all your work through Canvas