Classification statistics with FFTs

Nathaniel Phillips

asdf

In this vignette, I'll go over the basics of classification statistics.

What is a classification algorithm?

A classification algorithm is any algorithm that classifies cases into one of multiple classes based on multiple cues. For example, in a breast cancer dataset, we might want to classify women (cases) as either having breast cancer (class 1) or not having breast cancer (class 2) based on measurements on breast masses (cues).

How do you evaluate a classification algorithm?

You can evaluate a classification algorithm algorithm based on the amount, and type, of errors it makes. What does an error mean? Well, it depends on the data. We assume that every case in a data set has a true classification value. For example, patient A might truly have breast cancer, while patient B might truly not have breast cancer. A perfect classification algorithm, that makes no errors, will correctly classify patient A has having breast cancer, and patient B as not having breast cancer

No Cancer Yes Cancer
Decide "No" Table Cell Cell 2
Decide "Yes" Cell 3 Cell 4

To understand this, consider 100 women, where 20 women

While all classification algorithms make classification decisions based on statistics, they differ in both how cues they use, and how cue information is combined to make a decision.