Simple Linear Regression Assessment For Breast Cancer Synthetic Data

Isabelle Rennenberg



West Chester University

Table of Contents

  • Breast Cancer Data
  • All Contents of Data
  • Cancer Cell Mitoses
  • Numerical Variable Analysis
  • Plot of Mitoses and Cell Thickness
  • Regression Output
  • Regression Equation
  • Conclusions
  • Output Reference

Breast Cancer Data

  • The data used for this presentation was synthetic data
  • This data included 11 variables (10 numeric, 1 categorical)
  • There was a total of 600 observations
  • The variables used for this analysis were:
    • Mitoses (cell growth) as the independent variable
    • Thickness of clump (thickness of the layers of the cells) as the dependent variable
  • This assessment had the aim of understanding if there is a relationship between mitoses (number of cell divisions) and thickness of cell clumps

All Contents of Data

Unable to display PDF file. Download instead.

Cancer Cell Mitoses



Numerical Variable Analysis

Variable Mean
Mitoses 2.09
Cell Thickness 5.41

Plot of Mitoses and Cell Thickness



  • As the number of cell divisions increases, there are less cells that have a small amount of layers.

Regression Output



  • The p value for mitoses was highly significant at <0.0001.
    • This indicates that cell division is related to cell thickness. The greater the cell division, the thicker the cells.

Regression Equation

  • \(y=4.27157+X_{mitoses}0.54383\)
    • For each increase of cell mitoses, cell thickness is multiplied by 0.54

Conclusions

  • For the synthetic breast cancer data, a simple linear regression was performed.
  • The independent variable, mitoses, was evaluated to see if increase/decrease of cell division impacted cell thickness (dependent variable)
  • It was found that there was a strong relationship between mitoses and thickness, with a p value of <0.0001
    • Furthermore, it was found that with increased cell division, cells were thicker
    • Increased cell division and increased layers of cells are both related to a higher tendency for cells to not be benign
  • The final regression equation was: \(y=4.27157+X_{mitoses}0.54383\)

Output Reference

Unable to display PDF file. Download instead.