Worksheet2: Introduction to Reading Affymetrix Microarray Data in R

Author

Fatma Sayed Ibrahim Abdelati

Learning Goals

By the end of this exercise, you should be able to:

Install and load Bioconductor packages.

Locate and list Affymetrix CEL files. - Read and explore CEL files in R.

  • By the end of this exercise, you should be able to:

  • Install and load Bioconductor packages.

  • Locate and list Affymetrix CEL files. Step 1: Install Required Packages

  • Read and explore CEL files in R.

  • Display and inspect the resulting AffyBatch object.


Step 1: Install and load Required Packages

Install and load Bioconductor packages needed for microarray analysis.
Hint: Use BiocManager::install() and``library() functions. Required packages: affy, affydata, hgu133acdf.

BiocManager::install("package_name")
library(package_name)

Tasks:

  • Install the required packages.

  • Which package contains example CEL files?

  • Load affy, affydata, and hgu133acdf.


Step 2: Locate and List All CEL Files

Find the path where affydata stores its example CEL files. then,create a list of all CEL files in that folder.

Hint: Use system.file() and affy::list.celfiles().

system.file("celfiles", package = "package_name")
list.celfiles(path = your_path, full.names = TRUE)

Tasks:

  • Store the file path of the celfiles folder from affydata into a variable named path_affydata.

  • Print this variable to confirm the path.

  • Store the list of CEL files in a variable called cel_files.

  • Display the contents of cel_files.

  • How many CEL files are found?


Step 3: Read the CEL Files into R

Read one CEL file to create an AffyBatch object.

Hint: Use affy::ReadAffy().

ReadAffy(filenames = your_list[index])

Tasks:

  • Read only the first CEL file from your list.

  • Store it in a variable called data_affy.

  • Print data_affy to see what it contains.


Step 4: Explore the AffyBatch Object

Examine what information is stored in the object and access the raw intensity values

Hint: Use general R inspection functions.The raw data is stored inside the assayData slot. You can access it using the exprs() function.

class(data_affy)
slotNames(data_affy)
exprs(data_affy)

Tasks:

  • What is the class of data_affy?

  • Which slots are available in the object?

  • Which slot contains the raw intensity values?

  • Use exprs(data_affy) to extract the intensity matrix.

  • Compare the intensity ranges between arrays. Are they similar or very different?

  • Based on your observation, why might normalization be necessary before comparing samples?