I created a new folder called COAD and stored all 60 files in it. I then changed them to TXT files and opened them.
https://portal.gdc.cancer.gov/files/536f5a77-0087-457d-ac95-6d1a9abad8cb, UUID 536f5a77-0087-457d-ac95-6d1a9abad8cb, case: TCGA-AA-3516
https://portal.gdc.cancer.gov/files/ed52de66-66fa-44ce-b679-cf641b0d92cd, UUID ed52de66-66fa-44ce-b679-cf641b0d92cd, case: TCGA-AA-3516
https://portal.gdc.cancer.gov/files/b28090c5-c42d-4836-9bb1-ce906d3ead95, UUID: b28090c5-c42d-4836-9bb1-ce906d3ead95, case TCGA-AA-3854
https://portal.gdc.cancer.gov/cases/57cdaa1c-4e94-4a28-ab3b-300c0457555f, UUID: 49e29c69-d9d7-4496-9f24-26f42c8b6d8e, case: TCGA-A6-2674
https://portal.gdc.cancer.gov/files/08ed32e4-fb94-4bc0-8715-83ee2143a13d, UUID: 08ed32e4-fb94-4bc0-8715-83ee2143a13d, case: TCGA-AA-A00J
https://portal.gdc.cancer.gov/files/6e571f71-d5fb-42f3-a35b-554c5ab76587, UUID: 6e571f71-d5fb-42f3-a35b-554c5ab76587, case: TCGA-AA-A01G
https://portal.gdc.cancer.gov/files/8b12a000-f588-4a78-a9eb-f06041a65789, UUID: 8b12a000-f588-4a78-a9eb-f06041a65789, case: TCGA-A6-6780
https://portal.gdc.cancer.gov/files/02734d4d-fc8f-4ef7-ac82-1b4d7184cc5e, UUID: 02734d4d-fc8f-4ef7-ac82-1b4d7184cc5e, case: TCGA-CK-4950
https://portal.gdc.cancer.gov/files/6466a8b1-d1e2-4195-a353-0800576c13c8, UUID: 6466a8b1-d1e2-4195-a353-0800576c13c8, case: TCGA-G4-6322
https://portal.gdc.cancer.gov/files/bc47f01c-1994-4ff8-a356-94d9679b66ee, UUID: bc47f01c-1994-4ff8-a356-94d9679b66ee, case: TCGA-AA-3947
https://portal.gdc.cancer.gov/files/b045ee79-82a6-4636-a875-1a58603d89ff, UUID: b045ee79-82a6-4636-a875-1a58603d89ff, case: TCGA-A6-A566
https://portal.gdc.cancer.gov/files/c383ba2c-b00a-4bd2-82cb-b3f04c2a8172, UUID: c383ba2c-b00a-4bd2-82cb-b3f04c2a8172, case: TCGA-AA-3877
https://portal.gdc.cancer.gov/files/b52775aa-273e-484e-82c7-c625f09415fa, UUID: b52775aa-273e-484e-82c7-c625f09415fa, case: TCGA-A6-3809
https://portal.gdc.cancer.gov/files/7b15a87a-805c-4b8a-84de-549cec9c44e3, UUID: 7b15a87a-805c-4b8a-84de-549cec9c44e3, case: TCGA-AA-3684
https://portal.gdc.cancer.gov/files/b4f3dbbb-2686-4896-9e60-5bef6c9150b4, UUID: b4f3dbbb-2686-4896-9e60-5bef6c9150b4, case: TCGA-AA-3692
https://portal.gdc.cancer.gov/files/0b16e2bd-3ec7-4901-9ff0-a389670e5019, UUID: 0b16e2bd-3ec7-4901-9ff0-a389670e5019, case: TCGA-D5-6534
https://portal.gdc.cancer.gov/files/a6690007-f347-49c3-a0ba-28e01d131971, UUID: a6690007-f347-49c3-a0ba-28e01d131971, case: TCGA-A6-3809
https://portal.gdc.cancer.gov/files/a1742cf6-c3c5-43e7-879c-489494460e78, UUID: a1742cf6-c3c5-43e7-879c-489494460e78, case: TCGA-AA-A00N
https://portal.gdc.cancer.gov/files/d5be795d-beb6-4def-bda8-f485ee45bfc1, UUID: d5be795d-beb6-4def-bda8-f485ee45bfc1, case: TCGA-A6-2674
https://portal.gdc.cancer.gov/files/46306072-c59c-4b4b-963c-9c4e778ff34b, UUID: 46306072-c59c-4b4b-963c-9c4e778ff34b, case: TCGA-A6-6780
https://portal.gdc.cancer.gov/files/a938cb2c-c8e8-4395-915b-37e1e279a4da, UUID: a938cb2c-c8e8-4395-915b-37e1e279a4da, case: TCGA-G4-6302
https://portal.gdc.cancer.gov/files/7fec7c90-fd2e-4ee2-ba1a-77f85920771f, UUID: 7fec7c90-fd2e-4ee2-ba1a-77f85920771f, case: TCGA-DM-A282
https://portal.gdc.cancer.gov/files/2c3fd34c-70d1-4331-9628-260b77329b53, UUID: 2c3fd34c-70d1-4331-9628-260b77329b53, case: TCGA-F4-6704
https://portal.gdc.cancer.gov/files/4168a720-521e-47ff-afb5-4abe3e815490, UUID: 4168a720-521e-47ff-afb5-4abe3e815490, case: TCGA-AA-3950
https://portal.gdc.cancer.gov/files/ecc90bd1-f594-41ea-ba4b-d42f4c64880b, UUID: ecc90bd1-f594-41ea-ba4b-d42f4c64880b, case: TCGA-A6-6781
https://portal.gdc.cancer.gov/files/8736ed27-2141-48d9-b677-b1a0e14d4b50, UUID: 8736ed27-2141-48d9-b677-b1a0e14d4b50, case: TCGA-CA-6717
https://portal.gdc.cancer.gov/files/3b8d04cd-d658-46ba-adca-079fee531e17, UUID: 3b8d04cd-d658-46ba-adca-079fee531e17, case: TCGA-AA-3821
https://portal.gdc.cancer.gov/files/b27da518-d023-4f9c-a9ab-5cd68ee37870, UUID: b27da518-d023-4f9c-a9ab-5cd68ee37870, case: TCGA-CK-4951
https://portal.gdc.cancer.gov/files/e7005df6-f78b-4e47-abe7-61ae6a2ee026, UUID: e7005df6-f78b-4e47-abe7-61ae6a2ee026, case: TCGA-AA-A01R
https://portal.gdc.cancer.gov/files/e3598d14-292c-41cc-9b59-4497fa078272, UUID: e3598d14-292c-41cc-9b59-4497fa078272, case: TCGA-D5-6930
https://portal.gdc.cancer.gov/files/f1185347-ad15-43ae-9ef3-d5343b31a0fc, UUID: f1185347-ad15-43ae-9ef3-d5343b31a0fc, case: TCGA-A6-6654
https://portal.gdc.cancer.gov/files/0d53cb1c-97c4-4088-9e43-029de88fd66d, UUID: 0d53cb1c-97c4-4088-9e43-029de88fd66d, case: TCGA-DM-A1D4
https://portal.gdc.cancer.gov/files/a74bbce0-7f3d-434e-b294-7fa45e5b3a60, UUID: a74bbce0-7f3d-434e-b294-7fa45e5b3a60, case: TCGA-A6-2684
https://portal.gdc.cancer.gov/files/47554e4e-cd13-4b92-80be-e1940f9a950f, UUID: 47554e4e-cd13-4b92-80be-e1940f9a950f, case: TCGA-A6-5657
https://portal.gdc.cancer.gov/files/de60dbd7-8a93-47a5-b1ea-a3f95beade8a, UUID: de60dbd7-8a93-47a5-b1ea-a3f95beade8a, case: TCGA-F4-6854
https://portal.gdc.cancer.gov/files/70883b31-d130-4efd-a7c6-169c8d4a253d, UUID: 70883b31-d130-4efd-a7c6-169c8d4a253d, case: TCGA-AD-A5EJ
https://portal.gdc.cancer.gov/files/042bda3d-77aa-4522-8a97-c121711a760e, UUID: 042bda3d-77aa-4522-8a97-c121711a760e, case: TCGA-AG-3582
https://portal.gdc.cancer.gov/files/b6388e09-7ed5-4041-97bb-4427ba5571ba, UUID: b6388e09-7ed5-4041-97bb-4427ba5571ba, case: TCGA-AY-6197
https://portal.gdc.cancer.gov/files/54394c0b-6ae3-4b48-8e89-350ad5349611, UUID: 54394c0b-6ae3-4b48-8e89-350ad5349611, case: TCGA-AA-3554
https://portal.gdc.cancer.gov/files/f7e21d61-19b6-4e99-887f-463d4419628c, UUID: f7e21d61-19b6-4e99-887f-463d4419628c, case: TCGA-AG-4015
https://portal.gdc.cancer.gov/files/b4114885-38cd-4e8a-874b-b78da8d95e2c, UUID: b4114885-38cd-4e8a-874b-b78da8d95e2c, case: TCGA-CM-6171
https://portal.gdc.cancer.gov/files/f9fda40d-67e4-4cb9-859c-ddc2ea84b7e4, UUID: f9fda40d-67e4-4cb9-859c-ddc2ea84b7e4, case: TCGA-CM-6170
https://portal.gdc.cancer.gov/files/b4aebb2a-d0b8-43d8-bd1f-78af2065d8f9, UUID: b4aebb2a-d0b8-43d8-bd1f-78af2065d8f9, case: TCGA-AA-3846
https://portal.gdc.cancer.gov/files/6a750710-5ed9-4d24-b2bf-3a4e3211878f, UUID: 6a750710-5ed9-4d24-b2bf-3a4e3211878f, case: TCGA-CM-6677
https://portal.gdc.cancer.gov/files/93d1a78f-423e-4560-b4d3-ee4a89ac922b, UUID: 93d1a78f-423e-4560-b4d3-ee4a89ac922b, case: TCGA-RU-A8FL
https://portal.gdc.cancer.gov/files/2e632fd9-fa17-4290-9601-a5d462cf152c, UUID: 2e632fd9-fa17-4290-9601-a5d462cf152c, case: TCGA-AZ-4323
https://portal.gdc.cancer.gov/files/7239b026-2587-489d-81fe-7bc657b7523c, UUID: 7239b026-2587-489d-81fe-7bc657b7523c, case: TCGA-CM-6164
https://portal.gdc.cancer.gov/files/90e86a26-fffa-4c38-b2e0-bf0704ee3615, UUID: 90e86a26-fffa-4c38-b2e0-bf0704ee3615, case: TCGA-AZ-4315
https://portal.gdc.cancer.gov/files/9ff11fe0-037c-405e-95c3-dc4a15413db8, UUID: 9ff11fe0-037c-405e-95c3-dc4a15413db8, case: TCGA-G4-6311
https://portal.gdc.cancer.gov/files/b8eed826-6051-4358-9b3d-44d1553dd9ad, UUID: b8eed826-6051-4358-9b3d-44d1553dd9ad, case: TCGA-AA-3522
https://portal.gdc.cancer.gov/files/c172bc07-d4f0-41be-a558-49abc81065c2, UUID: c172bc07-d4f0-41be-a558-49abc81065c2, case: TCGA-AA-3667
https://portal.gdc.cancer.gov/files/260edc5e-1ca6-4b07-b96d-59594d03ac54, UUID: 260edc5e-1ca6-4b07-b96d-59594d03ac54, case: TCGA-AA-A00U
https://portal.gdc.cancer.gov/files/0c5c1a38-7e9c-4b43-810d-0761c3af49b1, UUID: 0c5c1a38-7e9c-4b43-810d-0761c3af49b1, case: TCGA-AA-3506
https://portal.gdc.cancer.gov/files/7024ba0c-be56-4907-9254-cdb2579e536e, UUID: 7024ba0c-be56-4907-9254-cdb2579e536e, case: TCGA-NH-A8F7
https://portal.gdc.cancer.gov/files/031cf2a5-74e0-4b5f-98bd-da60628c0854, UUID: 031cf2a5-74e0-4b5f-98bd-da60628c0854, case: TCGA-AA-3680
https://portal.gdc.cancer.gov/files/91991ecf-cc54-4110-8a4e-9236bf8aa072, UUID: 91991ecf-cc54-4110-8a4e-9236bf8aa072, case: TCGA-A6-4105
https://portal.gdc.cancer.gov/files/47aceec1-a01d-419f-9689-c46284c79bcb, UUID: 47aceec1-a01d-419f-9689-c46284c79bcb, case: TCGA-D5-6922
https://portal.gdc.cancer.gov/files/ce84c955-63db-473a-a6d7-0e3daad6efd4, UUID: ce84c955-63db-473a-a6d7-0e3daad6efd4, case: TCGA-AA-3524
https://portal.gdc.cancer.gov/files/a071fc45-61ea-4815-93bf-be34980e59ee, UUID: a071fc45-61ea-4815-93bf-be34980e59ee, case: TCGA-AA-3855
https://portal.gdc.cancer.gov/files/8b275144-b885-4fb0-af39-fea1e48a970a, UUID: 8b275144-b885-4fb0-af39-fea1e48a970a, case: TCGA-AA-A00Q
First rename the “.count” files to “.txt” and unzip each one by opening each file.
setwd('~/Desktop/COAD_Data/')
The working directory was changed to /Users/ashleynoriega/Desktop/COAD_Data inside a notebook chunk. The working directory will be reset when the chunk is finished running. Use the knitr root.dir option in the setup chunk to change the working directory for notebook chunks.
COAD_files <- c("9e8b528b-1172-4c07-a09b-ebb23cf2310c.htseq.txt", "bda1a9a4-a14f-4463-81d2-a4fcca65d6f1.htseq.txt",
"5697212f-b3fd-479f-84b0-ec0aae54534a.htseq.txt", "7f9a629b-12ed-48cc-8d5c-1c2f5db9cf1f.htseq.txt",
"15864159-be88-41c8-bdef-c2c5927cb1a1.htseq.txt", "649b19e1-96e2-4b55-951d-3b6ee9f4b91f.htseq.txt",
"86679663-dfc5-46ad-8cf9-c7954c4b339b.htseq.txt", "28004569-048d-4f8c-99aa-7a8c69a98dcc.htseq.txt", "911f6378-8a25-4570-9d3b-80f5b5bfc085.htseq.txt", "d5dca54e-d7e9-4328-b2ca-1d191a2b8b4c.htseq.txt", "f3895ae4-1228-49b3-9342-3c3b86cb5243.htseq.txt", "f590941d-19dc-427a-95b6-942c97ea8333.htseq.txt", "55aa6d16-3598-42ca-8844-0fe84739ef66.htseq.txt", "0e7094cf-4c79-43f4-8b72-9de259e5e18f.htseq.txt", "9a62fe1f-36ec-4e8e-b3d9-bdfc62f71905.htseq.txt", "d2587070-cb7d-440d-ae49-52f5077248e6.htseq.txt", "7800bdb2-aa8b-43e0-8e45-1b968872b34e.htseq.txt", "2bcd2efd-4fd6-40ee-86a4-867ae82711b0.htseq.txt", "424d8e5f-9fc6-470b-ad2c-b4447b0eb07e.htseq.txt", "934f9dc6-1260-4268-b022-870f1e37dd6f.htseq.txt", "0fa55c0e-6f8f-44a6-82fe-9a42495d3484.htseq.txt", "c8544a8a-4352-438d-94d4-3495af2e9a78.htseq.txt", "dade0b16-ecc3-43b3-b328-3819a8fc18c6.htseq.txt", "e875ae4e-4645-4e84-b0ff-9c9a694717a9.htseq.txt", "debd6982-7c27-42e8-b778-20afcc78a5f3.htseq.txt", "17c88994-9e8e-4f16-8c41-34e98a0d8c52.htseq.txt", "7f5a924a-ddf3-45ff-be1f-5b5909305f46.htseq.txt", "fa73bdce-67fb-42aa-883f-635f0e7bcdc6.htseq.txt", "abe20df7-6b97-4397-8864-881bac27e92c.htseq.txt", "62f84581-4c7d-4c8e-835c-9304bcec3106.htseq.txt", "3abbd2b5-04db-4fe0-8dd1-ea2b48caa4c1.htseq.txt", "087666cd-47ae-4f56-b947-d6aa1c25e8a7.htseq.txt",
"c14f98e2-8e9b-49f4-a244-3d06c6cb7126.htseq.txt", "13abc91e-fbfc-4c55-bf54-fbd134979ccc.htseq.txt",
"6ae2dd6c-2a39-411f-a1fc-11e0e6e82165.htseq.txt", "8f77f4f4-b184-40c7-8ab8-2f95b13620b5.htseq.txt",
"168e5cb2-7390-45ad-ad04-c9aa4416e950.htseq.txt", "0ed65bdf-cb92-47c1-8aeb-42518ce639b8.htseq.txt", "4e7c6811-88e4-4bb7-a88f-7491dfa6d072.htseq.txt", "7fb73a84-867a-4c28-aa02-93068efffb7b.htseq.txt", "b53f9a9d-b24d-410d-b3e9-f2a8bf22ca27.htseq.txt", "f7ce175f-763e-4a55-97e3-0381d889b0eb.htseq.txt", "f346f2d2-285c-455c-ba34-ea8eec3fa881.htseq.txt", "e53e1a83-1979-4e12-bbb7-79b37d0cfe03.htseq.txt", "a26d49db-2309-46a0-a3ed-275378d484e7.htseq.txt", "a3f88a5d-7169-465b-bb80-e5999590681c.htseq.txt", "c264fe3b-482b-44ec-83a4-73df565663ff.htseq.txt", "bd2dfab3-88a8-4673-ba36-3daf252d0b4d.htseq.txt", "7261b656-c79c-4581-a503-15b653e2b5d2.htseq.txt", "ee4dcccc-514b-4cc6-ae63-6ed3e7519a40.htseq.txt", "f596eabc-e39a-4e35-9fc6-edade04eb785.htseq.txt", "bf9c448b-bdc9-4f74-b13a-374e6add7939.htseq.txt", "564daa81-cfef-45b6-94a0-3249b2724d9b.htseq.txt", "82e00e45-734c-471f-ba97-79ec3b7e0baa.htseq.txt", "9c52ed00-325f-4664-8873-327bcaa5ea74.htseq.txt", "fabefb10-5546-4017-8ea1-29982a10fb3c.htseq.txt", "32a115cf-570f-4ad9-a123-8e1970062f51.htseq.txt", "05eef9f8-a246-403a-b0be-07d274b6f93a.htseq.txt", "5c18c6a8-9ad2-43a8-a3a0-83d8fc0cc257.htseq.txt", "43b292be-5d63-4523-a43f-666d20039208.htseq.txt")
read.delim(COAD_files[1], nrows = 60)
Use edgeR to create a matrix of 60 text files.
Spoke to professor Craig on 12/4 and it is ok to not change the root, just setwd to desktop as my desktop since files were downloaded locally.
setwd('~/Desktop/COAD_Data/')
The working directory was changed to /Users/ashleynoriega/Desktop/COAD_Data inside a notebook chunk. The working directory will be reset when the chunk is finished running. Use the knitr root.dir option in the setup chunk to change the working directory for notebook chunks.
library(edgeR)
x <- readDGE(COAD_files, columns=c(1,2)) #joins my 60 files and creates a dataset
Meta tags detected: __no_feature, __ambiguous, __too_low_aQual, __not_aligned, __alignment_not_unique
class(x)
[1] "DGEList"
attr(,"package")
[1] "edgeR"
dim(x)
[1] 60487 60
names(x) #accessor function
[1] "samples" "counts"
str(x) #displays the structure of x in compact way, alternative to summary and best for displaying contents of lists
Formal class 'DGEList' [package "edgeR"] with 1 slot
..@ .Data:List of 2
.. ..$ :'data.frame': 60 obs. of 4 variables:
.. .. ..$ files : chr [1:60] "9e8b528b-1172-4c07-a09b-ebb23cf2310c.htseq.txt" "bda1a9a4-a14f-4463-81d2-a4fcca65d6f1.htseq.txt" "5697212f-b3fd-479f-84b0-ec0aae54534a.htseq.txt" "7f9a629b-12ed-48cc-8d5c-1c2f5db9cf1f.htseq.txt" ...
.. .. ..$ group : Factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 ...
.. .. ..$ lib.size : num [1:60] 9.02e+07 3.68e+07 4.30e+07 1.12e+08 3.63e+07 ...
.. .. ..$ norm.factors: num [1:60] 1 1 1 1 1 1 1 1 1 1 ...
.. ..$ : num [1:60487, 1:60] 47 1212 1176 121 166 ...
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ Tags : chr [1:60487] "ENSG00000000005.5" "ENSG00000000419.11" "ENSG00000000457.12" "ENSG00000000460.15" ...
.. .. .. ..$ Samples: chr [1:60] "9e8b528b-1172-4c07-a09b-ebb23cf2310c.htseq" "bda1a9a4-a14f-4463-81d2-a4fcca65d6f1.htseq" "5697212f-b3fd-479f-84b0-ec0aae54534a.htseq" "7f9a629b-12ed-48cc-8d5c-1c2f5db9cf1f.htseq" ...
x$samples
samplenames <- substring(colnames(x), 1, nchar(colnames(x)))
samplenames
[1] "9e8b528b-1172-4c07-a09b-ebb23cf2310c.htseq" "bda1a9a4-a14f-4463-81d2-a4fcca65d6f1.htseq"
[3] "5697212f-b3fd-479f-84b0-ec0aae54534a.htseq" "7f9a629b-12ed-48cc-8d5c-1c2f5db9cf1f.htseq"
[5] "15864159-be88-41c8-bdef-c2c5927cb1a1.htseq" "649b19e1-96e2-4b55-951d-3b6ee9f4b91f.htseq"
[7] "86679663-dfc5-46ad-8cf9-c7954c4b339b.htseq" "28004569-048d-4f8c-99aa-7a8c69a98dcc.htseq"
[9] "911f6378-8a25-4570-9d3b-80f5b5bfc085.htseq" "d5dca54e-d7e9-4328-b2ca-1d191a2b8b4c.htseq"
[11] "f3895ae4-1228-49b3-9342-3c3b86cb5243.htseq" "f590941d-19dc-427a-95b6-942c97ea8333.htseq"
[13] "55aa6d16-3598-42ca-8844-0fe84739ef66.htseq" "0e7094cf-4c79-43f4-8b72-9de259e5e18f.htseq"
[15] "9a62fe1f-36ec-4e8e-b3d9-bdfc62f71905.htseq" "d2587070-cb7d-440d-ae49-52f5077248e6.htseq"
[17] "7800bdb2-aa8b-43e0-8e45-1b968872b34e.htseq" "2bcd2efd-4fd6-40ee-86a4-867ae82711b0.htseq"
[19] "424d8e5f-9fc6-470b-ad2c-b4447b0eb07e.htseq" "934f9dc6-1260-4268-b022-870f1e37dd6f.htseq"
[21] "0fa55c0e-6f8f-44a6-82fe-9a42495d3484.htseq" "c8544a8a-4352-438d-94d4-3495af2e9a78.htseq"
[23] "dade0b16-ecc3-43b3-b328-3819a8fc18c6.htseq" "e875ae4e-4645-4e84-b0ff-9c9a694717a9.htseq"
[25] "debd6982-7c27-42e8-b778-20afcc78a5f3.htseq" "17c88994-9e8e-4f16-8c41-34e98a0d8c52.htseq"
[27] "7f5a924a-ddf3-45ff-be1f-5b5909305f46.htseq" "fa73bdce-67fb-42aa-883f-635f0e7bcdc6.htseq"
[29] "abe20df7-6b97-4397-8864-881bac27e92c.htseq" "62f84581-4c7d-4c8e-835c-9304bcec3106.htseq"
[31] "3abbd2b5-04db-4fe0-8dd1-ea2b48caa4c1.htseq" "087666cd-47ae-4f56-b947-d6aa1c25e8a7.htseq"
[33] "c14f98e2-8e9b-49f4-a244-3d06c6cb7126.htseq" "13abc91e-fbfc-4c55-bf54-fbd134979ccc.htseq"
[35] "6ae2dd6c-2a39-411f-a1fc-11e0e6e82165.htseq" "8f77f4f4-b184-40c7-8ab8-2f95b13620b5.htseq"
[37] "168e5cb2-7390-45ad-ad04-c9aa4416e950.htseq" "0ed65bdf-cb92-47c1-8aeb-42518ce639b8.htseq"
[39] "4e7c6811-88e4-4bb7-a88f-7491dfa6d072.htseq" "7fb73a84-867a-4c28-aa02-93068efffb7b.htseq"
[41] "b53f9a9d-b24d-410d-b3e9-f2a8bf22ca27.htseq" "f7ce175f-763e-4a55-97e3-0381d889b0eb.htseq"
[43] "f346f2d2-285c-455c-ba34-ea8eec3fa881.htseq" "e53e1a83-1979-4e12-bbb7-79b37d0cfe03.htseq"
[45] "a26d49db-2309-46a0-a3ed-275378d484e7.htseq" "a3f88a5d-7169-465b-bb80-e5999590681c.htseq"
[47] "c264fe3b-482b-44ec-83a4-73df565663ff.htseq" "bd2dfab3-88a8-4673-ba36-3daf252d0b4d.htseq"
[49] "7261b656-c79c-4581-a503-15b653e2b5d2.htseq" "ee4dcccc-514b-4cc6-ae63-6ed3e7519a40.htseq"
[51] "f596eabc-e39a-4e35-9fc6-edade04eb785.htseq" "bf9c448b-bdc9-4f74-b13a-374e6add7939.htseq"
[53] "564daa81-cfef-45b6-94a0-3249b2724d9b.htseq" "82e00e45-734c-471f-ba97-79ec3b7e0baa.htseq"
[55] "9c52ed00-325f-4664-8873-327bcaa5ea74.htseq" "fabefb10-5546-4017-8ea1-29982a10fb3c.htseq"
[57] "32a115cf-570f-4ad9-a123-8e1970062f51.htseq" "05eef9f8-a246-403a-b0be-07d274b6f93a.htseq"
[59] "5c18c6a8-9ad2-43a8-a3a0-83d8fc0cc257.htseq" "43b292be-5d63-4523-a43f-666d20039208.htseq"
colnames(x) <- samplenames
group <- as.factor(c("CMS", "CMS", "CMS", "CMS", "CMS", "CMS",
"CMS", "CMS", "CMS", "CMS", "CMS", "CMS",
"CMS", "CMS", "CMS", "CMS", "CMS", "CMS",
"CMS", "CMS", "CMS", "CMS", "CMS", "CMS",
"CMS", "CMS", "CMS", "CMS", "CMS", "CMS",
"ADENOCARCINOMA", "ADENOCARCINOMA", "ADENOCARCINOMA", "ADENOCARCINOMA", "ADENOCARCINOMA", "ADENOCARCINOMA", "ADENOCARCINOMA", "ADENOCARCINOMA",
"ADENOCARCINOMA", "ADENOCARCINOMA", "ADENOCARCINOMA", "ADENOCARCINOMA",
"ADENOCARCINOMA", "ADENOCARCINOMA", "ADENOCARCINOMA", "ADENOCARCINOMA",
"ADENOCARCINOMA", "ADENOCARCINOMA", "ADENOCARCINOMA", "ADENOCARCINOMA",
"ADENOCARCINOMA", "ADENOCARCINOMA", "ADENOCARCINOMA", "ADENOCARCINOMA",
"ADENOCARCINOMA", "ADENOCARCINOMA", "ADENOCARCINOMA", "ADENOCARCINOMA",
"ADENOCARCINOMA", "ADENOCARCINOMA"))
x$samples$group <- group
x$samples
DF<-x$samples #for my own visualization purposes
{ if (!requireNamespace(“BiocManager”, quietly = TRUE)) install.packages(“BiocManager”)
BiocManager::install(“Homo.sapiens”) library(Homo.sapiens) install.packages(gsubfn) library(gsubfn) }
First install Homo.sapiens, then use a script remove the decimals and numbers after the decimal points in all 60487 ENSEMBL geneid elements.
library(Homo.sapiens)
#library(stringr)
library(gsubfn)
geneid <- rownames(x)
#geneid_test <- c("ENSG00000000005",
# "ENSG00000000419",
# "ENSG00000000457",
# "ENSG00000000938")
#geneid <- str_remove(geneid, "[.]") removes decimals only
geneid <- gsub("\\.[0-9]*$", "", geneid) #remove decimals and numbers after decimals
genes <- select(Homo.sapiens, keys=geneid, columns=c("SYMBOL", "TXCHROM"),
keytype="ENSEMBL")
'select()' returned 1:many mapping between keys and columns
head(genes)
genes <- genes[!duplicated(genes$ENSEMBL),]
x$genes <- genes
x
An object of class "DGEList"
$samples
55 more rows ...
$counts
Samples
Tags 9e8b528b-1172-4c07-a09b-ebb23cf2310c.htseq
ENSG00000000005.5 47
ENSG00000000419.11 1212
ENSG00000000457.12 1176
ENSG00000000460.15 121
ENSG00000000938.11 166
Samples
Tags bda1a9a4-a14f-4463-81d2-a4fcca65d6f1.htseq
ENSG00000000005.5 4
ENSG00000000419.11 710
ENSG00000000457.12 236
ENSG00000000460.15 211
ENSG00000000938.11 140
Samples
Tags 5697212f-b3fd-479f-84b0-ec0aae54534a.htseq
ENSG00000000005.5 2
ENSG00000000419.11 702
ENSG00000000457.12 552
ENSG00000000460.15 320
ENSG00000000938.11 93
Samples
Tags 7f9a629b-12ed-48cc-8d5c-1c2f5db9cf1f.htseq
ENSG00000000005.5 10
ENSG00000000419.11 432
ENSG00000000457.12 803
ENSG00000000460.15 605
ENSG00000000938.11 473
Samples
Tags 15864159-be88-41c8-bdef-c2c5927cb1a1.htseq
ENSG00000000005.5 3
ENSG00000000419.11 641
ENSG00000000457.12 311
ENSG00000000460.15 182
ENSG00000000938.11 130
Samples
Tags 649b19e1-96e2-4b55-951d-3b6ee9f4b91f.htseq
ENSG00000000005.5 14
ENSG00000000419.11 1151
ENSG00000000457.12 246
ENSG00000000460.15 202
ENSG00000000938.11 52
Samples
Tags 86679663-dfc5-46ad-8cf9-c7954c4b339b.htseq
ENSG00000000005.5 14
ENSG00000000419.11 3675
ENSG00000000457.12 1901
ENSG00000000460.15 1436
ENSG00000000938.11 862
Samples
Tags 28004569-048d-4f8c-99aa-7a8c69a98dcc.htseq
ENSG00000000005.5 37
ENSG00000000419.11 2278
ENSG00000000457.12 835
ENSG00000000460.15 697
ENSG00000000938.11 687
Samples
Tags 911f6378-8a25-4570-9d3b-80f5b5bfc085.htseq
ENSG00000000005.5 2
ENSG00000000419.11 1934
ENSG00000000457.12 745
ENSG00000000460.15 464
ENSG00000000938.11 138
Samples
Tags d5dca54e-d7e9-4328-b2ca-1d191a2b8b4c.htseq
ENSG00000000005.5 3
ENSG00000000419.11 707
ENSG00000000457.12 366
ENSG00000000460.15 206
ENSG00000000938.11 80
Samples
Tags f3895ae4-1228-49b3-9342-3c3b86cb5243.htseq
ENSG00000000005.5 10
ENSG00000000419.11 1282
ENSG00000000457.12 624
ENSG00000000460.15 267
ENSG00000000938.11 979
Samples
Tags f590941d-19dc-427a-95b6-942c97ea8333.htseq
ENSG00000000005.5 0
ENSG00000000419.11 727
ENSG00000000457.12 356
ENSG00000000460.15 240
ENSG00000000938.11 378
Samples
Tags 55aa6d16-3598-42ca-8844-0fe84739ef66.htseq
ENSG00000000005.5 1
ENSG00000000419.11 2949
ENSG00000000457.12 892
ENSG00000000460.15 823
ENSG00000000938.11 389
Samples
Tags 0e7094cf-4c79-43f4-8b72-9de259e5e18f.htseq
ENSG00000000005.5 10
ENSG00000000419.11 219
ENSG00000000457.12 95
ENSG00000000460.15 106
ENSG00000000938.11 320
Samples
Tags 9a62fe1f-36ec-4e8e-b3d9-bdfc62f71905.htseq
ENSG00000000005.5 7
ENSG00000000419.11 1503
ENSG00000000457.12 566
ENSG00000000460.15 389
ENSG00000000938.11 235
Samples
Tags d2587070-cb7d-440d-ae49-52f5077248e6.htseq
ENSG00000000005.5 124
ENSG00000000419.11 2070
ENSG00000000457.12 886
ENSG00000000460.15 283
ENSG00000000938.11 2117
Samples
Tags 7800bdb2-aa8b-43e0-8e45-1b968872b34e.htseq
ENSG00000000005.5 2
ENSG00000000419.11 604
ENSG00000000457.12 215
ENSG00000000460.15 255
ENSG00000000938.11 228
Samples
Tags 2bcd2efd-4fd6-40ee-86a4-867ae82711b0.htseq
ENSG00000000005.5 6
ENSG00000000419.11 518
ENSG00000000457.12 215
ENSG00000000460.15 119
ENSG00000000938.11 159
Samples
Tags 424d8e5f-9fc6-470b-ad2c-b4447b0eb07e.htseq
ENSG00000000005.5 5
ENSG00000000419.11 2300
ENSG00000000457.12 1445
ENSG00000000460.15 831
ENSG00000000938.11 2183
Samples
Tags 934f9dc6-1260-4268-b022-870f1e37dd6f.htseq
ENSG00000000005.5 11
ENSG00000000419.11 627
ENSG00000000457.12 518
ENSG00000000460.15 401
ENSG00000000938.11 227
Samples
Tags 0fa55c0e-6f8f-44a6-82fe-9a42495d3484.htseq
ENSG00000000005.5 2
ENSG00000000419.11 1012
ENSG00000000457.12 468
ENSG00000000460.15 187
ENSG00000000938.11 534
Samples
Tags c8544a8a-4352-438d-94d4-3495af2e9a78.htseq
ENSG00000000005.5 433
ENSG00000000419.11 3532
ENSG00000000457.12 771
ENSG00000000460.15 449
ENSG00000000938.11 99
Samples
Tags dade0b16-ecc3-43b3-b328-3819a8fc18c6.htseq
ENSG00000000005.5 6
ENSG00000000419.11 3445
ENSG00000000457.12 840
ENSG00000000460.15 523
ENSG00000000938.11 891
Samples
Tags e875ae4e-4645-4e84-b0ff-9c9a694717a9.htseq
ENSG00000000005.5 0
ENSG00000000419.11 757
ENSG00000000457.12 234
ENSG00000000460.15 232
ENSG00000000938.11 333
Samples
Tags debd6982-7c27-42e8-b778-20afcc78a5f3.htseq
ENSG00000000005.5 0
ENSG00000000419.11 1519
ENSG00000000457.12 869
ENSG00000000460.15 317
ENSG00000000938.11 526
Samples
Tags 17c88994-9e8e-4f16-8c41-34e98a0d8c52.htseq
ENSG00000000005.5 11
ENSG00000000419.11 1875
ENSG00000000457.12 650
ENSG00000000460.15 325
ENSG00000000938.11 742
Samples
Tags 7f5a924a-ddf3-45ff-be1f-5b5909305f46.htseq
ENSG00000000005.5 11
ENSG00000000419.11 757
ENSG00000000457.12 332
ENSG00000000460.15 237
ENSG00000000938.11 139
Samples
Tags fa73bdce-67fb-42aa-883f-635f0e7bcdc6.htseq
ENSG00000000005.5 8
ENSG00000000419.11 959
ENSG00000000457.12 307
ENSG00000000460.15 246
ENSG00000000938.11 324
Samples
Tags abe20df7-6b97-4397-8864-881bac27e92c.htseq
ENSG00000000005.5 3
ENSG00000000419.11 392
ENSG00000000457.12 341
ENSG00000000460.15 335
ENSG00000000938.11 342
Samples
Tags 62f84581-4c7d-4c8e-835c-9304bcec3106.htseq
ENSG00000000005.5 1
ENSG00000000419.11 1144
ENSG00000000457.12 358
ENSG00000000460.15 320
ENSG00000000938.11 199
Samples
Tags 3abbd2b5-04db-4fe0-8dd1-ea2b48caa4c1.htseq
ENSG00000000005.5 1
ENSG00000000419.11 2901
ENSG00000000457.12 731
ENSG00000000460.15 494
ENSG00000000938.11 1845
Samples
Tags 087666cd-47ae-4f56-b947-d6aa1c25e8a7.htseq
ENSG00000000005.5 36
ENSG00000000419.11 3725
ENSG00000000457.12 1188
ENSG00000000460.15 741
ENSG00000000938.11 89
Samples
Tags c14f98e2-8e9b-49f4-a244-3d06c6cb7126.htseq
ENSG00000000005.5 2
ENSG00000000419.11 378
ENSG00000000457.12 171
ENSG00000000460.15 230
ENSG00000000938.11 440
Samples
Tags 13abc91e-fbfc-4c55-bf54-fbd134979ccc.htseq
ENSG00000000005.5 100
ENSG00000000419.11 2292
ENSG00000000457.12 831
ENSG00000000460.15 874
ENSG00000000938.11 489
Samples
Tags 6ae2dd6c-2a39-411f-a1fc-11e0e6e82165.htseq
ENSG00000000005.5 31
ENSG00000000419.11 4884
ENSG00000000457.12 765
ENSG00000000460.15 628
ENSG00000000938.11 284
Samples
Tags 8f77f4f4-b184-40c7-8ab8-2f95b13620b5.htseq
ENSG00000000005.5 4
ENSG00000000419.11 1593
ENSG00000000457.12 575
ENSG00000000460.15 368
ENSG00000000938.11 376
Samples
Tags 168e5cb2-7390-45ad-ad04-c9aa4416e950.htseq
ENSG00000000005.5 76
ENSG00000000419.11 1247
ENSG00000000457.12 274
ENSG00000000460.15 239
ENSG00000000938.11 158
Samples
Tags 0ed65bdf-cb92-47c1-8aeb-42518ce639b8.htseq
ENSG00000000005.5 3
ENSG00000000419.11 1853
ENSG00000000457.12 673
ENSG00000000460.15 437
ENSG00000000938.11 271
Samples
Tags 4e7c6811-88e4-4bb7-a88f-7491dfa6d072.htseq
ENSG00000000005.5 1
ENSG00000000419.11 506
ENSG00000000457.12 270
ENSG00000000460.15 184
ENSG00000000938.11 918
Samples
Tags 7fb73a84-867a-4c28-aa02-93068efffb7b.htseq
ENSG00000000005.5 19
ENSG00000000419.11 1464
ENSG00000000457.12 271
ENSG00000000460.15 303
ENSG00000000938.11 93
Samples
Tags b53f9a9d-b24d-410d-b3e9-f2a8bf22ca27.htseq
ENSG00000000005.5 1
ENSG00000000419.11 1331
ENSG00000000457.12 743
ENSG00000000460.15 422
ENSG00000000938.11 437
Samples
Tags f7ce175f-763e-4a55-97e3-0381d889b0eb.htseq
ENSG00000000005.5 72
ENSG00000000419.11 4749
ENSG00000000457.12 877
ENSG00000000460.15 536
ENSG00000000938.11 446
Samples
Tags f346f2d2-285c-455c-ba34-ea8eec3fa881.htseq
ENSG00000000005.5 7
ENSG00000000419.11 1954
ENSG00000000457.12 422
ENSG00000000460.15 283
ENSG00000000938.11 97
Samples
Tags e53e1a83-1979-4e12-bbb7-79b37d0cfe03.htseq
ENSG00000000005.5 252
ENSG00000000419.11 2538
ENSG00000000457.12 581
ENSG00000000460.15 521
ENSG00000000938.11 208
Samples
Tags a26d49db-2309-46a0-a3ed-275378d484e7.htseq
ENSG00000000005.5 26
ENSG00000000419.11 3001
ENSG00000000457.12 875
ENSG00000000460.15 462
ENSG00000000938.11 39
Samples
Tags a3f88a5d-7169-465b-bb80-e5999590681c.htseq
ENSG00000000005.5 4
ENSG00000000419.11 2746
ENSG00000000457.12 732
ENSG00000000460.15 542
ENSG00000000938.11 888
Samples
Tags c264fe3b-482b-44ec-83a4-73df565663ff.htseq
ENSG00000000005.5 104
ENSG00000000419.11 5777
ENSG00000000457.12 684
ENSG00000000460.15 634
ENSG00000000938.11 304
Samples
Tags bd2dfab3-88a8-4673-ba36-3daf252d0b4d.htseq
ENSG00000000005.5 10
ENSG00000000419.11 980
ENSG00000000457.12 193
ENSG00000000460.15 309
ENSG00000000938.11 76
Samples
Tags 7261b656-c79c-4581-a503-15b653e2b5d2.htseq
ENSG00000000005.5 2
ENSG00000000419.11 2981
ENSG00000000457.12 658
ENSG00000000460.15 845
ENSG00000000938.11 459
Samples
Tags ee4dcccc-514b-4cc6-ae63-6ed3e7519a40.htseq
ENSG00000000005.5 78
ENSG00000000419.11 1846
ENSG00000000457.12 1368
ENSG00000000460.15 415
ENSG00000000938.11 283
Samples
Tags f596eabc-e39a-4e35-9fc6-edade04eb785.htseq
ENSG00000000005.5 12
ENSG00000000419.11 478
ENSG00000000457.12 98
ENSG00000000460.15 95
ENSG00000000938.11 112
Samples
Tags bf9c448b-bdc9-4f74-b13a-374e6add7939.htseq
ENSG00000000005.5 4
ENSG00000000419.11 850
ENSG00000000457.12 277
ENSG00000000460.15 315
ENSG00000000938.11 67
Samples
Tags 564daa81-cfef-45b6-94a0-3249b2724d9b.htseq
ENSG00000000005.5 19
ENSG00000000419.11 202
ENSG00000000457.12 117
ENSG00000000460.15 61
ENSG00000000938.11 91
Samples
Tags 82e00e45-734c-471f-ba97-79ec3b7e0baa.htseq
ENSG00000000005.5 26
ENSG00000000419.11 5155
ENSG00000000457.12 728
ENSG00000000460.15 626
ENSG00000000938.11 46
Samples
Tags 9c52ed00-325f-4664-8873-327bcaa5ea74.htseq
ENSG00000000005.5 24
ENSG00000000419.11 557
ENSG00000000457.12 454
ENSG00000000460.15 175
ENSG00000000938.11 70
Samples
Tags fabefb10-5546-4017-8ea1-29982a10fb3c.htseq
ENSG00000000005.5 15
ENSG00000000419.11 4147
ENSG00000000457.12 679
ENSG00000000460.15 764
ENSG00000000938.11 477
Samples
Tags 32a115cf-570f-4ad9-a123-8e1970062f51.htseq
ENSG00000000005.5 37
ENSG00000000419.11 2843
ENSG00000000457.12 1259
ENSG00000000460.15 869
ENSG00000000938.11 438
Samples
Tags 05eef9f8-a246-403a-b0be-07d274b6f93a.htseq
ENSG00000000005.5 42
ENSG00000000419.11 844
ENSG00000000457.12 137
ENSG00000000460.15 133
ENSG00000000938.11 24
Samples
Tags 5c18c6a8-9ad2-43a8-a3a0-83d8fc0cc257.htseq
ENSG00000000005.5 179
ENSG00000000419.11 1307
ENSG00000000457.12 571
ENSG00000000460.15 307
ENSG00000000938.11 136
Samples
Tags 43b292be-5d63-4523-a43f-666d20039208.htseq
ENSG00000000005.5 140
ENSG00000000419.11 1101
ENSG00000000457.12 407
ENSG00000000460.15 191
ENSG00000000938.11 85
60482 more rows ...
$genes
60482 more rows ...
cpm <- cpm(x)
lcpm <- cpm(x, log=TRUE)
L <- mean(x$samples$lib.size) * 1e-6
M <- median(x$samples$lib.size) * 1e-6
c(L, M)
[1] 64.23804 58.76902
summary(lcpm)
9e8b528b-1172-4c07-a09b-ebb23cf2310c.htseq bda1a9a4-a14f-4463-81d2-a4fcca65d6f1.htseq
Min. :-5.0054 Min. :-5.005
1st Qu.:-5.0054 1st Qu.:-5.005
Median :-5.0054 Median :-5.005
Mean :-2.5302 Mean :-2.573
3rd Qu.:-0.7721 3rd Qu.:-1.020
Max. :17.9542 Max. :18.478
5697212f-b3fd-479f-84b0-ec0aae54534a.htseq 7f9a629b-12ed-48cc-8d5c-1c2f5db9cf1f.htseq
Min. :-5.005 Min. :-5.0054
1st Qu.:-5.005 1st Qu.:-5.0054
Median :-5.005 Median :-3.4138
Mean :-2.548 Mean :-2.3687
3rd Qu.:-1.079 3rd Qu.:-0.6434
Max. :18.160 Max. :19.0973
15864159-be88-41c8-bdef-c2c5927cb1a1.htseq 649b19e1-96e2-4b55-951d-3b6ee9f4b91f.htseq
Min. :-5.005 Min. :-5.0054
1st Qu.:-5.005 1st Qu.:-5.0054
Median :-5.005 Median :-5.0054
Mean :-2.406 Mean :-2.4792
3rd Qu.:-0.591 3rd Qu.:-0.8821
Max. :18.390 Max. :18.3537
86679663-dfc5-46ad-8cf9-c7954c4b339b.htseq 28004569-048d-4f8c-99aa-7a8c69a98dcc.htseq
Min. :-5.0054 Min. :-5.0054
1st Qu.:-5.0054 1st Qu.:-5.0054
Median :-4.6634 Median :-4.6140
Mean :-2.4026 Mean :-2.4599
3rd Qu.:-0.7838 3rd Qu.:-0.7757
Max. :18.3240 Max. :18.0427
911f6378-8a25-4570-9d3b-80f5b5bfc085.htseq d5dca54e-d7e9-4328-b2ca-1d191a2b8b4c.htseq
Min. :-5.0054 Min. :-5.0054
1st Qu.:-5.0054 1st Qu.:-5.0054
Median :-4.5570 Median :-5.0054
Mean :-2.4067 Mean :-2.4745
3rd Qu.:-0.6097 3rd Qu.:-0.6094
Max. :17.9832 Max. :18.1525
f3895ae4-1228-49b3-9342-3c3b86cb5243.htseq f590941d-19dc-427a-95b6-942c97ea8333.htseq
Min. :-5.00536 Min. :-5.0054
1st Qu.:-5.00536 1st Qu.:-5.0054
Median :-4.29934 Median :-5.0054
Mean :-2.20443 Mean :-2.3872
3rd Qu.:-0.06656 3rd Qu.:-0.5109
Max. :17.62822 Max. :18.2714
55aa6d16-3598-42ca-8844-0fe84739ef66.htseq 0e7094cf-4c79-43f4-8b72-9de259e5e18f.htseq
Min. :-5.005 Min. :-5.0054
1st Qu.:-5.005 1st Qu.:-5.0054
Median :-4.694 Median :-5.0054
Mean :-2.678 Mean :-2.4202
3rd Qu.:-1.498 3rd Qu.:-0.3006
Max. :19.004 Max. :18.0102
9a62fe1f-36ec-4e8e-b3d9-bdfc62f71905.htseq d2587070-cb7d-440d-ae49-52f5077248e6.htseq
Min. :-5.0054 Min. :-5.00536
1st Qu.:-5.0054 1st Qu.:-5.00536
Median :-4.2625 Median :-4.55846
Mean :-2.3000 Mean :-2.21997
3rd Qu.:-0.2363 3rd Qu.:-0.02803
Max. :18.3430 Max. :17.69040
7800bdb2-aa8b-43e0-8e45-1b968872b34e.htseq 2bcd2efd-4fd6-40ee-86a4-867ae82711b0.htseq
Min. :-5.005 Min. :-5.0054
1st Qu.:-5.005 1st Qu.:-5.0054
Median :-4.644 Median :-5.0054
Mean :-2.912 Mean :-2.3752
3rd Qu.:-1.374 3rd Qu.:-0.3785
Max. :18.862 Max. :17.9383
424d8e5f-9fc6-470b-ad2c-b4447b0eb07e.htseq 934f9dc6-1260-4268-b022-870f1e37dd6f.htseq
Min. :-5.0054 Min. :-5.0054
1st Qu.:-5.0054 1st Qu.:-5.0054
Median :-4.1417 Median :-3.9604
Mean :-2.4586 Mean :-2.5190
3rd Qu.:-0.7395 3rd Qu.:-0.7787
Max. :19.0773 Max. :18.7200
0fa55c0e-6f8f-44a6-82fe-9a42495d3484.htseq c8544a8a-4352-438d-94d4-3495af2e9a78.htseq
Min. :-5.005 Min. :-5.0054
1st Qu.:-5.005 1st Qu.:-5.0054
Median :-5.005 Median :-4.5364
Mean :-2.310 Mean :-2.4309
3rd Qu.:-0.129 3rd Qu.:-0.6687
Max. :17.739 Max. :17.9586
dade0b16-ecc3-43b3-b328-3819a8fc18c6.htseq e875ae4e-4645-4e84-b0ff-9c9a694717a9.htseq
Min. :-5.0054 Min. :-5.0054
1st Qu.:-5.0054 1st Qu.:-5.0054
Median :-4.6905 Median :-5.0054
Mean :-2.3374 Mean :-2.5190
3rd Qu.:-0.2584 3rd Qu.:-0.8769
Max. :18.1117 Max. :18.2798
debd6982-7c27-42e8-b778-20afcc78a5f3.htseq 17c88994-9e8e-4f16-8c41-34e98a0d8c52.htseq
Min. :-5.005 Min. :-5.0054
1st Qu.:-5.005 1st Qu.:-5.0054
Median :-4.529 Median :-4.4684
Mean :-2.669 Mean :-2.2833
3rd Qu.:-1.269 3rd Qu.:-0.1722
Max. :19.271 Max. :18.0424
7f5a924a-ddf3-45ff-be1f-5b5909305f46.htseq fa73bdce-67fb-42aa-883f-635f0e7bcdc6.htseq
Min. :-5.005 Min. :-5.0054
1st Qu.:-5.005 1st Qu.:-5.0054
Median :-5.005 Median :-5.0054
Mean :-2.315 Mean :-2.4626
3rd Qu.:-0.347 3rd Qu.:-0.7878
Max. :18.253 Max. :18.5223
abe20df7-6b97-4397-8864-881bac27e92c.htseq 62f84581-4c7d-4c8e-835c-9304bcec3106.htseq
Min. :-5.0054 Min. :-5.0054
1st Qu.:-5.0054 1st Qu.:-5.0054
Median :-5.0054 Median :-5.0054
Mean :-2.4819 Mean :-2.4094
3rd Qu.:-0.6892 3rd Qu.:-0.6332
Max. :18.2755 Max. :18.1225
3abbd2b5-04db-4fe0-8dd1-ea2b48caa4c1.htseq 087666cd-47ae-4f56-b947-d6aa1c25e8a7.htseq
Min. :-5.0054 Min. :-5.0054
1st Qu.:-5.0054 1st Qu.:-5.0054
Median :-4.4363 Median :-4.5625
Mean :-2.3126 Mean :-2.4787
3rd Qu.:-0.3237 3rd Qu.:-0.9039
Max. :17.6880 Max. :18.0167
c14f98e2-8e9b-49f4-a244-3d06c6cb7126.htseq 13abc91e-fbfc-4c55-bf54-fbd134979ccc.htseq
Min. :-5.0054 Min. :-5.0054
1st Qu.:-5.0054 1st Qu.:-5.0054
Median :-5.0054 Median :-4.6001
Mean :-2.5007 Mean :-2.2739
3rd Qu.:-0.6715 3rd Qu.:-0.3063
Max. :18.2731 Max. :17.8027
6ae2dd6c-2a39-411f-a1fc-11e0e6e82165.htseq 8f77f4f4-b184-40c7-8ab8-2f95b13620b5.htseq
Min. :-5.0054 Min. :-5.0054
1st Qu.:-5.0054 1st Qu.:-5.0054
Median :-4.5434 Median :-4.4560
Mean :-2.3342 Mean :-2.4084
3rd Qu.:-0.4192 3rd Qu.:-0.7874
Max. :17.8651 Max. :17.8969
168e5cb2-7390-45ad-ad04-c9aa4416e950.htseq 0ed65bdf-cb92-47c1-8aeb-42518ce639b8.htseq
Min. :-5.0054 Min. :-5.0054
1st Qu.:-5.0054 1st Qu.:-5.0054
Median :-5.0054 Median :-4.5027
Mean :-2.4576 Mean :-2.3667
3rd Qu.:-0.6493 3rd Qu.:-0.5842
Max. :18.3474 Max. :17.8840
4e7c6811-88e4-4bb7-a88f-7491dfa6d072.htseq 7fb73a84-867a-4c28-aa02-93068efffb7b.htseq
Min. :-5.0054 Min. :-5.0054
1st Qu.:-5.0054 1st Qu.:-5.0054
Median :-5.0054 Median :-5.0054
Mean :-2.3788 Mean :-2.5250
3rd Qu.:-0.5019 3rd Qu.:-0.8667
Max. :18.2679 Max. :18.4535
b53f9a9d-b24d-410d-b3e9-f2a8bf22ca27.htseq f7ce175f-763e-4a55-97e3-0381d889b0eb.htseq
Min. :-5.0054 Min. :-5.0054
1st Qu.:-5.0054 1st Qu.:-5.0054
Median :-4.4286 Median :-4.5018
Mean :-2.3446 Mean :-2.3837
3rd Qu.:-0.5695 3rd Qu.:-0.5533
Max. :17.7747 Max. :18.0439
f346f2d2-285c-455c-ba34-ea8eec3fa881.htseq e53e1a83-1979-4e12-bbb7-79b37d0cfe03.htseq
Min. :-5.005 Min. :-5.005
1st Qu.:-5.005 1st Qu.:-5.005
Median :-5.005 Median :-5.005
Mean :-2.374 Mean :-2.494
3rd Qu.:-0.463 3rd Qu.:-0.882
Max. :18.269 Max. :18.606
a26d49db-2309-46a0-a3ed-275378d484e7.htseq a3f88a5d-7169-465b-bb80-e5999590681c.htseq
Min. :-5.005 Min. :-5.0054
1st Qu.:-5.005 1st Qu.:-5.0054
Median :-5.005 Median :-4.0762
Mean :-2.505 Mean :-1.9487
3rd Qu.:-1.076 3rd Qu.: 0.8739
Max. :17.950 Max. :18.1564
c264fe3b-482b-44ec-83a4-73df565663ff.htseq bd2dfab3-88a8-4673-ba36-3daf252d0b4d.htseq
Min. :-5.0054 Min. :-5.0054
1st Qu.:-5.0054 1st Qu.:-5.0054
Median :-4.5239 Median :-5.0054
Mean :-2.2732 Mean :-2.4576
3rd Qu.:-0.3071 3rd Qu.:-0.7238
Max. :17.8054 Max. :18.4775
7261b656-c79c-4581-a503-15b653e2b5d2.htseq ee4dcccc-514b-4cc6-ae63-6ed3e7519a40.htseq
Min. :-5.0054 Min. :-5.0054
1st Qu.:-5.0054 1st Qu.:-5.0054
Median :-5.0054 Median :-4.6205
Mean :-2.3392 Mean :-2.3583
3rd Qu.:-0.4668 3rd Qu.:-0.4248
Max. :17.7787 Max. :17.9569
f596eabc-e39a-4e35-9fc6-edade04eb785.htseq bf9c448b-bdc9-4f74-b13a-374e6add7939.htseq
Min. :-5.005 Min. :-5.0054
1st Qu.:-5.005 1st Qu.:-5.0054
Median :-5.005 Median :-5.0054
Mean :-2.635 Mean :-2.5172
3rd Qu.:-0.981 3rd Qu.:-0.9961
Max. :18.298 Max. :18.2876
564daa81-cfef-45b6-94a0-3249b2724d9b.htseq 82e00e45-734c-471f-ba97-79ec3b7e0baa.htseq
Min. :-5.0054 Min. :-5.0054
1st Qu.:-5.0054 1st Qu.:-5.0054
Median :-5.0054 Median :-4.5039
Mean :-2.5218 Mean :-2.4147
3rd Qu.:-0.6848 3rd Qu.:-0.7359
Max. :18.2312 Max. :17.8176
9c52ed00-325f-4664-8873-327bcaa5ea74.htseq fabefb10-5546-4017-8ea1-29982a10fb3c.htseq
Min. :-5.0054 Min. :-5.005
1st Qu.:-5.0054 1st Qu.:-5.005
Median :-5.0054 Median :-4.399
Mean :-2.3284 Mean :-2.282
3rd Qu.:-0.2542 3rd Qu.:-0.272
Max. :18.0456 Max. :17.898
32a115cf-570f-4ad9-a123-8e1970062f51.htseq 05eef9f8-a246-403a-b0be-07d274b6f93a.htseq
Min. :-5.00536 Min. :-5.005
1st Qu.:-5.00536 1st Qu.:-5.005
Median :-4.51407 Median :-5.005
Mean :-2.18955 Mean :-2.610
3rd Qu.:-0.05016 3rd Qu.:-1.036
Max. :17.82851 Max. :18.150
5c18c6a8-9ad2-43a8-a3a0-83d8fc0cc257.htseq 43b292be-5d63-4523-a43f-666d20039208.htseq
Min. :-5.0054 Min. :-5.0054
1st Qu.:-5.0054 1st Qu.:-5.0054
Median :-5.0054 Median :-5.0054
Mean :-2.4734 Mean :-2.4267
3rd Qu.:-0.6384 3rd Qu.:-0.6233
Max. :18.7800 Max. :18.3746
True signifies how many genes have counts equal to zero, meaning genes are unexpressed throughout all samples.
table(rowSums(x$counts==0)==9)
FALSE TRUE
60074 413
keep.exprs <- filterByExpr(x, group=group)
x <- x[keep.exprs,, keep.lib.sizes=FALSE]
dim(x)
[1] 19105 60
There is a sample that is a potential outlier (green colored line), could remove the sample for future analysis but spoke to porfessor Craig on 12/4 and agreed to leave the sample in since the vignette has a normalisation step.
Spoke to professor Craig on 12/4 and agreed to stop working on this issue. I understand that the “Paired” palatte only offers 12 colors so every 13th sample repeats color scheme. I tried increasing the number of colors available with colorRampPalatte but was unsuccesful.
lcpm.cutoff <- log2(10/M + 2/L)
library(RColorBrewer)
#library(colorRamps)
nsamples <- ncol(x)
col <- brewer.pal(nsamples, "Paired") #results in the error message: n too large, allowed maximum for palette Paired is 12. Returning the palette you asked for with that many colors
n too large, allowed maximum for palette Paired is 12
Returning the palette you asked for with that many colors
#nb.cols = 60
#col <- colorRampPalette(brewer.pal(nsamples, "Paired"))(nb.cols) #colorRampPalette is a constructor function that builds palettes with arbitrary number of colors by interpolating existing palette
par(mfrow=c(1,2)) #1 row, 2 columns
plot(density(lcpm[,1]), col=col[1], lwd=2, ylim=c(0,0.26), las=2, main="", xlab="")
title(main="A. Raw data", xlab="Log-cpm")
abline(v=lcpm.cutoff, lty=3)
for (i in 2:nsamples){
den <- density(lcpm[,i])
lines(den$x, den$y, col=col[i], lwd=2)
}
legend("topright", samplenames, text.col=col, bty="n")
lcpm <- cpm(x, log=TRUE)
plot(density(lcpm[,1]), col=col[1], lwd=2, ylim=c(0,0.26), las=2, main="", xlab="")
title(main="B. Filtered data", xlab="Log-cpm")
abline(v=lcpm.cutoff, lty=3)
for (i in 2:nsamples){
den <- density(lcpm[,i])
lines(den$x, den$y, col=col[i], lwd=2)
}
legend("topright", samplenames, text.col=col, bty="n")
x <- calcNormFactors(x, method = "TMM")
x$samples$norm.factors
[1] 0.8247877 0.8701067 0.9744718 0.3338411 1.0672340 0.9853437 0.8355374 1.0685271 1.1041480
[10] 1.1621714 1.3190864 1.1616288 0.5656024 1.0079779 1.1311621 1.3455788 0.2548806 1.1728192
[19] 0.5164198 0.4158854 1.1552465 1.0977806 1.1538364 1.0380711 0.5054850 1.2962443 1.1705252
[28] 0.9900169 0.9259593 1.1222135 1.2985756 1.0598233 0.9471138 1.3391984 1.3043419 1.1144424
[37] 1.0697018 1.1921660 1.1054413 0.9399911 1.2384865 1.2243347 1.0760378 0.9933192 1.1164186
[46] 1.4176459 1.3689476 0.8989757 1.3130426 1.0789261 0.7851273 1.0110826 0.9751891 1.1994225
[55] 1.2667583 1.3476310 1.4869359 1.0917179 0.8579364 1.0468322
x2 <- x
x2$samples$norm.factors <- 1
x2$counts[,1] <- ceiling(x2$counts[,1]*0.05)
x2$counts[,2] <- x2$counts[,2]*5
par(mfrow=c(1,1)) #makes boxplot look less cramped
lcpm <- cpm(x2, log=TRUE)
boxplot(lcpm, las=2, col=col, main="")
title(main="A. Example: Unnormalised data",ylab="Log-cpm")
x2 <- calcNormFactors(x2)
x2$samples$norm.factors
[1] 0.04889808 4.36314201 0.99857375 0.36344409 1.08814631 1.00048470 0.91165862 1.07196030
[9] 1.10500025 1.21054683 1.28772624 1.16063702 0.60882746 1.04998828 1.13203522 1.31787908
[17] 0.28023587 1.16142488 0.53329090 0.46494919 1.14274272 1.09741972 1.16406362 1.04925131
[25] 0.51082602 1.30832124 1.17758030 1.00128465 0.97668711 1.11752935 1.28315406 1.05679584
[33] 0.99263908 1.36510564 1.33873108 1.14293994 1.10193770 1.21035319 1.10148876 0.97629109
[41] 1.25905436 1.25325781 1.12605092 1.02503014 1.11956858 1.41609997 1.38704078 0.91182127
[49] 1.31684398 1.09702295 0.85028060 1.03939580 1.01768934 1.20164571 1.27684217 1.35509884
[57] 1.51897054 1.10244902 0.85428854 1.05218206
This step forces the samples to even out, may not be a good thing since there is a potential outlier.
lcpm <- cpm(x2, log=TRUE)
boxplot(lcpm, las=2, col=col, main="")
title(main="B. Example: Normalised data",ylab="Log-cpm")
I spoke to professor Craig on 12/4, ok to ignore error since I am only comparing 2 different subsets of colon cancer. To get rid of this error I would need to add an additional factor: lane.
lcpm <- cpm(x, log=TRUE)
par(mfrow=c(1,1)) #1 row, 1 column
col.group <- group
levels(col.group) <- brewer.pal(nlevels(col.group), "Set1") #n= number of different colors in a palette with the min being 3
minimal value for n is 3, returning requested palette with 3 different levels
col.group <- as.character(col.group)
#col.lane <- lane did not have lanes for my data
#levels(col.lane) <- brewer.pal(nlevels(col.lane), "Set2")
#col.lane <- as.character(col.lane)
plotMDS(lcpm, labels=group, col=col.group)
title(main="A. Sample groups")
#plotMDS(lcpm, labels=lane, col=col.lane, dim=c(3,4))
#title(main="B. Sequencing lanes")
HTML page will be generarted and opened in a browser if launch=TRUE
suppressPackageStartupMessages(library(Glimma))
glMDSPlot(lcpm, labels=paste(group, sep="_"),
groups=x$samples[,c(1,2)], launch=FALSE)
design <- model.matrix(~0+group) #removes intercept from the factor group
#design <- model.matrix(~group) leaves intercept from factor group, but model contrasts are more straight forward without intercept
colnames(design) <- gsub("group", "", colnames(design))
design
ADENOCARCINOMA CMS
1 0 1
2 0 1
3 0 1
4 0 1
5 0 1
6 0 1
7 0 1
8 0 1
9 0 1
10 0 1
11 0 1
12 0 1
13 0 1
14 0 1
15 0 1
16 0 1
17 0 1
18 0 1
19 0 1
20 0 1
21 0 1
22 0 1
23 0 1
24 0 1
25 0 1
26 0 1
27 0 1
28 0 1
29 0 1
30 0 1
31 1 0
32 1 0
33 1 0
34 1 0
35 1 0
36 1 0
37 1 0
38 1 0
39 1 0
40 1 0
41 1 0
42 1 0
43 1 0
44 1 0
45 1 0
46 1 0
47 1 0
48 1 0
49 1 0
50 1 0
51 1 0
52 1 0
53 1 0
54 1 0
55 1 0
56 1 0
57 1 0
58 1 0
59 1 0
60 1 0
attr(,"assign")
[1] 1 1
attr(,"contrasts")
attr(,"contrasts")$group
[1] "contr.treatment"
Since I am only comparing CMS and Adenocarcinoma, I will only have 1 pairwise comparison.
library(limma)
contr.matrix <- makeContrasts(
ADENOCARCINOMAvsCMS = ADENOCARCINOMA-CMS,
levels = colnames(design))
contr.matrix
Contrasts
Levels ADENOCARCINOMAvsCMS
ADENOCARCINOMA 1
CMS -1
## Remove heteroscedascity from count data
Each black dot represents a gene. The red curve is the estimated mean-varience trend used to compute the voom weights.
par(mfrow=c(1,2))
v <- voom(x, design, plot=TRUE) #voom converts raw counts to log-CPM values by extracting library sizes and normalisation factors from x
v
An object of class "EList"
$genes
19100 more rows ...
$targets
55 more rows ...
$E
Samples
Tags 9e8b528b-1172-4c07-a09b-ebb23cf2310c.htseq
ENSG00000000005.5 -0.6452887
ENSG00000000419.11 4.0286248
ENSG00000000457.12 3.9851413
ENSG00000000460.15 0.7096682
ENSG00000000938.11 1.1642341
Samples
Tags bda1a9a4-a14f-4463-81d2-a4fcca65d6f1.htseq
ENSG00000000005.5 -2.829508
ENSG00000000419.11 4.473258
ENSG00000000457.12 2.886263
ENSG00000000460.15 2.725081
ENSG00000000938.11 2.134993
Samples
Tags 5697212f-b3fd-479f-84b0-ec0aae54534a.htseq
ENSG00000000005.5 -4.064736
ENSG00000000419.11 4.069690
ENSG00000000457.12 3.723166
ENSG00000000460.15 2.937516
ENSG00000000938.11 1.160230
Samples
Tags 7f9a629b-12ed-48cc-8d5c-1c2f5db9cf1f.htseq
ENSG00000000005.5 -1.820221
ENSG00000000419.11 3.544018
ENSG00000000457.12 4.437616
ENSG00000000460.15 4.029445
ENSG00000000938.11 3.674683
Samples
Tags 15864159-be88-41c8-bdef-c2c5927cb1a1.htseq
ENSG00000000005.5 -3.468722
ENSG00000000419.11 4.049228
ENSG00000000457.12 3.007011
ENSG00000000460.15 2.235675
ENSG00000000938.11 1.751829
Samples
Tags 649b19e1-96e2-4b55-951d-3b6ee9f4b91f.htseq
ENSG00000000005.5 -1.1743179
ENSG00000000419.11 5.1369998
ENSG00000000457.12 2.9131449
ENSG00000000460.15 2.6294792
ENSG00000000938.11 0.6819466
Samples
Tags 86679663-dfc5-46ad-8cf9-c7954c4b339b.htseq
ENSG00000000005.5 -2.786013
ENSG00000000419.11 5.199731
ENSG00000000457.12 4.248928
ENSG00000000460.15 3.844348
ENSG00000000938.11 3.108387
Samples
Tags 28004569-048d-4f8c-99aa-7a8c69a98dcc.htseq
ENSG00000000005.5 -1.553263
ENSG00000000419.11 4.371787
ENSG00000000457.12 2.924414
ENSG00000000460.15 2.663967
ENSG00000000938.11 2.643134
Samples
Tags 911f6378-8a25-4570-9d3b-80f5b5bfc085.htseq
ENSG00000000005.5 -5.2805208
ENSG00000000419.11 4.3152962
ENSG00000000457.12 2.9396157
ENSG00000000460.15 2.2570859
ENSG00000000938.11 0.5112933
Samples
Tags d5dca54e-d7e9-4328-b2ca-1d191a2b8b4c.htseq
ENSG00000000005.5 -2.408949
ENSG00000000419.11 5.250282
ENSG00000000457.12 4.301365
ENSG00000000460.15 3.473694
ENSG00000000938.11 2.114612
Samples
Tags f3895ae4-1228-49b3-9342-3c3b86cb5243.htseq
ENSG00000000005.5 -2.674378
ENSG00000000419.11 4.258048
ENSG00000000457.12 3.219863
ENSG00000000460.15 1.996700
ENSG00000000938.11 3.869207
Samples
Tags f590941d-19dc-427a-95b6-942c97ea8333.htseq
ENSG00000000005.5 -6.376198
ENSG00000000419.11 4.130606
ENSG00000000457.12 3.101561
ENSG00000000460.15 2.533696
ENSG00000000938.11 3.187952
Samples
Tags 55aa6d16-3598-42ca-8844-0fe84739ef66.htseq
ENSG00000000005.5 -5.645779
ENSG00000000419.11 5.295513
ENSG00000000457.12 3.570967
ENSG00000000460.15 3.454883
ENSG00000000938.11 2.374738
Samples
Tags 0e7094cf-4c79-43f4-8b72-9de259e5e18f.htseq
ENSG00000000005.5 -1.143810
ENSG00000000419.11 3.241950
ENSG00000000457.12 2.041301
ENSG00000000460.15 2.198582
ENSG00000000938.11 3.788053
Samples
Tags 9a62fe1f-36ec-4e8e-b3d9-bdfc62f71905.htseq
ENSG00000000005.5 -2.844269
ENSG00000000419.11 4.802950
ENSG00000000457.12 3.394772
ENSG00000000460.15 2.854320
ENSG00000000938.11 2.128424
Samples
Tags d2587070-cb7d-440d-ae49-52f5077248e6.htseq
ENSG00000000005.5 0.06657207
ENSG00000000419.11 4.12233363
ENSG00000000457.12 2.89854696
ENSG00000000460.15 1.25377506
ENSG00000000938.11 4.15471639
Samples
Tags 7800bdb2-aa8b-43e0-8e45-1b968872b34e.htseq
ENSG00000000005.5 -3.518050
ENSG00000000419.11 4.399620
ENSG00000000457.12 2.911566
ENSG00000000460.15 3.157201
ENSG00000000938.11 2.996072
Samples
Tags 2bcd2efd-4fd6-40ee-86a4-867ae82711b0.htseq
ENSG00000000005.5 -2.288229
ENSG00000000419.11 4.029531
ENSG00000000457.12 2.762875
ENSG00000000460.15 1.912198
ENSG00000000938.11 2.328744
Samples
Tags 424d8e5f-9fc6-470b-ad2c-b4447b0eb07e.htseq
ENSG00000000005.5 -3.874298
ENSG00000000419.11 4.834002
ENSG00000000457.12 4.163623
ENSG00000000460.15 3.365842
ENSG00000000938.11 4.758697
Samples
Tags 934f9dc6-1260-4268-b022-870f1e37dd6f.htseq
ENSG00000000005.5 -1.706603
ENSG00000000419.11 4.063307
ENSG00000000457.12 3.788035
ENSG00000000460.15 3.419091
ENSG00000000938.11 2.599558
Samples
Tags 0fa55c0e-6f8f-44a6-82fe-9a42495d3484.htseq
ENSG00000000005.5 -4.763380
ENSG00000000419.11 3.898399
ENSG00000000457.12 2.786598
ENSG00000000460.15 1.465439
ENSG00000000938.11 2.976738
Samples
Tags c8544a8a-4352-438d-94d4-3495af2e9a78.htseq
ENSG00000000005.5 2.2408036
ENSG00000000419.11 5.2673893
ENSG00000000457.12 3.0724378
ENSG00000000460.15 2.2930928
ENSG00000000938.11 0.1175401
Samples
Tags dade0b16-ecc3-43b3-b328-3819a8fc18c6.htseq
ENSG00000000005.5 -4.544553
ENSG00000000419.11 4.505505
ENSG00000000457.12 2.470112
ENSG00000000460.15 1.787053
ENSG00000000938.11 2.555099
Samples
Tags e875ae4e-4645-4e84-b0ff-9c9a694717a9.htseq
ENSG00000000005.5 -5.921106
ENSG00000000419.11 4.643996
ENSG00000000457.12 2.952338
ENSG00000000460.15 2.939981
ENSG00000000938.11 3.460437
Samples
Tags debd6982-7c27-42e8-b778-20afcc78a5f3.htseq
ENSG00000000005.5 -7.372309
ENSG00000000419.11 4.197072
ENSG00000000457.12 3.391733
ENSG00000000460.15 1.938303
ENSG00000000938.11 2.667980
Samples
Tags 17c88994-9e8e-4f16-8c41-34e98a0d8c52.htseq
ENSG00000000005.5 -3.003489
ENSG00000000419.11 4.346009
ENSG00000000457.12 2.818355
ENSG00000000460.15 1.819463
ENSG00000000938.11 3.009197
Samples
Tags 7f5a924a-ddf3-45ff-be1f-5b5909305f46.htseq
ENSG00000000005.5 -1.751115
ENSG00000000419.11 4.290426
ENSG00000000457.12 3.102534
ENSG00000000460.15 2.617107
ENSG00000000938.11 1.849445
Samples
Tags fa73bdce-67fb-42aa-883f-635f0e7bcdc6.htseq
ENSG00000000005.5 -2.408306
ENSG00000000419.11 4.410370
ENSG00000000457.12 2.768674
ENSG00000000460.15 2.449675
ENSG00000000938.11 2.846306
Samples
Tags abe20df7-6b97-4397-8864-881bac27e92c.htseq
ENSG00000000005.5 -3.366592
ENSG00000000419.11 3.442602
ENSG00000000457.12 3.241795
ENSG00000000460.15 3.216222
ENSG00000000938.11 3.246014
Samples
Tags 62f84581-4c7d-4c8e-835c-9304bcec3106.htseq
ENSG00000000005.5 -5.454801
ENSG00000000419.11 4.120738
ENSG00000000457.12 2.446066
ENSG00000000460.15 2.284417
ENSG00000000938.11 1.600482
Samples
Tags 3abbd2b5-04db-4fe0-8dd1-ea2b48caa4c1.htseq
ENSG00000000005.5 -5.844178
ENSG00000000419.11 5.073443
ENSG00000000457.12 3.085574
ENSG00000000460.15 2.520686
ENSG00000000938.11 4.420656
Samples
Tags 087666cd-47ae-4f56-b947-d6aa1c25e8a7.htseq
ENSG00000000005.5 -1.37511256
ENSG00000000419.11 5.29828123
ENSG00000000457.12 3.64998907
ENSG00000000460.15 2.96936577
ENSG00000000938.11 -0.08112134
Samples
Tags c14f98e2-8e9b-49f4-a244-3d06c6cb7126.htseq
ENSG00000000005.5 -3.735749
ENSG00000000419.11 3.506472
ENSG00000000457.12 2.364387
ENSG00000000460.15 2.790945
ENSG00000000938.11 3.725321
Samples
Tags 13abc91e-fbfc-4c55-bf54-fbd134979ccc.htseq
ENSG00000000005.5 -0.3984018
ENSG00000000419.11 4.1132525
ENSG00000000457.12 2.6501190
ENSG00000000460.15 2.7228611
ENSG00000000938.11 1.8857116
Samples
Tags 6ae2dd6c-2a39-411f-a1fc-11e0e6e82165.htseq
ENSG00000000005.5 -1.815983
ENSG00000000419.11 5.460732
ENSG00000000457.12 2.786995
ENSG00000000460.15 2.502506
ENSG00000000938.11 1.359022
Samples
Tags 8f77f4f4-b184-40c7-8ab8-2f95b13620b5.htseq
ENSG00000000005.5 -4.099968
ENSG00000000419.11 4.368090
ENSG00000000457.12 2.898779
ENSG00000000460.15 2.255627
ENSG00000000938.11 2.286613
Samples
Tags 168e5cb2-7390-45ad-ad04-c9aa4416e950.htseq
ENSG00000000005.5 1.352327
ENSG00000000419.11 5.379763
ENSG00000000457.12 3.195602
ENSG00000000460.15 2.998821
ENSG00000000938.11 2.403278
Samples
Tags 0ed65bdf-cb92-47c1-8aeb-42518ce639b8.htseq
ENSG00000000005.5 -4.712731
ENSG00000000419.11 4.335951
ENSG00000000457.12 2.875449
ENSG00000000460.15 2.253054
ENSG00000000938.11 1.564723
Samples
Tags 4e7c6811-88e4-4bb7-a88f-7491dfa6d072.htseq
ENSG00000000005.5 -4.648538
ENSG00000000419.11 3.750918
ENSG00000000457.12 2.845984
ENSG00000000460.15 2.293977
ENSG00000000938.11 4.609635
Samples
Tags 7fb73a84-867a-4c28-aa02-93068efffb7b.htseq
ENSG00000000005.5 -0.8970334
ENSG00000000419.11 5.3337569
ENSG00000000457.12 2.9023728
ENSG00000000460.15 3.0631171
ENSG00000000938.11 1.3644589
Samples
Tags b53f9a9d-b24d-410d-b3e9-f2a8bf22ca27.htseq
ENSG00000000005.5 -5.751942
ENSG00000000419.11 4.041932
ENSG00000000457.12 3.201284
ENSG00000000460.15 2.385903
ENSG00000000938.11 2.436234
Samples
Tags f7ce175f-763e-4a55-97e3-0381d889b0eb.htseq
ENSG00000000005.5 -0.375647
ENSG00000000419.11 5.658004
ENSG00000000457.12 3.221699
ENSG00000000460.15 2.511878
ENSG00000000938.11 2.246960
Samples
Tags f346f2d2-285c-455c-ba34-ea8eec3fa881.htseq
ENSG00000000005.5 -2.367437
ENSG00000000419.11 5.658257
ENSG00000000457.12 3.448480
ENSG00000000460.15 2.872878
ENSG00000000938.11 1.333003
Samples
Tags e53e1a83-1979-4e12-bbb7-79b37d0cfe03.htseq
ENSG00000000005.5 2.116966
ENSG00000000419.11 5.446587
ENSG00000000457.12 3.320462
ENSG00000000460.15 3.163350
ENSG00000000938.11 1.840730
Samples
Tags a26d49db-2309-46a0-a3ed-275378d484e7.htseq
ENSG00000000005.5 -1.557761
ENSG00000000419.11 5.265786
ENSG00000000457.12 3.488282
ENSG00000000460.15 2.567628
ENSG00000000938.11 -0.981901
Samples
Tags a3f88a5d-7169-465b-bb80-e5999590681c.htseq
ENSG00000000005.5 -4.475837
ENSG00000000419.11 4.777617
ENSG00000000457.12 2.870923
ENSG00000000460.15 2.437718
ENSG00000000938.11 3.149466
Samples
Tags c264fe3b-482b-44ec-83a4-73df565663ff.htseq
ENSG00000000005.5 -0.08499537
ENSG00000000419.11 5.70387514
ENSG00000000457.12 2.62655223
ENSG00000000460.15 2.51712185
ENSG00000000938.11 1.45794392
Samples
Tags bd2dfab3-88a8-4673-ba36-3daf252d0b4d.htseq
ENSG00000000005.5 -1.499968
ENSG00000000419.11 5.045088
ENSG00000000457.12 2.703904
ENSG00000000460.15 3.381510
ENSG00000000938.11 1.365102
Samples
Tags 7261b656-c79c-4581-a503-15b653e2b5d2.htseq
ENSG00000000005.5 -4.885539
ENSG00000000419.11 5.334355
ENSG00000000457.12 3.155572
ENSG00000000460.15 3.516193
ENSG00000000938.11 2.636454
Samples
Tags ee4dcccc-514b-4cc6-ae63-6ed3e7519a40.htseq
ENSG00000000005.5 -0.528445
ENSG00000000419.11 4.027512
ENSG00000000457.12 3.595314
ENSG00000000460.15 1.875639
ENSG00000000938.11 1.324139
Samples
Tags f596eabc-e39a-4e35-9fc6-edade04eb785.htseq
ENSG00000000005.5 -0.4005062
ENSG00000000419.11 4.8580127
ENSG00000000457.12 2.5776894
ENSG00000000460.15 2.5330664
ENSG00000000938.11 2.7694188
Samples
Tags bf9c448b-bdc9-4f74-b13a-374e6add7939.htseq
ENSG00000000005.5 -2.9336310
ENSG00000000419.11 4.6286114
ENSG00000000457.12 3.0127879
ENSG00000000460.15 3.1979402
ENSG00000000938.11 0.9732596
Samples
Tags 564daa81-cfef-45b6-94a0-3249b2724d9b.htseq
ENSG00000000005.5 0.2422946
ENSG00000000419.11 3.6186704
ENSG00000000457.12 2.8334093
ENSG00000000460.15 1.8994068
ENSG00000000938.11 2.4725922
Samples
Tags 82e00e45-734c-471f-ba97-79ec3b7e0baa.htseq
ENSG00000000005.5 -1.804916
ENSG00000000419.11 5.799060
ENSG00000000457.12 2.975948
ENSG00000000460.15 2.758334
ENSG00000000938.11 -0.993678
Samples
Tags 9c52ed00-325f-4664-8873-327bcaa5ea74.htseq
ENSG00000000005.5 -0.425799
ENSG00000000419.11 4.082319
ENSG00000000457.12 3.787628
ENSG00000000460.15 2.414818
ENSG00000000938.11 1.099043
Samples
Tags fabefb10-5546-4017-8ea1-29982a10fb3c.htseq
ENSG00000000005.5 -2.416932
ENSG00000000419.11 5.646898
ENSG00000000457.12 3.037201
ENSG00000000460.15 3.207244
ENSG00000000938.11 2.528229
Samples
Tags 32a115cf-570f-4ad9-a123-8e1970062f51.htseq
ENSG00000000005.5 -1.647989
ENSG00000000419.11 4.596644
ENSG00000000457.12 3.421828
ENSG00000000460.15 2.887235
ENSG00000000938.11 1.899625
Samples
Tags 05eef9f8-a246-403a-b0be-07d274b6f93a.htseq
ENSG00000000005.5 1.5670755
ENSG00000000419.11 5.8796382
ENSG00000000457.12 3.2609724
ENSG00000000460.15 3.2183805
ENSG00000000938.11 0.7723944
Samples
Tags 5c18c6a8-9ad2-43a8-a3a0-83d8fc0cc257.htseq
ENSG00000000005.5 2.000466
ENSG00000000419.11 4.865221
ENSG00000000457.12 3.671236
ENSG00000000460.15 2.777068
ENSG00000000938.11 1.605383
Samples
Tags 43b292be-5d63-4523-a43f-666d20039208.htseq
ENSG00000000005.5 1.851908
ENSG00000000419.11 4.822736
ENSG00000000457.12 3.388138
ENSG00000000460.15 2.298683
ENSG00000000938.11 1.135335
19100 more rows ...
$weights
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 0.4272084 0.3605236 0.3757352 0.365797 0.3693936 0.3605236 0.4570683 0.4672378 0.4540638
[2,] 1.9814015 1.6365197 1.7819632 1.719739 1.7436574 1.6501657 2.0277074 2.0368915 2.0243354
[3,] 1.6643834 1.1684340 1.3145104 1.246240 1.2706888 1.1800283 1.8183637 1.8583854 1.8042999
[4,] 1.3834658 0.9685114 1.0782455 1.026371 1.0448994 0.9770827 1.5681360 1.6255947 1.5498688
[5,] 1.3694188 0.9598946 1.0679714 1.016793 1.0351053 0.9683706 1.5534252 1.6119818 1.5353716
[,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18]
[1,] 0.3605236 0.4174268 0.3751305 0.4282511 0.3605236 0.3974921 0.4756471 0.3605236 0.3605236
[2,] 1.3060271 1.9565953 1.7784959 1.9834135 1.4385854 1.8910097 2.0431154 1.5704385 1.6320811
[3,] 0.9377362 1.6032037 1.3103639 1.6709795 1.0217228 1.4673786 1.8891483 1.1145441 1.1646677
[4,] 0.8002510 1.3223929 1.0749748 1.3900854 0.8613773 1.2010650 1.6720149 0.9289599 0.9657247
[5,] 0.7943371 1.3091158 1.0648054 1.3759544 0.8544378 1.1890175 1.6579150 0.9210325 0.9571388
[,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26] [,27]
[1,] 0.4352165 0.3667904 0.4186382 0.4481102 0.5034813 0.3605236 0.4378621 0.4486748 0.3693138
[2,] 1.9967627 1.7263329 1.9598408 2.0166399 2.0545965 1.6045624 2.0017922 2.0173736 1.7431256
[3,] 1.7094138 1.2529634 1.6107185 1.7764873 1.9617136 1.1414146 1.7237786 1.7791217 1.2701433
[4,] 1.4336004 1.0314693 1.3298629 1.5140091 1.7983482 0.9485683 1.4502539 1.5173906 1.0444864
[5,] 1.4193715 1.0218322 1.3164920 1.4999295 1.7871833 0.9404314 1.4358184 1.5032717 1.0346972
[,28] [,29] [,30] [,31] [,32] [,33] [,34] [,35] [,36]
[1,] 0.3821114 0.3636339 0.4156818 0.5566926 0.5692988 0.4457222 0.6160439 0.5910201 0.5422713
[2,] 1.8186144 1.7054115 1.9519111 2.0455284 2.0508520 1.8436462 2.0553194 2.0556041 2.0374975
[3,] 1.3587262 1.2316779 1.5924075 1.7025964 1.7535783 1.1491360 1.8988201 1.8304581 1.6396777
[4,] 1.1130911 1.0153172 1.3116768 1.5372851 1.5961086 1.0233400 1.7812708 1.6886882 1.4686227
[5,] 1.1022734 1.0059897 1.2983333 1.1829062 1.2325189 0.8208803 1.4281382 1.3216210 1.1280796
[,37] [,38] [,39] [,40] [,41] [,42] [,43] [,44] [,45]
[1,] 0.4351421 0.5651271 0.4583116 0.4546091 0.5482909 0.5684231 0.4612994 0.507252 0.5436882
[2,] 1.7951817 2.0493867 1.8918037 1.8778143 2.0416245 2.0506310 1.9008018 2.003378 2.0385513
[3,] 1.0987680 1.7365313 1.2115461 1.1929927 1.6677198 1.7499956 1.2266409 1.466563 1.6462312
[4,] 0.9815781 1.5772147 1.0754848 1.0599818 1.4980710 1.5923585 1.0883208 1.298794 1.4756170
[5,] 0.7932221 1.2158480 0.8549097 0.8448113 1.1506285 1.2290082 0.8634054 1.004397 1.1333250
[,46] [,47] [,48] [,49] [,50] [,51] [,52] [,53] [,54]
[1,] 0.576914 0.5909329 0.4342706 0.5367139 0.5938889 0.3807075 0.4489648 0.3806337 0.5663092
[2,] 2.052757 2.0556004 1.7909115 2.0333376 2.0557268 1.4513570 1.8561201 1.4508110 2.0499284
[3,] 1.780728 1.8301464 1.0946825 1.6140829 1.8385894 0.8728752 1.1650257 0.8726089 1.7413566
[4,] 1.628880 1.6883568 0.9782159 1.4413704 1.6996040 0.7953596 1.0366278 0.7951382 1.5829164
[5,] 1.263292 1.3212579 0.7909791 1.1076346 1.3335985 0.6665224 0.8295810 0.6663713 1.2205580
[,55] [,56] [,57] [,58] [,59] [,60]
[1,] 0.4445163 0.5513897 0.5991062 0.3690418 0.4772951 0.4619017
[2,] 1.8390051 2.0430719 2.0559479 1.3656913 1.9465166 1.9026135
[3,] 1.1432791 1.6813825 1.8531617 0.8321907 1.3093027 1.2296981
[4,] 1.0184223 1.5124736 1.7195119 0.7611163 1.1586773 1.0909190
[5,] 0.8176537 1.1624749 1.3555422 0.6433334 0.9098613 0.8651225
19100 more rows ...
$design
ADENOCARCINOMA CMS
1 0 1
2 0 1
3 0 1
4 0 1
5 0 1
55 more rows ...
Each black dot is a gene. The blue line is the average log2 residual standard deviation computed with the Bayes algorithm.
vfit <- lmFit(v, design)
vfit <- contrasts.fit(vfit, contrasts=contr.matrix)
efit <- eBayes(vfit)
plotSA(efit, main="Final model: Mean-variance trend") #plots log2 residual standard deviations against mean log-CPM values
Quick view at how many genes are down-regulated, up-regulated, and not statistically significant. The adjusted p-value cutoff is 5% by default.
summary(decideTests(efit))
ADENOCARCINOMAvsCMS
Down 1474
NotSig 15810
Up 1821
This is a stricter definition of significance and could be overcorrecting since now I don’t have any down-regulated or up-regulated genes.
tfit <- treat(vfit, lfc=1) #p-values calculated from empirical Bayes moderated t-statistics with a minimum log-FC requirement.
dt <- decideTests(tfit)
#dt <- decideTests(efit) #for testing purposes
summary(dt)
ADENOCARCINOMAvsCMS
Down 0
NotSig 19105
Up 0
I don’t have any DE genes if tfit is used. If efit is used, I have 3295 DE genes.
de.common <- which(dt[,1]!=0)
length(de.common)
[1] 0
If efit is used the genes are: “DPM1”, “CFH”, “LAS1L”, “CFTR”, “TMEM176A”, “DBNDD1”, “TFPI”, “SLC7A2”, “ARF5”, “POLDIP2”, “ARHGAP33”, “UPF1”, “MCUB”, “POLR2J”, “THSD7A”, “LIG3”, “SPPL2B”, “IBTK”, “PDK2”, “REX1BD”
head(tfit$genes$SYMBOL[de.common], n=20)
character(0)
My diagram only has 1 circle because I only have 1 pairwise comparison.
vennDiagram(dt[,1], circle.col=c("turquoise", "salmon"))
write.fit(tfit, dt, file="results.txt")
ADENOCARCINOMA.vs.CMS <- topTreat(tfit, coef=1, n=Inf)
head(ADENOCARCINOMA.vs.CMS)
If efit is used, will have read, black and blue genes. Since tfit is used, all genes are black.
plotMD(tfit, column=1, status=dt[,1], main=colnames(tfit)[1],
xlim=c(-8,13))
To open HTML page in a browser make launch=TRUE
library(Glimma)
glMDPlot(tfit, coef=1, status=dt, main=colnames(tfit)[1],
side.main="ENSEMBL", counts=lcpm, groups=group, launch=FALSE)
Install heatmap.plus beacuse heatmap.2 did not work for my data.
library(gplots)
library(heatmap.plus)
ADENOCARCINOMA.vs.CMS.topgenes <- ADENOCARCINOMA.vs.CMS$ENSEMBL[1:100]
i <- which(v$genes$ENSEMBL %in% ADENOCARCINOMA.vs.CMS.topgenes)
mycol <- colorpanel(1000,"blue","white","red")
#par("mar") OUTPUT SHOULD BE [1] 5.1 4.1 4.1 2.1
par(cex.main=0.8,mar=c(1,1,1,1)) #mar=c(1,1,1,1) ensures margins are large enough
heatmap.plus(lcpm[i,], col=bluered(20),cexRow=1,cexCol=0.2, margins = c(10,10), main = "HeatMap") #changed the margins to have a more legible heatmap