Final Project Checkpoint 4
Initial QC Steps
By: Fatma Gazeyoglu
To review the steps of my paper, per the data stored in PRJNA994299, 16 pigs were challenged with Swine Influenza Virus H1N2 in which samples were collected 3 from each group 2 days post inoculation (2 dpi), 3 per group at 5dpi, and the remaining 4 pigs by the end of the in vivo experiment (9 dpi) as well as an extra lung sample, adding up to a total of 33 samples. They were all run using Illumina MiSeq.
|
|
1 |
2 |
3 |
4 |
5 |
6 Collection_Date |
7 |
8 isolation_source |
9 |
10 replicate |
11 |
|---|---|---|---|---|---|---|---|---|---|---|---|
|
1 |
265 |
7.54 M |
3.90 Mb |
2020-08-10 |
BALF |
13_BALF_EVOLSI2020 |
10 |
13_BALF_EVOLSI2020 |
|||
|
2 |
270 |
27.69 M |
14.17 Mb |
2020-08-09 |
BALF |
12_BALF_EVOLSI2020 |
9 |
12_BALF_EVOLSI2020 |
|||
|
3 |
271 |
3.69 M |
2.30 Mb |
2020-08-08 |
BALF |
11_BALF |
8 |
11_BALF |
|||
|
4 |
268 |
7.06 M |
3.65 Mb |
2020-08-07 |
BALF |
10_BALF_EVOLSI2020 |
7 |
10_BALF_EVOLSI2020 |
|||
|
5 |
268 |
11.03 M |
5.67 Mb |
2020-08-06 |
BALF |
09_BALF_EVOLSI2020 |
6 |
09_BALF_EVOLSI2020 |
|||
|
6 |
271 |
6.68 M |
3.52 Mb |
2020-08-05 |
BALF |
06_BALF |
5 |
06_BALF |
|||
|
7 |
239 |
4.59 M |
2.20 Mb |
2020-08-10 |
MDCK cells |
H1N2_EVOLSI2020 |
33 |
H1N2_EVOLSI2020 |
|||
|
8 |
269 |
6.34 M |
3.34 Mb |
2020-08-02 |
nasal swab |
animal16_9_NS |
32 |
animal16_9_NS |
|||
|
9 |
263 |
7.80 M |
4.14 Mb |
2020-08-01 |
nasal swab |
animal16_5_NS_EVOLSI2020 |
31 |
animal16_5_NS_EVOLSI2020 |
|||
|
10 |
269 |
5.33 M |
2.80 Mb |
2020-08-04 |
BALF |
05_BALF |
4 |
05_BALF |
|||
|
11 |
261 |
20.83 M |
10.92 Mb |
2020-08-10 |
nasal swab |
animal16_4_NS_EVOLSI2020 |
30 |
animal16_4_NS_EVOLSI2020 |
|||
|
12 |
270 |
15.99 M |
8.21 Mb |
2020-08-10 |
nasal swab |
animal16_3_NS |
29 |
animal16_3_NS |
|||
|
13 |
267 |
3.58 M |
2.25 Mb |
2020-08-07 |
nasal swab |
animal15_5_NS_EVOLSI2020 |
28 |
animal15_5_NS_EVOLSI2020 |
|||
|
14 |
249 |
6.97 M |
4.40 Mb |
2020-08-06 |
nasal swab |
animal15_3_NS |
27 |
animal15_3_NS |
|||
|
15 |
257 |
11.39 M |
5.91 Mb |
2020-08-04 |
nasal swab |
animal14_4_NS |
26 |
animal14_4_NS |
|||
|
16 |
266 |
10.94 M |
5.71 Mb |
2020-08-03 |
nasal swab |
animal14_3_NS |
25 |
animal14_3_NS |
|||
|
17 |
266 |
12.17 M |
6.34 Mb |
2020-08-09 |
nasal swab |
animal13_5_NS_EVOLSI2020 |
24 |
animal13_5_NS_EVOLSI2020 |
|||
|
18 |
259 |
2.12 M |
1.17 Mb |
2020-08-10 |
nasal swab |
animal13_4_NS |
23 |
animal13_4_NS |
|||
|
19 |
272 |
4.90 M |
2.62 Mb |
2020-08-09 |
nasal swab |
animal13_3_NS |
22 |
animal13_3_NS |
|||
|
20 |
265 |
4.53 M |
2.47 Mb |
2020-08-09 |
nasal swab |
animal12_5_NS_EVOLSI2020 |
21 |
animal12_5_NS_EVOLSI2020 |
|||
|
21 |
272 |
9.35 M |
4.90 Mb |
2020-08-03 |
BALF |
04_BALF |
3 |
04_BALF |
|||
|
22 |
265 |
2.37 M |
1.27 Mb |
2020-08-10 |
nasal swab |
animal12_4_NS |
20 |
animal12_4_NS |
|||
|
23 |
274 |
7.69 M |
4.00 Mb |
2020-08-09 |
nasal swab |
animal12_3_NS |
19 |
animal12_3_NS |
|||
|
24 |
264 |
3.58 M |
1.90 Mb |
2020-08-01 |
nasal swab |
animal8_4_NS |
18 |
animal8_4_NS |
|||
|
25 |
262 |
2.63 M |
1.42 Mb |
2020-08-01 |
nasal swab |
animal7_4_NS |
17 |
animal7_4_NS |
|||
|
26 |
271 |
27.96 M |
14.42 Mb |
2020-08-01 |
nasal swab |
animal7_3_NS |
16 |
animal7_3_NS |
|||
|
27 |
263 |
7.06 M |
3.64 Mb |
2020-08-01 |
nasal swab |
animal6_4_NS |
15 |
animal6_4_NS |
|||
|
28 |
263 |
2.94 M |
1.89 Mb |
2020-08-01 |
nasal swab |
animal5_5_NS |
14 |
animal5_5_NS |
|||
|
29 |
260 |
4.41 M |
2.30 Mb |
2020-08-01 |
nasal swab |
animal5_4_NS |
13 |
animal5_4_NS |
|||
|
30 |
266 |
9.12 M |
4.69 Mb |
2020-08-01 |
nasal swab |
animal5_3_NS |
12 |
animal5_3_NS |
|||
|
31 |
262 |
5.09 M |
3.28 Mb |
2020-08-10 |
BALF |
14_BALF_EVOLSI2020 |
11 |
14_BALF_EVOLSI2020 |
|||
|
32 |
267 |
15.17 M |
7.77 Mb |
2020-08-02 |
BALF |
02_BALF |
2 |
02_BALF |
|||
|
33 |
269 |
9.59 M |
4.91 Mb |
2020-08-01 |
BALF |
01_BALF |
1 |
01_BALF |
The samples were fastqc’d followed by trimming via trimmomatic. The following is some of the general statistics created by MultiQC post trimming.
| Sample Name | % Dups | % GC | Length | M Seqs |
|---|---|---|---|---|
| SRR25266111_trim.1 | 61.0% | 44% | 134 bp | 0.0 |
| SRR25266111_trim.2 | 59.8% | 44% | 133 bp | 0.0 |
| SRR25266112_trim.1 | 75.3% | 44% | 136 bp | 0.1 |
| SRR25266112_trim.2 | 73.1% | 44% | 135 bp | 0.1 |
| SRR25266113_trim.1 | 45.9% | 44% | 135 bp | 0.0 |
| SRR25266113_trim.2 | 43.7% | 44% | 134 bp | 0.0 |
| SRR25266114_trim.1 | 60.2% | 44% | 135 bp | 0.0 |
| SRR25266114_trim.2 | 58.3% | 44% | 135 bp | 0.0 |
| SRR25266115_trim.1 | 66.5% | 44% | 135 bp | 0.0 |
| SRR25266115_trim.2 | 65.1% | 44% | 135 bp | 0.0 |
| SRR25266116_trim.1 | 57.6% | 44% | 136 bp | 0.0 |
| SRR25266116_trim.2 | 56.3% | 44% | 136 bp | 0.0 |
| SRR25266117_trim.1 | 49.1% | 43% | 123 bp | 0.0 |
| SRR25266117_trim.2 | 49.1% | 43% | 123 bp | 0.0 |
| SRR25266118_trim.1 | 64.3% | 46% | 135 bp | 0.0 |
| SRR25266118_trim.2 | 63.5% | 46% | 135 bp | 0.0 |
| SRR25266119_trim.1 | 60.7% | 44% | 133 bp | 0.0 |
| SRR25266119_trim.2 | 58.7% | 44% | 132 bp | 0.0 |
| SRR25266120_trim.1 | 54.3% | 44% | 135 bp | 0.0 |
| SRR25266120_trim.2 | 53.5% | 44% | 135 bp | 0.0 |
| SRR25266121_trim.1 | 72.0% | 44% | 132 bp | 0.1 |
| SRR25266121_trim.2 | 69.3% | 44% | 131 bp | 0.1 |
| SRR25266122_trim.1 | 71.3% | 44% | 136 bp | 0.1 |
| SRR25266122_trim.2 | 69.8% | 44% | 135 bp | 0.1 |
| SRR25266123_trim.1 | 49.2% | 44% | 134 bp | 0.0 |
| SRR25266123_trim.2 | 43.0% | 44% | 132 bp | 0.0 |
| SRR25266124_trim.1 | 50.1% | 44% | 125 bp | 0.0 |
| SRR25266124_trim.2 | 44.2% | 44% | 123 bp | 0.0 |
| SRR25266125_trim.1 | 63.7% | 44% | 130 bp | 0.0 |
| SRR25266125_trim.2 | 62.8% | 44% | 130 bp | 0.0 |
| SRR25266126_trim.1 | 68.3% | 44% | 134 bp | 0.0 |
| SRR25266126_trim.2 | 66.8% | 44% | 133 bp | 0.0 |
| SRR25266127_trim.1 | 66.7% | 44% | 134 bp | 0.0 |
| SRR25266127_trim.2 | 65.3% | 44% | 134 bp | 0.0 |
| SRR25266128_trim.1 | 44.5% | 43% | 131 bp | 0.0 |
| SRR25266128_trim.2 | 44.0% | 43% | 131 bp | 0.0 |
| SRR25266129_trim.1 | 59.0% | 45% | 136 bp | 0.0 |
| SRR25266129_trim.2 | 56.4% | 45% | 135 bp | 0.0 |
| SRR25266130_trim.1 | 53.8% | 45% | 133 bp | 0.0 |
| SRR25266130_trim.2 | 52.8% | 45% | 133 bp | 0.0 |
| SRR25266131_trim.1 | 66.2% | 44% | 137 bp | 0.0 |
| SRR25266131_trim.2 | 64.3% | 44% | 136 bp | 0.0 |
| SRR25266132_trim.1 | 39.9% | 43% | 134 bp | 0.0 |
| SRR25266132_trim.2 | 39.4% | 43% | 134 bp | 0.0 |
| SRR25266133_trim.1 | 61.1% | 44% | 138 bp | 0.0 |
| SRR25266133_trim.2 | 59.4% | 44% | 137 bp | 0.0 |
| SRR25266134_trim.1 | 51.1% | 44% | 133 bp | 0.0 |
| SRR25266134_trim.2 | 51.2% | 44% | 133 bp | 0.0 |
| SRR25266135_trim.1 | 53.4% | 46% | 132 bp | 0.0 |
| SRR25266135_trim.2 | 52.9% | 46% | 132 bp | 0.0 |
| SRR25266136_trim.1 | 78.5% | 45% | 137 bp | 0.1 |
| SRR25266136_trim.2 | 76.0% | 45% | 136 bp | 0.1 |
| SRR25266137_trim.1 | 61.8% | 45% | 133 bp | 0.0 |
| SRR25266137_trim.2 | 61.1% | 45% | 133 bp | 0.0 |
| SRR25266138_trim.1 | 38.8% | 44% | 131 bp | 0.0 |
| SRR25266138_trim.2 | 33.4% | 44% | 129 bp | 0.0 |
| SRR25266139_trim.1 | 54.3% | 44% | 131 bp | 0.0 |
| SRR25266139_trim.2 | 53.3% | 44% | 131 bp | 0.0 |
| SRR25266140_trim.1 | 64.6% | 44% | 134 bp | 0.0 |
| SRR25266140_trim.2 | 63.2% | 44% | 134 bp | 0.0 |
| SRR25266141_trim.1 | 53.5% | 43% | 131 bp | 0.0 |
| SRR25266141_trim.2 | 39.4% | 44% | 125 bp | 0.0 |
| SRR25266142_trim.1 | 68.7% | 44% | 135 bp | 0.1 |
| SRR25266142_trim.2 | 67.5% | 44% | 134 bp | 0.1 |
| SRR25266143_trim.1 | 60.8% | 44% | 136 bp | 0.0 |
| SRR25266143_trim.2 | 59.4% | 44% | 135 bp | 0.0 |
It seems that the GC content is not as great but in accordance to sequencing of Influenza, the GC content tends to be around 46% so I am not worried about it for the time being.