Based on the spreadsheet that Dr. Yellman emailed us, and the roster via Canvas. I think I have this paired up correctly, but sorry if I messed up your name!
## Source: local data frame [6 x 5]
##
## Last First Gene Score
## (chr) (chr) (fctr) (dbl)
## 1 ABAY J POLD1 9.5
## 2 ABU-REZEQ Y RPPH1 9.5
## 3 AKIN L DHX9 9.5
## 4 ALI H CDC14A 10.0
## 5 AMAEFULE B CDC14A 9.5
## 6 AMATONG A SIRT2 10.0
## Variables not shown: Comments (chr)
Source: https://docs.google.com/spreadsheets/d/19tar8AlOZrigXCZcdKpKcZrE7DEWm180JguHrrAej94/pub
## Source: local data frame [6 x 6]
##
## Full Last First FirstName Course Role
## (chr) (chr) (chr) (chr) (fctr) (fctr)
## 1 YASMEEN ABU-REZEQ ABU-REZEQ Y YASMEEN BCH339N(53902) Student
## 2 LUCAS AKIN AKIN L LUCAS BCH339N(53902) Student
## 3 HASAN ALI ALI H HASAN BCH339N(53902) Student
## 4 BRYAN AMAEFULE AMAEFULE B BRYAN BCH339N(53902) Student
## 5 ALBERTO AMATONG AMATONG A ALBERTO BCH339N(53902) Student
## 6 NORMA ARROYO ARROYO N NORMA BCH339N(53902) Student
Source: https://utexas.instructure.com/courses/1143172/users
GD <- inner_join(gene.dos, canvas) %>%
select(Gene, First=FirstName, Last, Score, Full, Comments) %>%
mutate_each(funs(str_to_title), First, Last, Full) %>%
arrange(Gene, Last)
df <- select(GD, -Score, -Full)
knitr::kable(df)
| Gene | First | Last | Comments |
|---|---|---|---|
| ACTB | Bria | Lacour | this is just a start on this part of the project, many pieces of information lacking, 6 sections are incomplete |
| BLM | Nicolas | Garza | be more specific about the biochemical function, there should be orthologs in all species, nice multiple alignment with highlighting |
| BLM | Christine | Pham | nice list of post-translational modifications, now add information about their functions, alignments should be merged into a single multiple aligment with conserved amino acids highlighted |
| CDC14A | Hasan | Ali | do the colors in the multiple alignment properly highlight the conserved residues? |
| CDC14A | Bryan | Amaefule | interactions with substrates are not usually stable, protein sequence alignment should show all orthologs in one alignment, highlighting conserved residues |
| CDC14A | Kevin | Chan | nicely organized and well done, but alignments should be unified |
| CDC14A | Brian | Duong | biochemical function is phosphatase, don’t include descriptions of interactions with specific substrates; alignment needs to be a multiple amino acid sequence alignment with conserved residues highlighted |
| CDC14A | Ashley | Vu | need standard gene names in the paralogs and orthologs tables; alignment graph should have an amino acid sequence multiple alignment with conserved residues highlighted; this alignment is just a screen shot that is not of much use |
| CDC20 | Santiago | Sanchez | organize your information using the headings that I listed in the assignment page, Alignments are poorly labeled in bottom graph. Which line is which protein? |
| CDC20 | Marc | St. Cyr | nice project; has all information well presented, but in docx that are downloadable and not a webpage format |
| CDC20 | Allison | Teng | need to make the site easy to access, not just with a UT e-mail |
| CDC20 | Korynne | Ward | need to sort out the functional information for this protein; the orthologs table doesn’t list what the ortholog is, It just states which one is the closest; also, does not have protein sequence alignments for nearest orthologs (has only one alignment) |
| CDK1 | Brandon | Bartos | protein forms complexes with different cyclins during different stages of the cell cycle, enzymatic function:kinase!, list a few post-translational modification of CDK itself, alignments should be unified |
| CDK1 | Bradley | Billac | no files posted, apparently, please contact instructor, something strange: I don’t see the documents |
| CDK1 | Christopher | Caywood | missing part 2 entirely, and need to use the headings, this gene dossier is undeveloped |
| CDK1 | Elizabeth | Mays | protein alignment should be unified into a single multiple alignment, with colors highlighting conserved amino acids |
| CENPA | Brianna | Barry | could use more detail and clear explanations for the post-translational modifications; alignments should be better organized |
| CENPA | Jeong | Lee | nice list of post-translational modifications; specifically, protein localizes to centromeres; ortholog proteins not actually listed, just says if they have them or not |
| CYCS | Pranav | Bhamidipati | alignments need to be unified into a single multiple alignment |
| CYCS | Erin | Kim | S. pombe and E. coli must have orthologs, what is the function of this protein: why do cells have an electron transport chain?, don’t copy the BLAST window, show alignments with highlighting |
| CYCS | Megan | Weijiang | protein alignment should be unified into a single multiple alignment, with colors highlighting conserved amino acids |
| DHX9 | Lucas | Akin | distinguish between biochemical and cellular functions, alignments should be summarized, conserved residues highlighted, are you sure E. coli does not have orthologs? |
| DHX9 | Aman | Jaiswal | nice site, nice alignment, but highlighting of conserved amino acids with color would be useful, are there really that many paralogs? Just wondering |
| DHX9 | Yoori | Kim | RNA transcripts are not necessarily splice variants, should find orthologs in more species, including the yeasts, and are all those paralogs real? |
| DHX9 | Quynh-Nhi | Nguyen | cellular or biological roles are its functions in the cell, which you have listed in a description, but put the information under the correct heading; alignment should be a multiple alignment with highlighting of conserved amino acids |
| DYNC1H1 | Allen | Gwo | there are additional cellular roles; can all those paralogs be real?, use standard gene names in the orthologs table, alignment is unclear, those sequences are not lined up |
| DYNC1H1 | Benjamin | Hong | need to use the organizational headings and order I requested; post-translation modification information is minimal, are there really so many paralogs in human? Use highlighting to indicate conserved protein alignment regions, |
| DYNC1H1 | Jae Hee | Kim | nice visual display, the complex this protein is in is dynein, so it definitely does not work alone, multiple alignment should highlight conserved residues |
| DYNC1H1 | Rachel | Wagstaff | dyneins do function in a complex with other protein subunits (light, intermediate and heavy chain subunits); structural image is not visible, nice alignment |
| EIF2S1 | Christina | Hull | distinguish between biochemical and cellular functions, there must be many other complex members (not substrates), what are the functions of the post-translational modifications?, organize the alignment, no raw BLAST pictures |
| EIF2S1 | Ali Afaq | Khawaja | need gene names for the orthologs, need to highlight conserved amino acids in the protein alignments, notes on post-translational modifications? |
| EIF2S1 | Michael | Lin | protein sequence alignment is small and hard to read, there are no labels on the alignment to indicate which proteins are aligned |
| EIF2S1 | Henry | Nguyen | functional information is minimal; no alignments completed |
| HSP90 | Jonathan | Peltier | I want to see all of the information you gather put on a google website, you have added an extra category for protein function, apparently does not function in a complex but you don’t state that |
| HSP90 | Christine | Rafie | orthologs and paralogs section missing (replaced by a map of paris?) |
| HSP90 | Esther | Shu | orthologs/paralogs section is not in webpage format, but requires downloading files from a listing, alignments also need to be presented in a multiple alignment with highlighting of conserved amino acids |
| MAPK1 | Homero | Dominguez | paralogs/orthologs needs better organization and model organism information and more information on biological role, but nice project so far |
| MAPK1 | Mark | Lerma | alignment graphs are unlabeled/unorganized (what does each of the lines stand for? How do I read this?) |
| MAPK1 | Chase | Meyer | need to present the multiple protein sequence alignment and highlight conserved amino acid residues |
| MAPK1 | Layla | Nejad | use the headings that were provided, alignment should be a multiple alignment of the protein sequences, with conserved regions highlighted, the alignment provided has no useful information |
| mTOR | Meghan | Baker | no alignment |
| mTOR | Anna | Brown | nice aligment, with highlights |
| mTOR | Katherine | Cook | this protein should have orthologs in all eukaryotes; protein alignments should be a multiple alignment with conserved amino acids highlighted |
| MVK | Sarah | Gorring | orthologs table is poorly organized, and lacking orthologs from fruit fly, I don’t know what the paralog list represents or how to interpret the spicing information; your alignments must be presented, not just as a link |
| MVK | Ju-Yun | Kuo | should distinguish between biochemical activity and cellular functions, there are post-translational modifications, find them; need gene or protein names in ortholog list, alignment needs work |
| MYH1 | Christine | Lam | cellular or biological roles are its functions in the cell, such as vesicle transport, chromosome movement by microtubules, etc.; alignment graphs are unlabeled/unorganized (what does each of the lines stand for? How do I read this?) |
| MYH1 | Steven | Mai | meaning of the splicing information is unclear; no alignments |
| MYH1 | Marina | Martins | protein alignment should be unified into a single multiple alignment, with colors highlighting conserved amino acids |
| POLD1 | Amita | Kulshreshtha | need more detail on the biochemical and cellular functions, need gene or protein names of orthologs, are all those paralogs real? |
| POLD1 | Karine | Ovadia-Combe | nice project; mouse should have an ortholog, also list the human gene; I still have not seen your alignments, there appears to be a problem with seeing it |
| POLG1 | Simon | Gonzalez-Esteva | need specific information about post-translational modifications, nice multiple alignment with highlighting |
| POLG1 | Hunter | Ratliff | the cut-and-pasted splice variant information is not very useful; show me a multiple protein alignment, don’t just provide a bunch of links |
| POLG1 | James | Tran | protein biochemical function: DNA polymerase; alignment should be a multiple sequence alignment with conserved amino acids highlighted |
| POLG1 | Aubrey | Trapp | alignments are as raw text files downloadable from website, very inconvenient, should list function of the post-translational modifications, if known |
| PPP2R4 | Brant | Campodonico | protein is not a cis-prolyl isomerase, does not function individually, is in a trimeric complex, alignments need to be unified into a single multiple alignment, with conserved amino acids highlighted |
| PPP2R4 | Rhedda | Onihana | paralogs should be in table form, they are hard to read, protein alignment should be a single multiple alignment with highights marking conserved amino acids |
| PSMC3 | Kevin | Gian | transcripts are not equivalent to splice variants, I don’t think those paralogs can all be real, look at definition of a paralog, bring alignments together into a multiple alignment |
| PSMC3 | Jose | Guerra | nice site, but please organize in the order I specified so I can look through it more easily, please list the gene or protein name in the orthologs table, also those paralogs cannot all be correct, can they? |
| RAD51 | Julia | Chernis | need gene names in the ortholog table, also has an E. coli ortholog, RecA |
| RAD51 | Aidana | Omarbekova | nice project, need to highlight conserved amino acids in the multiple sequence alignment |
| RAD51 | Alan | Te | orthologs table needs gene names; no alignments |
| RAN | Maytee | Chantharayukhont | no alignment |
| RAN | Tri | Do | nice project, well organized, good multiple alignment, but you will want to highlight conserved amino acids in the alignment |
| RAN | Kyoung | Oe | don’t sprinkle hyperlinks all over your project, alignment needs highlighting to show conserved amino acids |
| RAN | Collin | Pullara | where is paralogs table?, alignment not done (has “search strategy” file instead), list functions of post-translational modifications |
| REC8 | Norma | Arroyo | no orthologs |
| RPL28 | Roald | Menodiado | could use better description of biochemical and biological cellular functions; what are the post-translational modifications of RPL28 itself?, orthologs and paralogs need gene names, alignments should show amino acid sequences and conserved regions, all in one multiple alignment |
| RPL28 | Christine | Nguyen | sequence alignments should be a miltiple protein sequence alignment with conserved amino acids highlighted; Website says “*still trying to figure out how to properly run BLAST“ |
| RPPH1 | Yasmeen | Abu-Rezeq | nicely organized, C. elegans and E. coli have orthologs, protein activity is catalytic (it is the catalytic subunit of RNase P) |
| RPPH1 | Jessica | Yi | need names of protein orthologs |
| RPPH1, RPP30 | William | Eveland | isoforms are not paralogs, orthologs/paralogs need standard gene or protein names, show multiple alignments together |
| SIRT2 | Alberto | Amatong | I’m not sure you are getting the closest orthologs, check those protein identities |
| SIRT2 | Christopher | Jackson | many extra documents clutter the site, so I can’t tell where the important information might be, many sources are simply copied to the site, orthologs need gene names |
| SIRT2 | Meg | Maddox | not all transcripts are splicing variants; has the protein sequences but no alignments |
| SPO11 | Benjamin | Corona | there are post-translational modifications of this protein; missing part 2 |
| SPO11 | Dallas | Miller | alignment should be a multiple protein sequence alignment with conserved amino acids highlighted |
| TCP1 | Caroline | Chen | are those paralogs really paralogs?, complex also includes CCT6A, CCT6B, CCT7; alignments should be unified |
| TCP1 | Alexa | Johnson | nice project so far, need to find the S. pombe ortholog, also I think the E. coli GroEL and GroES proteins are distant orthologs- look into that |
| TERT | Jaeeun | Go | transcripts are not equivalent to splice variants, orthologs have some errors, they are just repeated protein names, you are doing a bit too much cutting and pasting |
| TERT | Haley | Hendrix | need an explanation for the splicing variants, are there really 6 paralogs in human? Use standard gene names in the tables |
| TERT | Hyun | Jung | are the mutliple protein sequence alignments lined up correctly? Would like to see more orthologous proteins in that alignment |
| TERT | Peyton | Sarmiento | protein information is minimal, almost certainly it has some post-translational modifications; alignments should show the protein sequences, with conserved amino acids highlighted |
| TOP2A | Arno | Dunstatter | splicing isoforms are not paralogs, need gene name on first page, show multiple alignments together |
| TOP2A | Catherine | Mortensen | much of part 1 is unfinished; no protein sequence alignments |
| TOP2A | Simon | Yu | the cut-and-pasted splice variant information is not very useful; show me a multiple protein alignment, don’t just provide a bunch of links |
| NA | Hyun-Young | Lee | looks good so far, but need to set preferences so that I can see the site more easily, not just with a UT e-mail address |
Hunter Ratliff
Email: HunterRatliff1@gmail.com
Twitter: @HunterRatliff1
Copyright (C) 2015 Hunter Ratliff
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.