Based on the spreadsheet that Dr. Yellman emailed us, and the roster via Canvas. I think I have this paired up correctly, but sorry if I messed up your name!


Read Spreadsheet From Dr. Yellman

## Source: local data frame [6 x 5]
## 
##        Last First   Gene Score
##       (chr) (chr) (fctr) (dbl)
## 1      ABAY     J  POLD1   9.5
## 2 ABU-REZEQ     Y  RPPH1   9.5
## 3      AKIN     L   DHX9   9.5
## 4       ALI     H CDC14A  10.0
## 5  AMAEFULE     B CDC14A   9.5
## 6   AMATONG     A  SIRT2  10.0
## Variables not shown: Comments (chr)

Source: https://docs.google.com/spreadsheets/d/19tar8AlOZrigXCZcdKpKcZrE7DEWm180JguHrrAej94/pub


Read Class Roster From Canvas

## Source: local data frame [6 x 6]
## 
##                Full      Last First FirstName         Course    Role
##               (chr)     (chr) (chr)     (chr)         (fctr)  (fctr)
## 1 YASMEEN ABU-REZEQ ABU-REZEQ     Y   YASMEEN BCH339N(53902) Student
## 2        LUCAS AKIN      AKIN     L     LUCAS BCH339N(53902) Student
## 3         HASAN ALI       ALI     H     HASAN BCH339N(53902) Student
## 4    BRYAN AMAEFULE  AMAEFULE     B     BRYAN BCH339N(53902) Student
## 5   ALBERTO AMATONG   AMATONG     A   ALBERTO BCH339N(53902) Student
## 6      NORMA ARROYO    ARROYO     N     NORMA BCH339N(53902) Student

Source: https://utexas.instructure.com/courses/1143172/users


Join the two tables

GD <- inner_join(gene.dos, canvas) %>%
  select(Gene, First=FirstName, Last, Score, Full, Comments) %>%
  mutate_each(funs(str_to_title), First, Last, Full) %>%
  arrange(Gene, Last)

df <- select(GD, -Score, -Full)

knitr::kable(df)
Gene First Last Comments
ACTB Bria Lacour this is just a start on this part of the project, many pieces of information lacking, 6 sections are incomplete
BLM Nicolas Garza be more specific about the biochemical function, there should be orthologs in all species, nice multiple alignment with highlighting
BLM Christine Pham nice list of post-translational modifications, now add information about their functions, alignments should be merged into a single multiple aligment with conserved amino acids highlighted
CDC14A Hasan Ali do the colors in the multiple alignment properly highlight the conserved residues?
CDC14A Bryan Amaefule interactions with substrates are not usually stable, protein sequence alignment should show all orthologs in one alignment, highlighting conserved residues
CDC14A Kevin Chan nicely organized and well done, but alignments should be unified
CDC14A Brian Duong biochemical function is phosphatase, don’t include descriptions of interactions with specific substrates; alignment needs to be a multiple amino acid sequence alignment with conserved residues highlighted
CDC14A Ashley Vu need standard gene names in the paralogs and orthologs tables; alignment graph should have an amino acid sequence multiple alignment with conserved residues highlighted; this alignment is just a screen shot that is not of much use
CDC20 Santiago Sanchez organize your information using the headings that I listed in the assignment page, Alignments are poorly labeled in bottom graph. Which line is which protein?
CDC20 Marc St. Cyr nice project; has all information well presented, but in docx that are downloadable and not a webpage format
CDC20 Allison Teng need to make the site easy to access, not just with a UT e-mail
CDC20 Korynne Ward need to sort out the functional information for this protein; the orthologs table doesn’t list what the ortholog is, It just states which one is the closest; also, does not have protein sequence alignments for nearest orthologs (has only one alignment)
CDK1 Brandon Bartos protein forms complexes with different cyclins during different stages of the cell cycle, enzymatic function:kinase!, list a few post-translational modification of CDK itself, alignments should be unified
CDK1 Bradley Billac no files posted, apparently, please contact instructor, something strange: I don’t see the documents
CDK1 Christopher Caywood missing part 2 entirely, and need to use the headings, this gene dossier is undeveloped
CDK1 Elizabeth Mays protein alignment should be unified into a single multiple alignment, with colors highlighting conserved amino acids
CENPA Brianna Barry could use more detail and clear explanations for the post-translational modifications; alignments should be better organized
CENPA Jeong Lee nice list of post-translational modifications; specifically, protein localizes to centromeres; ortholog proteins not actually listed, just says if they have them or not
CYCS Pranav Bhamidipati alignments need to be unified into a single multiple alignment
CYCS Erin Kim S. pombe and E. coli must have orthologs, what is the function of this protein: why do cells have an electron transport chain?, don’t copy the BLAST window, show alignments with highlighting
CYCS Megan Weijiang protein alignment should be unified into a single multiple alignment, with colors highlighting conserved amino acids
DHX9 Lucas Akin distinguish between biochemical and cellular functions, alignments should be summarized, conserved residues highlighted, are you sure E. coli does not have orthologs?
DHX9 Aman Jaiswal nice site, nice alignment, but highlighting of conserved amino acids with color would be useful, are there really that many paralogs? Just wondering
DHX9 Yoori Kim RNA transcripts are not necessarily splice variants, should find orthologs in more species, including the yeasts, and are all those paralogs real?
DHX9 Quynh-Nhi Nguyen cellular or biological roles are its functions in the cell, which you have listed in a description, but put the information under the correct heading; alignment should be a multiple alignment with highlighting of conserved amino acids
DYNC1H1 Allen Gwo there are additional cellular roles; can all those paralogs be real?, use standard gene names in the orthologs table, alignment is unclear, those sequences are not lined up
DYNC1H1 Benjamin Hong need to use the organizational headings and order I requested; post-translation modification information is minimal, are there really so many paralogs in human? Use highlighting to indicate conserved protein alignment regions,
DYNC1H1 Jae Hee Kim nice visual display, the complex this protein is in is dynein, so it definitely does not work alone, multiple alignment should highlight conserved residues
DYNC1H1 Rachel Wagstaff dyneins do function in a complex with other protein subunits (light, intermediate and heavy chain subunits); structural image is not visible, nice alignment
EIF2S1 Christina Hull distinguish between biochemical and cellular functions, there must be many other complex members (not substrates), what are the functions of the post-translational modifications?, organize the alignment, no raw BLAST pictures
EIF2S1 Ali Afaq Khawaja need gene names for the orthologs, need to highlight conserved amino acids in the protein alignments, notes on post-translational modifications?
EIF2S1 Michael Lin protein sequence alignment is small and hard to read, there are no labels on the alignment to indicate which proteins are aligned
EIF2S1 Henry Nguyen functional information is minimal; no alignments completed
HSP90 Jonathan Peltier I want to see all of the information you gather put on a google website, you have added an extra category for protein function, apparently does not function in a complex but you don’t state that
HSP90 Christine Rafie orthologs and paralogs section missing (replaced by a map of paris?)
HSP90 Esther Shu orthologs/paralogs section is not in webpage format, but requires downloading files from a listing, alignments also need to be presented in a multiple alignment with highlighting of conserved amino acids
MAPK1 Homero Dominguez paralogs/orthologs needs better organization and model organism information and more information on biological role, but nice project so far
MAPK1 Mark Lerma alignment graphs are unlabeled/unorganized (what does each of the lines stand for? How do I read this?)
MAPK1 Chase Meyer need to present the multiple protein sequence alignment and highlight conserved amino acid residues
MAPK1 Layla Nejad use the headings that were provided, alignment should be a multiple alignment of the protein sequences, with conserved regions highlighted, the alignment provided has no useful information
mTOR Meghan Baker no alignment
mTOR Anna Brown nice aligment, with highlights
mTOR Katherine Cook this protein should have orthologs in all eukaryotes; protein alignments should be a multiple alignment with conserved amino acids highlighted
MVK Sarah Gorring orthologs table is poorly organized, and lacking orthologs from fruit fly, I don’t know what the paralog list represents or how to interpret the spicing information; your alignments must be presented, not just as a link
MVK Ju-Yun Kuo should distinguish between biochemical activity and cellular functions, there are post-translational modifications, find them; need gene or protein names in ortholog list, alignment needs work
MYH1 Christine Lam cellular or biological roles are its functions in the cell, such as vesicle transport, chromosome movement by microtubules, etc.; alignment graphs are unlabeled/unorganized (what does each of the lines stand for? How do I read this?)
MYH1 Steven Mai meaning of the splicing information is unclear; no alignments
MYH1 Marina Martins protein alignment should be unified into a single multiple alignment, with colors highlighting conserved amino acids
POLD1 Amita Kulshreshtha need more detail on the biochemical and cellular functions, need gene or protein names of orthologs, are all those paralogs real?
POLD1 Karine Ovadia-Combe nice project; mouse should have an ortholog, also list the human gene; I still have not seen your alignments, there appears to be a problem with seeing it
POLG1 Simon Gonzalez-Esteva need specific information about post-translational modifications, nice multiple alignment with highlighting
POLG1 Hunter Ratliff the cut-and-pasted splice variant information is not very useful; show me a multiple protein alignment, don’t just provide a bunch of links
POLG1 James Tran protein biochemical function: DNA polymerase; alignment should be a multiple sequence alignment with conserved amino acids highlighted
POLG1 Aubrey Trapp alignments are as raw text files downloadable from website, very inconvenient, should list function of the post-translational modifications, if known
PPP2R4 Brant Campodonico protein is not a cis-prolyl isomerase, does not function individually, is in a trimeric complex, alignments need to be unified into a single multiple alignment, with conserved amino acids highlighted
PPP2R4 Rhedda Onihana paralogs should be in table form, they are hard to read, protein alignment should be a single multiple alignment with highights marking conserved amino acids
PSMC3 Kevin Gian transcripts are not equivalent to splice variants, I don’t think those paralogs can all be real, look at definition of a paralog, bring alignments together into a multiple alignment
PSMC3 Jose Guerra nice site, but please organize in the order I specified so I can look through it more easily, please list the gene or protein name in the orthologs table, also those paralogs cannot all be correct, can they?
RAD51 Julia Chernis need gene names in the ortholog table, also has an E. coli ortholog, RecA
RAD51 Aidana Omarbekova nice project, need to highlight conserved amino acids in the multiple sequence alignment
RAD51 Alan Te orthologs table needs gene names; no alignments
RAN Maytee Chantharayukhont no alignment
RAN Tri Do nice project, well organized, good multiple alignment, but you will want to highlight conserved amino acids in the alignment
RAN Kyoung Oe don’t sprinkle hyperlinks all over your project, alignment needs highlighting to show conserved amino acids
RAN Collin Pullara where is paralogs table?, alignment not done (has “search strategy” file instead), list functions of post-translational modifications
REC8 Norma Arroyo no orthologs
RPL28 Roald Menodiado could use better description of biochemical and biological cellular functions; what are the post-translational modifications of RPL28 itself?, orthologs and paralogs need gene names, alignments should show amino acid sequences and conserved regions, all in one multiple alignment
RPL28 Christine Nguyen sequence alignments should be a miltiple protein sequence alignment with conserved amino acids highlighted; Website says “*still trying to figure out how to properly run BLAST“
RPPH1 Yasmeen Abu-Rezeq nicely organized, C. elegans and E. coli have orthologs, protein activity is catalytic (it is the catalytic subunit of RNase P)
RPPH1 Jessica Yi need names of protein orthologs
RPPH1, RPP30 William Eveland isoforms are not paralogs, orthologs/paralogs need standard gene or protein names, show multiple alignments together
SIRT2 Alberto Amatong I’m not sure you are getting the closest orthologs, check those protein identities
SIRT2 Christopher Jackson many extra documents clutter the site, so I can’t tell where the important information might be, many sources are simply copied to the site, orthologs need gene names
SIRT2 Meg Maddox not all transcripts are splicing variants; has the protein sequences but no alignments
SPO11 Benjamin Corona there are post-translational modifications of this protein; missing part 2
SPO11 Dallas Miller alignment should be a multiple protein sequence alignment with conserved amino acids highlighted
TCP1 Caroline Chen are those paralogs really paralogs?, complex also includes CCT6A, CCT6B, CCT7; alignments should be unified
TCP1 Alexa Johnson nice project so far, need to find the S. pombe ortholog, also I think the E. coli GroEL and GroES proteins are distant orthologs- look into that
TERT Jaeeun Go transcripts are not equivalent to splice variants, orthologs have some errors, they are just repeated protein names, you are doing a bit too much cutting and pasting
TERT Haley Hendrix need an explanation for the splicing variants, are there really 6 paralogs in human? Use standard gene names in the tables
TERT Hyun Jung are the mutliple protein sequence alignments lined up correctly? Would like to see more orthologous proteins in that alignment
TERT Peyton Sarmiento protein information is minimal, almost certainly it has some post-translational modifications; alignments should show the protein sequences, with conserved amino acids highlighted
TOP2A Arno Dunstatter splicing isoforms are not paralogs, need gene name on first page, show multiple alignments together
TOP2A Catherine Mortensen much of part 1 is unfinished; no protein sequence alignments
TOP2A Simon Yu the cut-and-pasted splice variant information is not very useful; show me a multiple protein alignment, don’t just provide a bunch of links
NA Hyun-Young Lee looks good so far, but need to set preferences so that I can see the site more easily, not just with a UT e-mail address

Contact

Hunter Ratliff

Email: HunterRatliff1@gmail.com
Twitter: @HunterRatliff1

Copyright (C) 2015 Hunter Ratliff

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program.  If not, see <http://www.gnu.org/licenses/>.

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.