Aim:
The purpose of this exercise is to familiarize students with the basics of using the BLAST algorithm for sequence comparison and to help them understand the output generated by the algorithm. The major objectives are
– Understanding the basic principle behind BLAST
– Running a BLAST search
– Interpreting BLAST results
– Evaluating the significance of BLAST results
– Customizing BLAST for specific needs
Go to the NCBI BLAST website (https://blast.ncbi.nlm.nih.gov/Blast.cgi).
Select the type of BLAST search you want to perform (e.g., nucleotide BLAST, protein BLAST).
Enter your query sequence in the appropriate field. You can either enter the sequence directly or upload a file containing the sequence.
Select the database you want to search against. There are several databases available, including the NCBI nucleotide database and the NCBI protein database.
Set any additional parameters for the BLAST search, such as the expected value threshold or the alignment length.
Click the “BLAST” button to start the search.
Wait for the search to complete. This may take a few minutes to several hours, depending on the database’s size and the query’s complexity.
Review the results of the BLAST search. The results will show the top-scoring alignments between your query sequence and the sequences in the database.
Use the BLAST search results to identify similar sequences and learn more about their function and biological significance.
example sequence 1 ATGAAACTTTTCTTGATTTTGCTTGTTTTGCCCCTGGCCTCTTGCTTTTTCACATGTAATAGTAATGCTA ATCTCTCTATGTTACAATTAGGTGTTCCTGACAATTCTTCAACTATTGTTACGGGTTTATTGCCAACTCA TTGGTTTTGTGCTAATCAGAGTACATCTGTTTACTCAGCCAATGGTTTCTTTTATATTGATGTTGGTAAT CACCGTAGTGCTTTTGCGCTCCATACTGGTTATTATGATGCTAATCAGTATTATATTTATGTTACTAATG AAATAGGCTTAAATGCTTCTGTTACTCTTAAGATTTGTAAGTTTAGTAGAAACACTACTTTTGATTTTTT AAGTAATGCTTCTAGTTCTTTTGACTGTATAGTTAATTTGTTATTTACAGAACAGTTAGGTGCGCCTTTG GGCATAACTATATCTGGTGAAACTGTGCGTCTGCATTTATATAATGTAACTCGTACTTTTTATGTGCCAG CAGCTTATAAACTTACTAAACTTAGTGTTAAATGTTACTTTAACTATTCCTGTGTTTTTAGTGTTGTCAA CGCCACCGTTACTGTGAATGTCACCACACATAATGGCCGTGTAGTTAACTACACTGTTTGTGATGATTGT AATGGTTATACTGATAACATATTTTCTGTTCAACAGGATGGCCGCATTCCTAATGGTTTCCCTTTTAATA ATTGGTTTTTGTTAACTAATGGTTCCACACTAGTGGACGGGGTCTCTAGACTTTATCAACCACTCCGTTT AACTTGTTTATGGCCTGTACCTGGTCTTAAATCTTCAACTGGTTTTGTTTATTTTAATGCCACTGGTTCT GATGTTAATTGTAACGGCTATCAACATAATTCTGTTGTTGATGTTATGCGTTACAATCTTAACTTCAGTG CTAATTCTTTGGACAATCTCAAGAGTGGTGTTATAGTTTTTAAAACTTTACAGTACGATGTTTTGTTTTA TTGTAGTAATTCTTCCTCAGGTGTTCTTGACACCACAATACCTTTTGGCCCGTCCTCTCAACCTTATTAC TGTTTTATAAACAGCACTATCAACACTACTCATGTTAGCACTTTTGTGGGTATTTTACCACCCACTGTGC GTGAAATTGTTGTTGCTAGAACTGGCCAGTTTTATATTAATGGTTTTAAGTATTTCGATTTGGGTTTCAT AGAAGCTGTCAATTTTAATGTCACGACTGCTAGCGCCACAGATTTTTGGACGGTTGCATTTGCTACTTTT GTTGATGTTTTGGTTAATGTTAGTGCAACTAACATTCAAAACTTACTTTATTGCGATTCTCCATTTGAAA AGTTGCAGTGTGAGCACTTGCAGTTTGGATTGCAGGATGGTTTTTATTCTGCAAATTTTCTTGATGATAA TGTTTTGCCTGAGACTTATGTTGCACTCCCCATTTATTATCAACACACGGACATAAATTTTACTGCAACT GCATCTTTTGGTGGTTCTTGTTATGTTTGTAAACCACACCAGGTTAATATATCTCTTAATGGTAACACTT CAGTGTGTGTTAGAACATCTCATTTTTCAATTAGGTATATTTATAACCGCGTTAAGAGTGGTTCACCAGG TGACTCTTCATGGCACATTTATTTAAAGAGTGGCACTTGTCCATTTTCTTTTTCTAAGTTAAATAATTTT
hypothetical_protein MLKSASSLVRSFIRPQTFRLCSSSSTTQGSPSVSSDDEPVILENNPYTKEPRKCLLCSTGVELDYKNSRL LQQFVSTFSGRVYDRHITGLCDENKKKLIEAIAKSRRAGFMPIFVKDPKYTRDPKLFDPLKPIRPHSFA