Sometimes we need to convert genetic sequences present in tabular format into plain text file (fasta) format. Also, often we need to convert fasta format into tabular format. DNA seqences present in tabular format can be used as a vector in R and we can do various operations like extracting part of DNA for each sequence. So, I have written two functions that would perform these tasks.
These functions can be pulled into R Studio directly from the github.
library (devtools)
library (tidyverse)
source_url("https://raw.githubusercontent.com/lrjoshi/FastaTabular/master/fasta_and_tabular.R")
LS0tDQp0aXRsZTogIkZhc3RhIHRvIFRhYmxlIHRvIEZhc3RhIENvbnZlcnNpb24gaW4gUiINCm91dHB1dDogaHRtbF9ub3RlYm9vaw0KLS0tDQoNClNvbWV0aW1lcyB3ZSBuZWVkIHRvIGNvbnZlcnQgZ2VuZXRpYyBzZXF1ZW5jZXMgcHJlc2VudCBpbiB0YWJ1bGFyIGZvcm1hdCBpbnRvIHBsYWluIHRleHQgZmlsZSAoZmFzdGEpIGZvcm1hdC4gQWxzbywgb2Z0ZW4gd2UgbmVlZCB0byBjb252ZXJ0IGZhc3RhIGZvcm1hdCBpbnRvIHRhYnVsYXIgZm9ybWF0LiBETkEgc2VxZW5jZXMgcHJlc2VudCBpbiB0YWJ1bGFyIGZvcm1hdCBjYW4gYmUgdXNlZCBhcyBhIHZlY3RvciBpbiBSIGFuZCB3ZSBjYW4gZG8gdmFyaW91cyBvcGVyYXRpb25zIGxpa2UgZXh0cmFjdGluZyBwYXJ0IG9mIEROQSBmb3IgZWFjaCBzZXF1ZW5jZS4gU28sIEkgaGF2ZSB3cml0dGVuIHR3byBmdW5jdGlvbnMgdGhhdCB3b3VsZCBwZXJmb3JtIHRoZXNlIHRhc2tzLg0KDQoNCg0KVGhlc2UgZnVuY3Rpb25zIGNhbiBiZSBwdWxsZWQgaW50byBSIFN0dWRpbyBkaXJlY3RseSBmcm9tIHRoZSBnaXRodWIuDQoNCg0KYGBge3J9DQpsaWJyYXJ5IChkZXZ0b29scykNCmxpYnJhcnkgKHRpZHl2ZXJzZSkNCnNvdXJjZV91cmwoImh0dHBzOi8vcmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbS9scmpvc2hpL0Zhc3RhVGFidWxhci9tYXN0ZXIvZmFzdGFfYW5kX3RhYnVsYXIuUiIpDQpgYGANCg0KDQojIyMgRmFzdGEgdG8gdGFidWxhciBmb3JtYXQgDQoNClN1cHBvc2Ugd2UgaGF2ZSBvdXIgRE5BIHNlcXVlbmNlcyBpbiBkbmFfZmFzdGEuZmFzdGEgZmlsZS4gDQoNCg0KQ29udmVydCB0aGlzIGZhc3RhIGZpbGUgdG8gdGFibGUgdXNpbmcgZm9sbG93aW5nIGZ1bmN0aW9uLiBUaGUgb3V0cHV0IHdpbGwgYmUgc3RvcmVkIGFzIGRuYV90YWJsZS5jc3YgaW4gdGhlIGN1cnJlbnQgZGlyZWN0b3J5LiANCg0KYGBgIHtyfQ0KRmFzdGFUb1RhYnVsYXIoImRuYV9mYXN0YS5mYXN0YSIpDQoNCmBgYA0KDQojIyMgVGFidWxhciB0byBGYXN0YSBmb3JtYXQNCg0KVG8gY29udmVydCBjc3YgdG8gZmFzdGEgZm9ybWF0LCBvbmUgcmVzdHJpY3Rpb24gaXMgdGhhdCB5b3Ugc2hvdWxkIGhhdmUgeW91ciBzZXF1ZW5jZSBuYW1lcyBpbiB0aGUgZmlyc3QgY29sdW1uIGFuZCBzZXF1ZW5jZSBpdHNlbGYgaW4gdGhlIHNlY29uZCBjb2x1bW4uIFRoZW4gdXNlIGZvbGxvd2luZyBmdW5jdGlvbi4gVGhpcyB3aWxsIHN0b3JlIHRoZSBvdXRwdXQgdGFibGUgYXMgZG5hX3RhYmxlLmZhc3RhIGZpbGUgaW4gdGhlIGN1cnJlbnQgd29ya2luZyBkaXJlY3RvdHkuIFJlbWVtYmVyLCBwcmUtZXhpc3RpbmcgZmlsZSB3aXRoIHRoZSBzYW1lIG5hbWUgd2lsbCBiZSBvdmVyd3JpdHRlbi4NCg0KDQpgYGB7cn0NCg0KVGFidWxhclRvRmFzdGEoImdlbmUuY3N2IikNCg0KYGBgDQpJZiB5b3UgZ2V0IHBlcm1pc3Npb24gZXJyb3Igd2hpbGUgd3JpdGluZyBmaWxlcywgdHJ5IHRvIGNyZWF0ZSBhIG5ldyBkaXJlY3RvcnkgYW5kIHNldCB0aGF0IGRpcmVjdG9yeSBhcyB3b3JraW5nIGRpcmVjdG9yeS4NCg0KU2FtcGxlIGZpbGVzIGFuZCBjb2RlcyBhcmUgcHJlc2VudCBpbiBteSBbR2l0aHViXShodHRwczovL2dpdGh1Yi5jb20vbHJqb3NoaS9GYXN0YVRhYnVsYXIpIHJlcG9zaXRvcnku