14 Sep Language c++ only can use (If else statment, for (while) loop, array, function,vector) #include #i
Language c++ only can use (If else statment, for (while) loop, array, function,vector)
#include <iostream>
#include <vector>
#include <string>
#include <fstream>
#include <cctype>
#include <algorithm>
mRNA molecules are created from a single strand of DNA, through base pair matching with one exception. Instead of Tyhmine, mRNA has a base called Urasil (U).
For example: If the DNA strand has the following bases ATGCCCGTTA, its corresponding mRNA is UACGGGCAAU. A is paired with U, T is paired with A, C is paired with G and G is paired with C.
After mRNA is transcribed, it is translated into codons which are triplets of bases. Each codon has a special meaning and corresponds to a specific aminoacid. Then, aminoacids are sequenced to create a protein.
How do we indicate the start and end of an aminoacid sequence? As it turns out, some codons are reserved to indicate this. For example, AUG codon indicates the start of the protein synthesis, while three other codons indicate the end: UAG, UGA, UAA.
The first file ecoli.fa is a FASTA file which contains the DNA sequence data. Here is an excerpt from the file:
>Chromosome dna_rm:chromosome chromosome:ASM584v2:Chromosome:1:4641652:1 REF AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTC TGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGACTTAGG TCACTAAATACTTTAACCAATATAGGCATAGCGCACAGACAGATAAAAATTACAGAGTAC ACAACATCCATGAAACGCATTAGCACCACCATTACCACCACCATCACCATTACCACAGGT AACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGG CTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAAGTTCGGCGGT ACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCC AGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTG GCGATGATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAA CGTATTTTTGCCGAACTTTTGACGGGACTCGCCGCCGCCCAGCCGGGGTTCCCGCTGGCG CAATTGAAAACTTTCGTCGATCAGGAATTTGCCCAAATAAAACATGTCCTGCATGGCATT AGTTTGTTGGGGCAGTGCCCGGATAGCATCAACGCTGCGCTGATTTGCCGTGGCGAGAAA ATGTCGATCGCCATTATGGCCGGCGTATTAGAAGCGCGCGGTCACAACGTTACTGTTATC GATCCGGTCGAAAAACTGCTGGCAGTGGGGCATTACCTCGAATCTACCGTCGATATTGCT GAGTCCACCCGCCGTATTGCGGCAAGCCGCATTCCGGCTGATCACATGGTGCTGATGGCA GGTTTCACCGCCGGTAATGAAAAAGGCGAACTGGTGGTGCTTGGACGCAACGGTTCCGAC TACTCTGCTGCGGTGCTGGCTGCCTGTTTACGCGCCGATTGTTGCGAGATTTGGACGGAC GTTGACGGGGTCTATACCTGCGACCCGCGTCAGGTGCCCGATGCGAGGTTGTTGAAGTCG ATGTCCTACCAGGAAGCGATGGAGCTTTCCTACTTCGGCGCTAAAGTTCTTCACCCCCGC
The second file in the project folder is a CSV file named codon_table.csv which contains the codon list. Here is an excerpt from the file:
Codon
<td>AA.Abv</td> <
td>AA.Code</td>
<td>AA.Name</td></tr>
<tr><td>UUU</td>
<td>Phe</td>
<td>F</td>
<td>Phenylalanine</td></tr>
<tr><td>UUC</td>
<td>Phe</td>
<td>F</td>
<td>Phenylalanine</td></tr>
<tr><td>UUA</td>
<td>Leu</td>
<td>L</td>
<td>Leucine</td></tr>
<tr><td>UUG</td>
<td>Leu</td>
<td>L</td>
<td>Leucine</td></tr>
<tr><td>CUU</td>
<td>Leu</td>
<td>L</td>
<td>Leucine</td></tr>
In the above table, AA.Abv represents the abbreviation of the aminoacid, AA.Code represents the code for the aminoacid and AA.Name represents the actual name of the aminoacid. There are 64 codons in the file. One aminoacid can be represented with multiple codons, they all create the same aminoacid. For example, both UUU and UUC codons are translated as phenylalanine.
Write a function transcribe(dna_string) that creates the mRNA string from the DNA string. Each base in dna_string must be matched to its corresponding mRNA base. There might be strange characters in the DNA string other than A, T, C, G. They should be ignored: A U, T A, G C, C G matchings are the only valid ones.
string transcribe(string dna_string){
//this function must take the DNA string and construct a new mRNA string
//then return the mRNA string
}
Write a function translate which accepts the mRNA string as a parameter and creates a string vector of proteins. Each item in the vector is a string that consist of the aminoacid codes of the protein. The function must return the protein vector as a result.
Each protein’s aminoacid sequence starts with M (Methionine) which is the starting aminoacid and ends with a Stop aminoacid. So the function should:
look for mRNA sequences that starts with AUG codon;
detect the end (UAG, UGA, or UAA codon);
in between, identify the corresponding aminoacids for the codons to construct the protein;
save the protein string in the vector(Use push_back function).
vector<string> translate(string mrna_string) {
//create a protein vector and return it. }
Use the following print and main function to connect the processes and print the resulting protein vector
void print_protein_list(vector<string> list) {
for(string line : list)
{
cout << line << endl;
}
cout << list.size() << " proteins listed" << endl; }
int main() {
string dnastring = readFastaFile("ecoli.fa");
readCsvFile("codon_table.csv");
string mrnastring = transcribe(dnastring);
vector<string> protein_list = translate(mrnastring);
print_protein_list(protein_list);
return 0; }
The first few lines of the output should look like this:
0 -> MDGTHLILKStop
1 -> MKLVISVSRVCLFLMSHVLStop
2 -> MVVVVVMVSIATPDCACPLCLFSGVDCHARKKKLVSIAPLLVRSQLQAAMStop
3 -> MGYSRYGLHKNGLKTALSGGGSAPRATALTFESSStop
4 -> MTIARPAFStop
5 -> MELRWQLStop
6 -> MRRRHDRRTNARLTTLStop
7 -> MDAGRSPRATLQQLQLQDGPSLPRKDEAAISRSGGVVMGVAGQGLGNGLIFMAFRSSWSMRVTTVGTTSAStop 8 -> MRStop
9 -> MSStop
10 -> MDLDFLPNDLGDRHCLADRStop
11 -> MSSLRQVMASPIASQTLTAATATATRRPRStop
12 -> MHRRHNGStop
….
>Chromosome dna_rm:chromosome chromosome:ASM584v2:Chromosome:1:4641652:1 …
,
| Codon | AA.Abv | AA.Code | AA.Name |
| UUU | Phe | F | Phenylalanine |
| UUC | Phe | F | Phenylalanine |
| UUA | Leu | L | Leucine |
| UUG | Leu | L | Leucine |
| CUU | Leu | L | Leucine |
| CUC | Leu | L | Leucine |
| CUA | Leu | L | Leucine |
| CUG | Leu | L | Leucine |
| AUU | Ile | I | Isoleucine |
| AUC | Ile | I | Isoleucine |
| AUA | Ile | I | Isoleucine |
| AUG | Met | M | Methionine |
| GUU | Val | V | Valine |
| GUC | Val | V | Valine |
| GUA | Val | V | Valine |
| GUG | Val | V | Valine |
| UCU | Ser | S | Serine |
| UCC | Ser | S | Serine |
| UCA | Ser | S | Serine |
| UCG | Ser | S | Serine |
| CCU | Pro | P | Proline |
| CCC | Pro | P | Proline |
| CCA | Pro | P | Proline |
| CCG | Pro | P | Proline |
| ACU | Thr | T | Threonine |
| ACC | Thr | T | Threonine |
| ACA | Thr | T | Threonine |
| ACG | Thr | T | Threonine |
| GCU | Ala | A | Alanine |
| GCC | Ala | A | Alanine |
| GCA | Ala | A | Alanine |
| GCG | Ala | A | Alanine |
| UAU | Tyr | Y | Tyrosine |
| UAC | Tyr | Y | Tyrosine |
| UAA | Ochre | Stop | ‘’ |
| UAG | Amber | Stop | ‘’ |
| CAU | His | H | Histidine |
| CAC | His | H | Histidine |
| CAA | Gln | Q | Glutamine |
| CAG | Gln | Q | Glutamine |
| AAU | Asn | N | Asparagine |
| AAC | Asn | N | Asparagine |
| AAA | Lys | K | Lysine |
| AAG | Lys | K | Lysine |
| GAU | Asp | D | Aspartic-acid |
| GAC | Asp | D | Aspartic-acid |
| GAA | Glu | E | Glutamic-acid |
| GAG | Glu | E | Glutamic-acid |
| UGU | Cys | C | Cysteine |
| UGC | Cys | C | Cysteine |
| UGA | Opal | Stop | ‘’ |
| UGG | Trp | W | Tryptophan |
| CGU | Arg | R | Arginine |
| CGC | Arg | R | Arginine |
| CGA | Arg | R | Arginine |
| CGG | Arg | R | Arginine |
| AGU | Ser | S | Serine |
| AGC | Ser | S | Serine |
| AGA | Arg | R | Arginine |
| AGG | Arg | R | Arginine |
| GGU | Gly | G | Glycine |
| GGC | Gly | G | Glycine |
| GGA | Gly | G | Glycine |
| GGG | Gly | G | Glycine |
,
#include <iostream> #include <vector> #include <string> #include <fstream> #include <cctype> #include <algorithm> using namespace std; string readFastaFile(string path); void readCsvFile(string path); string transcribe(string dnaString); vector<string> translate(string mrnaString); //global vector of vectors (2Dimensional vector) list vector<vector<string>> mycodons; string readFastaFile(string path) { string dnaString; fstream newfile; newfile.open(path,ios::in); // open a file to perform write operation using file object if(newfile.is_open()){ //checking whether the file is open string line; getline(newfile, line); while(getline(newfile, line)){ //read data from file object and put it into string. dnaString += line; } newfile.close(); //close the file object. } //cout << dnaString.substr(0,100)<< endl; return dnaString; } void readCsvFile(string path) { ifstream inFile; string codon,Abv, AAcode, name; inFile.open(path); //read the first entry as column headers getline ( inFile, codon, ',' ); getline ( inFile, Abv, ',' ); getline ( inFile, AAcode, ',' ); inFile >> name; while (!inFile.eof()) {//while we have not reached the end of file //read next entry getline ( inFile, codon, ',' ); getline ( inFile, Abv, ',' ); getline ( inFile, AAcode, ',' ); inFile >> name; vector<string> temp; codon.erase(std::remove_if(codon.begin(), codon.end(), ::isspace), codon.end());//clear from white space characters temp.push_back(codon); AAcode.erase(std::remove_if(AAcode.begin(), AAcode.end(), ::isspace), AAcode.end());//clear from white space characters temp.push_back(AAcode); // add a vector entry consisting of codon and AAcode mycodons.push_back(temp);//add the vector to mycodons 2D vector } } string findAA(string codon) { //search the AAcode for a specific codon for(int i = 0 ; i < mycodons.size(); i++){ if(mycodons[i][0] == codon){ return mycodons[i][1]; } } return ""; } //convert DNA string to mRNA string string transcribe(string dna_string){ // your code comes here } //search for codons in the mRNA string and convert them to equivalent AAcodes //each protein starts with AUG codon and ends with UAG, UGA or UAA codons //return the list of proteins as a vector vector<string> translate(string mrna_string) { // your code comes here } void print_protein_list(vector<string> list) { for(string line : list) { cout << line << endl; } cout << list.size() << " proteins listed" << endl; } int main() { string dnastring = readFastaFile("ecoli.fa"); readCsvFile("codon_table.csv"); string mrnastring = transcribe(dnastring); vector<string> protein_list = translate(mrnastring); print_protein_list(protein_list); return 0; }
Our website has a team of professional writers who can help you write any of your homework. They will write your papers from scratch. We also have a team of editors just to make sure all papers are of HIGH QUALITY & PLAGIARISM FREE. To make an Order you only need to click Ask A Question and we will direct you to our Order Page at WriteDemy. Then fill Our Order Form with all your assignment instructions. Select your deadline and pay for your paper. You will get it few hours before your set deadline.
Fill in all the assignment paper details that are required in the order form with the standard information being the page count, deadline, academic level and type of paper. It is advisable to have this information at hand so that you can quickly fill in the necessary information needed in the form for the essay writer to be immediately assigned to your writing project. Make payment for the custom essay order to enable us to assign a suitable writer to your order. Payments are made through Paypal on a secured billing page. Finally, sit back and relax.
About Wridemy
We are a professional paper writing website. If you have searched a question and bumped into our website just know you are in the right place to get help in your coursework. We offer HIGH QUALITY & PLAGIARISM FREE Papers.
How It Works
To make an Order you only need to click on “Order Now” and we will direct you to our Order Page. Fill Our Order Form with all your assignment instructions. Select your deadline and pay for your paper. You will get it few hours before your set deadline.
Are there Discounts?
All new clients are eligible for 20% off in their first Order. Our payment method is safe and secure.
