Skip to content

Commit 96f1be8

Browse files
fix typos
1 parent 2d54bc9 commit 96f1be8

1 file changed

Lines changed: 25 additions & 20 deletions

File tree

docs/src/rosalind/08-prot.md

Lines changed: 25 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -27,37 +27,39 @@
2727
```
2828

2929
### DIY solution
30-
Let's first tackle this problem by writing our own solution.
30+
Let's tackle this problem by writing our own solution,
31+
and then seeing how we can solve it with functions already available in BioJulia.
3132

3233
First, we will check that this is a coding region by verifying that the string starts with a start codon (`AUG`).
3334
If not, we can still convert the string to protein,
34-
but we'll throw an error.
35+
but we'll throw a warning to alert the user.
3536
There may be a frame shift,
3637
in which case the returned translation will be incorrect.
3738

3839
We'll also do a check that the string is divisible by three.
3940
If it is not, this will likely mean that there was a mutation in the string
4041
(addition or deletion).
41-
Again, we can still convert as much of the the string as possible.
42-
However, we should alert the user that this result may be incorrect!
42+
Again, we can still convert as much of the string as possible.
43+
However, we should alert the user that the result may be incorrect!
4344

44-
We need to convert this string of DNA to a string of proteins using the RNA codon table.
45+
Next, we'll need to convert this string of mRNA to a string of proteins using the RNA codon table.
4546
We can convert the RNA codon table into a dictionary,
4647
which can map over our codons.
4748
Alternatively, we could also import this from the BioSequences package,
4849
as this is already defined [there](https://github.com/BioJulia/BioSequences.jl/blob/b626dbcaad76217b248449e6aa2cc1650e95660c/src/geneticcode.jl#L132).
4950

50-
Then, we'll break the string into codons by slicing at every three characters.
51-
These codons can be matched to the strings into the RNA codon table to get the corresponding amino acid.
52-
We'll append this amino acid to a string.
51+
Then, we'll break the string into codons by slicing it every three characters.
52+
These codons can be matched against the RNA codon table to get the corresponding amino acid.
53+
We'll join all these amino acids together to form the final string.
5354

54-
We'll need to deal with any three-character strings that don't match a codon.
55-
This likely means that there was a mutation in the input DNA string!
55+
Lastly, we'll need to deal with any three-character strings that don't match a codon.
56+
This likely means that there was a mutation in the input mRNA string!
5657
If we get a codon that doesn't match,
5758
we can return "X" for that amino acid,
5859
and continue translating the rest of the string.
59-
However, if we get a string X's,
60-
that will definitely signal to us that there was some kind of frame shift.
60+
If we get a string of X's,
61+
that should signal to the user that there was some kind of frame shift.
62+
6163

6264
Now that we have established an approach,
6365
let's turn this into code!
@@ -68,7 +70,7 @@ using Test
6870
rna = "AUGGCCAUGGCGCCCAGAACUGAGAUCAAUAGUACCCGUAUUAACGGGUGA"
6971

7072
# note: this can be created by hand
71-
# or it can be accessed using
73+
# or it can be accessed from the BioSequences package (see link above)
7274
codon_table = Dict{String,Char}(
7375
"AAA" => 'K', "AAC" => 'N', "AAG" => 'K', "AAU" => 'N',
7476
"ACA" => 'T', "ACC" => 'T', "ACG" => 'T', "ACU" => 'T',
@@ -127,21 +129,24 @@ translate(rna"AUGGCCAUGGCGCCCAGAACUGAGAUCAAUAGUACCCGUAUUAACGGGUGA")
127129

128130
```
129131

130-
This function is straightforward to use.
131-
However, there are also additional parameters for us to use.
132+
This function is straightforward to use,
133+
especially in the case where the input mRNA has no ambiguous codons
134+
and is divisible by 3.
135+
However, there are also additional parameters available for handling other types of strings.
132136

133137
For instance, the function defaults to using the standard genetic code.
134138
However, if a user wishes to use another codon chart
135139
(for example, yeast or invertebrate),
136140
there are others available on [BioSequences.jl](https://github.com/BioJulia/BioSequences.jl/blob/b626dbcaad76217b248449e6aa2cc1650e95660c/src/geneticcode.jl#L130) to choose from.
137141

138-
By default `allow_ambiguous_codons` is `true`.
139-
However, if a user is giving the function a mRNA string with ambiguous codons that may not be found in the standard genetic code,
140-
these codons will be translated to the most narrow amino acid which covers all
142+
By default, `allow_ambiguous_codons` is `true`.
143+
If a user gives the function a mRNA string with ambiguous codons that may not be found in the standard genetic code,
144+
these codons will be translated to the narrowest amino acid which covers all
141145
non-ambiguous codons encompassed by the ambiguous codon.
142-
By default, ambiguous codons will cause an error.
146+
If this option is turned off,
147+
ambiguous codons will cause an error.
143148

144149
Additionally, `alternative_start` is `false` by default.
145-
If set to true, the starting codon will be Methionine regardless of the starting codon.
150+
If set to true, the starting amino acid will be Methionine regardless of what the first codon is.
146151

147152
Similar to our function, the BioSequences function also throws an error if the input mRNA string is not divisible by 3.

0 commit comments

Comments
 (0)