Skip to content

Commit 6a35d32

Browse files
fix typos
1 parent ae2cf2c commit 6a35d32

File tree

1 file changed

+17
-12
lines changed

1 file changed

+17
-12
lines changed

cookbook/03-blast.md

Lines changed: 17 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,16 @@ rss_descr = "Using NCBIBlast.jl to run BLAST searches"
66
+++
77

88
# Introduction to BLAST
9-
A BLAST search allows you to query a sequence (either nucleotide or protein) against an entire database of sequences.
9+
A BLAST search allows you to query a sequence (either nucleotide or protein) against an entire database of sequences.
10+
It can be helpful for quickly compare unknown sequences to databases of established reference sequences for purposes such as species identity or assignment gene function.
1011

1112
More information about how to use BLAST can be found in its [manual](https://www.ncbi.nlm.nih.gov/books/NBK569856/).
1213

13-
BLAST's can be run from the BLAST web page [here](https://blast.ncbi.nlm.nih.gov/Blast.cgi).
14-
A user can simply copy in a nucleotide sequence and search for a best match against all of the databases in NCBI!
14+
BLAST searches can be run from the command line interface (CLI) or through BLAST web page [here](https://blast.ncbi.nlm.nih.gov/Blast.cgi).
15+
A user can simply copy in a nucleotide sequence and search for the best match in NCBI!
16+
While searching from the website is fast and straightforward,
17+
it only searches against the NCBI databases.
18+
The CLI allows users to query against both NCBI databases and custom databases.
1519

1620
`NCBIBlast.jl` is a thin wrapper around the BLAST command line tool,
1721
allowing users to run the tool within Julia.
@@ -34,7 +38,7 @@ Note: [BioTools BLAST](https://biojulia.dev/BioTools.jl/stable/blast/) is a depr
3438

3539
The keywords used in the tool are sent to the shell for running BLAST.
3640

37-
As stated on the Github [docs](https://github.com/BioJulia/NCBIBlast.jl), the julia call
41+
As stated on the GitHub [docs](https://github.com/BioJulia/NCBIBlast.jl), the Julia call
3842

3943
```
4044
blastn(; query = "a_file.txt", db="mydb", out="results.txt")
@@ -58,9 +62,9 @@ More directions on building a BLAST database locally can be found [here](https:/
5862

5963
## Example: Building a local BLAST database and running the BLAST search
6064

61-
For our first example, we will replicate the example on the NCBIBlast.jl Github.
65+
For our first example, we will replicate the example on the `NCBIBlast.jl` Github.
6266

63-
First, we will build a local database using a fasta file found in the NCBIBlast github repository ([link here](https://github.com/BioJulia/NCBIBlast.jl/blob/main/test/example_files/dna2.fasta)).
67+
First, we will build a local database using a FASTA file found in the NCBIBlast github repository ([link here](https://github.com/BioJulia/NCBIBlast.jl/blob/main/test/example_files/dna2.fasta)).
6468

6569
```
6670
makeblastdb(; in="assets/dna2.fasta", dbtype="nucl")
@@ -94,7 +98,7 @@ io = IOBuffer();
9498
blastn(buf; stdout=io, db="assets/dna2.fasta", outfmt="6");
9599
seek(io, 0);
96100
```
97-
The command `seek(io,0)` moves the cursor of to the start of the captured object (index 0) so it can be read into a dataframe.
101+
The command `seek(io,0)` moves the cursor to the start of the captured object (index 0) so it can be read into a dataframe.
98102

99103

100104
```
@@ -113,8 +117,8 @@ This output tells us that the query sequence (`Query_1` is the default name sinc
113117
There is 100% identity on a region that is 38 nucleotides long.
114118
There are 0 mismatches or gap openings.
115119
The match starts at index 1 on the query sequence, and ends at index 82.
116-
This region matches a region on the `Test1` that spans from index 82 to 119.
117-
The E-value is `5.64e-18`, meaning that it is extremely unlikely that this match occured simply due to chance.
120+
This region matches a region in the `Test1` sequence spanning from index 82 to 119.
121+
The E-value is `5.64e-18`, meaning that it is extremely unlikely that this match occurred simply due to chance.
118122

119123
Here is a description of the E-value from the NCBI [website](https://blast.ncbi.nlm.nih.gov/doc/blast-help/FAQ.html):
120124
> The Expect value (E) is a parameter that describes the number of
@@ -151,7 +155,8 @@ We should see that the query fasta is a direct hit to the _mecA_ gene
151155
For this BLAST search, I will search against the `core_nt` database,
152156
which is a faster, smaller, and more focused subset of the traditional `nt` (nucleotide) database.
153157
This newer database is the default as of August 2024.
154-
It seeks to reduce redundancy and reduce storage when downloading the database. More information about it can be found [here](https://ncbiinsights.ncbi.nlm.nih.gov/2024/07/18/new-blast-core-nucleotide-database/).
158+
It seeks to reduce redundancy and storage requirements when downloading the database.
159+
More information about it can be found [here](https://ncbiinsights.ncbi.nlm.nih.gov/2024/07/18/new-blast-core-nucleotide-database/).
155160

156161
General information about the different kinds of BLAST databases is also available [here](https://www.nlm.nih.gov/ncbi/workshops/2023-08_BLAST_evol/databases.html).
157162

@@ -204,10 +209,10 @@ Because of this, the first row in the results is not necessarily a better match
204209
even though it appears first.
205210

206211
To verify the first hit, we can look up the GenBankID of the first hit: `CP026646.1`.
207-
The NCBI [page](https://www.ncbi.nlm.nih.gov/nuccore/CP026646.1/) listing this sample confirms that this sample was phenotyped as _S. aureus_.
212+
The NCBI [page](https://www.ncbi.nlm.nih.gov/nuccore/CP026646.1/) for this sample confirms that this sample was phenotyped as _S. aureus_.
208213
Our query matches from indices 46719 to 46580.
209214
When we use the Graphics feature to visualize gene annotations, we see that there is a clear match to _mecA_.
210215

211216
![BLAST Graphics](assets/mecA_BLAST.png)
212217

213-
Overall, this confirms that our BLAST worked correctly!
218+
Overall, this confirms that our BLAST worked as corrected!

0 commit comments

Comments
 (0)