Skip to content

Commit 756269a

Browse files
add slight edit about MSAs and mapping
1 parent d20b9bd commit 756269a

File tree

1 file changed

+9
-7
lines changed

1 file changed

+9
-7
lines changed

cookbook/02-alignments.md

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -6,18 +6,20 @@ rss_descr = "Align a gene against a reference genome using BioAlignments.jl"
66
As mentioned in the previous [tutorial]("../01-sequences.md"), in this chapter, we will learn about alignments.
77
We will explore pairwise alignment as a tool to compare two copies of the _mecA_ gene found on NCBI.
88

9-
# Pairwise Alignment
9+
# `BioAlignments` Implements Only Pairwise Alignment
1010

1111
On the most basic level, aligners use algorithms to "line up" sequences
1212
and look for regions of similarity.
1313

14-
BioAlignments implements only pairwise alignment.
14+
`BioAlignments` implements pairwise alignment.
1515
Pairwise alignment differs from multiple sequence alignment (MSA) because
1616
it only aligns two sequences, while MSAs align any number of sequences.
17+
There is not currently a MSA package in Julia.
1718

1819
Pairwise alignment also assumes that the two sequences are roughly homologous.
1920
For example, you may use it to align two versions of the same gene.
2021
It is not used to map reads to a genome -- mapping would be a better solution for that.
22+
If mapping is your goal, you can use a mapper like `minimap2` and parse the result with `PairwiseMappingFormat.jl`.
2123

2224
# Running the Alignment
2325
There are two main parameters for determining how we want to perform our alignment:
@@ -26,7 +28,7 @@ alignment type and score/cost model.
2628
The alignment type specifies the alignment range (local vs global alignment)
2729
and the score/cost model explains how to score matches/mismatches in the sequences that are being compared.
2830

29-
### Alignment Types
31+
## Alignment Types
3032
Currently, four types of alignments are supported:
3133
- `GlobalAlignment`: global-to-global alignment
3234
- Aligns sequences end-to-end
@@ -65,7 +67,7 @@ The alignment type should be selected based on what is already known about the s
6567
- Are we looking at two sequences from wildly divergent organisms?
6668

6769

68-
### Cost Model
70+
## Cost Model
6971

7072
The cost model provides a way to calculate penalties for differences between the two sequences,
7173
and then finds the alignment that minimizes the total penalty.
@@ -107,7 +109,7 @@ Due to the similarity in the genes we are comparing, it makes the most sense to
107109

108110
In this first example, we'll align two strings that contain the genes.
109111

110-
## Running Alignment on BioSequences Object
112+
## Aligning BioSequences Object
111113

112114
```julia
113115
using BioAlignments
@@ -122,7 +124,7 @@ res = pairalign(GlobalAlignment(), mecA, mecA1, scoremodel)
122124
```
123125

124126

125-
## Running Alignment on FASTX files
127+
## Aligning FASTX files
126128
In this next example, we'll repeat the same alignment,
127129
but read in the files directly from the FASTA files containing the gene.
128130
Running the alignment on strings is straightforward with short sequences,
@@ -149,7 +151,7 @@ res_fasta = pairalign(GlobalAlignment(), mecA_fasta, mecA1_fasta, scoremodel)
149151
```
150152

151153

152-
### Understanding How Alignments Are Represented
154+
# Understanding How Alignments Are Represented
153155
The output of an alignment is a series of `AlignmentAnchor` objects.
154156
This data structure gives information on the position of the start of the alignment,
155157
sections where nucleotides match, as well as where there may be deletions or insertions.

0 commit comments

Comments
 (0)