# Sequence Alignment
Suppose $M$ is a given alignment between $X$ and $Y$.
The goal is to find the minimal alignment cost, aka the *optimal alignment*.
The cost of $M$ is the sum of gap and mismatch costs:
- Gap Penalty: Every
- Mismatch Cost: For each pair of leters $p,q$ in the alphabet, there is a mismatch cost $\alpha_{pq}$ for lining up $p$ and $q$. (Assumption that $\alpha_{pp}=0$)
There are three possible situations we can encounter when comparing characters:
- Alignment: They are identical
- Mismatch
## Types of Alignments
### Global vs Local Alignment
Sequence alignment is tha arrangement of biological sequences to identify regions of similarity and help identify any structural and functional overlap.
- Sequences from a sample are often aligned with sequences of a reference genome to identify
- **Global Alignment:** Aims to align every residue in every sequence from start to end
- **Local Alignment:** Aims to align parts of the sequence which share the highest similarity
- Typically uses
![[Pasted image 20240220055256.png|200]]
#### Operational Taxonomic Unit (OTU)
> https://www.cd-genomics.com/microbioseq/operational-taxonomic-unit-otu-and-otu-clustering.html
https://www.zymoresearch.com/blogs/blog/microbiome-informatics-otu-vs-asv
Ex: 16S rRNA-Seq,
### Multiple vs Pairwise Alignment
[Pairwise vs. Multiple Sequence Alignments: Which has better accuracy?](https://www.biostars.org/p/114718/#114779)
## Scoring/Alignment Matrices