If you’re interested in the underlying algorithm of variant callers, it’s always a safe bet to check out the paper(s) about the variant caller. So here are the papers for the two most commonly used variant callers:
SAMtools
1. The Sequence Alignment/Map format and SAMtools. Li et al. 2009
Although, this is the first publication about SAMtools, and according to the SAMtools webpage, it is the recommended article for citing the tool, this paper is mostly about the SAM format. If you’re interested in the actual algorithm, check out the next paper, which is about the quasi-predecessor of SAMtools, called Maq.
2. Mapping short DNA sequencing reads and calling variants using mapping quality scores. Li et al. 2008
As I’ve mentioned before, this article is about Maq, which was an aligner and variant caller developed by Heng Li. This tool is not updated any more, but as it shares some algorithms with SAMtools, it’s definitely worth reading. Be sure to check out the supplement as well, as some important details are mentioned there (e.g. about quality calibration).
GATK
3. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. McKenna et al. 2010
This is a fairly detailed description of GATK, including some details about the architecture of the GATK suite, plus some usage examples (see the figure below).
4. A framework for variation discovery and genotyping using next-generation DNA sequencing data. DePristo et al. 2011
This article presents an analysis workflow using GATK. It’s not really about algorithmic details, it’s more like practical guide, which presents an analysis framework and some results, using real life data from the probably the most sequenced human on this planet, whom we lovingly call NA12878.