How to interpret Tajima's D
Tajima’s D is a widely used statistic in population genetics. In simple terms, it quantifies how much genetic diversity deviates from what would be expected under a neutral model of evolution. The formula is: π is the nucleotide diversity . It is the count of differences of nucleotides per pair of sequences, and averages it over all the pairs and all sites. θ is the number of segregating sites – number of positions in the alignment that have variation – normalized by a factor that depends on the sample size. Tajima’s D variation under population events: During a population bottleneck , many rare variants are lost by genetic drift. This results in a reduction in the number of segregating sites – Watterson’s θ decreases. But the variants that survive are in a intermediate frequency, so π does not decrease as much as θ , so Tajima’s D > 0 . During a population expansion , the number of individuals increases. Many recent mutati...