1.3 Function filter.excess.het
Purpose: Remove loci with excessively high heterozygosity that are suspected to be bioinformatic artefacts (i.e., multilocus SNPs).
Input: A genlight object in which ‘ind.metrics’ contains a column named ‘pop’ and each individual is assigned to one population.
How it works: This function considers a locus to be ‘excessively heterozygous’ if its heterozygosity > 0.5and it significantly exceeds 0.5 assuming Hardy-Weinberg (HW) proportions. The rationale is that applying an absolute heterozygosity cut-off (e.g., 0.5 or 0.6) may remove some loci that conform to HW proportions but exceed the threshold due to sampling error. The function starts by dividing the genlight object by population, and identifying loci whose heterozygosity > 0.5. It then performs a χ2 test to detect heterozygote excess significantly beyond that from sampling variance assuming HW proportions in a given population (α= 0.05), and adjusts the p-values for False Discovery Rate with R function p.adjust (Benjamini & Hochberg, 1995). Loci whose adjusted p-values ≤ 0.5 in any population are considered excessively heterozygous and are removed from the input genlight object.