1.2 Function infer.sex
Purpose: Identify the genetic sex of individuals.
Input: The output of function sex.linked.filter (list of
six elements), a user-specified parameter that declares the
sex-determination system of the species (‘zw’ or ‘xy’), and a seed
number.
How it works: This function uses the types of loci available in
the input (W-linked/Y-linked, Z-linked/X-linked and gametologous loci)
to assign one preliminary sex for each type of sex-linked loci:
W-linked/Y-linked loci. For a ZW-system, it preliminarily assigns
‘M’ (male) to an individual if it presents more loci with NA (i.e.,
missing data) than loci with called genotype (i.e., ‘0’, ‘1’ or ‘2’),
and ‘F’ (female) otherwise. For a XY-system, the assignment is the
opposite.Z-linked/X-linked loci. It uses the matrix of genotypes for all
individuals to perform k-means clustering with two centers (using the
provided seed number). The rationale is that individuals would form two
distinctive clusters, one per sex. As a result, individuals are assigned
to one of two sex clusters. The individual with the most loci scored as
heterozygous is used to identify the sex of its cluster (‘M’ for
ZW-system, and ‘F’ for XY-system), while the other cluster is identified
as the opposite sex.Gametologs. It follows the same method as Z-linked/X-linked loci:
performs k-means clustering in which individuals are assigned to one of
two sex clusters. It also uses the individual with the most loci scored
as heterozygous to identify the sex of its cluster (‘F’ for ZW-system,
and ‘M’ for XY-system).
If a type of sex-linked locus was not available (e.g., zero gametologs),
it assigns NA to that preliminary assignment. The function uses the
preliminary assignments to output a final sex assignment: ‘F’ or ‘M’ if
all preliminary assignments match, ‘*F’ or ‘*M’ if they do not.
Output: a table with the three preliminary, and final sex
assignments per individual. The Table 1lso includes the raw data on
which the preliminary assignments were based on: number of
W-linked/Y-linked loci with missing/called genotype, number of
Z-linked/X-linked loci scored as homozygous/heterozygous, and number of
gametologs scored as homozygous/heterozygous
Recommended use: We created this function with the explicit
intent that a person inspects the final sex assignments for which not
all three preliminary assignments agree (denoted as ’*M’ or ’*F’). Some
individuals may have ambiguous genotypes for one type of sex-linked
loci, and given the nature of k-means clustering, they may be assigned
the wrong preliminary sex. It is recommended that the user checks the
output table to make a definite final assignment. We recommend it being
used straight after using function filter.sex.linked .