loading page

A pipeline for analysis of allele specific expression from RNA-seq data reveals salinity-dependent response in Nile tilapia
  • +1
  • Aurora Campo,
  • Moran Gershoni,
  • Adi Doron-Faigenboim,
  • Avner Cnaani
Aurora Campo
Agricultural Research Organization Volcani Center

Corresponding Author:ayla.bcn@gmail.com

Author Profile
Moran Gershoni
Agricultural Research Organization Volcani Center
Author Profile
Adi Doron-Faigenboim
Agricultural Research Organization Volcani Center
Author Profile
Avner Cnaani
Agricultural Research Organization Volcani Center
Author Profile

Abstract

Species living in a changing environment are capable of adapting to alterations of various factors. Physiological acclimatization may be significantly influenced by the heterozygosity, especially with regards to allele variance and its specific expression (ASE) under different conditions. Data from RNA-seq experiments can be used to identify and quantify the alleles expressed, in order to detect and characterize ASE and regulation of gene expression. However, the allele matching the reference genome creates a mapping bias that prevents a reliable estimation of the allele depth unless the haplotype of the experimental individuals is provided. We developed a pipeline that allows the identification of the alleles corresponding to an RNA-seq dataset and their unbiased quantification. This pipeline does not require the sequencing of the DNA nor the previous knowledge of the haplotype. The identified SNPs are further substituted in the reference genome, thus creating two pseudogenomes with the alternative alleles on two independent samples of the experiment. The SNPs are further called against each pseudogenome thus providing with two SNP datasets that are averaged for calculation of the allele depth. The final SNP calling file contains the coordinates of the SNPs and also the ID of genes containing the SNPs, the expressed genotypes, the unbiased allele depth and the statistical tests for identifying ASE according to the experimental design and correlated with differentially expressed genes. Therefore, the pipeline presented here can calculate ASE in non-model organisms and can be applied to previous RNA-seq datasets for expanding studies in gene expression regulation.