2.1. Sequence search, alignment and phylogenetic
analysis
Cellulase sequences of S. quadricauda were identified from genome
sequence data of our previous study 9 using protein
folding homology analysis by Phyre2 12 and Blast-N
similarity study 13 with Monoraphidium
neglectum taken as reference, and their details are included in table
1. Other analyzed sequences of Scenedesmus were taken from
PhycoCosm 14 or NCBI {https://www.ncbi.nlm.nih.gov/}
and their accession numbers are shown in table 2. Conserved domains,
signal peptide, and GH-family assignment were identified with Prosite
patterns 15, DeepLoc 16 and PredAlgo17. The sequences were aligned and processed with
Clustal Omega 18 and visualized with ESPript 3.019. To construct the phylogenetic trees, all the
sequences were aligned with sequences from phylogenetically distant
β-1,4-endoglucanases, β-glucosidases or exocellulases (respectively)
from microalgae, fungi, plants, invertebrates and bacteria and processed
with Gblock v0.91b before analyzing them in MEGA 6.0620,21. Enzymes signal peptides were not included in
the phylogenetic analysis. The phylogenetic trees were built by Maximum
Likelihood method in MEGA 6.06 version with the model and the
restrictions suggested by the program. Phylogenies were determined by
Bootstrap Analysis of 100 replicates.