Highly contiguous genomes of human clinical isolates of Giardia duodenalis reveal assemblage-and sub-assemblage-specific presence–absence variation in protein-coding genes
Klotz, Christian; Schmid, Marc William; Winter, Katja; Ignatius, Ralf; Weisz, Filip; Saghaug, Christina Skår; Langeland, Nina; Dawson, Scott; Lalle, Marco; Hanevik, Kurt; Cacciò, Simone M.; Aebischer, Toni
Journal article, Peer reviewed
Published version
Åpne
Permanent lenke
https://hdl.handle.net/11250/3097836Utgivelsesdato
2023Metadata
Vis full innførselSamlinger
- Department of Clinical Science [2441]
- Registrations from Cristin [10794]
Sammendrag
Giardia duodenalis (syn. G. intestinalis, G. lamblia) is a widespread gastrointestinal protozoan parasite with debated taxonomic status. Currently, eight distinct genetic sub-groups, termed assemblages A–H, are defined based on a few genetic markers. Assemblages A and B may represent distinct species and are both of human public health relevance. Genomic studies are scarce and the few reference genomes available, in particular for assemblage B, are insufficient for adequate comparative genomics. Here, by combining long- and short-read sequences generated by PacBio and Illumina sequencing technologies, we provide nine annotated genome sequences for reference from new clinical isolates (four assemblage A and five assemblage B parasite isolates). Isolates chosen represent the currently accepted classification of sub-assemblages AI, AII, BIII and BIV. Synteny over the whole genome was generally high, but we report chromosome-level translocations as a feature that distinguishes assemblage A from B parasites. Orthologue gene group analysis was used to define gene content differences between assemblage A and B and to contribute a gene-set-based operational definition of respective taxonomic units. Giardia is tetraploid, and high allelic sequence heterogeneity (ASH) for assemblage B vs. assemblage A has been observed so far. Noteworthy, here we report an extremely low ASH (0.002%) for one of the assemblage B isolates (a value even lower than the reference assemblage A isolate WB-C6). This challenges the view of low ASH being a notable feature that distinguishes assemblage A from B parasites, and low ASH allowed assembly of the most contiguous assemblage B genome currently available for reference. In conclusion, the description of nine highly contiguous genome assemblies of new isolates of G. duodenalis assemblage A and B adds to our understanding of the genomics and species population structure of this widespread zoonotic parasite.