SeeCiTe: a method to assess CNV calls from SNP arrays using trio data
Journal article, Peer reviewed
MetadataVis full innførsel
OriginalversjonBioinformatics. 2021, 37 (13), 1876-1883. 10.1093/bioinformatics/btab028
Motivation Single nucleotide polymorphism (SNP) genotyping arrays remain an attractive platform for assaying copy number variants (CNVs) in large population-wide cohorts. However, current tools for calling CNVs are still prone to extensive false positive calls when applied to biobank scale arrays. Moreover, there is a lack of methods exploiting cohorts with trios available (e.g. nuclear family) to assist in quality control and downstream analyses following the calling. Results We developed SeeCiTe (Seeing CNVs in Trios), a novel CNV-quality control tool that postprocesses output from current CNV-calling tools exploiting child-parent trio data to classify calls in quality categories and provide a set of visualizations for each putative CNV call in the offspring. We apply it to the Norwegian Mother, Father and Child Cohort Study (MoBa) and show that SeeCiTe improves the specificity and sensitivity compared to the common empiric filtering strategies. To our knowledge, it is the first tool that utilizes probe-level CNV data in trios (and singletons) to systematically highlight potential artifacts and visualize signal intensities in a streamlined fashion suitable for biobank scale studies.