RT Journal Article SR Electronic T1 Statistical guidelines for quality control of next-generation sequencing techniques JF Life Science Alliance JO Life Sci. Alliance FD Life Science Alliance LLC SP e202101113 DO 10.26508/lsa.202101113 VO 4 IS 11 A1 Maximilian Sprang A1 Matteo Krüger A1 Miguel A Andrade-Navarro A1 Jean-Fred Fontaine YR 2021 UL https://www.life-science-alliance.org/content/4/11/e202101113.abstract AB More and more next-generation sequencing (NGS) data are made available every day. However, the quality of this data is not always guaranteed. Available quality control tools require profound knowledge to correctly interpret the multiplicity of quality features. Moreover, it is usually difficult to know if quality features are relevant in all experimental conditions. Therefore, the NGS community would highly benefit from condition-specific data-driven guidelines derived from many publicly available experiments, which reflect routinely generated NGS data. In this work, we have characterized well-known quality guidelines and related features in big datasets and concluded that they are too limited for assessing the quality of a given NGS file accurately. Therefore, we present new data-driven guidelines derived from the statistical analysis of many public datasets using quality features calculated by common bioinformatics tools. Thanks to this approach, we confirm the high relevance of genome mapping statistics to assess the quality of the data, and we demonstrate the limited scope of some quality features that are not relevant in all conditions. Our guidelines are available at https://cbdm.uni-mainz.de/ngs-guidelines.