Machine learning ATR-FTIR spectroscopy data for the screening of collagen for ZooMS analysis and mtDNA in archaeological bone
Pal Chowdhury, Manasij; Choudhury, Kaustabh Datta; Bouchard, Geneviève Pothier; Riel-Salvatore, Julien; Negrino, Fabio; Benazzi, Stefano; Slimak, Ludovic; Frasier, Brenna; Szabo, Vicki; Harrison, Ramona; Hambrecht, George; Kitchener, Andrew C.; Wogelius, Roy A.; Buckley, Mike
Journal article, Peer reviewed
Accepted version
View/ Open
Date
2021Metadata
Show full item recordCollections
Abstract
Faunal remains from archaeological sites allow for the identification of animal species that enables the better understanding of the relationships between humans and animals, not only from their morphological information, but also from the ancient biomolecules (lipids, proteins, and DNA) preserved in these remains for thousands and even millions of years. However, due to the costs and efforts required for ancient biomolecular analysis, there has been considerable research into development of accurate and efficient screening approaches for archaeological remains. FTIR spectroscopy is one such approach that has been considered for screening of proteins, but its widespread use has been hindered by the fact that its predictive accuracy can vary widely depending on the extent of sample preservation and the instrument used. Further, screening methods for ancient DNA (aDNA) analysis are scarce. Here we present a new approach to vastly improve upon FTIR-based screening methods prior to ZooMS (Zooarchaeology by Mass Spectrometry) and aDNA analysis through the use of random forest-based machine learning. To do so, we use ATR-FTIR to examine three sets of archaeological bone assemblages and analyse them by ZooMS (for taxonomic identification). Two of these are from Palaeolithic contexts, dominated by terrestrial fauna and include specimens with a variety of preservational conditions. The third set consists of Holocene faunal remains, with variable levels of preservation and is dominated by cetaceans. Using the Holocene faunal remains, we were able to more consistently evaluate ATR-FTIR-based screening for mtDNA as well as ZooMS success. We report on the potential of machine learning in ATR-FTIR-based screening for ancient mtDNA analysis, and our machine learning models conclusively improve the accuracy prior to usage of ATR-FTIR-based screening for ZooMS by 20–40%. The results also suggest this approach potentially allows for a universal screening system, applicable across multiple sites and largely independent of the spectrometers used.