Data partitioning enables the use of standard SOAP Web Services in genome-scale workflows

Sztromwasser, Paweł; Puntervoll, Pål; Petersen, Kjell

dc.contributor.author	Sztromwasser, Paweł	eng
dc.contributor.author	Puntervoll, Pål	eng
dc.contributor.author	Petersen, Kjell	eng
dc.date.accessioned	2014-04-11T13:49:30Z
dc.date.available	2014-04-11T13:49:30Z
dc.date.issued	2011	eng
dc.identifier.issn	1613-4516	en_US
dc.identifier.other	http://journal.imbio.de/articles/pdf/jib-163.pdf	eng
dc.identifier.uri	https://hdl.handle.net/1956/7904
dc.description.abstract	Biological databases and computational biology tools are provided by research groups around the world, and made accessible on the Web. Combining these resources is a com- mon practice in bioinformatics, but integration of heterogeneous and often distributed tools and datasets can be challenging. To date, this challenge has been commonly addressed in a pragmatic way, by tedious and error-prone scripting. Recently however a more reliable technique has been identified and proposed as the platform that would tie together bioinfor- matics resources, namely Web Services. In the last decade the Web Services have spread wide in bioinformatics, and earned the title of recommended technology. However, in the era of high-throughput experimentation, a major concern regarding Web Services is their ability to handle large-scale data traffic. We propose a stream-like communication pattern for standard SOAP Web Services, that enables efficient flow of large data traffic between a workflow orchestrator and Web Services. We evaluated the data-partitioning strategy by comparing it with typical communication patterns on an example pipeline for genomic sequence annotation. The results show that data-partitioning lowers resource demands of services and increases their throughput, which in consequence allows to execute in-silico experiments on genome-scale, using standard SOAP Web Services and workflows. As a proof-of-principle we annotated an RNA-seq dataset using a plain BPEL workflow engine.	en_US
dc.language.iso	eng	eng
dc.publisher	IMBio e.V.	en_US
dc.relation.ispartof	<a href="http://hdl.handle.net/1956/7906" target="blank">Throughput and robustness of bioinformatics pipelines for genome-scale data analysis</a>	en_US
dc.rights	Attribution-NonCommercial-NoDerivs CC BY-NC-ND	eng
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0	eng
dc.title	Data partitioning enables the use of standard SOAP Web Services in genome-scale workflows	en_US
dc.type	Peer reviewed
dc.type	Journal article
dc.description.version	publishedVersion	en_US
dc.rights.holder	Copyright 2011 The Authors	en_US
dc.source.articlenumber	163
dc.identifier.doi	https://doi.org/10.2390/biecoll-jib-2011-163
dc.identifier.cristin	834107
dc.source.journal	Journal of Integrated Bioinformatics
dc.source.40	8
dc.source.14	2

Tilhørende fil(er)

Filnavn:: Sztromwasser et al_JIB.pdf
Størrelse:: 1.320Mb
Format:: PDF
Beskrivelse:: Published version

Åpne

Denne innførselen finnes i følgende samling(er)

Department of Informatics [917]

Vis enkel innførsel

Med mindre annet er angitt, så er denne innførselen lisensiert som Attribution-NonCommercial-NoDerivs CC BY-NC-ND