Department of Informatics
http://hdl.handle.net/1956/918
Fri, 23 Mar 2018 16:59:22 GMT2018-03-23T16:59:22ZGraph modification problems: Beyond the known boundaries
http://hdl.handle.net/1956/17399
Graph modification problems: Beyond the known boundaries
Agrawal, Akanksha
Doctoral thesis
Fri, 17 Nov 2017 00:00:00 GMThttp://hdl.handle.net/1956/173992017-11-17T00:00:00ZThe Atlantic salmon genome provides insights into rediploidization
http://hdl.handle.net/1956/17363
The Atlantic salmon genome provides insights into rediploidization
Lien, Sigbjørn; Koop, Ben F; Sandve, Simen Rød; Miller, Jason R.; Kent, Matthew Peter; Nome, Torfinn; Hvidsten, Torgeir Rhoden; Leong, Jong; Minkley, David R.; Zimin, Aleksey; Grammes, Fabian; Grove, Harald; Gjuvsland, Arne Bjørke; Walenz, Brian; Hermansen, Russell A.; von Schalburg, Kristian R.; Rondeau, Eric; Genova, Alex Di; Antony Samy, Jeevan Karloss; Vik, Jon Olav; Vigeland, Magnus Dehli; Caler, Lis; Grimholt, Unni; Jentoft, Sissel; Våge, Dag Inge; de Jong, Pieter J.; Moen, Thomas; Baranski, Matthew; Palti, Yniv; Smith, Douglas W.; Yorke, James A.; Nederbragt, Alexander J.; Tooming-Klunderud, Ave; Jakobsen, Kjetill Sigurd; Jiang, Xuanting; Fan, Dingding; Hu, Yan; Liberles, David A.; Vidal, Rodrigo; Iturra, Patricia; Jones, Steven J.M.; Jonassen, Inge; Maass, Alejandro; Omholt, Stig William; Davidson, William S
Journal article
The whole-genome duplication 80 million years ago of the common ancestor of salmonids (salmonid-specific fourth vertebrate whole-genome duplication, Ss4R) provides unique opportunities to learn about the evolutionary fate of a duplicated vertebrate genome in 70 extant lineages. Here we present a high-quality genome assembly for Atlantic salmon (Salmo salar), and show that large genomic reorganizations, coinciding with bursts of transposon-mediated repeat expansions, were crucial for the post-Ss4R rediploidization process. Comparisons of duplicate gene expression patterns across a wide range of tissues with orthologous genes from a pre-Ss4R outgroup unexpectedly demonstrate far more instances of neofunctionalization than subfunctionalization. Surprisingly, we find that genes that were retained as duplicates after the teleost-specific whole-genome duplication 320 million years ago were not more likely to be retained after the Ss4R, and that the duplicate retention was not influenced to a great extent by the nature of the predicted protein interactions of the gene products. Finally, we demonstrate that the Atlantic salmon assembly can serve as a reference sequence for the study of other salmonids for a range of purposes.
Sun, 01 May 2016 00:00:00 GMThttp://hdl.handle.net/1956/173632016-05-01T00:00:00ZA Comparative Study on Distributed Storage and Erasure Coding Techniques Using Apache Hadoop Over NorNet Core
http://hdl.handle.net/1956/17308
A Comparative Study on Distributed Storage and Erasure Coding Techniques Using Apache Hadoop Over NorNet Core
Vela, Maximiliano Matias
Master thesis
Both private and public sector organizations are constantly looking for new ways to keep their information safe and accessible at all times. Over the past few decades, replication has always been a reliable way to make sure data is constantly available, even though it has been proven to induce higher costs due to the additional required storage. Since the early 2000s, erasure codes have been developed as a means to drastically reduce the overhead, while enormously increasing efficiency and providing significant error-correcting capabilities. One of the most well-known erasure coding policies is Reed-Solomon (RS), a highly consistent, reliable, and efficient technique to store and recover data, currently used at Facebook's data centers. Other frequently mentioned policies are Pyramid codes, a variant of Locally Repairable Codes (LRCs) that make use of a pyramid-based scheme to generate additional parity groups for each level, and has been used at Microsoft's Windows Live servers. Apache Hadoop is an open-source distributed framework used for scalable processing that has recently introduced erasure coding policies to their storage capabilities. NorNet Core (or NorNet Core Testbed {website: www.nntb.no.}), a distributed academic network, will be used as the main scenario to measure, compare, and analyze these different erasure coding policies and their efficiency. Based on simulations of physically distributed storage, this thesis will show how minimal alterations in commonly known codes (such as RS codes) can converge in a Pyramid-based code that could severely enhance fault-tolerance and performance. Additionally, in a side-to-side comparison, it will be detailed how bigger codes (of higher dimension and length), more often than not, provide a more beneficial trade-off.
Wed, 24 Jan 2018 00:00:00 GMThttp://hdl.handle.net/1956/173082018-01-24T00:00:00ZA Comparative Analysis of MongoDB and Cassandra
http://hdl.handle.net/1956/17228
A Comparative Analysis of MongoDB and Cassandra
Bhamra, Kavita
Master thesis
NoSQL is a group of database technologies that emerged due to the limitations of relational databases. The number of NoSQL technologies has increased over time to encompass hundreds of different technologies. The many NoSQL databases are, unlike relational databases, not based on a standardised data model, query language, or a common way of thinking. Also, the NoSQL field brings forth some new concepts and challenges that were not present in the relational context. It can, therefore, be difficult to choose the right NoSQL technology for a particular application. This thesis analyses and compares two specific NoSQL database systems, MongoDB and Apache Cassandra, to simplify the selection process. There are several important factors to consider when deciding on a NoSQL technology. The purpose of this comparison is to outline what factors are useful to consider when selecting a technology. The discussion is mostly based on an extensive literature review. In the end, several factors were stressed, like the potential technology's data model, CAP classification, and available documentation, to name a few.
Wed, 03 Jan 2018 00:00:00 GMThttp://hdl.handle.net/1956/172282018-01-03T00:00:00ZSBMLmod: a Python-based web application and web service for efficient data integration and model simulation
http://hdl.handle.net/1956/17141
SBMLmod: a Python-based web application and web service for efficient data integration and model simulation
Schäuble, Sascha; Stavrum, Anne-Kristin; Bockwoldt, Mathias; Puntervoll, Pål; Heiland, Ines
Journal article
<p>Background: Systems Biology Markup Language (SBML) is the standard model representation and description language in systems biology. Enriching and analysing systems biology models by integrating the multitude of available data, increases the predictive power of these models. This may be a daunting task, which commonly requires bioinformatic competence and scripting.</p>
<p>Results: We present SBMLmod, a Python-based web application and service, that automates integration of high throughput data into SBML models. Subsequent steady state analysis is readily accessible via the web service COPASIWS. We illustrate the utility of SBMLmod by integrating gene expression data from different healthy tissues as well as from a cancer dataset into a previously published model of mammalian tryptophan metabolism.</p>
<p>Conclusion: SBMLmod is a user-friendly platform for model modification and simulation. The web application is available at http://sbmlmod.uit.no, whereas the WSDL definition file for the web service is accessible via http://sbmlmod.uit.no/SBMLmod.wsdl. Furthermore, the entire package can be downloaded from https://github.com/MolecularBioinformatics/sbml-mod-ws. We envision that SBMLmod will make automated model modification and simulation available to a broader research community.</p>
Sat, 24 Jun 2017 00:00:00 GMThttp://hdl.handle.net/1956/171412017-06-24T00:00:00ZImplementering og evaluering av en HL7 FHIR integrasjonsbuss for kommunikasjon mellom klientapplikasjoner og e-helsesystemer
http://hdl.handle.net/1956/16951
Implementering og evaluering av en HL7 FHIR integrasjonsbuss for kommunikasjon mellom klientapplikasjoner og e-helsesystemer
Haugen, Karina Øverland
Master thesis
Bakgrunn: Den økende digitaliseringen av helsetjenestene krever at informasjonen i de
elektroniske pasientjournalene er klinisk strukturerte ved hjelp av datastandarder og kliniske
terminologier for riktig prosessering av pasientinformasjonen. Pasientjournalene må også
være tilgjengelige når behovet er der. Dagens mangel på en felles standard for datastruktur
og datautveksling medfører at systemene og applikasjonene har problemer med å samhandle
og dele viktig pasientinformasjon, noe som påvirker effektiviteten hos helsevesenet.
Mål: Dette forskningsprosjektet vil evaluere og vurdere gevinstene med en tjenestearkitektur
som bruker en kjent datautvekslingsstandard, HL7 FHIR, som en integrasjonsbuss
mellom klientapplikasjoner og bakenforliggende e-helsesystemer.
Metoder: Vi implementerte en mindre tjenestearkitektur som består av en FHIR integrasjonsbuss
som kommuniserer med et openEHR basert pasientjournalsystem. Implementasjonen
er vurdert ved hjelp av medisinapplikasjonen LEMI som er integrert mot FHIR bussen
for henting av pasientinformasjon, medisininformasjon og ordineringsinformasjon.
Resultater: FHIR bussen tilbyr et grensesnitt basert på REST spesifikasjonen som abstraherer
den komplekse dokumentstrukturen til openEHR. Forespørslene sendt til FHIR bussen
krever ingen domenekunnskap om tjenestene som FHIR bussen bruker som datakilder. Ved
hjelp av profilering av FHIR ressursene kunne vi konvertere de obligatoriske datapunktene i
openEHR arketypene over til tilhørende datapunkter i FHIR ressurser uten store problemer.
Konklusjon: FHIR er strukturert for å forbedre data- og meldingsutveksling mellom applikasjoner
og systemer, og vår tjenestearkitektur med en FHIR integrasjonsbuss mellom
medisinapplikasjonen LEMI og openEHR pasientjournalsystemet demonstrerte de store integrasjonsgevinstene
som FHIR tilbyr. Vårt openEHR pasientjournalsystem demonstrerte
hvordan FHIR kan brukes for å abstrahere domene- og implementasjonskunnskapen som
trengs for å kunne prosessere openEHR informasjonen og utføre openEHR spørringer. FHIR
bussen gir også tilgang til klinisk strukturerte testdata som forhindrer at utviklerne etablerer
sine egne datastrukturer som ikke er kompatible med eksisterende e-helsesystemer.
Tue, 11 Jul 2017 00:00:00 GMThttp://hdl.handle.net/1956/169512017-07-11T00:00:00ZA model for optimal fleet composition of vessels for offshore wind farm maintenance
http://hdl.handle.net/1956/16878
A model for optimal fleet composition of vessels for offshore wind farm maintenance
Gutierrez-Alcoba, Alejandro; Ortega, Gloria; Hendrix, Eligius M.T.; Halvorsen-Weare, Elin Espeland; Haugland, Dag
Journal article
We present a discrete optimisation model that chooses an optimal fleet of vessels to support maintenance operations at Offshore Wind Farms (OFWs). The model is presented as a bi-level problem. On the first (tactical) level, decisions are made on the fleet composition for a certain time horizon. On the second (operational) level, the fleet is used to optimise the schedule of operations needed at the OWF, given events of failures and weather conditions.
Sun, 01 Jan 2017 00:00:00 GMThttp://hdl.handle.net/1956/168782017-01-01T00:00:00ZParameterized complexity of the spanning tree congestion problem
http://hdl.handle.net/1956/16868
Parameterized complexity of the spanning tree congestion problem
Bodlaender, Hans L.; Fomin, Fedor; Golovach, Petr; Otachi, Yota; van Leeuwen, Erik Jan
Journal article
We study the problem of determining the spanning tree congestion of a graph. We present some sharp contrasts in the parameterized complexity of this problem. First, we show that on apex-minor-free graphs, a general class of graphs containing planar graphs, graphs of bounded treewidth, and graphs of bounded genus, the problem to determine whether a given graph has spanning tree congestion at most k can be solved in linear time for every fixed k. We also show that for every fixed k and d the problem is solvable in linear time for graphs of degree at most d. In contrast, if we allow only one vertex of unbounded degree, the problem immediately becomes NP-complete for any fixed k≥8. Moreover, the hardness result holds for graphs excluding the complete graph on 6 vertices as a minor. We also observe that for k≤3 the problem becomes polynomially time solvable.
Sat, 01 Sep 2012 00:00:00 GMThttp://hdl.handle.net/1956/168682012-09-01T00:00:00ZParallel Matching and Clustering Algorithms on GPUs
http://hdl.handle.net/1956/16755
Parallel Matching and Clustering Algorithms on GPUs
Naim, Md.
Doctoral thesis
Sat, 17 Jun 2017 00:00:00 GMThttp://hdl.handle.net/1956/167552017-06-17T00:00:00ZOn Stable Marriages and Greedy Matchings
http://hdl.handle.net/1956/16754
On Stable Marriages and Greedy Matchings
Manne, Fredrik; Naim, Md.; Lerring, Håkon; Halappanavar, Mahantesh
Conference object
<p>Research on stable marriage problems has a long and mathematically rigorous history, while that of exploiting greedy matchings in combinatorial scientific computing is a younger and less developed research field. We consider the relationships between these two areas. In particular we show that several problems related to computing greedy matchings can be formulated as stable marriage problems and as a consequence several recently proposed algorithms for computing greedy matchings are in fact special cases of well known algorithms for the stable marriage problem.</p>
<p>However, in terms of implementations and practical scalable solutions on modern hardware, work on computing greedy matchings has made considerable progress. We show that due to this strong relationship many of these results are also applicable for solving stable marriage problems. This is further demonstrated by designing and testing efficient multicore as well as GPU algorithms for the stable marriage problem. Read More: http://epubs.siam.org/doi/abs/10.1137/1.9781611974690.ch10</p>
Fri, 01 Jan 2016 00:00:00 GMThttp://hdl.handle.net/1956/167542016-01-01T00:00:00ZCommunity Detection on the GPU
http://hdl.handle.net/1956/16753
Community Detection on the GPU
Naim, Md.; Manne, Fredrik; Halappanavar, Mahantesh; Tumeo, Antonino
Conference object
We present and evaluate a new GPU algorithm based on the Louvain method for community detection. Our algorithm is the first for this problem that parallelizes the access to individual edges. In this way we can fine tune the load balance when processing networks with nodes of highly varying degrees. This is achieved by scaling the number of threads assigned to each node according to its degree. Extensive experiments show that we obtain speedups up to a factor of 270 compared to the sequential algorithm. The algorithm consistently outperforms other recent shared memory implementations and is only one order of magnitude slower than the current fastest parallel Louvain method running on a Blue Gene/Q supercomputer using more than 500K threads.
Sun, 01 Jan 2017 00:00:00 GMThttp://hdl.handle.net/1956/167532017-01-01T00:00:00ZOptimizing Approximate Weighted Matching on Nvidia Kepler K40
http://hdl.handle.net/1956/16752
Optimizing Approximate Weighted Matching on Nvidia Kepler K40
Naim, Md.; Manne, Fredrik; Halappanavar, Mahantesh; Tumeo, Antonino; Langguth, Johannes
Conference object
Matching is a fundamental graph problem with numerous applications in science and engineering. While algorithms for computing optimal matchings are difficult to parallelize, approximation algorithms on the other hand generally compute high quality solutions and are amenable to parallelization. In this paper, we present efficient implementations of the current best algorithm for half-approximate weighted matching, the Suitor algorithm, on Nvidia Kepler K-40 platform. We develop four variants of the algorithm that exploit hardware features to address key challenges for a GPU implementation. We also experiment with different combinations of work assigned to a warp. Using an exhaustive set of 269 inputs, we demonstrate that the new implementation outperforms the previous best GPU algorithm by 10 to 100x for over 100 instances, and from 100 to 1000x for 15 instances. We also demonstrate up to 20x speedup relative to 2 threads, and up to 5x relative to 16 threads on Intel Xeon platform with 16 cores for the same algorithm. The new algorithms and implementations provided in this paper will have a direct impact on several applications that repeatedly use matching as a key compute kernel. Further, algorithm designs and insights provided in this paper will benefit other researchers implementing graph algorithms on modern GPU architectures.
Mon, 01 Jan 2018 00:00:00 GMThttp://hdl.handle.net/1956/167522018-01-01T00:00:00ZVaulteR - A pipeline for vault associated RNA detection from RNA- sequencing
http://hdl.handle.net/1956/16747
VaulteR - A pipeline for vault associated RNA detection from RNA- sequencing
Gao, Yue
Master thesis
Vaults are highly conserved ribonucleoprotein complexes of unknown function. They have so far been found to be present in high numbers among higher eukaryotes including mammals, amphibians, and avians, as well as lower eukaryotes including deuterostomes and the slime mold (Dictyostelium discoideum). The aim of this thesis is to design a pipeline for vault associated RNA detection from RNA-sequences. And especially try to detect vtRNA in the Salmon Louse. The genome of the atlantic salmon louse, a major parasite of salmonids, affecting the global aquaculture industry. The thesis presents three methods of detecting vtRNA, one way is to find the peaks in the alignment of reads and search for the high coverage sequences in Rfam to check the existence of vtRNA. Another way is by predicting the secondary structures of the high coverage sequences, drawing a dendrogram with hierarchical clusters according to the dissimilarity matrix of RNA secondary structures, and then analysing key features of secondary structures of the known vtRNA in order to filter the candidates. At last, the third method is by detecting motifs, such as A-Box and B-Box, in candidate sequences with the MEME Suite. The result of this thesis is a pipeline that can effectively detect vtRNA, and a set of novel candidate sequences which can probably act as vtRNA in the salmon louse genome.
Fri, 06 Oct 2017 00:00:00 GMThttp://hdl.handle.net/1956/167472017-10-06T00:00:00ZInteractive Exploration of Ligand Transportation through Protein Tunnels
http://hdl.handle.net/1956/16634
Interactive Exploration of Ligand Transportation through Protein Tunnels
Furmanova, Katarina; Jaresova, Miroslava; Byska, Jan; Jurcik, Adam; Parulek, Julius; Hauser, Helwig; Kozlikova, Barbora
Journal article
<p>Background: Protein structures and their interaction with ligands have been in the focus of biochemistry and structural biology research for decades. The transportation of ligand into the protein active site is often complex process, driven by geometric and physico-chemical properties, which renders the ligand path full of jitter and impasses. This prevents understanding of the ligand transportation and reasoning behind its behavior along the path.</p>
<p>Results: To address the needs of the domain experts we design an explorative visualization solution based on a multi-scale simplification model. It helps to navigate the user to the most interesting parts of the ligand trajectory by exploring different attributes of the ligand and its movement, such as its distance to the active site, changes of amino acids lining the ligand, or ligand “stuckness”. The process is supported by three linked views – 3D representation of the simplified trajectory, scatterplot matrix, and bar charts with line representation of ligand-lining amino acids.</p>
<p>Conclusions: The usage of our tool is demonstrated on molecular dynamics simulations provided by the domain experts. The tool was tested by the domain experts from protein engineering and the results confirm that it helps to navigate the user to the most interesting parts of the ligand trajectory and to understand the ligand behavior.</p>
Wed, 15 Feb 2017 00:00:00 GMThttp://hdl.handle.net/1956/166342017-02-15T00:00:00ZComparative Visualization of Protein Secondary Structures
http://hdl.handle.net/1956/16631
Comparative Visualization of Protein Secondary Structures
Kocincova, Lucia; Jaresova, Miroslava; Byska, Jan; Parulek, Julius; Hauser, Helwig; Kozlikova, Barbora
Journal article
<p>Background: Protein function is determined by many factors, namely by its constitution, spatial arrangement, and dynamic behavior. Studying these factors helps the biochemists and biologists to better understand the protein behavior and to design proteins with modified properties. One of the most common approaches to these studies is to compare the protein structure with other molecules and to reveal similarities and differences in their polypeptide chains.</p>
<p>Results: We support the comparison process by proposing a new visualization technique that bridges the gap between traditionally used 1D and 3D representations. By introducing the information about mutual positions of protein chains into the 1D sequential representation the users are able to observe the spatial differences between the proteins without any occlusion commonly present in 3D view. Our representation is designed to serve namely for comparison of multiple proteins or a set of time steps of molecular dynamics simulation.</p>
<p>Conclusions: The novel representation is demonstrated on two usage scenarios. The first scenario aims to compare a set of proteins from the family of cytochromes P450 where the position of the secondary structures has a significant impact on the substrate channeling. The second scenario focuses on the protein flexibility when by comparing a set of time steps our representation helps to reveal the most dynamically changing parts of the protein chain.</p>
Wed, 15 Feb 2017 00:00:00 GMThttp://hdl.handle.net/1956/166312017-02-15T00:00:00ZA domain-specific dialect for financial-economic calculations using reactive programming
http://hdl.handle.net/1956/16423
A domain-specific dialect for financial-economic calculations using reactive programming
Marheim, Christian Lundekvam
Master thesis
This thesis investigates to what extent reactive programming is suited for the implementation of systems that support banks in providing financial-economic advice to their private customers. To this end, we built a small such system in Hotdrink, as a proof-of-concept. This small system is a simplification of a much larger system that has been developed by Delfi Data, and which is overdue to be rewritten from scratch. Our system was evaluated on the commercial criteria readability, maintainability, extensibility, and efficiency. It turned out that readability and maintainability were quite high, even allowing nondevelopers to understand most of the code. However, extensibility is limited for financial calculations in general and especially for parallelization and encapsulation. With regard to commercialization, the solution was too immature to be evaluated on efficiency. Our conclusion is that reactive programming alone is too restrictive for solving financialeconomic problems. Notwithstanding the above, as a by-product we developed a prototype which sourceto-source compiles our new high-level domain-specific JavaScript dialect to reactive calculations.
Sat, 19 Aug 2017 00:00:00 GMThttp://hdl.handle.net/1956/164232017-08-19T00:00:00ZInteractive Dynamic Volume Illumination with Refraction and Caustics
http://hdl.handle.net/1956/16327
Interactive Dynamic Volume Illumination with Refraction and Caustics
Magnus, Jens Gåsemyr
Master thesis
In recent years, significant progress has been made in developing high-quality interactive methods for realistic volume illumination. However, refraction - despite being an important aspect of light propagation in participating media - has so far only received little attention. In this thesis, we present a novel approach for refractive volume illumination including caustics capable of interactive frame rates. By interleaving light and viewing ray propagation, our technique avoids memory-intensive storage of illumination information and does not require any precomputation. Propagation of refracted illumination is realized by employing a Semi-Lagrangian backward integration scheme, inspired by texture advection from the field of texture-based flow visualization. It is fully dynamic and all parameters such as light position and transfer function can be modified interactively without a performance penalty.
Tue, 15 Aug 2017 00:00:00 GMThttp://hdl.handle.net/1956/163272017-08-15T00:00:00ZVisual Analysis of Protein-Protein Interaction
http://hdl.handle.net/1956/16326
Visual Analysis of Protein-Protein Interaction
Horne, Marius Tendeland
Master thesis
Over the last decade there has been a steady increase in the focus of research
into Protein-Protein docking. The Docking software provides a plausible configuration
to a Protein-Protein Interaction. The docking will also provide analysis
and ranking of said plausible configuration of Protein-Protein Interaction. The
Docking softwares are getting more reliable, but there are still parameters that
the software can’t handle, and domain experts have to manually explore the
configurations to find and select the relevant ones, which is a time consuming
process. With our software, the time used by domain experts to explore the
configurations, will be reduced. This software provides a nice overview of the
connections between the two proteins in a Protein-Protein Interaction, and provides
3D visual aid to locate the spatial orientation of the contact zone in the
proteins, and the Amino Acid pairs in the contact zones spatial orientation to
each other.
Tue, 15 Aug 2017 00:00:00 GMThttp://hdl.handle.net/1956/163262017-08-15T00:00:00ZExploring graph parameters similar to tree-width and path-width
http://hdl.handle.net/1956/16325
Exploring graph parameters similar to tree-width and path-width
Nordstrand, Joakim Alme
Master thesis
In a recent paper appearing at IPEC 2015, ”Maximum matching width: new
characterization and fast algorithms for dominating set” [12], three similar treelike
parameters, tree-width, branch-width and maximum matching-width, are
studied using a common framework. Here we extend the work of that paper in
several ways. First we will answer one of the open problems by proving that
the ”last parameter” from the framework defining the three parameters is equal
to tree-width. Second we fill out the details in one of the proofs to ensure its
correctness. Third we define two new parameters, linear branch-width and linear
maximum matching-with. These are the ”path-like” variants of branch-width
and maximum matching-width respectively in the same sense that path-width is
the ”path-like” variant of tree-width. We show that linear branch-width is almost
identical to path-width, for any graph G we have pw(G) ≤ lbw(G) ≤ pw(G) + 1,
and that when a graph has linear maximum matching-width k then its pathwidth
is somewhere between k and 2k. We also explore the minimal forbidden
minors of graphs with linear branch-width less than 1,2 and 3 and with linear
maximum matching-width less than 1,2 and 3.
Tue, 04 Jul 2017 00:00:00 GMThttp://hdl.handle.net/1956/163252017-07-04T00:00:00ZProgram Transformations in Magnolia
http://hdl.handle.net/1956/16324
Program Transformations in Magnolia
Haugsbakk, Kristoffer
Master thesis
We explore program transformations in the context of the Magnolia
programming language. We discuss research and implementations of
transformation techniques, scenarios to put them to use in Magnolia,
interfacing with transformations, and potential workflows and tooling
that this approach to programming enables.; Vi utforsker program transformasjoner med tanke på programmeringsspråket Magnolia. Vi diskuterer forsking og implementasjoner av transformasjonsteknikker, sammenhenger der vi kan bruke dei i Magnolia, grensesnitt til transformasjoner, og potensielle arbeidsflyt og verktøy som denne tilnærmingen til programmering kan tillate og fremme.
Tue, 11 Jul 2017 00:00:00 GMThttp://hdl.handle.net/1956/163242017-07-11T00:00:00ZUsing 3D functionality available in current web-browsers to create and visualize geological models.
http://hdl.handle.net/1956/16264
Using 3D functionality available in current web-browsers to create and visualize geological models.
Malt, Øystein Ivar
Master thesis
This thesis investigates the possibility of using modern web technologies to develop accessible applications for interactive covisualization of geological data such as topography, seismic slices and measurements from wells. To avoid low-level development we specifically investigate X3DOM, which hides the details of graphics rendering in a high-level, declarative XML-like syntax. The thesis shows how to successfully implement a geoscientific application using X3DOM and the Angular web application framework.
Tue, 11 Jul 2017 00:00:00 GMThttp://hdl.handle.net/1956/162642017-07-11T00:00:00ZMachine Learning methods for mood disorder decision support
http://hdl.handle.net/1956/16259
Machine Learning methods for mood disorder decision support
Oleksy, Tomasz Artur
Master thesis
Tue, 11 Jul 2017 00:00:00 GMThttp://hdl.handle.net/1956/162592017-07-11T00:00:00ZGraphical Computing Solution for Industrial Plant Engineering
http://hdl.handle.net/1956/16258
Graphical Computing Solution for Industrial Plant Engineering
Lima, Marcelo Furtado
Master thesis
When preparing an engineering operation on an industrial plant, reliable and updated models of the plant must be available for correct decisions and planning. However, especially in the case of offshore oil and gas installations, it can hazardous and expensive to send an engineering party to assess and update the model of the plant. To reduce the cost and risk of modelling the plant, there are methods for quickly generating a 3D representation, such as LiDAR and stereoscopic reconstruction. However, these methods generate large files with no inherit cohesion. To address this, we propose to find a solution to efficiently transform point clouds from stereoscopic reconstruction into small mesh files that can be streamed or shared across teams. With that in mind, different techniques for treating point clouds and generating meshes were tested independently to measure their performance and effectiveness on an artifact-rich data set, such as the ones this work is aimed for. Afterwards, the techniques were combined into pipelines and compared with each other in terms of efficiency, file size output, and quality. With all results in place, the best solution from the ones tested was identified and validated with large real-world data sets.
Wed, 21 Jun 2017 00:00:00 GMThttp://hdl.handle.net/1956/162582017-06-21T00:00:00ZModel Checking with the Sweep-Line Method
http://hdl.handle.net/1956/16257
Model Checking with the Sweep-Line Method
Lilleskare, Andreas
Master thesis
Explicit-state model checking is a formal software verification technique that differs from peer review and unit testing, in that model checking does an exhaustive state space search. With model checking one takes a system model, traverse all reachable states, and check theses according to formal stated properties over the variables in the model. The properties can be expressed with linear temporal logic or computation tree logic, and can for example be that the value of some variable x should always be positive. When conducting an explicit state space exploration one is guaranteed that the complete state space is checked according to the given property. This is not the case in for instance unit testing, where only fragments of a system are tested. In the case that a property is violated, the model checking algorithm should present an error trace. The error trace represents an execution path of the model, demonstrating why it does not satisfy the property. The main disadvantage of model checking, is that the number of reachable states may grow exponentially in the number of variables. This is known as the state explosion problem. This thesis focuses on explicit-state model checking using the sweep-line method. To combat the state explosion problem, the sweep-line method exploits the notion of progress that a system makes, and is able to delete states from memory on-the-fly during the verification process. The notion of progress is captured by progress measures. Since the standard model checking algorithms rely upon having the whole state space in memory, they are not directly compatible with the sweep-line method. We survey differences of standard model checking algorithms and the sweep-line method, and present previous research on verifying properties and providing error traces with the sweep-line method. The new contributions of this thesis are as follows: (1) We develop a new general technique for providing an error trace for linear temporal logic properties, verified using the sweep-line method; (2) A new algorithm for verifying two key computation tree logic properties, on models limited to monotonic progress measures; (3) A unified library for the sweep-line method is implemented with the algorithms developed in this thesis, and the previous developed algorithms for verifying safety properties and linear temporal logic property checking. All algorithms implemented, are validated by checking properties on a model of a stop-and-wait communication protocol.
Wed, 21 Jun 2017 00:00:00 GMThttp://hdl.handle.net/1956/162572017-06-21T00:00:00ZUsing Heat and Ceilometer to create an elastic OpenStack grid
http://hdl.handle.net/1956/16256
Using Heat and Ceilometer to create an elastic OpenStack grid
Trippler, Niklas
Master thesis
Grid computing is a term for connecting computing resources together to solve large computational problems. Computational grids are used for a lot of computations within the high energy physics domain, where the amount of computing power required for some tasks is vastly more than a local computer can provide. This thesis investigates if cloud technology can be utilized to make an elastic computational grid, in order to get access to more resources that would otherwise be idle. Functional requirements were defined for creating a prototype capable of providing a virtualized environment that scales the amount of virtual machines up and down automatically based on the load on the system. A prototype was created to take advantage of the technology provided by cloud, and the prototype tested to see how it fulfills the functional requirements. Although one of the functional requirements was not achieved, the test results demonstrate that the technology has promising potential, but further work and testing needs to be done.
Wed, 21 Jun 2017 00:00:00 GMThttp://hdl.handle.net/1956/162562017-06-21T00:00:00ZVisualization of Biomolecular Structures: State of the Art Revisited
http://hdl.handle.net/1956/16221
Visualization of Biomolecular Structures: State of the Art Revisited
Kozlíková, Barbora; Krone, Michael; Falk, Martin; Lindow, Norbert; Baaden, Marc; Baum, Daniel S.; Viola, Ivan; Parulek, Julius; Hege, Hans-Christian
Journal article
Structural properties of molecules are of primary concern in many fields. This report provides a comprehensive overview on techniques that have been developed in the fields of molecular graphics and visualization with a focus on applications in structural biology. The field heavily relies on computerized geometric and visual representations of three-dimensional, complex, large and time-varying molecular structures. The report presents a taxonomy that demonstrates which areas of molecular visualization have already been extensively investigated and where the field is currently heading. It discusses visualizations for molecular structures, strategies for efficient display regarding image quality and frame rate, covers different aspects of level of detail and reviews visualizations illustrating the dynamic aspects of molecular simulation data. The survey concludes with an outlook on promising and important research topics to foster further success in the development of tools that help to reveal molecular secrets.
Fri, 18 Nov 2016 00:00:00 GMThttp://hdl.handle.net/1956/162212016-11-18T00:00:00ZFaster enumeration of minimal connected dominating sets in split graphs
http://hdl.handle.net/1956/16058
Faster enumeration of minimal connected dominating sets in split graphs
Skjørten, Ida Bredal
Master thesis
Graphs are mathematical objects that can be used to model many real world problems. An example is a roadmap, where the nodes in the graph represent cities and the edges of the graph represent roads. An interesting and important task is to find certain objects or node subsets with a specified property in a graph. It is exactly this type of question this thesis will study, and the objects that we are looking for are minimal connected dominating sets. In many applications the best known way to find a minimum object of a certain type can be to list all the minimal objects and then pick the smallest one among these minimal objects. Thus, knowing the number of such objects and being able to list them in reasonable time might help us get faster algorithms for hard problems. In 2015, Golovach, Heggernes and Kratsch gave an O(1.3803^n)-time algorithm for enumerating the mcds in split graphs. This algorithm gives the best known upper bound on the number of mcds in split graphs. Golovach et al also give the best known lower bound example on the number of mcds in split graphs, which has 1.3195^n mcds, so there is an albeit small gap between the lower and the upper bound. In this thesis we implement and test the mentioned enumeration algorithm, and with these tests we find a new lower bound example with 1.3195^n mcds. On the theoretical side, we improve the running time of the enumeration algorithm to O(1.3674^n), thereby proving that the maximum number of mcds in split graphs is at most 1.3674^n. Consequently, we narrow the gap between the known upper and lower bound on the maximum number of mcds in split graphs.
Tue, 20 Jun 2017 00:00:00 GMThttp://hdl.handle.net/1956/160582017-06-20T00:00:00ZCommunity Detection in Complex Networks
http://hdl.handle.net/1956/16057
Community Detection in Complex Networks
Lund, Herman Møyner
Master thesis
Tue, 20 Jun 2017 00:00:00 GMThttp://hdl.handle.net/1956/160572017-06-20T00:00:00ZParallel algorithms for matching under preference
http://hdl.handle.net/1956/16056
Parallel algorithms for matching under preference
Lerring, Håkon Heggernes
Master thesis
Tue, 20 Jun 2017 00:00:00 GMThttp://hdl.handle.net/1956/160562017-06-20T00:00:00ZNew Lower Bounds on the Maximum Number of Minimal Connected Vertex Covers
http://hdl.handle.net/1956/16055
New Lower Bounds on the Maximum Number of Minimal Connected Vertex Covers
Ryland, Ida
Master thesis
Graphs are important mathematical structures that are used to model many real-life problems. They can, for instance, be used to model relations between objects in a network. An important field of study in graph theory is the study of vertex covers. A vertex cover of a graph is a subset of vertices such that each edge of the graph is incident to at least one vertex in the vertex cover. In this thesis we are interested in listing all vertex covers of a certain type and deciding the maximum number of them. In theoretical computer science we speak about extremal problems where we seek to find how many different configurations of a certain type can occur in a graph, satisfying some properties. Closely related to this, are enumeration problems where the goal is to enumerate all solutions to a given problem. Enumeration of minimal vertex covers are well studied. However, the problem of enumerating minimal connected vertex covers has not been given the same attention. A recent paper by Golovach, Heggernes and Kratsch studies exactly this problem. The authors give an algorithm for enumerating all minimal connected vertex covers in time O(1.8668^n). This also provides an upper bound of 1.8668^n on the number of minimal connected vertex covers a graph can have. They also provide a lower bound, which is a graph that has 3^((n-1)/3) ~ 1.4422^n minimal connected vertex covers. This leaves a gap between the upper and lower bounds. In this thesis we implement and study the algorithm by Golovach et al. We theoretically prove that there exist graphs that have 1.51978^n minimal connected vertex covers. With these discoveries we have been able to increase the lower bound from 1.4422^n to 1.51978^n and by that narrowed the gap between the upper and lower bounds on the number of minimal connected vertex covers in graphs.
Tue, 20 Jun 2017 00:00:00 GMThttp://hdl.handle.net/1956/160552017-06-20T00:00:00ZPrecrec: fast and accurate precision-recall and ROC curve calculations in R
http://hdl.handle.net/1956/15892
Precrec: fast and accurate precision-recall and ROC curve calculations in R
Saito, Takaya; Rehmsmeier, Marc
Journal article
The precision–recall plot is more informative than the ROC plot when evaluating classifiers on imbalanced datasets, but fast and accurate curve calculation tools for precision–recall plots are currently not available. We have developed Precrec, an R library that aims to overcome this limitation of the plot. Our tool provides fast and accurate precision–recall calculations together with multiple functionalities that work efficiently under different conditions.
Sun, 01 Jan 2017 00:00:00 GMThttp://hdl.handle.net/1956/158922017-01-01T00:00:00ZRidge Extraction, and Illustrative Visualization of an FTLE Flow Field.
http://hdl.handle.net/1956/15843
Ridge Extraction, and Illustrative Visualization of an FTLE Flow Field.
Sivertsen, Stian
Master thesis
Illustrative visualization of a flow is an area
that is proving to be interesting and in later
years something that
have been somewhat explored, it can provide some
very interesting results and in this thesis we
find the required structures needed for such a
visualization. Lagrangian coherent structures are
now commonly used in flow visualization and we
establish a simplistic approach to creating them
in this thesis.
The problem of occlusion and the general
complexity of both the data and the visual results
are still a common thing when working with a
structured flow.
This thesis presents a solution that identifies
the complex and often large amounts of data that
is inherent in a flow and tries to find a good way
of
extracting the structures within. By using the
FTLE values found in the flow data we define these
structures so that we can see what is going on
in these complex systems, as well as combining it
with an illustrative approach that will provide an
interesting visual aspect to the integral
structures found in
flows. By using mathematical analysis of the raw
data we obtain the necessary tools we need to
define and extract ridges that
form LCS that we can then model. We use a simple
approach by focusing on the basic components of
such complex structures as well
as utilizing often long standing principles in an
interesting way.
Combining the strengths of both illustrative
visualization and the clarity of integral
structures we can create
a more visual sound model that loses some of its
cluttering complexities but is still capable of
generating a pleasing visual end result.
Fri, 01 Jun 2012 00:00:00 GMThttp://hdl.handle.net/1956/158432012-06-01T00:00:00ZFirewalls: Enforcement of Security Policy in networks
http://hdl.handle.net/1956/15842
Firewalls: Enforcement of Security Policy in networks
Nordås, Harald
Master thesis
Firewalls are set up to protect computer
networks. Originally, networks were just
coupled
together, in order to achieve connection. With
the ability to reach networks all over the
world, one started to denote this the Internet.
When the Internet evolved, the focus
of design was on connectivity, efficiency and
reliability, but not on security. Gradually,
the need to control the traffic gave rise to
enhanced security features of the routers
connecting the networks. In the 1990s this was
no longer sufficient, because people were
able to circumvent this simple barrier.
Dedicated devices then developed into more
advanced mechanisms. The concept of a digital
firewall was born.
Today a plethora of threats to a network
challenge the firewall and organisations who
are managing the firewall system. This thesis
will go into the many tasks involved in
managing a firewall system. We will look at
challenges but also suggest solutions to
some of the issues.
Thu, 20 Nov 2014 00:00:00 GMThttp://hdl.handle.net/1956/158422014-11-20T00:00:00ZMultiple Side Linear Equations. A New Tool For Solving Sparse Algebraic Equations in Finite Field
http://hdl.handle.net/1956/15841
Multiple Side Linear Equations. A New Tool For Solving Sparse Algebraic Equations in Finite Field
Renzengwangdui, X
Master thesis
A new technique of linearization for sparse multivariate polynomial equation system is presented. Applying Gluing algorithm on the newly constructed linear equation systems, therefore solve the original multivariate polynomial system.
Thu, 28 Jan 2010 00:00:00 GMThttp://hdl.handle.net/1956/158412010-01-28T00:00:00ZUtvikling og portering av Java ME applikasjoner
http://hdl.handle.net/1956/15840
Utvikling og portering av Java ME applikasjoner
Haugen, Thomas Rene Vervik
Master thesis
Ved utvikling av mobilapplikasjoner i Java ME møter man anderledes utfordringer enn ved mer tradisjonell utvikling av skrivebordsapplikasjoner og/eller webbaserte applikasjoner. Denne oppgaven tar for seg disse forskjellene og utfordringene og hvordan man håndterer dem
Mon, 02 Jun 2008 00:00:00 GMThttp://hdl.handle.net/1956/158402008-06-02T00:00:00ZWorkflow management systems
http://hdl.handle.net/1956/15839
Workflow management systems
Imsland, Geir Inge Struen
Master thesis
This master's thesis gives an insight to workflow technologies used to improve efficiency of business processes. Ways to use such technologies in order to help users through tasks in MIPS (Material Integrated Production System) are discussed. Description on how to implement a workflow management system for MIPS is provided. Workflow engine, workflow importer and user interface are the three most important parts discussed.
Sat, 01 Dec 2007 00:00:00 GMThttp://hdl.handle.net/1956/158392007-12-01T00:00:00ZExact methods for solving the small scale single vehicle pickup and delivery problem with time windows
http://hdl.handle.net/1956/15817
Exact methods for solving the small scale single vehicle pickup and delivery problem with time windows
Lakel, Yacine Mohamed
Master thesis
The single vehicle pickup and delivery problem
with time windows (1-PDPTW) is a wellknown
problem in transportation and combinatorics. In
practice, these routes are relatively
small with less then 30 stops. In this thesis, we
consider two exact methods to solve
this problem for small instance sizes. The first
is solving the corresponding MIP-model
using commercial linear programming software and
the second is a forward dynamic
programming algorithm. The motivation for this
thesis is to determine the usability of
exact methods as a heuristic to solve the multi
vehicle pickup and delivery problem.
Mon, 15 Aug 2016 00:00:00 GMThttp://hdl.handle.net/1956/158172016-08-15T00:00:00ZVisualization and Interaction with Medical Data in Immersive Environments
http://hdl.handle.net/1956/15816
Visualization and Interaction with Medical Data in Immersive Environments
Hammersland, Yngve Devik
Master thesis
Immersive visualization techniques are just starting to see limited adoption in medical applications.
The Visualization Group at the University of Bergen wish to expand its research
efforts into such immersive visualization techniques. A new immersive environment installed
at the University is meant to be utilized for this purpose.
This thesis presents a solution which enables the use of VolumeShop in the immersive
environments for presentation of volumetric data and for combination of 2D and 3D medical
imaging modalities. VolumeShop is a rapid prototyping environment for visualization
techniques, which is often used by our visualization group.
For general use of the immersive environment, motion tracking support needs to be
added to VolumeShop. This motion tracking support can then be utilized to realize the immersive
visualization pipeline. This pipeline computes correct perspective projection based
on the user's position. Additionally it implements intuitive gesture based interaction with
the data using a handheld interaction device.
A multimodal visualization pipeline is described and implemented, which enables the
visualization of 2D+time ultrasound images combined with MRI volumes. To realize this
pipeline, components for motion tracking of the ultrasound probe, for synchronization of
the time offset between the motion tracking and the ultrasound series, picking landmarks
in the ultrasound slice and MRI cross-section, and computation of the registration transformation
which spatially unifies the two of the modalities.
The result of this thesis is a general purpose motion tracking component as well as the
two described pipelines.
Mon, 08 Dec 2008 00:00:00 GMThttp://hdl.handle.net/1956/158162008-12-08T00:00:00ZComparison of OpenMP and Threading Building Blocks for expressing parallelism on shared-memory systems
http://hdl.handle.net/1956/15815
Comparison of OpenMP and Threading Building Blocks for expressing parallelism on shared-memory systems
Refsnes, Peder Rindal
Master thesis
The thesis offers a comparison of OpenMP and Intel Threading
Building blocks. The two are threading packages used to express
parallelism in programs. In addition, the thesis presents a
promising scalable algorithm for finding connected components in
graphs using disjoint set data structures.
Tue, 01 Feb 2011 00:00:00 GMThttp://hdl.handle.net/1956/158152011-02-01T00:00:00ZWebucator 3.0 - Brukerhåndtering og aksesskontroll for DPG 2.0
http://hdl.handle.net/1956/15814
Webucator 3.0 - Brukerhåndtering og aksesskontroll for DPG 2.0
Løvik, Kristian Skønberg
Master thesis
Oppgaven omhandler brukerhåndtering, sikkerhet og aksesskontroll i
Dynamic Presentation Generator og Webucator, som er sentrale systemer for
gjennomføringen av fjernundervisningskurs over internett ved Universitetet i
Bergen.
Mon, 18 Aug 2008 00:00:00 GMThttp://hdl.handle.net/1956/158142008-08-18T00:00:00ZWSDL Workshop. Semantic web application in HTML5 for the discovery, construction and analysis of workflows
http://hdl.handle.net/1956/15813
WSDL Workshop. Semantic web application in HTML5 for the discovery, construction and analysis of workflows
Nozal Canadas, Rafael Adolfo
Master thesis
WSDL-Workshop is a HTML5 web application for the discovery and exploration of web services and for analysing the compatibility between web services. This is the result of a mathematical model developed from WSDL1.1. The program provides a graphical user interface and let the user build a workflow composed of services described with WSDL1.1 and tells if: ∙An output is compatible with an input. ∙It is correct to link an output with an input. ∙It is correct to link a given operation after another. ∙If many operations correctly linked together still make sense as a group. In order to do that the WSDLs must have semantic annotations so the computer can recognize what is the purpose of certain data or operation. WSDL-Workshop uses the EDAM Ontology as a reference for semantic concepts. In the discovery aspect; for a given set of WSDLs you can find services by filtering by operation name, input or output names, or semantic annotations. For a given operation output it can also filter by WSDL which have inputs which are correct to link with that output. For a given operation it can also filter by operations which are correct to link after.
Wed, 20 Nov 2013 00:00:00 GMThttp://hdl.handle.net/1956/158132013-11-20T00:00:00ZKernelization of Vertex Cover by Structural Parameters
http://hdl.handle.net/1956/15802
Kernelization of Vertex Cover by Structural Parameters
Strømme, Torstein Jarl Fagerbakke
Master thesis
In the NP-complete problem Vertex Cover, one is given a graph G and an integer k and are asked whether there exists a vertex set S ⊆ V (G) with size at most k such that every edge of the graph is incident to a vertex in S. In this thesis we explore techniques to solve Vertex Cover using parameterized algorithms, with a particular focus on kernelization by structural parameters. We present two new polynomial kernels for Vertex Cover, one parameterized by the size of a minimum degree-2 modulator, and one parameterized by the size of a minimum pseudoforest modulator. A degree-2 modulator is a vertex set X ⊆ V (G) such that G-X has maximum degree two, and a pseudoforest modulator is a vertex set X ⊆ V (G) such that every connected component of G-X has at most one cycle. Specifically, we provide polynomial time algorithms that for an input graph G and an integer k, outputs a graph G' and an integer k' such that G has a vertex cover of size k if and only if G' has a vertex cover of size k'. Moreover, the number of vertices of G' is bounded by O(|X|^7) where |X| is the size of a minimum degree-2 modulator for G, or bounded by O(|X|^12) where |X| is the size a minimum pseudoforest modulator for G. Our result extends known results on structural kernelization for Vertex Cover.
Mon, 03 Aug 2015 00:00:00 GMThttp://hdl.handle.net/1956/158022015-08-03T00:00:00ZNonlinear feedback shift registers and generating of binary de Bruijn sequences
http://hdl.handle.net/1956/15801
Nonlinear feedback shift registers and generating of binary de Bruijn sequences
Vivelid, Christian Ebne
Master thesis
Mon, 21 Nov 2016 00:00:00 GMThttp://hdl.handle.net/1956/158012016-11-21T00:00:00ZParallel algorithms for computing k-connectivity
http://hdl.handle.net/1956/15786
Parallel algorithms for computing k-connectivity
Haakonsen, Joachim
Master thesis
Fri, 05 May 2017 00:00:00 GMThttp://hdl.handle.net/1956/157862017-05-05T00:00:00ZProgramming Language Technology for Niche Platforms
http://hdl.handle.net/1956/15648
Programming Language Technology for Niche Platforms
Hasu, Tero
Doctoral thesis
<p>Developers writing software for a niche platform are denied the luxury of a first-class vendor-supported integrated development environment and a large community crafting platform-tailored libraries, tools, and documentation. I outline a strategy for setting up a cross-platform software product line with cost-effective targeting of niche platforms in mind.</p>
<p>The product line setup strategy assumes little tool support from the platform vendor or third parties, instead relying on a suitably-designed, malleable general-purpose programming language for the necessary support. The required language support includes: program translation into the relevant vendor-favored languages; human-comprehensible translator output to allow for basic debugging irrespective of available tools; a component system for managing software assets and assembling products; static reasoning of facts about whole programs for the benefit of configuration management and building; and modifiability of the language from within (and perhaps also from without), to allow for purpose-oriented variability, and low-threshold implementation of abstractions over platform and product-line specific idioms.</p>
<p>I present a collection of technologies aimed at implementing such programming languages, and show a number of ways to apply such languages in ways that suit the niche platform application product line scenario. I use smartphone operating systems as an example platform ecosystem, and focus on error handling and prevention as an example concern that poses reuse, integration, and configuration management challenges in multi-platform codebases.</p>
Fri, 03 Mar 2017 00:00:00 GMThttp://hdl.handle.net/1956/156482017-03-03T00:00:00ZUtilizing the HTM algorithms for weather forecasting and anomaly detection
http://hdl.handle.net/1956/15383
Utilizing the HTM algorithms for weather forecasting and anomaly detection
Vivmond, Alexandre
Master thesis
Various studies have utilized different artificial neural networks (ANN) for weather forecasting. This thesis examines how well the official implementation of a novel online ANN called the Hierarchical Temporal Memory (HTM) can forecast the weather and detect anomalies in the weather data. Created by Numenta (www.numenta.com), the HTM emulates the brain's neocortical structures and processes to mimic its capabilities of memory retention. By using sparse distributed representations instead of binary representations as its foundation for information storage and representation, it is able to learn complex patterns in noisy data sets that can be used to make predictions and detect anomalies in streamed data. Numenta has officially implemented the theory of HTM in an open- source Python platform called NuPIC. Although there are slight differences between the theory of HTM and its implementation, the most important factor about NuPIC is the addition of several purely engineered algorithms. Two of the most notable additions, are an algorithm that enables NuPIC to make the final decisions in cases when more than one possible prediction is possible, and an algorithm that makes it possible to simultaneously input multiple metrics to NuPIC. The weather data that was to be predicted consisted of several weather factors, wind direction, wind speed, atmospheric pressure, precipitation, temperature, and relative humidity measurements spanning over a period of 12 years. Originally, the goal was to input the data sets simultaneously. However, because the functionality responsible for enabling this feature was malfunctioning at the time of the thesis work, every weather data set had to be input separately. The results showed that NuPIC was able to make decent forecasts, but was for the most part outperformed by a simple technique that made predictions by calculating the average of the last few days. The main reasons for this was due to the weather's lack of similarity between past and current conditions, and NuPIC's inability to generalize its knowledge in order to factor weather trends in its predictions. Although there is also a minor issue with the current engineered prediction algorithm, the results indicate that prediction is not NuPIC's strongest suit. NuPIC was completely unable to detect any noteworthy anomalies in the weather data, which again is most likely due to the weather data's chaotic nature. Despite the negative results, there were also some positive ones. An unrelated experiment that detected anomalies in the oil price, revealed that NuPIC was able to detect anomalies that were linked to major real world economic and/or geopolitical events. This indicates that the quality of NuPIC's results are highly dependant on the properties of the data set that it is given. Data sets that conform to NuPIC's strengths can lead to both decent predictions and anomaly detections, while those that do not produce poor results.
Thu, 17 Nov 2016 00:00:00 GMThttp://hdl.handle.net/1956/153832016-11-17T00:00:00ZThe Role of Humans in Complex ICT Systems
http://hdl.handle.net/1956/15375
The Role of Humans in Complex ICT Systems
Kazymyrova, Valentyna
Master thesis
In the modern world, companies, regardless of their type of business,
rely on information and communications technology (ICT) systems to
carry out their everyday operations. The ICT systems have been
developed over time to fit companies' changing needs. It is often
hard to determine when and how these systems were modified
because of constantly changing environments and stakeholders that
come and go. A large industry such as the petroleum business is
completely dependent on ICT systems and, as a result, it faces the
typical problems related to such systems.
This thesis models the interactions between applications used by
employees of a real engineering company. Analysis of the model
unveiled several classical problems of complex systems, such as
centricity, hidden relations, and closedness. The analysis shows that
insufficient understanding of the interdependencies between the
applications lead to unjustified actions that caused unpredictable
consequences.
It is argued that the interactions between humans and information
technology can never be excluded from the analysis of complex ICT
systems without damaging the quality and usefulness of the results.
Many real-life examples presented in this thesis show that humans
can be the source of errors, but they can also be a critically
important to rectify problems before the consequences become
intolerable. The role of humans in ICT systems is analyzed from a
bottom-up prospective with examples based on the author's
experience. The conclusions are supported by case studies from
everyday routines.
The thesis considers both theoretical and practical aspects of the
design, maintenance, and analysis of complex adaptive ICT systems.
Since it analyzes a real system, the thesis proposes several practical
improvements like openness, advanced human error validation, and
team diversity. While the thesis only studies the ICT system of a
single company, the recommendations should be of interest to other
companies as well.
Sun, 16 Oct 2016 00:00:00 GMThttp://hdl.handle.net/1956/153752016-10-16T00:00:00ZBeyond the question of fixed-parameter tractability
http://hdl.handle.net/1956/15340
Beyond the question of fixed-parameter tractability
Dregi, Markus Fanebust
Doctoral thesis
<p>Multivariate complexity is a prominent field that over the last decades has developed
a rich toolbox, not only to tackle seemingly intractable problems, but
also to describe the boundaries of tractability in a richer and more fine-grained
way. In this thesis we survey the research directions emerging after the question
of fixed-parameter tractability has been settled. That is, we define and exemplify
structural parameters, polynomial kernelizations, branching techniques, subexponential
time algorithms and parameterized approximation algorithms. In addition,
we display techniques for proving lower bounds for all of the above mentioned
directions. After this, we give new results within this parameterized framework
for several classic graph problems.</p>
<p>The problems studied in this thesis can naturally be divided into two groups; graph
modification problems and structural graph problems. With respect to graph
modification problems, we study problems where one is to remove a small set of
vertices in order to break the graph into small connected components. We also
study problems where, instead of deleting vertices, we are to add and/or remove a
small number of edges in order to obtain a graph that adheres to a specific set of
properties. We resolve several questions in the literature with respect to modifying
a graph to a threshold graph or to a chain graph. We prove that editing to such
graphs is NP-hard and, under the widely believed Exponential Time Hypothesis
(ETH), not solvable in 2o(√k) · nO(1) time. We also provide polynomial kernels and
subexponential time parameterized algorithms running in time 2O(√k log k) + nO(1)
for all three edge modification variants into both graph classes.</p>
<p>We also consider edge modifications into H-free graphs, where H is any finite set
of forbidden induced subgraphs, on bounded degree input graphs. We prove that
for a fixed maximum degree Δ, both edge editing and edge deletion to H-free
graphs in at most k operations, admit polynomial kernels with kO(Δ log Δ) vertices.
Then, via the framework of cross-compositions we prove that there is a finite set H,
such that completion to H-free graphs does not admit a polynomial kernelization
algorithm on bounded degree graphs, when parameterized by the bound on the
number operations k, unless NP ⊆ coNP/poly.</p>
<p>With respect to structural graph problems, we first provide several results for
bandwidth. We prove that, assuming ETH, there is no significant improvement
over the classic dynamic programming algorithm by Saxe [SIAM’80]. In particular, we prove that, assuming ETH, there is no f(b)no(b) time algorithm for deciding
whether the bandwidth of a graph is at most b. This result remains true when
restricted to trees of pathwidth at most 2. By the same reduction, we prove that
deciding whether the bandwidth of a graph is at most b, when parameterized
by b, is W[1]-hard when restricted to the same set of trees. Furthermore, we
provide the first approximation algorithm for computing the bandwidth of trees,
where the approximation factor depends solely on b. We then extend this result to
graphs of bounded treelength, a rich graph class containing among others chordal
graphs and graphs of bounded hyperbolicity. We also provide a characterization
of graphs of small bandwidth for the same graph classes. In particular, the
most general of these results states that a graph of bounded treelength can
only have high bandwidth if it has high local density or high pathwidth, or if it
contains a slight modification of a bandwidth obstruction introduced by Chung
and Seymour [Discrete Mathematics’89].</p>
<p>Finally, we provide a constant factor approximation algorithm for computing the
treewidth of a graph that runs in O(ckn) time. The algorithm either provides
a tree decomposition of width 5k + 4 or concludes that the treewidth of the
input graph is larger than k. This algorithm improves several known results,
like the one by Robertson and Seymour [JCTB’95] and Reed [STOC’92] and
Amir [Algorithmica’10]. We point out that there are many important problems in
the literature, for example computing a vertex cover, a dominating set or a steiner
tree of a graph, that can be solved in O(ckn) time if provided a tree decomposition
of width at most O(k). The algorithm presented, is the first that can provide such
algorithms with a tree decomposition of sufficiently small width, without being the
bottleneck of the composed algorithm: That is, an algorithm that first computes
a decomposition of width O(k) and then solves the problem in O(ckn) time.</p>
Fri, 06 Jan 2017 00:00:00 GMThttp://hdl.handle.net/1956/153402017-01-06T00:00:00ZWEBnm@ v2.0: Web server and services for comparing protein flexibility
http://hdl.handle.net/1956/13058
WEBnm@ v2.0: Web server and services for comparing protein flexibility
Tiwari, Sandhya Premnath; Fuglebakk, Edvin; Hollup, Siv Midtun; Skjærven, Lars; Cragnolini, Tristan; Grindhaug, Svenn Helge; Tekle, Kidane M; Reuter, Nathalie
Journal article
<p>Background: Normal mode analysis (NMA) using elastic network models is a reliable and cost-effective computational method to characterise protein flexibility and by extension, their dynamics. Further insight into the dynamics–function relationship can be gained by comparing protein motions between protein homologs and functional classifications. This can be achieved by comparing normal modes obtained from sets of evolutionary related proteins.</p>
<p>Results: We have developed an automated tool for comparative NMA of a set of pre-aligned protein structures. The user can submit a sequence alignment in the FASTA format and the corresponding coordinate files in the Protein Data Bank (PDB) format. The computed normalised squared atomic fluctuations and atomic deformation energies of the submitted structures can be easily compared on graphs provided by the web user interface. The web server provides pairwise comparison of the dynamics of all proteins included in the submitted set using two measures: the Root Mean Squared Inner Product and the Bhattacharyya Coefficient. The Comparative Analysis has been implemented on our web server for NMA, WEBnm@, which also provides recently upgraded functionality for NMA of single protein structures. This includes new visualisations of protein motion, visualisation of inter-residue correlations and the analysis of conformational change using the overlap analysis. In addition, programmatic access to WEBnm@ is now available through a SOAP-based web service. Webnm@ is available at http://apps.cbu.uib.no/webnma.</p>
<p>Conclusion: WEBnm@ v2.0 is an online tool offering unique capability for comparative NMA on multiple protein structures. Along with a convenient web interface, powerful computing resources, and several methods for mode analyses, WEBnm@ facilitates the assessment of protein flexibility within protein families and superfamilies. These analyses can give a good view of how the structures move and how the flexibility is conserved over the different structures.</p>
Tue, 30 Dec 2014 00:00:00 GMThttp://hdl.handle.net/1956/130582014-12-30T00:00:00ZTools and data services registry: a community effort to document bioinformatics resources
http://hdl.handle.net/1956/12891
Tools and data services registry: a community effort to document bioinformatics resources
Ison, Jon; Rapacki, Kristoffer; Ménager, Hervé; Kalaš, Matúš; et al., 69 authors
Journal article
Life sciences are yielding huge data sets that underpin scientific discoveries fundamental to improvement in human health, agriculture and the environment. In support of these discoveries, a plethora of databases and tools are deployed, in technically complex and diverse implementations, across a spectrum of scientific disciplines. The corpus of documentation of these resources is fragmented across the Web, with much redundancy, and has lacked a common standard of information. The outcome is
that scientists must often struggle to find, understand, compare and use the best resources for the
task at hand.
Here we present a community-driven curation effort, supported by ELIXIR––the European infrastructure for biological information––that aspires to a comprehensive and consistent registry of information about bioinformatics resources. The sustainable upkeep of this Tools and Data Services Registry is assured by a curation effort driven by and tailored to local needs, and shared amongst a network of engaged partners.
As of November 2015, the registry includes 1785 resources, with depositions from 126 individual registrations including 52 institutional providers and 74 individuals. With community support, the registry can become a standard for dissemination of information about bioinformatics resources: we welcome everyone to join us in this common endeavour. The registry is freely available at https://bio.tools.
Tue, 03 Nov 2015 00:00:00 GMThttp://hdl.handle.net/1956/128912015-11-03T00:00:00ZUsing registries to integrate bioinformatics tools and services into workbench environments
http://hdl.handle.net/1956/12836
Using registries to integrate bioinformatics tools and services into workbench environments
Ménager, Hervé; Kalaš, Matúš; Rapacki, Kristoffer; Ison, Jon
Journal article
The diversity and complexity of bioinformatics resources presents significant challenges to their localisation, deployment and use, creating a need for reliable systems that address these issues. Meanwhile, users demand increasingly usable and integrated ways to access and analyse data, especially within convenient, integrated “workbench” environments. Resource descriptions are the core element of registry and workbench systems, which are used to both help the user find and comprehend available software tools, data resources, and Web Services, and to localise, execute and combine them. The descriptions are, however, hard and expensive to create and maintain, because they are volatile and require an exhaustive knowledge of the described resource, its applicability to biological research, and the data model and syntax used to describe it. We present here the Workbench Integration Enabler, a software component that will ease the integration of bioinformatics resources in a workbench environment, using their description provided by the existing ELIXIR Tools and Data Services Registry.
Thu, 01 Jan 2015 00:00:00 GMThttp://hdl.handle.net/1956/128362015-01-01T00:00:00ZMultiPath TCP-communication (in NorNet Core)
http://hdl.handle.net/1956/12668
MultiPath TCP-communication (in NorNet Core)
Ingebretsen, Kristian Bøckmann; Selvik, Daniel
Master thesis
Technology is constantly evolving, and we are currently witnessing a digital revolution
with a tremendous growth of interconnected devices. The scale of the Internet and the
amount of transported data is constantly increasing. The need for reliability is also increasing,
as users are constantly on the move, and are more dependent on cloud services.
Utilizing the connectivity redundancy of devices is an interesting approach to deal with
this increasing resource demand. In addition to ISPs and data centers implementing redundancy,
lately we have also seen that end-user devices also do the same. The term
multihoming describes the practice of being connected to multiple networks simultaneously.
However, the most used transport protocols today, TCP and UDP, are not able to
effectively take advantage of multihoming, as they are single-path transport protocols.
In this thesis we will explore the newly proposed MultiPath TCP protocol, an
extension of TCP, which enables the concurrent use of multiple network interfaces. This
makes it possible for one data stream to be transferred over multiple paths on the Internet.
In order to evaluate the performance of this protocol, we will test it according to its design
goals set by the Internet Engineering Task Force, in the NorNet Core testbed.
Tue, 31 May 2016 00:00:00 GMThttp://hdl.handle.net/1956/126682016-05-31T00:00:00ZAlgorithms for Linearly Ordered Boolean Formulas
http://hdl.handle.net/1956/12667
Algorithms for Linearly Ordered Boolean Formulas
Egeland, Christian
Master thesis
This thesis considers a class of propositional boolean formulas on which various problems related to satisfiability are efficiently solvable by a dynamic programming algorithm. It mainly consists of two larger parts: the first part describes the class of boolean formulas we are interested in and how to find them, and the second part investigates whether this class of formulas have any practical implications.
Thu, 30 Jun 2016 00:00:00 GMThttp://hdl.handle.net/1956/126672016-06-30T00:00:00ZIncompleteness of the Inference System BNeg
http://hdl.handle.net/1956/12644
Incompleteness of the Inference System BNeg
Golid, Kjetil Midtgarden
Master thesis
Any propositional discourse can be represented as a propositional
theory in a specific form in such a way that the theory is inconsistent
if and only if the discourse is paradoxical. Propositional theories in
this form can be represented as directed graphs such that the
models of the theory correspond to the kernels of the digraph. This
thesis looks at Neg; a sound, refutationally complete, non-explosive
resolution system over such propositional theories. We investigate
the relation between various graph structures and clauses provable
by the resolution system from the corresponding theory. The main
results is a counter-example to a conjecture that a restricted version
BNeg of the system is refutationally complete.
Tue, 31 May 2016 00:00:00 GMThttp://hdl.handle.net/1956/126442016-05-31T00:00:00ZSelGenes: a tool for selecting marker genes in heterogeneous samples
http://hdl.handle.net/1956/12640
SelGenes: a tool for selecting marker genes in heterogeneous samples
Samdal, Kristian Brakstad
Master thesis
SelGenes is a tool for selecting marker genes for the dominating
cell type in heterogeneous
samples. Based on a framework from an existing algorithm, SelGenes
selects cell-type specific
marker genes for the dominating cell-type and uses these marker
genes to estimate cell-type
proportions in the sample and cell-type specific expression
proles. We test the performance of
SelGenes on a benchmark set and validate the marker genes against
an external database and
further apply SelGenes to a real data set containing gene
expression data from cancer samples.
Compared to an existing method the results from the test were
consistently better for SelGenes.
Tue, 31 May 2016 00:00:00 GMThttp://hdl.handle.net/1956/126402016-05-31T00:00:00ZUsing Smart Cards to Mitigate Security Threats on Mobile Devices
http://hdl.handle.net/1956/12614
Using Smart Cards to Mitigate Security Threats on Mobile Devices
Sivertsgård, Henrik Mathisen
Master thesis
This master thesis developed and tested the idea that smart cards are able to help mitigate security
threats on mobile devices that are handling sensitive data. Investigating the limitations of smart
cards is a fundamental part of the idea and we performed in-depth testing and analysis of what smart
cards are capable of. Our study shows that smart cards are limited by their low processing power, low
amount of memory and a severely lacking application programming language. These limitations affect
smart cards usefulness concerning cryptography and data processing. Additionally, lack of standard
support for smart cards in modern mobile operating systems is a limitation we investigate and propose
solutions for.
Despite these limitations, smart cards can still be a useful asset as they offer a secure execution
environment and are tamper resistant. Viable use cases include secure key generation, management and
storage, digital signing, encryption, strong authentication and the possibility to run small
specialized applets securely. More complex use cases are also possible, but require additional
external components and infrastructure to be realized. For instance, a smart card could be used as a
simple policy enforcement point, given that we had a trusted third party available, and a functioning
public key infrastructure in place.
We were able to construct an Android library and a smart card applet for secure communication, but
there still remains research on the topic. Not all functionality were implemented due to time
constraints and technical issues, but the framework foundations are in place so that extensions can be
quickly and easily implemented. Future work may include full scale testing of our framework,
additional development and testing on more technological advanced smart cards.
Tue, 31 May 2016 00:00:00 GMThttp://hdl.handle.net/1956/126142016-05-31T00:00:00ZStatistical and Algebraic Properties of DES
http://hdl.handle.net/1956/12526
Statistical and Algebraic Properties of DES
Fauskanger, Stian; Semaev, Igor A.
Lin, Dongdai; Wang, XiaoFeng; Yung, Moti
Chapter
D. Davies and S. Murphy found that there are at most 660 different probability distributions on the output from any three adjacent S-boxes after 16 rounds of DES. In this paper it is shown that there are only 72 different distributions for S-boxes 4, 5 and 6. The distributions from S-box triplets are linearly dependent and the dependencies are described. E.g. there are only 13 linearly independent distributions for S-boxes 4, 5 and 6. A coset representation of DES S-boxes which reveals their hidden linearity is studied. That may be used in algebraic attacks. S-box 4 can be represented by significantly fewer cosets than the other S-boxes and therefore has more linearity. Open cryptanalytic problems are stated.
Fri, 01 Jan 2016 00:00:00 GMThttp://hdl.handle.net/1956/125262016-01-01T00:00:00ZTowards Correct Modelling and Model Transformation in DPF
http://hdl.handle.net/1956/12352
Towards Correct Modelling and Model Transformation in DPF
Wang, Xiaoliang
Doctoral thesis
<p>Model-driven engineering (MDE) is a model-centric software development methodology. It promotes models as first-class entities in software de- velopment. Models are used to represent software along software devel- opment process and to finally generate software automatically by model transformations. Thus, the quality of software is highly dependent on models and model transformations. This thesis devotes to construct cor- rect models and model transformations in Diagram Predicate Framework (DPF), which provides a formal diagrammatic approach to (meta)modelling and model transformation based on category theory. </p>
<p>The thesis presents the DPF Model Editor, a graphical (meta)modelling editor for DPF which supports diagrammatic (meta)modelling. In addi- tion, we propose bounded verification approaches of models and model transformations respectively by using Alloy. Alloy consists of a model- ling language and the Alloy Analyzer to examine Alloy specifications. The verification approaches proposed in the thesis translate models and model transformations into Alloy specifications, which are passed to the Alloy Analyzer to verify whether the models and model transformations satisfy some desired properties. </p>
<p>Because of the inherent limitation of Alloy, the verification approaches also encounter a scalability problem: it may take quite long time or be- come intractable to verify larger models or complex model transforma- tions. To tackle the problem, the thesis also presents several techniques to optimize the approaches. The first technique splits models into submod- els and reduces the verification of the models into the verification of some submodels. The other two techniques are proposed for the verification of model transformations: one technique uses a modelling approach to sim- plify model transformations; while the other one optimizes the translation of model transformation rules. Experim at these tech-niques alleviate the scalability problem. </p>
Tue, 14 Jun 2016 00:00:00 GMThttp://hdl.handle.net/1956/123522016-06-14T00:00:00ZModel Checking Healthcare Workflows Using Alloy
http://hdl.handle.net/1956/12351
Model Checking Healthcare Workflows Using Alloy
Wang, Xiaoliang; Rutle, Adrian
Conference object
Workflows are used to organize business processes, and workflow management tools are used to guide users in which order these processes should be performed. These tools increase organizational efficiency and enable users to focus on the tasks and activities rather than complex processes. Workflow models represent real life workflows and consist mainly of a graph-based structure where nodes represent tasks and arrows represent the flows between these tasks. From workflow models, one can use model transformations to generate workflow software. The correctness of the software is dependent on the correctness of the models, hence verification of the models against certain properties like termination, liveness and absence of deadlock are crucial in safety critical domains like healthcare. In previous works we presented a formal diagrammatic framework for workflow modelling and verification which uses principles from model-driven engineering. The framework uses a metamodelling approach for the specification of workflow models, and a transformation module which creates DiVinE code used for verification of model properties. In this paper, in order to improve the scalability and efficiency of the verification, we introduce a new encoding of the workflow models using the Alloy specification language, and we present a bounded verification approach for workflow models based on relational logic. We automatically translate the workflow metamodel into a model transformation specification in Alloy. Properties of the workflow can then be verified against the specification; especially, we can verify properties about loops. We use a running example to explain the metamodelling approach and the encoding to Alloy.
Wed, 01 Jan 2014 00:00:00 GMThttp://hdl.handle.net/1956/123512014-01-01T00:00:00ZHigh-performance design patterns for modern Fortran
http://hdl.handle.net/1956/12340
High-performance design patterns for modern Fortran
Haveraaen, Magne; Morris, Karla; Rouson, Damian; Radhakrishnan, Hari; Carson, Clayton
Journal article
This paper presents ideas for using coordinate-free numerics in modern Fortran to achieve code flexibility in the partial differential
equation (PDE) domain. We also show how Fortran, over the last few decades, has changed to become a language well-suited for
state-of-the-art software development. Fortran’s new coarray distributed data structure, the language’s class mechanism, and its
side-effect-free, pure procedure capability provide the scaffolding on which we implement HPC software. These features empower
compilers to organize parallel computations with efficient communication. We present some programming patterns that support
asynchronous evaluation of expressions comprised of parallel operations on distributed data. We implemented these patterns using
coarrays and the message passing interface (MPI). We compared the codes’ complexity and performance. The MPI code is much
more complex and depends on external libraries. The MPI code on Cray hardware using the Cray compiler is 1.5–2 times faster than
the coarray code on the same hardware. The Intel compiler implements coarrays atop Intel’s MPI library with the result apparently
being 2–2.5 times slower than manually coded MPI despite exhibiting nearly linear scaling efficiency. As compilers mature and
further improvements to coarrays comes in Fortran 2015, we expect this performance gap to narrow.
Thu, 01 Jan 2015 00:00:00 GMThttp://hdl.handle.net/1956/123402015-01-01T00:00:00ZThe Ontology Lookup Service: bigger and better
http://hdl.handle.net/1956/12147
The Ontology Lookup Service: bigger and better
Côté, Richard G.; Reisinger, Florian; Martens, Lennart; Barsnes, Harald; Vizcaíno, Juan Antonio; Hermjakob, Henning
Journal article
The Ontology Lookup Service (OLS; http://www.ebi
.ac.uk/ols) has been providing several means to
query, browse and navigate biomedical ontologies
and controlled vocabularies since it first went into
production 4 years ago, and usage statistics
indicate that it has become a heavily accessed
service with millions of hits monthly. The volume
of data available for querying has increased 7-fold
since its inception. OLS functionality has been
integrated into several high-usage databases and
data entry tools. Improvements in the data model
and loaders, as well as interface enhancements
have made the OLS easier to use and capture
more annotations from the source data. In
addition, newly released software packages now
provide easy means to fully integrate OLS functionality
in external applications.
Tue, 11 May 2010 00:00:00 GMThttp://hdl.handle.net/1956/121472010-05-11T00:00:00ZEnumerating minimal connected dominating sets in graphs of bounded chordality
http://hdl.handle.net/1956/12088
Enumerating minimal connected dominating sets in graphs of bounded chordality
Golovach, Petr; Heggernes, Pinar; Kratsch, Dieter
Conference object
Listing, generating or enumerating objects of specified type is one of the principal tasks in algorithmics. In graph algorithms one often enumerates vertex subsets satisfying a certain property. We study the enumeration of all minimal connected dominating sets of an input graph from various graph classes of bounded chordality. We establish enumeration algorithms as well as lower and upper bounds for the maximum number of minimal connected dominating sets in such graphs. In particular, we present algorithms to enumerate all minimal connected dominating sets of chordal graphs in time O(1.7159^n), of split graphs in time O(1.3803^n), and of AT-free, strongly chordal, and distance-hereditary graphs in time O^*(3^{n/3}), where n is the number of vertices of the input graph. Our algorithms imply corresponding upper bounds for the number of minimal connected dominating sets for these graph classes.
Thu, 01 Jan 2015 00:00:00 GMThttp://hdl.handle.net/1956/120882015-01-01T00:00:00ZComputing cutwidth and pathwidth of semi-complete digraphs via degree orderings
http://hdl.handle.net/1956/12070
Computing cutwidth and pathwidth of semi-complete digraphs via degree orderings
Pilipczuk, Michal Pawel
Conference object
The notions of cutwidth and pathwidth of digraphs play a central role in the containment theory for tournaments, or more generally semi-complete digraphs, developed in a recent series of papers by Chudnovsky, Fradkin, Kim, Scott, and Seymour (Maria Chudnovsky, Alexandra Fradkin, and Paul Seymour, 2012; Maria Chudnovsky, Alex Scott, and Paul Seymour, 2011; Maria Chudnovsky and Paul D. Seymour, 2011; Alexandra Fradkin and Paul Seymour, 2010; Alexandra Fradkin and Paul Seymour, 2011; Ilhee Kim and Paul Seymour, 2012). In this work we introduce a new approach to computing these width measures on semi-complete digraphs, via degree orderings. Using the new technique we are able to reprove the main results of (Maria Chudnovsky, Alexandra Fradkin, and Paul Seymour, 2012; Alexandra Fradkin and Paul Seymour, 2011) in a unified and significantly simplified way, as well as obtain new results. First, we present polynomial-time approximation algorithms for both cutwidth and pathwidth, faster and simpler than the previously known ones; the most significant improvement is in case of pathwidth, where instead of previously known O(OPT)-approximation in fixed-parameter tractable time (Fedor V. Fomin and Michal Pilipczuk, 2013) we obtain a constant-factor approximation in polynomial time. Secondly, by exploiting the new set of obstacles for cutwidth and pathwidth, we show that topological containment and immersion in semi-complete digraphs can be tested in single-exponential fixed-parameter tractable time. Finally, we present how the new approach can be used to obtain exact fixed-parameter tractable algorithms for cutwidth and pathwidth, with single-exponential running time dependency on the optimal width.
Tue, 01 Jan 2013 00:00:00 GMThttp://hdl.handle.net/1956/120702013-01-01T00:00:00ZThe Proteomics Identifications database: 2010 update
http://hdl.handle.net/1956/12060
The Proteomics Identifications database: 2010 update
Vizcaíno, Juan Antonio; Côté, Richard G.; Reisinger, Florian; Barsnes, Harald; Foster, JM; Rameseder, Jonathan; Hermjakob, Henning; Martens, Lennart
Journal article
The Proteomics Identifications database (PRIDE, <a href="http://www.ebi.ac.uk/pride"target="blank">http://www.ebi.ac.uk/pride</a>) at the European Bioinformatics Institute has become one of the main repositories of mass spectrometry-derived proteomics data. For the last 2 years, PRIDE data holdings have grown substantially, comprising 60 different species, more than 2.5 million protein identifications, 11.5 million peptides and over 50 million spectra by September 2009. We here describe several new and improved features in PRIDE, including the revised submission process, which now includes direct submission of fragment ion annotations. Correspondingly, it is now possible to visualize spectrum fragmentation annotations on tandem mass spectra, a key feature for compliance with journal data submission requirements. We also describe recent developments in the PRIDE BioMart interface, which now allows integrative queries that can join PRIDE data to a growing number of biological resources such as Reactome, Ensembl, InterPro and UniProt. This ability to perform extremely powerful across-domain queries will certainly be a cornerstone of future bioinformatics analyses. Finally, we highlight the importance of data sharing in the proteomics field, and the corresponding integration of PRIDE with other databases in the ProteomExchange consortium.
Wed, 11 Nov 2009 00:00:00 GMThttp://hdl.handle.net/1956/120602009-11-11T00:00:00ZChromatin and epigenetic features of long-range gene regulation
http://hdl.handle.net/1956/12057
Chromatin and epigenetic features of long-range gene regulation
Harmston, Nathan; Lenhard, Boris
Journal article
The precise regulation of gene transcription during metazoan development is controlled by a complex system of interactions between transcription factors, histone modifications and modifying enzymes and chromatin conformation. Developments in chromosome conformation capture technologies have revealed that interactions between regions of chromatin are pervasive and highly cell-type specific. The movement of enhancers and promoters in and out of higher-order chromatin structures within the nucleus are associated with changes in expression and histone modifications. However, the factors responsible for mediating these changes and determining enhancer:promoter specificity are still not completely known. In this review, we summarize what is known about the patterns of epigenetic and chromatin features characteristic of elements involved in long-range interactions. In addition, we review the insights into both local and global patterns of chromatin interactions that have been revealed by the latest experimental and computational methods.
Thu, 13 Jun 2013 00:00:00 GMThttp://hdl.handle.net/1956/120572013-06-13T00:00:00ZSubexponential-time parameterized algorithm for Steiner tree on planar graphs
http://hdl.handle.net/1956/12056
Subexponential-time parameterized algorithm for Steiner tree on planar graphs
Pilipczuk, Marcin; Pilipczuk, Michal Pawel; van Leeuwen, Erik Jan; Sankowski, Piotr
Conference object
The well-known bidimensionality theory provides a method for designing fast, subexponential-time parameterized algorithms for a vast number of NP-hard problems on sparse graph classes such as planar graphs, bounded genus graphs, or, more generally, graphs with a fixed excluded minor. However, in order to apply the bidimensionality framework the considered problem needs to fulfill a special density property. Some well-known problems do not have this property, unfortunately, with probably the most prominent and important example being the Steiner Tree problem. Hence the question whether a subexponential-time parameterized algorithm for Steiner Tree on planar graphs exists has remained open. In this paper, we answer this question positively and develop an algorithm running in O(2^{O((k log k)^{2/3})}n) time and polynomial space, where k is the size of the Steiner tree and n is the number of vertices of the graph. Our algorithm does not rely on tools from bidimensionality theory or graph minors theory, apart from Baker's classical approach. Instead, we introduce new tools and concepts to the study of the parameterized complexity of problems on sparse graphs.
Tue, 01 Jan 2013 00:00:00 GMThttp://hdl.handle.net/1956/120562013-01-01T00:00:00ZThe EMBRACE web service collection
http://hdl.handle.net/1956/12054
The EMBRACE web service collection
Pettifer, Steve; Ison, Jon; Kalaš, Matúš; Thorne, David; McDermott, Philip; Jonassen, Inge; Ali, Liaquat; Fernandez, Jose M; Rodriguez, Jose M; Pisano, David G; Blanchet, Christophe; Uludag, Mahmut; Rice, Peter; Bartaseviciute, Edita; Rapacki, Kristoffer; Hekkelman, Maarten; Sand, Olivier; Stockinger, Heinz; Clegg, Andrew B; Bongcam-Rudloff, Eric; Salzemann, Jean; Breton, Vincent; Attwood, Teresa K; Cameron, Graham; Vriend, Gert
Journal article
The EMBRACE (European Model for Bioinformatics Research and Community Education) web service collection is the culmination of a 5-year project that set out to investigate issues involved in developing and deploying web services for use in the life sciences. The project concluded that in order for web services to achieve widespread adoption, standards must be defined for the choice of web service technology, for semantically annotating both service function and the data exchanged, and a mechanism for discovering services must be provided. Building on this, the project developed: EDAM, an ontology for describing life science web services; BioXSD, a schema for exchanging data between services; and a centralized registry (<a href="http://www.embraceregistry.net"target="blank">http://www.embraceregistry.net</a>) that collects together around 1000 services developed by the consortium partners. This article presents the current status of the collection and its associated recommendations and standards definitions.
Mon, 10 May 2010 00:00:00 GMThttp://hdl.handle.net/1956/120542010-05-10T00:00:00ZThe Genomic HyperBrowser: an analysis web server for genome-scale data
http://hdl.handle.net/1956/12053
The Genomic HyperBrowser: an analysis web server for genome-scale data
Sandve, Geir Kjetil; Gundersen, Sveinung; Johansen, Morten; Glad, Ingrid Kristine; Gunathasan, Krishanthi; Holden, Lars; Holden, Marit; Liestøl, Knut; Nygård, Ståle; Nygaard, Vegard; Paulsen, Jonas; Rydbeck, Halfdan; Trengereid, Kai; Clancy, Trevor; Drabløs, Finn; Ferkingstad, Egil; Kalaš, Matúš; Lien, Tonje Gulbrandsen; Rye, Morten Beck; Frigessi, Arnoldo; Hovig, Johannes Eivind
Journal article
The immense increase in availability of genomic scale datasets, such as those provided by the ENCODE and Roadmap Epigenomics projects, presents unprecedented opportunities for individual researchers to pose novel falsifiable biological questions. With this opportunity, however, researchers are faced with the challenge of how to best analyze and interpret their genome-scale datasets. A powerful way of representing genome-scale data is as feature-specific coordinates relative to reference genome assemblies, i.e. as genomic tracks. The Genomic HyperBrowser (<a href="http://hyperbrowser.uio.no"target="blank">http://hyperbrowser.uio.no</a>) is an open-ended web server for the analysis of genomic track data. Through the provision of several highly customizable components for processing and statistical analysis of genomic tracks, the HyperBrowser opens for a range of genomic investigations, related to, e.g., gene regulation, disease association or epigenetic modifications of the genome.
Tue, 30 Apr 2013 00:00:00 GMThttp://hdl.handle.net/1956/120532013-04-30T00:00:00ZVariants of plane diameter completion
http://hdl.handle.net/1956/12052
Variants of plane diameter completion
Golovach, Petr; Requile, Clement; Thilikos, Dimitrios M.
Conference object
The Plane Diameter Completion problem asks, given a plane graph G and a positive integer d, if it is a spanning subgraph of a plane graph H that has diameter at most d. We examine two variants of this problem where the input comes with another parameter k. In the first variant, called BPDC, k upper bounds the total number of edges to be added and in the second, called BFPDC, k upper bounds the number of additional edges per face. We prove that both problems are NP-complete, the first even for 3-connected graphs of face-degree at most 4 and the second even when k=1 on 3-connected graphs of face-degree at most 5. In this paper we give parameterized algorithms for both problems that run in O(n^{3})+2^{2^{O((kd)^2\log d)}} * n steps.
Thu, 01 Jan 2015 00:00:00 GMThttp://hdl.handle.net/1956/120522015-01-01T00:00:00ZTight bounds for parameterized complexity of Cluster Editing
http://hdl.handle.net/1956/12041
Tight bounds for parameterized complexity of Cluster Editing
Fomin, Fedor; Kratsch, Stefan; Pilipczuk, Marcin; Pilipczuk, Michal Pawel; Villanger, Yngve
Conference object
In the Correlation Clustering problem, also known as Cluster Editing, we are given an undirected graph G and a positive integer k; the task is to decide whether G can be transformed into a cluster graph, i.e., a disjoint union of cliques, by changing at most k adjacencies, that is, by adding or deleting at most k edges. The motivation of the problem stems from various tasks in computational biology (Ben-Dor et al., Journal of Computational Biology 1999) and machine learning (Bansal et al., Machine Learning 2004). Although in general Correlation Clustering is APX-hard (Charikar et al., FOCS 2003), the version of the problem where the number of cliques may not exceed a prescribed constant p admits a PTAS (Giotis and Guruswami, SODA 2006). We study the parameterized complexity of Correlation Clustering with this restriction on the number of cliques to be created. We give an algorithm that - in time O(2^{O(sqrt{pk})} + n+m) decides whether a graph G on n vertices and m edges can be transformed into a cluster graph with exactly p cliques by changing at most k adjacencies. We complement these algorithmic findings by the following, surprisingly tight lower bound on the asymptotic behavior of our algorithm. We show that unless the Exponential Time Hypothesis (ETH) fails - for any constant 0 <= sigma <= 1, there is p = Theta(k^sigma) such that there is no algorithm deciding in time 2^{o(sqrt{pk})} n^{O(1)} whether an n-vertex graph G can be transformed into a cluster graph with at most p cliques by changing at most k adjacencies. Thus, our upper and lower bounds provide an asymptotically tight analysis of the multivariate parameterized complexity of the problem for the whole range of values of p from constant to a linear function of k.
Tue, 01 Jan 2013 00:00:00 GMThttp://hdl.handle.net/1956/120412013-01-01T00:00:00ZMinimum Fill-in of Sparse Graphs: Kernelization and Approximation
http://hdl.handle.net/1956/12038
Minimum Fill-in of Sparse Graphs: Kernelization and Approximation
Fomin, Fedor; Geevarghese, Philip; Villanger, Yngve
Conference object
The Minimum Fill-in problem is to decide if a graph can be triangulated by adding at most k edges. The problem has important applications in numerical algebra, in particular in sparse matrix computations. We develop kernelization algorithms for the problem on several classes of sparse graphs. We obtain linear kernels on planar graphs, and kernels of size O(k^{3/2}) in graphs excluding some fixed graph as a minor and in graphs of bounded degeneracy. As a byproduct of our results, we obtain approximation algorithms with approximation ratios O(log{k}) on planar graphs and O(sqrt{k} log{k}) on H-minor-free graphs. These results significantly improve the previously known kernelization and approximation results for Minimum Fill-in on sparse graphs.
Sat, 01 Jan 2011 00:00:00 GMThttp://hdl.handle.net/1956/120382011-01-01T00:00:00ZFinding Induced Subgraphs via Minimal Triangulations
http://hdl.handle.net/1956/12037
Finding Induced Subgraphs via Minimal Triangulations
Fomin, Fedor; Villanger, Yngve
Conference object
Potential maximal cliques and minimal separators are combinatorial objects
which were introduced and studied in the realm of minimal triangulation problems in- cluding
Minimum Fill-in and Treewidth. We discover unexpected applications of these notions to the field of
moderate exponential algorithms. In particular, we show that given an n-vertex graph G together
with its set of potential maximal cliques, and an integer t, it is possible in time the number of
potential maximal cliques times O(nO(t)) to find a
maximum induced subgraph of treewidth t in G and for a given graph F of treewidth t, to
decide if G contains an induced subgraph isomorphic to F. Combined with an improved algorithm
enumerating all potential maximal cliques in time O(1.734601n ), this yields that both the problems
are solvable in time 1.734601n * nO(t) .
Fri, 01 Jan 2010 00:00:00 GMThttp://hdl.handle.net/1956/120372010-01-01T00:00:00ZApproximating Acyclicity Parameters of Sparse Hypergraphs
http://hdl.handle.net/1956/12036
Approximating Acyclicity Parameters of Sparse Hypergraphs
Fomin, Fedor; Golovach, Petr; Thilikos, Dimitrios
Conference object
The notions of hypertree width and generalized hypertree width were introduced by Gottlob, Leone, and Scarcello (PODS'99, PODS'01) in order to extend the concept of hypergraph acyclicity. These notions were further generalized by Grohe and Marx in SODA'06, who introduced the fractional hypertree width of a hypergraph. All these width parameters on hypergraphs are useful for extending tractability of many problems in database theory and artificial intelligence. Computing each of these width parameters is known to be an NP-hard problem. Moreover, the (generalized) hypertree width of an n-vertex hypergraph cannot be approximated within a logarithmic factor unless P=NP. In this paper, we study the approximability of (generalized, fractional) hyper treewidth of sparse hypergraphs where the criterion of sparsity reflects the sparsity of their incidence graphs. Our first step is to prove that the (generalized, fractional) hypertree width of a hypergraph is constant-factor sandwiched by the treewidth of its incidence graph, when the incidence graph belongs to some apex-minor-free graph class (the family of apex-minor-free graph classes includes planar graphs and graphs of bounded genus). This determines the combinatorial borderline above which the notion of (generalized, fractional) hypertree width becomes essentially more general than treewidth, justifying that way its functionality as a hypergraph acyclicity measure. While for more general sparse families of hypergraphs treewidth of incidence graphs and all hypertree width parameters may differ arbitrarily, there are sparse families where a constant factor approximation algorithm is possible. In particular, we give a constant factor approximation polynomial time algorithm for (generalized, fractional) hypertree width on hypergraphs whose incidence graphs belong to some H-minor-free graph class. This extends the results of Feige, Hajiaghayi, and Lee from STOC'05 on approximating treewidth of H-minor-free graphs.
Thu, 01 Jan 2009 00:00:00 GMThttp://hdl.handle.net/1956/120362009-01-01T00:00:00ZKernel(s) for Problems with No Kernel: On Out-Trees with Many Leaves
http://hdl.handle.net/1956/12035
Kernel(s) for Problems with No Kernel: On Out-Trees with Many Leaves
Fernau, Henning; Fomin, Fedor; Lokshtanov, Daniel; Raible, Daniel; Saurabh, Saket; Villanger, Yngve
Conference object
The {\sc $k$-Leaf Out-Branching} problem is to find an out-branching, that is a rooted oriented spanning tree, with at least $k$ leaves in a given digraph. The problem has recently received much attention from the viewpoint of parameterized algorithms. Here, we take a kernelization based approach to the {\sc $k$-Leaf-Out-Branching} problem. We give the first polynomial kernel for {\sc Rooted $k$-Leaf-Out-Branching}, a variant of {\sc $k$-Leaf-Out-Branching} where the root of the tree searched for is also a part of the input. Our kernel has cubic size and is obtained using extremal combinatorics. For the {\sc $k$-Leaf-Out-Branching} problem, we show that no polynomial kernel is possible unless the polynomial hierarchy collapses to third level by applying a recent breakthrough result by Bodlaender et al. (ICALP 2008) in a non-trivial fashion. However, our positive results for {\sc Rooted $k$-Leaf-Out-Branching} immediately imply that the seemingly intractable {\sc $k$-Leaf-Out-Branching} problem admits a data reduction to $n$ independent $O(k^3)$ kernels. These two results, tractability and intractability side by side, are the first ones separating {\it many-to-one kernelization} from {\it Turing kernelization}. This answers affirmatively an open problem regarding ``cheat kernelization'' raised by Mike Fellows and Jiong Guo independently.
Thu, 01 Jan 2009 00:00:00 GMThttp://hdl.handle.net/1956/120352009-01-01T00:00:00ZSubexponential Algorithms for Partial Cover Problems
http://hdl.handle.net/1956/12033
Subexponential Algorithms for Partial Cover Problems
Fomin, Fedor; Lokshtanov, Daniel; Raman, Venkatesh; Saurabh, Saket
Conference object
Partial Cover problems are optimization versions of fundamental and well studied problems like {\sc Vertex Cover} and {\sc Dominating Set}. Here one is interested in covering (or dominating) the maximum number of edges (or vertices) using a given number ($k$) of vertices, rather than covering all edges (or vertices). In general graphs, these problems are hard for parameterized complexity classes when parameterized by $k$. It was recently shown by Amini et. al. [{\em FSTTCS 08}\,] that {\sc Partial Vertex Cover} and {\sc Partial Dominating Set} are fixed parameter tractable on large classes of sparse graphs, namely $H$-minor free graphs, which include planar graphs and graphs of bounded genus. In particular, it was shown that on planar graphs both problems can be solved in time $2^{\cO(k)}n^{\cO(1)}$.
Thu, 01 Jan 2009 00:00:00 GMThttp://hdl.handle.net/1956/120332009-01-01T00:00:00ZFinding even subgraphs even faster
http://hdl.handle.net/1956/12030
Finding even subgraphs even faster
Goyal, Prachi; Misra, Pranabendu; Panolan, Fahad; Philip, Geevarghese; Saurabh, Saket
Conference object
Problems of the following kind have been the focus of much recent research in the realm of parameterized complexity: Given an input graph (digraph) on n vertices and a positive integer parameter k, find if there exist k edges(arcs) whose deletion results in a graph that satisfies some specified parity constraints. In particular, when the objective is to obtain a connected graph in which all the vertices have even degrees--where the resulting graph is Eulerian,the problem is called Undirected Eulerian Edge Deletion. The corresponding problem in digraphs where the resulting graph should be strongly connected and every vertex should have the same in-degree as its out-degree is called Directed Eulerian Edge Deletion. Cygan et al.[Algorithmica, 2014] showed that these problems are fixed parameter tractable (FPT), and gave algorithms with the running time 2^O(k log k)n^O(1). They also asked, as an open problem, whether there exist FPT algorithms which solve these problems in time 2^O(k)n^O(1). It was also posed as an open problem at the School on Parameterized Algorithms and Complexity 2014, Bedlewo, Poland. In this paper we answer their question in the affirmative: using the technique of computing representative families of co-graphic matroids we design algorithms which solve these problems in time 2^O(k)n^O(1). The crucial insight we bring to these problems is to view the solution as an independent set of a co-graphic matroid. We believe that this view-point/approach will be useful in other problems where one of the constraints that need to be satisfied is that of connectivity.
Thu, 01 Jan 2015 00:00:00 GMThttp://hdl.handle.net/1956/120302015-01-01T00:00:00ZMaximum matching width: New characterizations and a fast algorithm for dominating set
http://hdl.handle.net/1956/12029
Maximum matching width: New characterizations and a fast algorithm for dominating set
Jeong, Jisu; Sæther, Sigve Hortemo; Telle, Jan Arne
Conference object
We give alternative definitions for maximum matching width, e.g., a graph G has mmw(G) <= k if and only if it is a subgraph of a chordal graph H and for every maximal clique X of H there exists A,B,C \subseteq X with A \cup B \cup C=X and |A|,|B|,|C| <= k such that any subset of X that is a minimal separator of H is a subset of either A, B or C. Treewidth and branchwidth have alternative definitions through intersections of subtrees, where treewidth focuses on nodes and branchwidth focuses on edges. We show that mm-width combines both aspects, focusing on nodes and on edges. Based on this we prove that given a graph G and a branch decomposition of mm-width k we can solve Dominating Set in time O^*(8^k), thereby beating O^*(3^{tw(G)}) whenever tw(G) > log_3(8) * k ~ 1.893 k. Note that mmw(G) <= tw(G)+1 <= 3 mmw(G) and these inequalities are tight. Given only the graph G and using the best known algorithms to find decompositions, maximum matching width will be better for solving Dominating Set whenever tw(G) > 1.549 * mmw(G).
Thu, 01 Jan 2015 00:00:00 GMThttp://hdl.handle.net/1956/120292015-01-01T00:00:00ZFast biclustering by dual parameterization
http://hdl.handle.net/1956/12028
Fast biclustering by dual parameterization
Drange, Pål Grønås; Reidl, Felix; Villaamil, Fernando Sánchez; Sikdar, Somnath
Conference object
We study two clustering problems, Starforest Editing, the problem of adding and deleting edges to obtain a disjoint union of stars, and the generalization Bicluster Editing. We show that, in addition to being NP-hard, none of the problems can be solved in subexponential time unless the exponential time hypothesis fails. Misra, Panolan, and Saurabh (MFCS 2013) argue that introducing a bound on the number of connected components in the solution should not make the problem easier: In particular, they argue that the subexponential time algorithm for editing to a fixed number of clusters (p-Cluster Editing) by Fomin et al. (J. Comput. Syst. Sci., 80(7) 2014) is an exception rather than the rule. Here, p is a secondary parameter, bounding the number of components in the solution. However, upon bounding the number of stars or bicliques in the solution, we obtain algorithms which run in time O(2^{3*sqrt(pk)} + n + m) for p-Starforest Editing and O(2^{O(p * sqrt(k) * log(pk))} + n + m) for p-Bicluster Editing. We obtain a similar result for the more general case of t-Partite p-Cluster Editing. This is subexponential in k for a fixed number of clusters, since p is then considered a constant. Our results even out the number of multivariate subexponential time algorithms and give reasons to believe that this area warrants further study.
Thu, 01 Jan 2015 00:00:00 GMThttp://hdl.handle.net/1956/120282015-01-01T00:00:00ZParameterized complexity of secluded connectivity problems
http://hdl.handle.net/1956/12027
Parameterized complexity of secluded connectivity problems
Fomin, Fedor; Golovach, Petr; Karpov, Nikolay; Kulikov, Alexander S
Conference object
The Secluded Path problem introduced by Chechik et al. in [ESA 2013] models a situation where a sensitive information has to be transmitted between a pair of nodes along a path in a network. The measure of the quality of a selected path is its exposure, which is the total weight of vertices in its closed neighborhood. In order to minimize the risk of intercepting the information, we are interested in selecting a secluded path, i.e. a path with a small exposure. Similarly, the Secluded Steiner Tree problem is to find a tree in a graph connecting a given set of terminals such that the exposure of the tree is minimized. In this work, we obtain the following results about parameterized complexity of secluded connectivity problems. We start from an observation that being parameterized by the size of the exposure, the problem is fixed-parameter tractable (FPT). More precisely, we give an algorithm deciding if a graph G with a given cost function w:V(G)->N contains a secluded path of exposure at most k with the cost at most C in time O(3^{k/3}(n+m) log W), where W is the maximum value of w on an input graph G. Similarly, Secluded Steiner Tree is solvable in time O(2^{k}k^2 (n+m) log W). The main result of this paper is about "above guarantee" parameterizations for secluded problems. We show that Secluded Steiner Tree is FPT being parameterized by r+p, where p is the number of the terminals, l the size of an optimum Steiner tree, and r=k-l. We complement this result by showing that the problem is co-W[1]-hard when parameterized by r only. We also investigate Secluded Steiner Tree from kernelization perspective and provide several lower and upper bounds when parameters are the treewidth, the size of a vertex cover, maximum vertex degree and the solution size. Finally, we refine the algorithmic result of Chechik et al. by improving the exponential dependence from the treewidth of the input graph.
Thu, 01 Jan 2015 00:00:00 GMThttp://hdl.handle.net/1956/120272015-01-01T00:00:00ZB-chromatic number: Beyond NP-hardness
http://hdl.handle.net/1956/12026
B-chromatic number: Beyond NP-hardness
Panolan, Fahad; Philip, Geevarghese; Saurabh, Saket
Conference object
The b-chromatic number of a graph G, chi_b(G), is the largest integer k such that G has a k-vertex coloring with the property that each color class has a vertex which is adjacent to at least one vertex in each of the other color classes. In the B-Chromatic Number problem, the objective is to decide whether chi_b(G) >= k. Testing whether chi_b(G)=Delta(G)+1, where Delta(G) is the maximum degree of a graph, itself is NP-complete even for connected bipartite graphs (Kratochvil, Tuza and Voigt, WG 2002). In this paper we study B-Chromatic Number in the realm of parameterized complexity and exact exponential time algorithms. We show that B-Chromatic Number is W[1]-hard when parameterized by k, resolving the open question posed by Havet and Sampaio (Algorithmica 2013). When k=Delta(G)+1, we design an algorithm for B-Chromatic Number running in time 2^{O(k^2 * log(k))}*n^{O(1)}. Finally, we show that B-Chromatic Number for an n-vertex graph can be solved in time O(3^n * n^{4} * log(n)).
Thu, 01 Jan 2015 00:00:00 GMThttp://hdl.handle.net/1956/120262015-01-01T00:00:00ZAdditive Schwarz preconditioner for the finite volume element discretization of symmetric elliptic problems
http://hdl.handle.net/1956/12025
Additive Schwarz preconditioner for the finite volume element discretization of symmetric elliptic problems
Marcinkowski, Leszek; Rahman, Talal; Loneland, Atle; Valdman, Jan
Journal article
A symmetric and a nonsymmetric variant of the additive Schwarz preconditioner are proposed for the solution of a class of finite volume element discretization of the symmetric elliptic problem in two dimensions, with large jumps in the entries of the coefficient matrices across subdomains. It is shown that the convergence of the preconditioned generalized minimal residual iteration using the proposed preconditioners depends polylogarithmically, in other words weakly, on the mesh parameters, and that they are robust with respect to the jumps in the coefficients.
Fri, 25 Sep 2015 00:00:00 GMThttp://hdl.handle.net/1956/120252015-09-25T00:00:00ZQuick but odd growth of cacti
http://hdl.handle.net/1956/12024
Quick but odd growth of cacti
Kolay, Sudeshna; Lokshtanov, Daniel; Panolan, Fahad; Saurabh, Saket
Conference object
Let F be a family of graphs. Given an input graph G and a positive integer k, testing whether G has a k-sized subset of vertices S, such that G\S belongs to F, is a prototype vertex deletion problem. These type of problems have attracted a lot of attention in recent times in the domain of parameterized complexity. In this paper, we study two such problems; when F is either a family of cactus graphs or a family of odd-cactus graphs. A graph H is called a cactus graph if every pair of cycles in H intersect on at most one vertex. Furthermore, a cactus graph H is called an odd cactus, if every cycle of H is of odd length. Let us denote by C and C_{odd}, families of cactus and odd cactus, respectively. The vertex deletion problems corresponding to C and C_{odd} are called Diamond Hitting Set and Even Cycle Transversal, respectively. In this paper we design randomized algorithms with running time 12^{k}*n^{O(1)} for both these problems. Our algorithms considerably improve the running time for Diamond Hitting Set and Even Cycle Transversal, compared to what is known about them.
Thu, 01 Jan 2015 00:00:00 GMThttp://hdl.handle.net/1956/120242015-01-01T00:00:00ZOutput-Sensitive Filtering of Streaming Volume Data
http://hdl.handle.net/1956/12000
Output-Sensitive Filtering of Streaming Volume Data
Šoltészová, Veronika; Birkeland, Åsmund; Stoppel, Sergej; Viola, Ivan; Bruckner, Stefan
Journal article
Real-time volume data acquisition poses substantial challenges for the traditional visualization pipeline where data enhancement is typically seen as a pre-processing step. In the case of 4D ultrasound data, for instance, costly processing operations to reduce noise and to remove artefacts need to be executed for every frame. To enable the use of high-quality filtering operations in such scenarios, we propose an output-sensitive approach to the visualization of streaming volume data. Our method evaluates the potential contribution of all voxels to the final image, allowing us to skip expensive processing operations that have little or no effect on the visualization. As filtering operations modify the data values which may affect the visibility, our main contribution is a fast scheme to predict their maximum effect on the final image. Our approach prioritizes filtering of voxels with high contribution to the final visualization based on a maximal permissible error per pixel. With zero permissible error, the optimized filtering will yield a result that is identical to filtering of the entire volume. We provide a thorough technical evaluation of the approach and demonstrate it on several typical scenarios that require on-the-fly processing.
Mon, 01 Feb 2016 00:00:00 GMThttp://hdl.handle.net/1956/120002016-02-01T00:00:00ZA minimum requiring angle trisection
http://hdl.handle.net/1956/11996
A minimum requiring angle trisection
Steihaug, Trond; Rogers, D. G.
Journal article
Thu, 01 Jan 2009 00:00:00 GMThttp://hdl.handle.net/1956/119962009-01-01T00:00:00ZEditing to Eulerian Graphs
http://hdl.handle.net/1956/11951
Editing to Eulerian Graphs
Dabrowski, Konrad K.; Golovach, Petr; van' t Hof, Pim; Paulusma, Daniël
Conference object
We investigate the problem of modifying a graph into a connected graph in which the degree of each vertex satisfies a prescribed parity constraint. Let ea, ed and vd denote the operations edge addition, edge deletion and vertex deletion respectively. For any S subseteq {ea,ed,vd}, we define Connected Degree Parity Editing (S) (CDPE(S)) to be the problem that takes as input a graph G, an integer k and a function delta: V(G) -> {0,1}, and asks whether G can be modified into a connected graph H with d_H(v) = delta(v)(mod 2) for each v in V(H), using at most k operations from S. We prove that (*) if S={ea} or S={ea,ed}, then CDPE(S) can be solved in polynomial time; (*) if {vd} subseteq S subseteq {ea,ed,vd}, then CDPE(S) is NP-complete and W-hard when parameterized by k, even if delta = 0. Together with known results by Cai and Yang and by Cygan, Marx, Pilipczuk, Pilipczuk and Schlotter, our results completely classify the classical and parameterized complexity of the CDPE(S) problem for all S subseteq {ea,ed,vd}. We obtain the same classification for a natural variant of the cdpe(S) problem on directed graphs, where the target is a weakly connected digraph in which the difference between the in- and out-degree of every vertex equals a prescribed value. As an important implication of our results, we obtain polynomial-time algorithms for Eulerian Editing problem and its directed variant. To the best of our knowledge, the only other natural non-trivial graph class H for which the H-Editing problem is known to be polynomial-time solvable is the class of split graphs.
Wed, 01 Jan 2014 00:00:00 GMThttp://hdl.handle.net/1956/119512014-01-01T00:00:00ZConnecting Vertices by Independent Trees
http://hdl.handle.net/1956/11950
Connecting Vertices by Independent Trees
Basavaraju, Manu; Fomin, Fedor; Golovach, Petr; Saurabh, Saket
Conference object
We study the paramereteized complexity of the following connectivity problem. For a vertex subset U of a graph G, trees T1, . . . , Ts of G are completely independent spanning trees of U if each of them contains U , and for every two distinct vertices u, v ∈ U , the paths from u to v in T1, . . . , Ts are pairwise vertex disjoint except for end-vertices u and v. Then for a given s ≥ 2 and a parameter k, the task is to decide if a given n-vertex graph G contains a set U of size at least k such that there are s completely independent spanning trees of U . The problem is known to be NP-complete already for s = 2. We prove the following results: For s = 2 the problem is solvable in time 2O(k)nO(1). For s = 2 the problem does not admit a polynomial kernel unless NP ⊆ coNP /poly. For arbitrary s, we show that the problem is solvable in time f (s, k)nO(1) for some function f of s and k only.
Wed, 01 Jan 2014 00:00:00 GMThttp://hdl.handle.net/1956/119502014-01-01T00:00:00ZLargest chordal and interval subgraphs faster than 2n
http://hdl.handle.net/1956/11779
Largest chordal and interval subgraphs faster than 2n
Bliznets, Ivan; Fomin, Fedor; Pilipczuk, Michal Pawel; Villanger, Yngve
Journal article
We prove that in a graph with n vertices, induced chordal and interval subgraphs with the maximum number of vertices can be found in time O(2λn) for some λ< 1. These are the first algorithms breaking the trivial 2nnO(1) bound of the brute-force search for these problems.
Sat, 22 Aug 2015 00:00:00 GMThttp://hdl.handle.net/1956/117792015-08-22T00:00:00ZNon-Constructivity in Kan Simplicial Sets
http://hdl.handle.net/1956/11729
Non-Constructivity in Kan Simplicial Sets
Bezem, Marc; Coquand, Thierry; Parmann, Erik
Conference object
We give an analysis of the non-constructivity of the following basic result: if X and Y are
simplicial sets and Y has the Kan extension property, then Y X also has the Kan extension
property. By means of Kripke countermodels we show that even simple consequences of this
basic result, such as edge reversal and edge composition, are not constructively provable. We
also show that our unprovability argument will have to be refined if one strengthens the usual
formulation of the Kan extension property to one with explicit horn-filler operations.
Thu, 01 Jan 2015 00:00:00 GMThttp://hdl.handle.net/1956/117292015-01-01T00:00:00ZInvestigating Streamless Sets
http://hdl.handle.net/1956/11725
Investigating Streamless Sets
Parmann, Erik
Conference object
<p>In this paper we look at streamless sets, recently investigated by Coquand and Spiwack. A set is streamless if every stream over that set contain a duplicate. It is an open question in constructive mathematics whether the Cartesian product of two streamless sets is streamless.</p> <p>We look at some settings in which the Cartesian product of two streamless sets is indeed streamless; in particular, we show that this holds in Martin-Loef intentional type theory when at least one of the sets have decidable equality. We go on to show that the addition of functional extensionality give streamless sets decidable equality, and then investigate these results in a few other constructive systems.</p>
Thu, 01 Jan 2015 00:00:00 GMThttp://hdl.handle.net/1956/117252015-01-01T00:00:00ZCase Studies in Constructive Mathematics
http://hdl.handle.net/1956/11718
Case Studies in Constructive Mathematics
Parmann, Erik
Doctoral thesis
<p>The common theme in this thesis is the study of constructive provability: in particular we investigate aspects of ﬁnite sets and Kan simplicial sets from a constructive perspective. </p>
<p>There are numerous deﬁnitions of ﬁniteness which are classically equivalent but not constructively so. In other words, constructive mathematics is able to distinguish between more notions of ﬁniteness. We start by investigating some relationships between several ways of deﬁning ﬁniteness for sets of natural numbers. As a new result, we give strictly bounded a precise placement in a hierarchy of deﬁnitions of ﬁniteness. </p>
<p>We also investigate streamless sets, which constitutes another notion of ﬁnite- ness. Streamless sets require neither decidable equality nor that the set is a subset of an enumerable set, and they are as such more general than strictly bounded sets. It is an open problem whether the Cartesian product of two streamless sets is itself streamless. We show that this holds if at least one of the sets has decidable equality or is of bounded size. The problem remains open for the case where both streamless sets have undecidable equality and fail to be of bounded size. We also show that—in certain constructive systems—the addition of function extensionality makes equality within streamless sets decidable. </p>
<p>Another notion of ﬁniteness is Noetherian. Both streamless and Noetherian can be generalized to properties of binary relations, whereby such sets are those where equality is respectively streamless or Noetherian. We provide a proof that all Noetherian relations are streamless—notably, in a type system without inductively deﬁned equality. This result immediately entails that all Noetherian sets are streamless within that type system. </p>
<p>We proceed to investigate aspects of Kan simplicial sets, a notion coming from topology. Kan simplicial sets have recently caught the eye of the type theory community since they can be used to build models of Martin-L¨of type theory that validate the Univalence axiom. All known proofs of the following well-known theorem use classical logic: if simplicial sets X and Y are Kan simplicial sets then Y X is also a Kan simplicial set. This theorem plays an important role in the Kan simplicial set model of type theory. We investigate whether this theorem also holds constructively. The classical deﬁnition of the Kan property has at least two non-equivalent constructive interpretations, and we provide countermodels showing the constructive non-provability of the classical theorem above for both of these deﬁnitions of Kan simplicial sets. </p>
Fri, 22 Jan 2016 00:00:00 GMThttp://hdl.handle.net/1956/117182016-01-22T00:00:00ZGenerating software for MUB complementary sequence constructions
http://hdl.handle.net/1956/11670
Generating software for MUB complementary sequence constructions
Roodashty, Hanieh
Master thesis
This master thesis has been performed at the Department of Informatics, University of Bergen between February and November 2015. The work has been supervised by Professor Matthew G. Parker as a part of the research interest within complementary construction using mutually unbiased bases.
This project is an attempt in the line of the study to improve the set size of the complementary sequences while keeping the upper bound of PAPR as low as possible and also maintaining the pairwise distinguishability. To perform this task, seeding the recursive construction with optimal mutually-unbiased bases in dimension 2 and 3 was used in this study.
To use the MUB-based sequences in OFDM containing systems, we generated program codes that constructed distinct arrays and sequences for dimension 2 and 3 seeding by MUBs with and without linear offset. The codes were produced in MATLAB environment.
The codes for both dimensions have delivered satisfactory results. The results for lower iterations have also matched with the manually calculated values based on theory.
Various strategies were used to increase the software speed as well as to decrease the resource demand, but still to run the codes for higher iterations there needs advanced and professional computing solutions such as supercomputers.
It has been attempted to generate the codes with maximum possible flexibility so that they can be used for other dimensions with minor adjustments. The codes have also the capability of conversion to other programming languages.
Wed, 18 Nov 2015 00:00:00 GMThttp://hdl.handle.net/1956/116702015-11-18T00:00:00ZA 43k Kernel for Planar Dominating Set using Computer-Aided Reduction Rule Discovery
http://hdl.handle.net/1956/11669
A 43k Kernel for Planar Dominating Set using Computer-Aided Reduction Rule Discovery
Halseth, Johan Torås
Master thesis
In this thesis we explore the technique of Region
Decomposition for finding kernels for Planar Dominating Set.
We redefine some concepts used in earlier work, fixing some
ambiguities on the way. From those concepts we end up at
the 335k kernel upper bound from Alber et. We then build
on the results of Chen et al. to improve this upper bound to
55k.
In the last part of the thesis we make use of a computer
program to exhaustively search for reduced instances of
regions, to be used together with the Region Decomposition
technique. From the results from the computer program we
are able to conclude that any region used in a Region
Decomposition always can be reduced to an equivalent
region having 12 vertices or less, in addition to its to
endpoints. This let us arrive at a 43k kernel for Planar
Dominating Set.
Mon, 15 Feb 2016 00:00:00 GMThttp://hdl.handle.net/1956/116692016-02-15T00:00:00ZAxis patterning by BMPs: cnidarian network reveals evolutionary constraints
http://hdl.handle.net/1956/11601
Axis patterning by BMPs: cnidarian network reveals evolutionary constraints
Genikhovich, Grigory; Fried, Patrick; Prünster, M. Mandela; Schinko, Johannes B.; Gilles, Anna F.; Fredman, David; Meier, Karin; Iber, Dagmar; Technau, Ulrich
Journal article
BMP signaling plays a crucial role in the establishment of the dorso-ventral body axis in bilaterally symmetric animals. However, the topologies of the bone morphogenetic protein (BMP) signaling networks vary drastically in different animal groups, raising questions about the evolutionary constraints and evolvability of BMP signaling systems. Using loss-of-function analysis and mathematical modeling, we show that two signaling centers expressing different BMPs and BMP antagonists maintain the secondary axis of the sea anemone Nematostella. We demonstrate that BMP signaling is required for asymmetric Hox gene expression and mesentery formation. Computational analysis reveals that network parameters related to BMP4 and Chordin are constrained both in Nematostella and Xenopus, while those describing the BMP signaling modulators can vary significantly. Notably, only chordin, but not bmp4 expression needs to be spatially restricted for robust signaling gradient formation. Our data provide an explanation of the evolvability of BMP signaling systems in axis formation throughout Eumetazoa.
Sun, 01 Mar 2015 00:00:00 GMThttp://hdl.handle.net/1956/116012015-03-01T00:00:00ZEfficient CRISPR-Cas9-mediated generation of knockin human pluripotent stem cells lacking undesired mutations at the targeted locus
http://hdl.handle.net/1956/11600
Efficient CRISPR-Cas9-mediated generation of knockin human pluripotent stem cells lacking undesired mutations at the targeted locus
Merkle, Florian T.; Neuhausser, Werner M.; Santos, David; Valen, Eivind; Gagnon, James A.; Maas, Kristi; Sandoe, Jackson; Schier, Alexander F.; Eggan, Kevin
Journal article
The CRISPR-Cas9 system has the potential to revolutionize genome editing in human pluripotent stem cells (hPSCs), but its advantages and pitfalls are still poorly understood. We systematically tested the ability of CRISPR-Cas9 to mediate reporter gene knockin at 16 distinct genomic sites in hPSCs. We observed efficient gene targeting but found that targeted clones carried an unexpectedly high frequency of insertion and deletion (indel) mutations at both alleles of the targeted gene. These indels were induced by Cas9 nuclease, as well as Cas9-D10A single or dual nickases, and often disrupted gene function. To overcome this problem, we designed strategies to physically destroy or separate CRISPR target sites at the targeted allele and developed a bioinformatic pipeline to identify and eliminate clones harboring deleterious indels at the other allele. This two-pronged approach enables the reliable generation of knockin hPSC reporter cell lines free of unwanted mutations at the targeted locus.
Fri, 01 May 2015 00:00:00 GMThttp://hdl.handle.net/1956/116002015-05-01T00:00:00ZFreeContact: Fast and free software for protein contact prediction from residue co-evolution
http://hdl.handle.net/1956/10995
FreeContact: Fast and free software for protein contact prediction from residue co-evolution
Kaján, László; Hopf, Thomas A.; Kalaš, Matúš; Marks, Debora S.; Rost, Burkhard
Journal article
Background: 20 years of improved technology and growing sequences now renders residue-residue contact constraints in large protein families through correlated mutations accurate enough to drive de novo predictions of protein three-dimensional structure. The method EVfold broke new ground using mean-field Direct Coupling Analysis (EVfold-mfDCA); the method PSICOV applied a related concept by estimating a sparse inverse covariance matrix. Both methods (EVfold-mfDCA and PSICOV) are publicly available, but both require too much CPU time for interactive applications. On top, EVfold-mfDCA depends on proprietary software.
Results: Here, we present FreeContact, a fast, open source implementation of EVfold-mfDCA and PSICOV. On a test set of 140 proteins, FreeContact was almost eight times faster than PSICOV without decreasing prediction performance. The EVfold-mfDCA implementation of FreeContact was over 220 times faster than PSICOV with negligible performance decrease. EVfold-mfDCA was unavailable for testing due to its dependency on proprietary software. FreeContact is implemented as the free C++ library “libfreecontact”, complete with command line tool “freecontact”, as well as Perl and Python modules. All components are available as Debian packages. FreeContact supports the BioXSD format for interoperability.
Conclusions: FreeContact provides the opportunity to compute reliable contact predictions in any environment (desktop or cloud).
Wed, 26 Mar 2014 00:00:00 GMThttp://hdl.handle.net/1956/109952014-03-26T00:00:00ZIndependent Set on P5-free graphs, an empirical study
http://hdl.handle.net/1956/10946
Independent Set on P5-free graphs, an empirical study
Haug, Håvard Karim
Master thesis
We implement the recent polynomial time
algorithm for the independent set problem on
P5-free graphs, and study the performance of
this algorithm on graphs of size up to 50. Our
empirical results show that the algorithm is
probably much faster than its theoretical
running-time upperbound.
Fri, 20 Nov 2015 00:00:00 GMThttp://hdl.handle.net/1956/109462015-11-20T00:00:00ZParameterized Graph Modification Algorithms
http://hdl.handle.net/1956/10774
Parameterized Graph Modification Algorithms
Drange, Pål Grønås
Doctoral thesis
<p>Graph modification problems form an important class of algorithmic problems in computer science. In this thesis, we study edge modification problems towards classes related to chordal graphs, with the main focus on trivially perfect graphs and threshold graphs. We provide several new results in classical complexity, kernelization complexity, and subexponential parameterized complexity. In all cases we give positive and negative results—giving polynomial time algorithms as well as NP-hardness results, polynomial kernels as well as polynomial kernel impossibility results, and we give subexponential time algorithms, and show that many problems do not admit such algorithms unless the exponential time hypothesis fails.</p>
<p>Our main focus is on the subexponential time complexity of edge modification problems. For that to make sense, we first need to figure out whether or not we actually need super-polynomial time. We show that editing towards trivially perfect graphs, threshold graphs, and chain graphs are all NP-complete, resolving 15 year old open questions. When a problem is shown to be NP-complete, we study exactly how much exponential time is needed for an algorithm to solve it. We provide several subexponential time algorithms, for, e.g., editing towards chain graphs and threshold graphs, as well as completing towards trivially perfect graphs. We complement our results by showing that small alterations in the target graph classes yields much harder problems: Editing towards trivially perfect graphs and cographs is not possible in subexponential time unless the exponential time hypothesis fails.</p>
<p>A first step in our subexponential time algorithms, and an otherwise natural first step in dealing with NP-hard problems is offered by the toolbox of polynomial kernelization. In polynomial kernelizations, we are asked to design polynomial time compression algorithms that shrink the input instances to output instances bounded polynomially in a yes-solution. We provide polynomial kernels for all edge modification problems towards trivially perfect graphs, threshold graphs and chain graphs. In addition, we show that on bounded degree input graphs, we obtain polynomial kernels for any editing or deletion problem towards graph classes characterizable by a finite set of forbidden induced subgraphs. Finally, we show that we should not expect the same result for completion problems by proving that such a compression algorithm would imply the collapse of the polynomial hierarchy.</p>
Thu, 10 Dec 2015 00:00:00 GMThttp://hdl.handle.net/1956/107742015-12-10T00:00:00ZSecurity in cloud computing and virtual environments
http://hdl.handle.net/1956/10695
Security in cloud computing and virtual environments
Aarseth, Raymond
Master thesis
Cloud computing is a big buzzwords today. Just
watch the commercials on TV and I can promise
that you will hear the word cloud service at
least once. With the growth of cloud technology
steadily rising, and everything from cellphones
to cars connected to the cloud, how secure is
cloud technology? What are the caveats of using
cloud technology? And how does it all work? This
thesis will discuss cloud security and the
underlying technology called Virtualization to
better understand the security from a technical
point of view. We will show how some
vulnerabilities can be utilized to steal personal
information, and how an attacker can exploit
vulnerabilities to take control of a system.
Mon, 14 Sep 2015 00:00:00 GMThttp://hdl.handle.net/1956/106952015-09-14T00:00:00ZBioXSD: the common data-exchange format for everyday bioinformatics web services
http://hdl.handle.net/1956/10660
BioXSD: the common data-exchange format for everyday bioinformatics web services
Kalaš, Matúš; Puntervoll, Pål; Joseph, Alexandre; Bartaševičiūtė, Edita; Töpfer, Armin; Venkataraman, Prabakar; Pettifer, Steve; Bryne, Jan Christian; Ison, Jon; Blanchet, Christophe; Rapacki, Kristoffer; Jonassen, Inge
Journal article
<p>Motivation: The world-wide community of life scientists has access to a large number of public bioinformatics databases and tools, which are developed and deployed using diverse technologies and designs. More and more of the resources offer programmatic web-service interface. However, efficient use of the resources is hampered by the lack of widely used, standard data-exchange formats for the basic, everyday bioinformatics data types.</p>
<p>Results: BioXSD has been developed as a candidate for standard, canonical exchange format for basic bioinformatics data. BioXSD is represented by a dedicated XML Schema and defines syntax for biological sequences, sequence annotations, alignments and references to resources. We have adapted a set of web services to use BioXSD as the input and output format, and implemented a test-case workflow. This demonstrates that the approach is feasible and provides smooth interoperability. Semantics for BioXSD is provided by annotation with the EDAM ontology. We discuss in a separate section how BioXSD relates to other initiatives and approaches, including existing standards and the Semantic Web.</p>
<p>Availability: The BioXSD 1.0 XML Schema is freely available at http://www.bioxsd.org/BioXSD-1.0.xsd under the Creative Commons BY-ND 3.0 license. The http://bioxsd.org web page offers documentation, examples of data in BioXSD format, example workflows with source codes in common programming languages, an updated list of compatible web services and tools and a repository of feature requests from the community.</p>
Fri, 01 Jan 2010 00:00:00 GMThttp://hdl.handle.net/1956/106602010-01-01T00:00:00ZEDAM: an ontology of bioinformatics operations, types of data and identifiers, topics and formats
http://hdl.handle.net/1956/10659
EDAM: an ontology of bioinformatics operations, types of data and identifiers, topics and formats
Ison, Jon; Kalaš, Matúš; Jonassen, Inge; Bolser, Dan; Uludag, Mahmut; McWilliam, Hamish; Malone, James; Lopez, Rodrigo; Pettifer, Steve; Rice, Peter
Journal article
<p>Motivation: Advancing the search, publication and integration of bioinformatics tools and resources demands consistent machine-understandable descriptions. A comprehensive ontology allowing such descriptions is therefore required.</p>
<p>Results: EDAM is an ontology of bioinformatics operations (tool or workflow functions), types of data and identifiers, application domains and data formats. EDAM supports semantic annotation of diverse entities such as Web services, databases, programmatic libraries, standalone tools, interactive applications, data schemas, datasets and publications within bioinformatics. EDAM applies to organizing and finding suitable tools and data and to automating their integration into complex applications or workflows. It includes over 2200 defined concepts and has successfully been used for annotations and implementations.</p>
<p>Availability: The latest stable version of EDAM is available in OWL format from http://edamontology.org/EDAM.owl and in OBO format from http://edamontology.org/EDAM.obo. It can be viewed online at the NCBO BioPortal and the EBI Ontology Lookup Service. For documentation and license please refer to http://edamontology.org. This article describes version 1.2 available at http://edamontology.org/EDAM_1.2.owl.</p>
Tue, 01 Jan 2013 00:00:00 GMThttp://hdl.handle.net/1956/106592013-01-01T00:00:00ZEfforts towards accessible and reliable bioinformatics
http://hdl.handle.net/1956/10658
Efforts towards accessible and reliable bioinformatics
Kalaš, Matúš
Doctoral thesis
<p>The aim of the presented work was contributing to making scientific computing more accessible, reliable, and thus more efficient for researchers, primarily computational biologists and molecular biologists. Many approaches are possible and necessary towards these goals, and many layers need to be tackled, in collaborative community efforts with well-defined scope. As diverse components are necessary for the accessible and reliable bioinformatics scenario, our work focussed in particular on the following:</p>
<p>In the BioXSD project, we aimed at developing an XML-Schema-based data format compatible with Web services and programmatic libraries, that is expressive enough to be usable as a common, canonical data model that serves tools, libraries, and users with convenient data interoperability.</p>
<p>The EDAM ontology aimed at enumerating and organising concepts within bioinformatics, including operations and types of data. EDAM can be helpful in documenting and categorising bioinformatics resources using a standard “vocabulary”, enabling users to find respective resources and choose the right tools.</p>
<p>The eSysbio project explored ways of developing a workbench for collaborative data analysis, accessible in various ways for users with various tasks and expertise. We aimed at utilising the World-Wide-Web and industrial standards, in order to increase compatibility and maintainability, and foster shared effort.</p>
<p>In addition to these three main contributions that I have been involved in, I present a comprehensive but non-exhaustive research into the various previous and contemporary efforts and approaches to the broad topic of integrative bioinformatics, in particular with respect to bioinformatics software and services. I also mention some closely related efforts that I have been involved in.</p>
<p>The thesis is organised as follows: In the Background chapter, the field is presented, with various approaches and existing efforts. Summary of results summarises the contributions of my enclosed projects – the BioXSD data format, the EDAM ontology, and the eSysbio workbench prototype – to the broad topics of the thesis. The Discussion chapter presents further considerations and current work, and concludes the discussed contributions with alternative and future perspectives.</p>
<p>In the printed version, the three articles that are part of this thesis, are attached after the Discussion and References. In the electronic version (in PDF), the main thesis is optimised for reading on a screen, with clickable cross-references (e.g. from citations in the text to the list of References) and hyperlinks (e.g. for URLs and most References). A PDF viewer with “back“ function is recommended.</p>
Thu, 19 Nov 2015 00:00:00 GMThttp://hdl.handle.net/1956/106582015-11-19T00:00:00ZTowards a Secure Framework for mHealth. A Case Study in Mobile Data Collection Systems
http://hdl.handle.net/1956/10652
Towards a Secure Framework for mHealth. A Case Study in Mobile Data Collection Systems
Gejibo, Samson Hussien
Doctoral thesis
<p>The rapid growth in the mobile communications technology and wide cellular coverage created an opportunity to satisfy the demand for low-cost health care solutions. Mobile Health (a.k.a. mHealth) is a promising health service delivery concept that utilizes mobile communications technology to bridge the gap between remotely and sparsely populated communities and health care providers. So far, several mHealth applications have been developed and deployed in the field. Among those, a digital information gathering and dissemination system using mobile devices is the main focus of this work. This type of mHealth system is called Mobile Data Collection System (MDCS). Although MDCS succeeds over traditional paper form based data collection; it has also brought unique challenges such as data security in mobile communications technology. Despite MDCS are often used to collect sensitive health-related data, more work was needed to address security issues like confidentiality, integrity, availability and authentication to secure sensitive health related information in storage, data exchange and processing.</p>
<p>When we began this work, Java ME enabled feature phones, that dominated the scene for a decade, were the choice of most MDCS. At that time, in collaboration with our partner project, we proposed a secure custom protocol. The protocol has been implemented, tested, and integrated into our reference MDCS. We have confirmed the flexibility of our secure solution by retrofitting the existing openXdata system with user authentication, secure storage and communication solutions by modifying only a few lines of code in the client-server application.</p>
<p>However, in the past few years, the explosion of new mobile platforms and cloudbased services became game changer in our work. The move from feature phones to smartphones brought to the table the need to reevaluate, redesign, and port our earlier secure solution to smartphones based MDCS by considering the unique features and challenges of both smart phone clients and cloud-based server-side deployments.</p>
<p>In this dissertation, we analyze the challenges in securing mobile data collection systems deployed in remote areas, in resources-constrained environment, and in low project budget settings. We present a flexible and secure framework that offers user authentication both online and off-line, secure mobile storage, secure communication, and secure cloud storage. Besides, the framework provides data integrity, user account and data recovery, and multi-user management and is designed to be easily integrated in existing MDCS with minimal effort. Although fundamental security issues are conceptually identical in both old feature phone and current smartphone based solutions, our framework and the proposed solutions address the unique aspects of both mobile platforms. We also discuss the solution we designed for older Java ME based devices, and how they are still relevant. For this work, we collaborated with the open-source MDCS, openXdata and Open Data Kit (ODK).</p>
Thu, 05 Nov 2015 00:00:00 GMThttp://hdl.handle.net/1956/106522015-11-05T00:00:00ZInteractive Visual Analysis of Streaming Data
http://hdl.handle.net/1956/10587
Interactive Visual Analysis of Streaming Data
Smestad, Geir
Master thesis
Interactive Visual Analysis (IVA) has proven to be a robust set of methods for visually exploring complex data sets and generating hypotheses from data. Datasets and techniques where the temporal aspect is central has been an important area of study, both for the visualization field in general and for research on IVA. However, the challenge of handling streaming data sources for the purposes of decision support and analysis in real time, has been given comparatively little attention. This thesis presents a summary of the visualization literature addressing time-oriented and streaming data, with emphasis on Interactive Visual Analysis and its related techniques. We then explain the contemporary distinction between real-time data monitoring and retrospective data analysis, explore challenges that occur when a human user attempts to visually analyze data in real time, and use these observations to extend the scope of IVA such that it can be used to analyze streaming data in real time.
Tue, 23 Sep 2014 00:00:00 GMThttp://hdl.handle.net/1956/105872014-09-23T00:00:00ZThe genome sequence of Atlantic cod reveals a unique immune system
http://hdl.handle.net/1956/10530
The genome sequence of Atlantic cod reveals a unique immune system
Star, Bastiaan; Nederbragt, Alexander Johan; Jentoft, Sissel; Grimholt, Unni; Malmstrøm, Martin; Gregers, Tone Fredsvik; Rounge, Trine Ballestad; Paulsen, Jonas; Solbakken, Monica Hongrø; Sharma, Animesh; Wetten, Ola Frang; Lanzén, Anders; Winer, Roger; Knight, James; Vogel, Jan-Hinnerk; Aken, Bronwen; Andersen, Øivind; Lagesen, Karin; Tooming-Klunderud, Ave; Edvardsen, Rolf; Kirubakaran, G. Tina; Espelund, Mari; Nepal, Chirag; Previti, A. Christopher; Karlsen, Bård Ove; Moum, Truls; Skage, Morten; Berg, Paul Ragnar; Gjøen, Tor; Kuhl, Heiner; Thorsen, Jim; Malde, Ketil; Reinhardt, Richard; Du, Lei; Johansen, Steinar Daae; Searle, Steve; Lien, Sigbjørn; Nilsen, Frank; Jonassen, Inge; Omholt, Stig W; Stenseth, Nils Christian; Jakobsen, Kjetill Sigurd
Journal article
Atlantic cod (Gadus morhua) is a large, cold-adapted teleost that sustains long-standing commercial fisheries and incipient aquaculture. Here we present the genome sequence of Atlantic cod, showing evidence for complex thermal adaptations in its haemoglobin gene cluster and an unusual immune architecture compared to other sequenced vertebrates. The genome assembly was obtained exclusively by 454 sequencing of shotgun and paired-end libraries, and automated annotation identified 22,154 genes. The major histocompatibility complex (MHC) II is a conserved feature of the adaptive immune system of jawed vertebrates, but we show that Atlantic cod has lost the genes for MHC II, CD4 and invariant chain (Ii) that are essential for the function of this pathway. Nevertheless, Atlantic cod is not exceptionally susceptible to disease under natural conditions5. We find a highly expanded number of MHC I genes and a unique composition of its Toll-like receptor (TLR) families. This indicates how the Atlantic cod immune system has evolved compensatory mechanisms in both adaptive and innate immunity in the absence of MHC II. These observations affect fundamental assumptions about the evolution of the adaptive immune system and its components in vertebrates.
Thu, 01 Sep 2011 00:00:00 GMThttp://hdl.handle.net/1956/105302011-09-01T00:00:00ZJASPAR 2014: An extensively expanded and updated open-access database of transcription factor binding profiles
http://hdl.handle.net/1956/10452
JASPAR 2014: An extensively expanded and updated open-access database of transcription factor binding profiles
Mathelier, Anthony; Zhao, Xiaobei; Zhang, Allen W.; Parcy, François; Worsley-Hunt, Rebecca; Arenillas, David J.; Buchman, Sorana; Chen, Chih-yu; Chou, Alice; Ienasescu, Hans; Lim, Jonathan; Shyr, Casper; Tan, Ge; Zhou, Michelle; Lenhard, Boris; Sandelin, Albin; Wasserman, Wyeth W.
Journal article
JASPAR (http://jaspar.genereg.net) is the largest open-access database of matrix-based nucleotide profiles describing the binding preference of transcription factors from multiple species. The fifth major release greatly expands the heart of JASPAR—the JASPAR CORE subcollection, which contains curated, non-redundant profiles—with 135 new curated profiles (74 in vertebrates, 8 in Drosophila melanogaster, 10 in Caenorhabditis elegans and 43 in Arabidopsis thaliana; a 30% increase in total) and 43 older updated profiles (36 in vertebrates, 3 in D. melanogaster and 4 in A. thaliana; a 9% update in total). The new and updated profiles are mainly derived from published chromatin immunoprecipitation-seq experimental datasets. In addition, the web interface has been enhanced with advanced capabilities in browsing, searching and subsetting. Finally, the new JASPAR release is accompanied by a new BioPython package, a new R tool package and a new R/Bioconductor data package to facilitate access for both manual and automated methods.
Wed, 01 Jan 2014 00:00:00 GMThttp://hdl.handle.net/1956/104522014-01-01T00:00:00ZSkaping av meirverdi gjennom opne data om kollektivtrafikk
http://hdl.handle.net/1956/10429
Skaping av meirverdi gjennom opne data om kollektivtrafikk
Bergheim, Livar
Master thesis
Tema i denne oppgåva er opne data, med ei avgrensing
mot kollektivtrafikkdata. Djupnestudien
er retta mot verksemda til Skyss, det fylkeskommunale
kollektivtrafikkselskapet i Hordaland.
Omgrepet opne data" inneber at andre har løyve til å
vidarebruke dei i applikasjonar for
å nå dei ideelle målsetjingane som innovasjon, nyskaping,
gjennomsiktighet og effektivisering.
Utfordringane med opne data i denne typen verksemd er
kompleksiteten av dataelement ordna
i tid og rom. Samtidig er data dynamiske og nokre datasett
endrar seg i sanntid, t.d. prognosar
for kor mange minutt det er til dei neste bussane kjem til
ein gitt haldeplass.
I oppgåva reiser eg spørsmål om kva kvalitetar eit
programmeringsgrensesnitt (API) bør ha
med omsyn til å lette tilgangen til opne data og
funksjonaliteten knytt til desse.
Mitt standpunkt er som følgje av studien at web-API som
følgjer REST-arkitekturstilen, er
meir fleksible og lettare å ta i bruk og integrere med data
frå andre kjelder og andre applikasjonar.
Problemstillinga i oppgåva er todelt. For det første, korleis
ein kan gjere tilgjengeleg kollektivtrafikkdata
for vidarebruk på ein god måte. For det andre, i kva grad
kollektivtrafikkdata
let seg utnytte, eventuelt i kombinasjon med data frå andre
kjelder, til å lage nye applikasjonar/tenester
av verdi.
Gjennom drøftingar og utvikla app har eg vist døme på:
1) At å gjere kollektivtrafikkdata tilgjengeleg som opne data
kan gje meirverdi i form av nye
appar og tenester, innsyn og effektivisering.
2) Gjennom eit web-API er det relativt enkelt for utviklarar
med rett dugleik å bruke kunnskap
og kreativitet til nye applikasjonar.
3) Korleis ein kan legge til rette for eit godt web-API som
gjer det mogeleg og lettare for utviklarar
å bruke data.
Det har skjedd mykje dei siste åra når det gjeld teknologi
og kva data ein har om kollektivtrafikk.
Dette kan bidra til meir bruk av kollektivtrafikk, noko som er
miljømessig og politisk
ynskt.
Mon, 04 May 2015 00:00:00 GMThttp://hdl.handle.net/1956/104292015-05-04T00:00:00ZSMS One-Time Passwords, Security in Two-Factor Authentication
http://hdl.handle.net/1956/10426
SMS One-Time Passwords, Security in Two-Factor Authentication
Eide, Jan-Erik Lothe
Master thesis
In the past decade, the low price and ease of generating and sending large amounts of SMS have made it possible for many online services to create strong and affordable authentication systems. With the growth of smartphones on the market, authentication systems that use mobile phones have lost some of their security. These systems rely on mobile phones being independent, separated from personal computers. This thesis investigates weaknesses in authentication systems that sends vital information to mobile phones via SMS. We will show that services that rely on this type of authentication are vulnerable to attack. The intended audience for this thesis are computer scientists, professional and amateur software developers, but anyone with basic IT knowledge is encouraged to keep reading.
Fri, 29 May 2015 00:00:00 GMThttp://hdl.handle.net/1956/104262015-05-29T00:00:00ZImplementasjon av attributtbasert tilgangskontroll i elektroniske helsesystemer
http://hdl.handle.net/1956/10425
Implementasjon av attributtbasert tilgangskontroll i elektroniske helsesystemer
Kristensen, André Sæ
Master thesis
Tilgangskontroll er et av de viktigste temaene innenfor informasjonssikkerhet[10]. Sensitiv data bør bare kunne aksesseres av autoriserte brukere eller programmer. Denne oppgaven har som hovedmål å undersøke den nåværende tilgangskontrollen i to utbredte elektroniske helsesystemer: Open Medical Record System (OpenMRS) og Open Data Kit Aggregate (ODK Aggregate). Videre skal det utforskes muligheten for å implementere en mer fleksibel tilgangskontroll i overnevnte systemer med bruk av Attribute Based Access Control (ABAC). Før undersøkelsen blir utført skal det kartlegges eksisterende tilgangsmekanismer med deres fordeler og ulemper. Konklusjonen av arbeidet er at implementasjon av ABAC i de overnevnte systemene ikke kan gjøres på en optimal måte uten å måtte redesigne dem. Deler av den eksisterende tilgangskontrollen må derfor beholdes, og bare en begrenset ABAC kan implementeres. Dette er fordi systemene er utviklet for en annen type tilgangskontroll og det er blitt tatt valg under utviklingen som gjør at tilgangskontrollen er for tett integrert i applikasjonen. Basert på erfaringer med OpenMRS og ODK Aggregate ble det laget et elektronisk helsesystem med navnet Medical, der ABAC ble tatt med fra starten og implementert på en optimal måte. Medical prosjektet er et Proof of Concept (PoC) prosjekt som beviser hvor fleksibelt attributtbasert tilgangskontroll kan være når det blir tatt med i designfasen.
Thu, 28 May 2015 00:00:00 GMThttp://hdl.handle.net/1956/104252015-05-28T00:00:00ZA Survey of Linear-Programming Guided Branching Parameterized Algorithms for Vertex Cover, with Experimental Results
http://hdl.handle.net/1956/10402
A Survey of Linear-Programming Guided Branching Parameterized Algorithms for Vertex Cover, with Experimental Results
Urhaug, Tobias Sørensen
Master thesis
A survey of FPT algorithms for Vertex Cover, parameterized by an above guarantee parameter.
Mon, 01 Jun 2015 00:00:00 GMThttp://hdl.handle.net/1956/104022015-06-01T00:00:00ZMaximum number of objects in graph classes.
http://hdl.handle.net/1956/10394
Maximum number of objects in graph classes.
Hellestø, Marit Kristine Astad
Master thesis
The focus of this thesis is the study and
implementation of two exact exponential time
algorihms. These algorihms finds and lists the
number of minimal dominating sets and the number
of minimal subset feedback vertex sets in chordal
and split graphs.
Specifically the goal of this thesis is to study
the upper and lower bounds on the number of
minimal dominating sets and minimal subset
feedback vertex sets in chordal and split graphs,
to see whether it is possible to improve these
bounds. In fact it will be shown that there
exists a better lower bound on the number of
minimal subset feedback vertex sets in split
graphs.
Sun, 31 May 2015 00:00:00 GMThttp://hdl.handle.net/1956/103942015-05-31T00:00:00ZProjective Simulation compared to reinforcement learning
http://hdl.handle.net/1956/10391
Projective Simulation compared to reinforcement learning
Bjerland, Øystein Førsund
Master thesis
This thesis explores the model of projective simulation
(PS), a novel approach for an artificial intelligence (AI)
agent.
The model of PS learns by interacting with the
environment it is situated in, and allows for simulating
actions before real action is taken. The action selection
is based on a random walk through the episodic &
compositional memory (ECM), which is a network of
clips that represent previous experienced percepts. The
network takes percepts as inputs and returns actions.
Through the rewards from the environment, the clip
network will adjust itself dynamically such that the
probability of doing the most favourable action (.i.e most
rewarded) is increased in similar subsequent situations.
With a feature called generalisation, new internal clips
can be created dynamically such that the network will
grow to a multilayer network, which improves the
classification and grouping of percepts.
In this thesis the PS model will be tested on a large and
complex task, learning to play the classic Mario platform
game. Throughout the thesis the model will be
compared to the typical reinforcement algorithms (RL)
algorithms, Q-Learning and SARSA, by means of
experimental simulations.
A framework for PS was built for this thesis, and games
used in the previous papers that introduced PS were
used to validate the correctness of the framework.
Games are often used as a benchmark for learning
agents, a reason is that the rules of the experiment are
already defined and the evaluation can easily be
compared to human performance. The games that will
be used in this thesis are: The Blocking game, Mountain
Car, Pole Balancing and, finally, Mario. The results show
that the PS model is competitive to RL for complex
tasks, and that the evolving network will improve the
performance.
A quantum version of the PS model has recently been
proven to realise a quadratic speed-up compared to the
classical version, and this was one of the primary
reasons for the introduction of the PS model. This
quadratic speed-up is very promising as training AI is
computationally heavy and requires a large state space.
This thesis will, however, consider only the classical
version of the PS model.
Mon, 01 Jun 2015 00:00:00 GMThttp://hdl.handle.net/1956/103912015-06-01T00:00:00ZChoice of parameter for DP-based FPT algorithms: four case studies
http://hdl.handle.net/1956/10368
Choice of parameter for DP-based FPT algorithms: four case studies
Sæther, Sigve Hortemo
Doctoral thesis
<p>This thesis studies dynamic programming algorithms and structural parameters used when solving computationally hard problems. In particular, we look at algorithms that make use of structural decompositions to overcome difficulties of solving a problem, and find alternative runtime parameterizations for some of these problems.</p>
<p>The algorithms we look at make use of branch decompositions to guide the algorithm when doing dynamic programming. Algorithms of this type comprise of two parts; the first part computes a decomposition of the input, and the second part solves the given problem by dynamic programming over the computed decomposition. By altering what properties of an input instance these decompositions should exploit, the runtime of the complete algorithm will change. We look at four cases where altering the structural properties of the decomposition (i.e., changing what width measure for the decomposition to minimize), is used to improve an algorithm.</p>
<p>The first case looks at using branch decompositions of low maximum matchingwidth (mm-width) instead of tree-decompositions of low treewidth when solving Dominating Set. The result of this is an algorithm that is faster than the treewidth-algorithms on instances where the treewidth is at least 1.55 times the mm-width.</p>
<p>In the second case, we look at using branch decompositions of low splitmatching- width (sm-width) for cases when using tree-decompositions or kexpressions will not do. This study leads to new tractability results for Hamiltonian Cycle, Edge Dominating Set, Chromatic Number, and MaxCut for a class of dense graphs.</p>
<p>For the third case, we look at using branch decompositions of low Q-rank-width as an alternative to using branch decompositions of low rank-width for solving a large class of problems definable as [σ,ρ]-partition problems. This class consists of many domination-type problems such as Dominating Set and Independent Set. One of the results of using such an alternative branch decompositions is that we get an improved worst case runtime for Dominating Set parameterized by the clique-width cw; namely O ∗((cw)O(cw)) over the previous best O ∗(2O((cw)2)).</p>
<p>The fourth case looks at using branch decompositions of low projectionsatisfiable- width (ps-width) for solving #SAT and MaxSAT on CNF formulas. We define the notion of having low ps-width and show that by using a dynamic programming algorithm that makes use of the ps-width of a branch decomposition, we get new tractability results for #SAT and MaxSAT, and a framework unifying many previous tractability results.</p>
<p>We also show that deciding boolean-width of a graph is NP-hard and deciding mim-width of a graph is W[1]-hard. In fact, under the assumption NP =ZPP, we show that we cannot approximate mim-width to within a constant factor in polynomial time.</p>
Mon, 07 Sep 2015 00:00:00 GMThttp://hdl.handle.net/1956/103682015-09-07T00:00:00ZExact algorithms for MAX-2SAT and MAX-3SAT via multidimensional matrix multiplication
http://hdl.handle.net/1956/10348
Exact algorithms for MAX-2SAT and MAX-3SAT via multidimensional matrix multiplication
Petkevich, Yevgheni
Master thesis
In this thesis it is showed how an $O(n^{4-\epsilon})$ algorithm for the cube multiplication problem (that is defined in the thesis) would imply a faster than naive $O^{*}(2^{n(1-\frac{\epsilon}{4})})$ algorithm for the MAX-3SAT problem; this algorithm for MAX-3SAT is a generalization of the algorithm for the MAX-2SAT problem which was proposed by Ryan Williams; and cube multiplication, in turn, is defined as a generalization of the matrix multiplication problem for three-dimensional arrays. Approaches to find a faster than naive algorithm for cube multiplication are considered. Though no such algorithm was found using these approaches, it is showed how a variant of the Strassen algorithm for matrix multiplication could be found using the same approaches. Implementations of these approaches using computer programming and results of computational experiments are discussed.
Mon, 01 Jun 2015 00:00:00 GMThttp://hdl.handle.net/1956/103482015-06-01T00:00:00ZLocalizing Cell Towers from Crowdsourced Measurements
http://hdl.handle.net/1956/10302
Localizing Cell Towers from Crowdsourced Measurements
Rusvik, Johan Alexander Nordstrand
Master thesis
Today, several internet sites exist that aim to provide the locations and number of cellular network antennas worldwide. For example [1],[2] and [3]. What makes this task difficult to accomplish is the lack of information available about the whereabouts and number of antennas. Only in a few countries are correct locations for some cellular network antennas known. Otherwise, these sites base their knowledge about cellular network antenna locations on measurement data collected from crowdsourcing. OpenCellID uses a simple and primitive algorithm for estimating antenna locations based on such measurements. In this thesis we suggest an alternative approach to localize cellular network antennas based on data provided by OpenCellID. We start by giving an introduction to the problem, and give a brief overview of related work. This includes localization of mobile devices in addition to localization of cellular network antennas. We then present some background information for our algorithm development. Next we develop two similar algorithms for localizing cellular network antennas. One utilizes distance between measurements, the other utilizes Received Signal Strength (RSS) values among measurements. We experiment with the two algorithms on theoretical generated test data, and argue that utilizing RSS gives the most accurate estimated antenna locations. Next we present the OpenCellID data. We explore this data in detail before defining two subsets we will test our two algorithms on. One subset contains measurement data where correct antenna locations are known. The other contains measurement data for antennas in the Bergen City Center area. We then estimate cellular network antenna locations with our two algorithms for the two subsets. Our tests will show that utilizing RSS estimates more accurate antenna locations when correct antenna locations are known and can be compared to. We end the thesis by analyzing two measurement distribution patterns, and propose how the algorithms can be improved.
Mon, 01 Jun 2015 00:00:00 GMThttp://hdl.handle.net/1956/103022015-06-01T00:00:00ZA Model of Type Theory in Cubical Sets
http://hdl.handle.net/1956/10215
A Model of Type Theory in Cubical Sets
Bezem, Marcus Aloysius; Coquand, Thierry; Huber, Simon
Journal article
We present a model of type theory with dependent product, sum, and identity, in cubical sets. We describe a universe and explain how to transform an equivalence between two types into an equality. We also explain how to model propositional truncation and the circle. While not expressed internally in type theory, the model is expressed in a constructive metalogic. Thus it is a step towards a computational interpretation of Voevodsky's Univalence Axiom.
Wed, 01 Jan 2014 00:00:00 GMThttp://hdl.handle.net/1956/102152014-01-01T00:00:00ZMaximum number of edges in graph classes under degree and matching constraints
http://hdl.handle.net/1956/9951
Maximum number of edges in graph classes under degree and matching constraints
Måland, Erik Kvam
Master thesis
In extremal graph theory, we ask how large or
small a property of a graph can be, when the
graph has to satisfy certain constraints.
In this thesis, we ask how many edges a graph can
have with restrictions on its degree and matching
number, when the graph belongs to a given graph
class.
The solutions on general graphs and bipartite
graphs are known.
We present here the solution on split graphs,
disjoint union of split graphs and unit interval
graphs.
This is related to the determining the Ramsey
numbers of certain graph classes.
In addition, we present a characterization of
factor-critical chordal graphs in terms of
spanning subgraphs.; I ekstremal grafteori spør vi hvor stor eller
liten en grafs egenskap kan være, gitt at den må
tilfredsstille visse betingelser.
In denne oppgaven, spør vi hvor mange kanter en
graf kan ha når det blir satt restriksjoner på
dens grad og matching tall, og grafen må tilhøre
en gitt grafklasse.
Løsningen på generelle og bipartite grafer er
allerede kjent.
Vi presenterer her løsningen på split grafer,
disjunkt union av split grafer og enhetsintervall
grafer.
Dette er relatert til å bestemme Ramsey tall på
en spesiell samling av grafer.
I tillegg gir vi en karakterisering av faktor-
kritiske kordale grafer ved utspennende
subgrafer.
Tue, 12 May 2015 00:00:00 GMThttp://hdl.handle.net/1956/99512015-05-12T00:00:00ZCommunity Detection in Social Networks
http://hdl.handle.net/1956/9944
Community Detection in Social Networks
Fasmer, Erlend Eindride
Master thesis
Social networks usually display a hierarchy of communities
and it is the task of community detection algorithms to
detect these communities and preferably also their
hierarchical relationships. One common class of such
hierarchical algorithms are the agglomerative algorithms.
These algorithms start with one community per vertex in
the network and keep agglomerating vertices together to
form increasingly larger communities. Another common
class of hierarchical algorithms are the divisive algorithms.
These algorithms start with a single community comprising
all the vertices of the network and then split the network
into several connected components that are viewed as
communities.
We start this thesis by giving an introductory overview of
the field of com- munity detection in part I, including
complex networks, the basic groups of com- munity
definitions, the modularity function, and a description of
common com- munity detection techniques, including
agglomerative and divisive algorithms.
Then we proceed, in part II, with community detection
algorithms that have been implemented and tested, with
refined use of data structures, as part of this thesis. We
start by describing, implementing and testing against
benchmark graphs the greedy hierarchical agglomerative
community detection algorithm proposed by Aaron Clauset,
M. E. J. Newman, and Cristopher Moore in 2004 in the
article Finding community structure in very large networks
[5]. We continue with describing and implementing the
hierarchical divisive algorithm proposed by Filippo
Radicchi, Claudio Castellano, Federico Cecconi, Vittorio
Loreto, and Domenico Parisi in 2004 in the article Defining
and identifying communities in networks [28]. Instead of
testing this algorithm against benchmark graphs we
present a community detection web service that runs the
algorithm by Radicchi et al. on the collaboration networks
in the DBLP database of scientific publi- cations and co-
authorships in the area of computer science. We allow the
user to freely set the many parameters that we have
defined for this algorithm. The final judgment on the results
is measured by the modularity value or can be left to the
knowledgeable user. A rough description of the design of
the algorithms and of the web service is given, and all code
is available at GitHub [10] [9].
Lastly, a few improvements both to the algorithm by
Radicchi et al. and to the web service are presented.
Fri, 01 May 2015 00:00:00 GMThttp://hdl.handle.net/1956/99442015-05-01T00:00:00ZFast Method for Maximum-Flow Problem with Minimum-Lot Sizes
http://hdl.handle.net/1956/9903
Fast Method for Maximum-Flow Problem with Minimum-Lot Sizes
Ganeshan, Vithya
Master thesis
In transportation networks, such as pipeline networks for transporting natural gas, it is often impractical to send across amounts of flow below a certain threshold. Such lower threshold is referred as the minimum-lot size. The network flow is semi-continuous when a network has minimum-lot sizes. In other terms, the flow can be either zero or within the limits of minimum-lot and maximum capacity of the network. When a network includes minimum-lot constraints, the problem of finding maximum-flow becomes complex and exact methods tend to be too time consuming. Since it is not generally required that the solution methods provide the optimal solution, this master thesis proposes a fast (inexact) method to find near-optimum solution.Also, the proposed fast method is experimentally validated and compared with the other relevant approaches available in the literature, and the results are analyzed in detail.
Tue, 03 Mar 2015 00:00:00 GMThttp://hdl.handle.net/1956/99032015-03-03T00:00:00ZMolecular mechanisms of adaptation emerging from the physics and evolution of nucleic acids and proteins
http://hdl.handle.net/1956/9885
Molecular mechanisms of adaptation emerging from the physics and evolution of nucleic acids and proteins
Goncearenco, Alexander; Ma, Binguang; Berezovsky, Igor
Journal article
DNA, RNA and proteins are major biological macromolecules that coevolve and adapt to environments as components of one highly interconnected system. We explore here sequence/structure determinants of mechanisms of adaptation of these molecules, links between them, and results of their mutual evolution. We complemented statistical analysis of genomic and proteomic sequences with folding simulations of RNA molecules, unraveling causal relations between compositional and sequence biases reflecting molecular adaptation on DNA, RNA and protein levels. We found many compositional peculiarities related to environmental adaptation and the life style. Specifically, thermal adaptation of protein-coding sequences in Archaea is characterized by a stronger codon bias than in Bacteria. Guanine and cytosine load in the third codon position is important for supporting the aerobic life style, and it is highly pronounced in Bacteria. The third codon position also provides a tradeoff between arginine and lysine, which are favorable for thermal adaptation and aerobicity, respectively. Dinucleotide composition provides stability of nucleic acids via strong base-stacking in ApG dinucleotides. In relation to coevolution of nucleic acids and proteins, thermostability-related demands on the amino acid composition affect the nucleotide content in the second codon position in Archaea.
Wed, 01 Jan 2014 00:00:00 GMThttp://hdl.handle.net/1956/98852014-01-01T00:00:00ZCombining Aspect-Oriented and Strategic Programming
http://hdl.handle.net/1956/9807
Combining Aspect-Oriented and Strategic Programming
Kalleberg, Karl Trygve; Visser, Eelco
Journal article
Properties such as logging, persistence, debugging, tracing, distribution, performance monitoring and exception handling occur in most programming paradigms and are normally very difficult or even impossible to modularizewith traditional modularization mechanisms because they are cross-cutting. Recently, aspect-oriented programming has enjoyed recognition as a practical solution for separating these concerns. In this paper we describe an extension to the Stratego term rewriting language for capturing such properties. We show our aspect language offers a concise, practical and adaptable solution for dealing with unanticipated algorithm extension for forward data-flow propagation and dynamic type checking of terms. We briefly discuss some of the challenges faced when designing and implementing an aspect extension for and in a rule-based term rewriting system.
Sun, 01 Jan 2006 00:00:00 GMThttp://hdl.handle.net/1956/98072006-01-01T00:00:00ZInterfacing concepts: Why declaration style shouldn't matter
http://hdl.handle.net/1956/9805
Interfacing concepts: Why declaration style shouldn't matter
Bagge, Anya Helene; Haveraaen, Magne
Journal article
<p>A concept (or signature) describes the interface of a set of abstract types by listing the operations that should be supported for those types. When implementing a generic operation, such as sorting, we may then specify requirements such as “elements must be comparable” by requiring that the element type models the Comparable concept. We may also use axioms to describe behaviour that should be common to all models of a concept.</p>
<p>However, the operations specified by the concept are not always the ones that are best suited for the implementation. For example, numbers and matrices may both be addable, but adding two numbers is conveniently done by using a return value, whereas adding a sparse and a dense matrix is probably best achieved by modifying the dense matrix. In both cases, though, we may want to pretend we're using a simple function with a return value, as this most closely matches the notation we know from mathematics. This paper presents two simple concepts to break the notational tie between implementation and use of an operation: functionalisation, which derives a set of canonical pure functions from a procedure; and mutification, which translates calls using the functionalised declarations into calls to the implemented procedure.</p>
Fri, 17 Sep 2010 00:00:00 GMThttp://hdl.handle.net/1956/98052010-09-17T00:00:00ZHybrid visibility compositing and masking for illustrative rendering
http://hdl.handle.net/1956/9802
Hybrid visibility compositing and masking for illustrative rendering
Bruckner, Stefan; Rautek, Peter; Viola, Ivan; Roberts, Mike; Sousa, Mario Costa; Gröller, M. Eduard
Journal article
In this paper, we introduce a novel framework for the compositing of interactively rendered 3D layers tailored to the needs of scientific illustration. Currently, traditional scientific illustrations are produced in a series of composition stages, combining different pictorial elements using 2D digital layering. Our approach extends the layer metaphor into 3D without giving up the advantages of 2D methods. The new compositing approach allows for effects such as selective transparency, occlusion overrides, and soft depth buffering. Furthermore, we show how common manipulation techniques such as masking can be integrated into this concept. These tools behave just like in 2D, but their influence extends beyond a single viewpoint. Since the presented approach makes no assumptions about the underlying rendering algorithms, layers can be generated based on polygonal geometry, volumetric data, point-based representations, or others. Our implementation exploits current graphics hardware and permits real-time interaction and rendering.
Sun, 01 Aug 2010 00:00:00 GMThttp://hdl.handle.net/1956/98022010-08-01T00:00:00ZAxiom-Based Transformations: Optimisation and Testing
http://hdl.handle.net/1956/9801
Axiom-Based Transformations: Optimisation and Testing
Bagge, Anya Helene; Haveraaen, Magne
Journal article
Programmers typically have knowledge about properties of their programs that
aren't explicitly expressed in the code properties that may be very useful for,
e.g., compiler optimisation and automated testing. Although such information is
sometimes written down in a formal or informal specification, it is generally not
accessible to compilers and other tools. However, using the idea of concepts and
axioms in the upcoming C++ standard, we may embed axioms with program code.
In this paper, we sketch how such axioms can be interpreted as rewrite rules and test
oracles. Rewrite rules together with user-defined transformation strategies allow us
to implement program or library-specific optimisations.
Sat, 10 Oct 2009 00:00:00 GMThttp://hdl.handle.net/1956/98012009-10-10T00:00:00ZCurrent Trends for 4D Space-Time Topology for Semantic Flow Segmentation
http://hdl.handle.net/1956/9796
Current Trends for 4D Space-Time Topology for Semantic Flow Segmentation
Matković, Krešimir; Lež, Alan; Hauser, Helwig; Pobitzer, Armin; Theisel, Holger; Kuhn, Alexander; Otto, Mathias; Peikert, Ronald; Schindler, Benjamin; Fuchs, Raphael
Journal article
Sat, 01 Jan 2011 00:00:00 GMThttp://hdl.handle.net/1956/97962011-01-01T00:00:00ZA Diagrammatic Logic for Object-Oriented Visual Modeling
http://hdl.handle.net/1956/9728
A Diagrammatic Logic for Object-Oriented Visual Modeling
Diskin, Zinovy; Wolter, Uwe Egbert
Journal article
Formal generalized sketches is a graph-based specification format that borrows its main ideas from categorical and ordinary first-order logic, and adapts them to software engineering needs. In the engineering jargon, it is a modeling language design pattern that combines mathematical rigor and appealing graphical appearance. The paper presents a careful motivation and justification of the applicability of generalized sketches for formalizing practical modeling notations. We extend the sketch formalism by dependencies between predicate symbols and develop new semantic notions based on the Instances-as-typed-structures idea. We show that this new framework fits in the general patterns of the institution theory and is well amenable to algebraic manipulations. Keywords: Diagrammatic modeling; model management; generic logic; categorical logic; diagram predicate; categorical sketch
Fri, 21 Nov 2008 00:00:00 GMThttp://hdl.handle.net/1956/97282008-11-21T00:00:00ZFusing a Transformation Language with an Open Compiler
http://hdl.handle.net/1956/9726
Fusing a Transformation Language with an Open Compiler
Kalleberg, Karl Trygve; Visser, Eelco
Journal article
<p>Program transformation systems provide powerful analysis and transformation frameworks as well as concise languages for language processing, but instantiating them for every subject language is an arduous task, most often resulting in half-completed frontends. Compilers provide mature frontends with robust parsers and type checkers, but solving language processing problems in general-purpose languages without transformation libraries is tedious. Reusing these frontends with existing transformation systems is therefore attractive. However, for this reuse to be optimal, the functional logic found in the frontend should be exposed to the transformation system – simple data serialization of the abstract syntax tree is not enough, since this fails to expose important compiler functionality, such as import graphs, symbol tables and the type checker.</p>
<p>In this paper, we introduce a novel and general technique for combining term-based transformation systems with existing language frontends. The technique is presented in the context of a scriptable analysis and transformation framework for Java built on top of the Eclipse Java compiler. The framework consists of an adapter automatically extracted from the abstract syntax tree of the compiler and an interpreter for the Stratego program transformation language. The adapter allows the Stratego interpreter to rewrite directly on the compiler AST. We illustrate the applicability of our system with scripts written in Stratego that perform framework and library-specific analyses and transformations.</p>
Tue, 01 Apr 2008 00:00:00 GMThttp://hdl.handle.net/1956/97262008-04-01T00:00:00ZThe Second Rewrite Engines Competition
http://hdl.handle.net/1956/9725
The Second Rewrite Engines Competition
Durán, Francisco; Roldán, Manuel; Balland, Emilie; van den Brand, Mark; Eker, Steven; Kalleberg, Karl Trygve; Kats, Lennart C. L.; Moreau, Pierre-Etienne; Schevchenko, Ruslan; Visser, Eelco
Journal article
The Second Rewrite Engines Competition (REC) was celebrated as part of the 7th Workshop on Rewriting Logic and its Applications (WRLA 2008). In this edition of the competition participated five systems, namely ASF+SDF, Maude, Stratego/XT, TermWare, and Tom. We explain here how the competition was organized and conducted, and present its main results and conclusions.
Mon, 29 Jun 2009 00:00:00 GMThttp://hdl.handle.net/1956/97252009-06-29T00:00:00ZDomain-Specific Languages for Composable Editor Plugins
http://hdl.handle.net/1956/9721
Domain-Specific Languages for Composable Editor Plugins
Kats, Lennart C. L.; Kalleberg, Karl Trygve; Visser, Eelco
Journal article
Modern IDEs increase developer productivity by incorporating many different kinds of editor services. These can be purely syntactic, such as syntax highlighting, code folding, and an outline for navigation; or they can be based on the language semantics, such as in-line type error reporting and resolving identifier declarations. Building all these services from scratch requires both the extensive knowledge of the sometimes complicated and highly interdependent APIs and extension mechanisms of an IDE framework, and an in-depth understanding of the structure and semantics of the targeted language. This paper describes Spoofax/IMP, a meta-tooling suite that provides high-level domain-specific languages for describing editor services, relieving editor developers from much of the framework-specific programming. Editor services are defined as composable modules of rules coupled to a modular SDF grammar. The composability provided by the SGLR parser and the declaratively defined services allows embedded languages and language extensions to be easily formulated as additional rules extending an existing language definition. The service definitions are used to generate Eclipse editor plugins. We discuss two examples: an editor plugin for WebDSL, a domain-specific language for web applications, and the embedding of WebDSL in Stratego, used for expressing the (static) semantic rules of WebDSL.
Fri, 17 Sep 2010 00:00:00 GMThttp://hdl.handle.net/1956/97212010-09-17T00:00:00ZExploring Subexponential Parameterized Complexity of Completion Problems
http://hdl.handle.net/1956/9589
Exploring Subexponential Parameterized Complexity of Completion Problems
Drange, Pål Grønås; Fomin, Fedor; Pilipczuk, Michal Pawel; Villanger, Yngve
Journal article
<p>Let F be a family of graphs. In the F-Completion problem, we are given an n-vertex graph
G and an integer k as input, and asked whether at most k edges can be added to G so that the
resulting graph does not contain a graph from F as an induced subgraph. It appeared recently
that special cases of F-Completion, the problem of completing into a chordal graph known
as Minimum Fill-in, corresponding to the case of F = {C4,C5,C6, . . .}, and the problem of
completing into a split graph, i.e., the case of F = {C4, 2K2,C5}, are solvable in parameterized
subexponential time 2O(√ k log k)nO(1). The exploration of this phenomenon is the main motivation
for our research on F-Completion.</p>
<p>In this paper we prove that completions into several well studied classes of graphs without
long induced cycles also admit parameterized subexponential time algorithms by showing that:</p>
<p>The problem Trivially Perfect Completion is solvable in parameterized subexponential
time 2O(√k log k)nO(1), that is F-Completion for F = {C4, P4}, a cycle and a path on four
vertices.</p>
<p>The problems known in the literature as Pseudosplit Completion, the case where F =
{2K2,C4}, and Threshold Completion, where F = {2K2, P4,C4}, are also solvable in
time 2O(√k log k)nO(1).</p>
<p>We complement our algorithms for F-Completion with the following lower bounds:</p>
<p>For F = {2K2}, F = {C4}, F = {P4}, and F = {2K2, P4}, F-Completion cannot be
solved in time 2o(k)nO(1) unless the Exponential Time Hypothesis (ETH) fails.</p>
<p>Our upper and lower bounds provide a complete picture of the subexponential parameterized
complexity of F-Completion problems for F ⊆ {2K2,C4, P4}.</p>
Presented at the 31st International Symposium on Theoretical Aspects of Computer Science (STACS 2014)
Wed, 19 Feb 2014 00:00:00 GMThttp://hdl.handle.net/1956/95892014-02-19T00:00:00ZOn cutwidth parameterized by vertex cover
http://hdl.handle.net/1956/9561
On cutwidth parameterized by vertex cover
Cygan, Marek; Lokshtanov, Daniel; Pilipczuk, Marcin; Pilipczuk, Michal Pawel; Saurabh, Saket
Journal article
We study the CUTWIDTH problem, where the input is a graph G, and the
objective is find a linear layout of the vertices that minimizes the maximum number of
edges intersected by any vertical line inserted between two consecutive vertices. We
give an algorithm for CUTWIDTH with running time O(2knO(1)). Here k is the size of
a minimum vertex cover of the input graph G, and n is the number of vertices in G.
Our algorithm gives an O(2n/2nO(1)) time algorithm for CUTWIDTH on bipartite
graphs as a corollary. This is the first non-trivial exact exponential time algorithm for
CUTWIDTH on a graph class where the problem remains NP-complete. Additionally,
we show that CUTWIDTH parameterized by the size of the minimum vertex cover of
the input graph does not admit a polynomial kernel unless NP ⊆ coNP/poly. Our kernelization lower bound contrasts with the recent results of Bodlaender et al. (ICALP, Springer, Berlin, 2011; SWAT, Springer, Berlin, 2012) that both Treewidth and Pathwidth parameterized by vertex cover do admit polynomial kernels.
Tue, 01 Apr 2014 00:00:00 GMThttp://hdl.handle.net/1956/95612014-04-01T00:00:00ZParameterized complexity of Eulerian deletion problems
http://hdl.handle.net/1956/9520
Parameterized complexity of Eulerian deletion problems
Cygan, Marek; Pilipczuk, Marcin; Marx, Dániel; Pilipczuk, Michal Pawel; Schlotter, Ildikó
Journal article
We study a family of problems where the goal is to make a graph Eulerian,
i.e., connected and with all the vertices having even degrees, by a minimum
number of deletions. We completely classify the parameterized complexity of various
versions: undirected or directed graphs, vertex or edge deletions, with or without
the requirement of connectivity, etc. The collection of results shows an interesting
contrast: while the node-deletion variants remain intractable, i.e., W[1]-hard for
all the studied cases, edge-deletion problems are either fixed-parameter tractable or
polynomial-time solvable. Of particular interest is a randomized FPT algorithm for
making an undirected graph Eulerian by deleting the minimum number of edges,
based on a novel application of the color coding technique. For versions that remain
NP-complete but fixed-parameter tractable we consider also possibilities of polynomial kernelization; unfortunately, we prove that this is not possible unless
NP ⊆ coNP/poly.
Wed, 01 Jan 2014 00:00:00 GMThttp://hdl.handle.net/1956/95202014-01-01T00:00:00ZSolving the 2-disjoint connected subgraphs problem faster than 2ⁿ
http://hdl.handle.net/1956/9519
Solving the 2-disjoint connected subgraphs problem faster than 2ⁿ
Cygan, Marek; Pilipczuk, Marcin; Pilipczuk, Michal Pawel; Wojtaszczyk, Jakub Onufry
Journal article
The 2-DISJOINT CONNECTED SUBGRAPHS problem, given a graph along
with two disjoint sets of terminals Z1,Z2, asks whether it is possible to find disjoint
sets A1,A2, such that Z1 ⊆ A1, Z2 ⊆ A2 and A1,A2 induce connected subgraphs.
While the naive algorithm runs in O(2nnO(1)) time, solutions with complexity
of form O((2 − ε)n) have been found only for special graph classes (van ’t Hof
et al. in Theor. Comput. Sci. 410(47–49):4834–4843, 2009; Paulusma and van Rooij
in Theor. Comput. Sci. 412(48):6761–6769, 2011). In this paper we present an
O(1.933n) algorithm for 2-DISJOINT CONNECTED SUBGRAPHS in general case,
thus breaking the 2n barrier. As a counterpoise of this result we show that if we parameterize
the problem by the number of non-terminal vertices, it is hard both to
speed up the brute-force approach and to find a polynomial kernel.
Wed, 01 Oct 2014 00:00:00 GMThttp://hdl.handle.net/1956/95192014-10-01T00:00:00ZManaging spatial selections with contextual snapshots
http://hdl.handle.net/1956/9464
Managing spatial selections with contextual snapshots
Mindek, Peter; Gröller, Eduard; Bruckner, Stefan
Journal article
Spatial selections are a ubiquitous concept in visualization. By localizing particular features, they can be analysed and compared
in different views. However, the semantics of such selections often depend on specific parameter settings and it can be difficult
to reconstruct them without additional information. In this paper, we present the concept of contextual snapshots as an effective
means for managing spatial selections in visualized data. The selections are automatically associated with the context in which
they have been created. Contextual snapshots can also be used as the basis for interactive integrated and linked views, which
enable in-place investigation and comparison of multiple visual representations of data. Our approach is implemented as a
flexible toolkit with well-defined interfaces for integration into existing systems. We demonstrate the power and generality of our
techniques by applying them to several distinct scenarios such as the visualization of simulation data, the analysis of historical
documents and the display of anatomical data.
Mon, 01 Dec 2014 00:00:00 GMThttp://hdl.handle.net/1956/94642014-12-01T00:00:00ZScaling the scales - A suggested improvement to IBM's Intelligent Recommendation Algorithm
http://hdl.handle.net/1956/9211
Scaling the scales - A suggested improvement to IBM's Intelligent Recommendation Algorithm
Myrtveit, Magnar
Master thesis
Recommender systems appear in a large variety of applications, and their use has become very common in recent years. As a lot of money can be made by companies having a better recommender system than their competitors, much of the research behind the best recommendation algorithms is proprietary and has not been published. We suggest an improvement to a graph-based collaborative filtering recommendation algorithm developed at IBM Research and published in "KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery data mining" in 1999: the Intelligent Recommendation Algorithm (IRA). We start by giving an overview of the field of recommender systems, how they work, and how they can be utilized. Next we give a detailed description of the graph-theoretical ideas IRA is built upon, and take an in-depth look at how the algorithm works. We then present a suggested improvement to IRA, called Scaling the scales. In Scaling the scales, customers using differently sized ranges of the rating scale can still be used to predict each other. We give a short overview of the design behind the implementation of our recommender system, which is available at GitHub (https://github.com/Stadly/Scaling-the-scales), as well as BORA (https://bora-uib-no.pva.uib.no), where this thesis is published. The implementation uses a modular software design to allow for developing extensions to the recommendation algorithms we have implemented, as well as deploying other recommendation algorithms in our system. Next we compare the recommendation quality of IRA and Scaling the scales using leave-one-out cross-validation. This method works by hiding a known rating, predicting what this rating should be, and comparing the correct and predicted ratings. Experiments on four different datasets were run. Our results indicate that Scaling the scales give slightly smaller errors than IRA when predicting ratings. A bigger improvement is that Scaling the scales calculates predictions in many cases where IRA is unsuccessful. Lastly, ideas for further improvement of Scaling the scales are presented.
Thu, 20 Nov 2014 00:00:00 GMThttp://hdl.handle.net/1956/92112014-11-20T00:00:00ZThe index tracking problem with a limit on portfolio size
http://hdl.handle.net/1956/9185
The index tracking problem with a limit on portfolio size
Mutunge, Purity Kamene
Master thesis
For a passive fund manager tracking a benchmark, it is not uncommon to select some, and not all the assets in the index to his portfolio. In this thesis, we consider the problem of minimizing the tracking error under the mean--variance formulation which gives us a quadratic objective function. Our model includes a cardinality constraint, that puts a limit on the portfolio size. Our problem is a mixed integer nonlinear problem with a convex, quadratic objective function. For this NP-Hard problem, we apply continuous as well as Lagrangian relaxations. We illustrate a subgradient algorithm, modified to our problem. We also present two construction and three improvement heuristics to this problem. Our approaches are compared to the results of an exact and an interrupted solver and computational time is of interest. Our data sets range from 50-400 (500), with real constituent weights from S&P Dow Jones Indices for the largest set of index.
Wed, 19 Nov 2014 00:00:00 GMThttp://hdl.handle.net/1956/91852014-11-19T00:00:00ZClassifying and Measuring Student Problems and Misconceptions
http://hdl.handle.net/1956/9081
Classifying and Measuring Student Problems and Misconceptions
Rosbach, Alexander Hoem; Bagge, Anya Helene
Chapter
In this paper we report on an attempt to classify student problems and
mistakes, and measuring the frequency of particular problems in a firstsemester
programming course. We also propose a scheme for annotating
student hand-ins which may be useful both in grading and in future research.
Tue, 01 Jan 2013 00:00:00 GMThttp://hdl.handle.net/1956/90812013-01-01T00:00:00ZRate and power allocation for discrete-rate link adaptation
http://hdl.handle.net/1956/8985
Rate and power allocation for discrete-rate link adaptation
Gjendemsjø, Anders; Øien, Geir Egil; Holm, Henrik; Alouini, Mohamed-Slim; Gesbert, David; Hole, Kjell Jørgen; Orten, Pål
Journal article
Link adaptation, in particular adaptive coded modulation (ACM), is a promising tool for bandwidth-efficient transmission in a fading environment. The main motivation behind employing ACM schemes is to improve the spectral efficiency of wireless communication systems. In this paper, using a finite number of capacity achieving component codes, we propose new transmission schemes employing constant power transmission, as well as discrete- and continuous-power adaptation, for slowly varying flat-fading channels. We show that the proposed transmission schemes can achieve throughputs close to the Shannon limits of flat-fading channels using only a small number of codes. Specifically, using a fully discrete scheme with just four codes, each associated with four power levels, we achieve a spectral efficiency within 1 dB of the continuous-rate continuous-power Shannon capacity. Furthermore, when restricted to a fixed number of codes, the introduction of power adaptation has significant gains with respect to average spectral efficiency and probability of no transmission compared to a constant power scheme.
Wed, 09 Jan 2008 00:00:00 GMThttp://hdl.handle.net/1956/89852008-01-09T00:00:00ZA Hierarchical Splitting Scheme to Reveal Insight into Highly Self-Occluded Integral Surfaces
http://hdl.handle.net/1956/8965
A Hierarchical Splitting Scheme to Reveal Insight into Highly Self-Occluded Integral Surfaces
Brambilla, Andrea; Viola, Ivan; Hauser, Helwig
Journal article
In flow visualization, integral surfaces are of particular interest for
their ability to describe trajectories of massless particles. In areas
of swirling motion, integral surfaces can become very complex and
difficult to understand. Taking inspiration from traditional illustration
techniques, such as cut-aways and exploded views, we propose
a surface analysis tool based on surface splitting and focus+context
visualization. Our surface splitting scheme is hierarchical and at every
level of the hierarchy the best cut is chosen according to a surface
complexity metric. In order to make the interpretation of the resulting
pieces straightforward, cuts are always made along isocurves of
specific flow attributes. Moreover, a degree of interest can be specified,
so that the splitting procedure attempts to unveil the occluded
interesting areas. Through practical examples, we show that our approach
is able to overcome the lack of understanding originating from
structural occlusion.
Sun, 01 Jan 2012 00:00:00 GMThttp://hdl.handle.net/1956/89652012-01-01T00:00:00ZIntegrated Multi-aspect Visualization of 3D Fluid Flows
http://hdl.handle.net/1956/8963
Integrated Multi-aspect Visualization of 3D Fluid Flows
Brambilla, Andrea; Andreassen, Øyvind; Hauser, Helwig
Chapter
The motion of a fluid is affected by several intertwined flow aspects.
Analyzing one aspect at a time can only yield partial
information about the flow behavior. More details can be revealed by
studying their interactions. Our approach enables the investigation
of these interactions by simultaneously visualizing meaningful flow
aspects, such as swirling motion and shear strain. We adopt the notions
of relevance and coherency. Relevance identifies locations where
a certain flow aspect is deemed particularly important. The related
piece of information is visualized by a specific visual entity, placed at
the corresponding location. Coherency instead represents the homogeneity
of a flow property in a local neighborhood. It is exploited in
order to avoid visual redundancy and to reduce occlusion and cluttering.
We have applied our approach to three CFD datasets, obtaining
meaningful insights.
The definitive version is available at http://diglib.eg.org/
Tue, 01 Jan 2013 00:00:00 GMThttp://hdl.handle.net/1956/89632013-01-01T00:00:00ZIllustrative Flow Visualization: State of the Art, Trends and Challenges
http://hdl.handle.net/1956/8962
Illustrative Flow Visualization: State of the Art, Trends and Challenges
Brambilla, Andrea; Carnecky, Robert; Peikert, Robert; Viola, Ivan; Hauser, Helwig
Journal article
Flow visualization is a well established branch of scientific visualization
and it currently represents an invaluable resource to
many fields, like automotive design, meteorology and medical imaging.
Thanks to the capabilities of modern hardware, flow datasets
are increasing in size and complexity, and traditional flow visualization
techniques need to be updated and improved in order to deal
with the upcoming challenges. A fairly recent trend to enhance the
expressiveness of scientific visualization is to produce depictions of
physical phenomena taking inspiration from traditional handcrafted
illustrations: this approach is known as illustrative visualization, and
it is getting a foothold in flow visualization as well.
In this state of the art report we give an overview of the existing
illustrative techniques for flow visualization, we highlight which
problems have been solved and which issues still need further investigation,
and, finally, we provide remarks and insights on the current
trends in illustrative flow visualization.
The definitive version is available at http://diglib.eg.org/
Sun, 01 Jan 2012 00:00:00 GMThttp://hdl.handle.net/1956/89622012-01-01T00:00:00ZVisibility-oriented Visualization Design for Flow Illustration
http://hdl.handle.net/1956/8961
Visibility-oriented Visualization Design for Flow Illustration
Brambilla, Andrea
Doctoral thesis
<p>Flow phenomena are ubiquitous in our world and they affect many aspects of our daily life. For this reason, they are the subject of extensive studies in several research fields. In medicine, the blood flow through our vessels can reveal important information about cardiovascular diseases. The air flow around a vehicle and the motion of fluids in a combustion engine are examples of relevant flow phenomena in engineering disciplines. Meteorologists, climatologists and oceanographers are instead concerned with winds and water currents.</p>
<p>Thanks to the recent advancements in computational fluid dynamics and to the increasing power of modern hardware, accurate simulations of flow phenomena are feasible nowadays. The evolution of multiple flow attributes, such as velocity, temperature and pressure, can be simulated over large spatial and temporal domains (4D). The amount of data generated by this process is massive, therefore visualization techniques are often adopted in order to ease the analysis phase. The overall goal is to convey information about the phenomena of interest through a suitable representation of the data at hand. Due to the multivariate and multidimensional nature of the data, visibility issues (such as cluttering and occlusion), represent a significant challenge.</p>
<p>Flow visualization can greatly benefit from studying and addressing visibility issues already in the design phase. In this thesis we investigate and demonstrate the effectiveness of taking visibility management into account early in the design process. We apply this principle to three characteristic flow visualization scenarios: (1) The simultaneous visualization of multiple flow attributes. (2) The visual inspection of single and multiple integral surfaces. (3) The selection of seeding curves for constructing families of integral surfaces. Our techniques result in clutter- and occlusion-free visualizations, which effectively illustrate the key aspects of the flow behavior.</p>
<p>For demonstration purposes, we have applied our approaches to a number of application cases. Additionally, we have discussed our visualization designs with domain experts. They showed a genuine interest in our work and provided insightful suggestions for future research directions.</p>
Thu, 18 Dec 2014 00:00:00 GMThttp://hdl.handle.net/1956/89612014-12-18T00:00:00ZParsing in a Broad Sense
http://hdl.handle.net/1956/8938
Parsing in a Broad Sense
Zaytsev, Vadim; Bagge, Anya Helene
Chapter
Having multiple representations of the same instance is common in software language engineering: models can be visualised as graphs, edited as text, serialised as XML. When mappings between such representations are considered, terms “parsing” and “unparsing” are often used with incompatible meanings and varying sets of underlying assumptions. We investigate 12 classes of artefacts found in software language processing, present a case study demonstrating their implementations and state-of-the-art mappings among them, and systematically explore the technical research space of bidirectional mappings to build on top of the existing body of work and discover as of yet unused relationships.
Wed, 01 Jan 2014 00:00:00 GMThttp://hdl.handle.net/1956/89382014-01-01T00:00:00ZReflections on Courses for Software Language Engineering
http://hdl.handle.net/1956/8930
Reflections on Courses for Software Language Engineering
Bagge, Anya Helene; Lämmel, Ralf; Zaytsev, Vadim
Conference object
Software Language Engineering (SLE) has emerged as a field in computer science research and software engineering, but it has yet to become entrenched as part of the standard curriculum at universities. Many places have a compiler construction (CC) course and a programming languages (PL) course, but these are not aimed at training students in typical SLE matters such as DSL design and implementation, language workbenches, generalised parsers, and meta-tools. We describe our experiences with developing and teaching software language engineering courses at the Universities of Bergen and Koblenz-Landau. We reflect on lecture topics, assignments, development of course material, and other aspects and variation points in course design.
Wed, 01 Jan 2014 00:00:00 GMThttp://hdl.handle.net/1956/89302014-01-01T00:00:00ZSeparating Exceptional Concerns
http://hdl.handle.net/1956/8918
Separating Exceptional Concerns
Bagge, Anya Helene
Chapter
Traditional error handling mechanisms, including
exceptions, have several weaknesses that interfere with maintainability,
flexibility and genericity in software: Error code is
tangled with normal code; reporting is tangled with handling;
and generic code is locked into specific ways of reporting and
handling errors. We need to deal with errors in a declarative
way, where the concerns of errors, error reporting and error
handling are separated and dealt with individually by the
programmer.
Sun, 01 Jan 2012 00:00:00 GMThttp://hdl.handle.net/1956/89182012-01-01T00:00:00ZInferring Required Permissions for Statically Composed Programs
http://hdl.handle.net/1956/8916
Inferring Required Permissions for Statically Composed Programs
Hasu, Tero; Bagge, Anya Helene; Haveraaen, Magne
Chapter
Permission-based security models are common in smartphone
operating systems. Such models implement access control for sensitive
APIs, introducing an additional concern for application developers. It is
important for the correct set of permissions to be declared for an application,
as too small a set is likely to result in runtime errors, whereas
too large a set may needlessly worry users. Unfortunately, not all platform
vendors provide tools support to assist in determining the set of
permissions that an application requires.
We present a language-based solution for permission management. It
entails the specification of permission information within a collection of
source code, and allows for the inference of permission requirements for a
chosen program composition. Our implementation is based on Magnolia,
a programming language demonstrating characteristics that are favorable
for this use case. A language with a suitable component system supports
permission management also in a cross-platform codebase, allowing abstraction
over different platform-specific implementations and concrete
permission requirements. When the language also requires any “wiring”
of components to be known at compile time, and otherwise makes design
tradeoffs that favor ease of static analysis, then accurate inference
of permission requirements becomes possible.
Tue, 01 Jan 2013 00:00:00 GMThttp://hdl.handle.net/1956/89162013-01-01T00:00:00ZA Pretty Good Formatting Pipeline
http://hdl.handle.net/1956/8915
A Pretty Good Formatting Pipeline
Bagge, Anya Helene; Hasu, Tero
Chapter
Proper formatting makes the structure of a program apparent
and aids program comprehension. The need to format code arises in
code generation and transformation, as well as in normal reading and
editing situations. Commonly used pretty-printing tools in transformation
frameworks provide an easy way to produce indented code that is
fairly readable for humans, without reaching the level of purpose-built
reformatting tools, such as those built into IDEs. This paper presents a
library of pluggable components, built to support style-based formatting
and reformatting of code, and to enable further experimentation with
code formatting.
Tue, 01 Jan 2013 00:00:00 GMThttp://hdl.handle.net/1956/89152013-01-01T00:00:00ZWalk Your Tree Any Way You Want
http://hdl.handle.net/1956/8914
Walk Your Tree Any Way You Want
Bagge, Anya Helene; Lämmel, Ralf
Chapter
Software transformations in the Nuthatch style are described
as walks over trees (possibly graphs) that proceed in programmerdefined
steps which may observe join points of the walk, may observe
and affect state associated with the walk, may rewrite the walked tree,
may contribute to a built tree, and must walk somewhere, typically along
one branch or another. The approach blends well with OO programming.
We have implemented the approach in the Nuthatch/J library for Java.
Tue, 01 Jan 2013 00:00:00 GMThttp://hdl.handle.net/1956/89142013-01-01T00:00:00ZCommunity-driven development for computational biology at Sprints, Hackathons and Codefests
http://hdl.handle.net/1956/8861
Community-driven development for computational biology at Sprints, Hackathons and Codefests
Möller, Steffen; Afgan, Enis; Banck, Michael; Bonnal, Raoul J. P.; Booth, Timothy; Chilton, John; Cock, Peter J. A.; Gumbel, Markus; Harris, Nomi; Holland, Richard; Kalaš, Matúš; Kaján, László; Kibukawa, Eri; Powel, David R.; Prins, Pjotr; Quinn, Jacqueline; Sallou, Olivier; Strozzi, Francesco; Seemann, Torsten; Sloggett, Clare; Soiland-Reyes, Stian; Spooner, William; Steinbiss, Sascha; Tille, Andreas; Travis, Anthony J.; Guimera, Roman V.; Katayama, Toshiaki; Chapman, Brad A.
Journal article
<p>Background: Computational biology comprises a wide range of technologies and approaches. Multiple technologies can be combined to create more powerful workflows if the individuals contributing the data or providing tools for its interpretation can find mutual understanding and consensus. Much conversation and joint investigation are required in order to identify and implement the best approaches. Traditionally, scientific conferences feature talks presenting novel technologies or insights, followed up by informal discussions during coffee breaks. In multi-institution collaborations, in order to reach agreement on implementation details or to transfer deeper insights in a technology and practical skills, a representative of one group typically visits the other. However, this does not scale well when the number of technologies or research groups is large. Conferences have responded to this issue by introducing Birds-of-a-Feather (BoF) sessions, which offer an opportunity for individuals with common interests to intensify their interaction. However, parallel BoF sessions often make it hard for participants to join multiple BoFs and find common ground between the different technologies, and BoFs are generally too short to allow time for participants to program together.</p>
<p>Results: This report summarises our experience with computational biology Codefests, Hackathons and Sprints, which are interactive developer meetings. They are structured to reduce the limitations of traditional scientific meetings described above by strengthening the interaction among peers and letting the participants determine the schedule and topics. These meetings are commonly run as loosely scheduled “unconferences” (self-organized identification of participants and topics for meetings) over at least two days, with early introductory talks to welcome and organize contributors, followed by intensive collaborative coding sessions. We summarise some prominent achievements of those meetings and describe differences in how these are organised, how their audience is addressed, and their outreach to their respective communities.</p>
<p>Conclusions: Hackathons, Codefests and Sprints share a stimulating atmosphere that encourages participants to jointly brainstorm and tackle problems of shared interest in a self-driven proactive environment, as well as providing an opportunity for new participants to get involved in collaborative projects.</p>
Thu, 27 Nov 2014 00:00:00 GMThttp://hdl.handle.net/1956/88612014-11-27T00:00:00ZContinuous Levels-of-Detail and Visual Abstraction for Seamless Molecular Visualization
http://hdl.handle.net/1956/8875
Continuous Levels-of-Detail and Visual Abstraction for Seamless Molecular Visualization
Parulek, Julius; Jönsson, Daniel; Ropinski, Timo; Bruckner, Stefan; Ynnerman, Anders; Viola, Ivan
Journal article
Molecular visualization is often challenged with rendering of large molecular structures in real time. We introduce a novel approach that enables us to show even large protein complexes. Our method is based on the level-of-detail concept, where we exploit three different abstractions combined in one visualization. Firstly, molecular surface abstraction exploits three different surfaces, solvent-excluded surface (SES), Gaussian kernels and van der Waals spheres, combined as one surface by linear interpolation. Secondly, we introduce three shading abstraction levels and a method for creating seamless transitions between these representations. The SES representation with full shading and added contours stands in focus while on the other side a sphere representation of a cluster of atoms with constant shading and without contours provide the context. Thirdly, we propose a hierarchical abstraction based on a set of clusters formed on molecular atoms. All three abstraction models are driven by one importance function classifying the scene into the near-, mid- and far-field. Moreover, we introduce a methodology to render the entire molecule directly using the A-buffer technique, which further improves the performance. The rendering performance is evaluated on series of molecules of varying atom counts.
Tue, 06 May 2014 00:00:00 GMThttp://hdl.handle.net/1956/88752014-05-06T00:00:00ZPerceptually Uniform Motion Space
http://hdl.handle.net/1956/8852
Perceptually Uniform Motion Space
Birkeland, Åsmund; Turkay, Cagatay; Viola, Ivan
Journal article
Flow data is often visualized by animated particles inserted into a flow field. The velocity of a particle on the screen is typically linearly scaled by the velocities in the data. However, the perception of velocity magnitude in animated particles is not necessarily linear. We present a study on how different parameters affect relative motion perception. We have investigated the impact of four parameters. The parameters consist of speed multiplier, direction, contrast type and the global velocity scale. In addition, we investigated if multiple motion cues, and point distribution, affect the speed estimation. Several studies were executed to investigate the impact of each parameter. In the initial results, we noticed trends in scale and multiplier. Using the trends for the significant parameters, we designed a compensation model, which adjusts the particle speed to compensate for the effect of the parameters. We then performed a second study to investigate the performance of the compensation model. From the second study we detected a constant estimation error, which we adjusted for in the last study. In addition, we connect our work to established theories in psychophysics by comparing our model to a model based on Stevens' Power Law.
Wed, 07 May 2014 00:00:00 GMThttp://hdl.handle.net/1956/88522014-05-07T00:00:00ZTMM@: a web application for the analysis of transmembrane helix mobility
http://hdl.handle.net/1956/8844
TMM@: a web application for the analysis of transmembrane helix mobility
Skjærven, Lars; Jonassen, Inge; Reuter, Nathalie
Journal article
<p>Background: To understand the mechanism by which a protein transmits a signal through the cell
membrane, an understanding of the flexibility of its transmembrane (TM) region is essential.
Normal Mode Analysis (NMA) has become the method of choice to investigate the slowest
motions in macromolecular systems. It has been widely used to study transmembrane channels and
pumps. It relies on the hypothesis that the vibrational normal modes having the lowest frequencies
(also named soft modes) describe the largest movements in a protein and are the ones that are
functionally relevant. In particular NMA can be used to study dynamics of TM regions, but no tool
making this approach available for non-experts, has been available so far.</p><p>Results: We developed the web-application TMM@ (TransMembrane α-helical Mobility analyzer).
It uses NMA to characterize the propensity of transmembrane α-helices to be displaced. Starting
from a structure file at the PDB format, the server computes the normal modes of the protein and
identifies which helices in the bundle are the most mobile. Each analysis is performed independently
from the others and results can be visualized using only a web browser. No additional plug-in or
software is required. For users who would like to further analyze the output data with their
favourite software, raw results can also be downloaded.</p><p>Conclusion: We built a novel and unique tool, TMM@, to study the mobility of transmembrane
α-helices. The tool can be applied to for example membrane transporters and provides biologists
studying transmembrane proteins with an approach to investigate which α-helices are likely to
undergo the largest displacements, and hence which helices are most likely to be involved in the
transportation of molecules in and out of the cell.</p>
Mon, 02 Jul 2007 00:00:00 GMThttp://hdl.handle.net/1956/88442007-07-02T00:00:00ZA simple spreadsheet-based, MIAME-supportive format for microarray data: MAGE-TAB
http://hdl.handle.net/1956/8843
A simple spreadsheet-based, MIAME-supportive format for microarray data: MAGE-TAB
Rayner, Tim F.; Rocca-Serra, Philippe; Spellman, Paul T.; Causton, Helen C.; Farne, Anna; Holloway, Ele; Irizarry, Rafael A.; Liu, Junmin; Maier, Donald S.; Miller, Michael; Petersen, Kjell; Quackenbush, John; Sherlock, Gavin; Stoeckert, Christian J. Jr; White, Joseph; Whetzel, Patricia L.; Wymore, Farrell; Parkinson, Helen; Sarkans, Ugis; Ball, Catherine A.; Brazma, Alvis
Journal article
<p>Background: Sharing of microarray data within the research community has been greatly
facilitated by the development of the disclosure and communication standards MIAME and MAGEML
by the MGED Society. However, the complexity of the MAGE-ML format has made its use
impractical for laboratories lacking dedicated bioinformatics support.</p><p>Results: We propose a simple tab-delimited, spreadsheet-based format, MAGE-TAB, which will
become a part of the MAGE microarray data standard and can be used for annotating and
communicating microarray data in a MIAME compliant fashion.</p><p>Conclusion: MAGE-TAB will enable laboratories without bioinformatics experience or support
to manage, exchange and submit well-annotated microarray data in a standard format using a
spreadsheet. The MAGE-TAB format is self-contained, and does not require an understanding of
MAGE-ML or XML.</p>
Mon, 06 Nov 2006 00:00:00 GMThttp://hdl.handle.net/1956/88432006-11-06T00:00:00ZInteractively illustrating polymerization using three-level model fusion
http://hdl.handle.net/1956/8761
Interactively illustrating polymerization using three-level model fusion
Kolesar, Ivan; Parulek, Julius; Viola, Ivan; Bruckner, Stefan; Stavrum, Anne-Kristin; Hauser, Helwig
Journal article
<p>Background: Research in cell biology is steadily contributing new knowledge about many aspects of physiological processes, both with respect to the involved molecular structures as well as their related function. Illustrations of the spatio-temporal development of such processes are not only used in biomedical education, but also can serve scientists as an additional platform for in-silico experiments.</p>
<p>Results: In this paper, we contribute a new, three-level modeling approach to illustrate physiological processes from the class of polymerization at different time scales. We integrate physical and empirical modeling, according to which approach best suits the different involved levels of detail, and we additionally enable a form of interactive steering, while the process is illustrated. We demonstrate the suitability of our approach in the context of several polymerization processes and report from a first evaluation with domain experts.</p>
<p>Conclusion: We conclude that our approach provides a new, hybrid modeling approach for illustrating the process of emergence in physiology, embedded in a densely filled environment. Our approach of a complementary fusion of three systems combines the strong points from the different modeling approaches and is capable to bridge different spatial and temporal scales.</p>
Tue, 14 Oct 2014 00:00:00 GMThttp://hdl.handle.net/1956/87612014-10-14T00:00:00ZIdentifying elemental genomic track types and representing them uniformly
http://hdl.handle.net/1956/8724
Identifying elemental genomic track types and representing them uniformly
Gundersen, Sveinung; Kalaš, Matúš; Abul, Osman; Frigessi, Arnoldo; Hovig, Eivind; Sandve, Geir Kjetil
Journal article
<p>Background: With the recent advances and availability of various high-throughput sequencing technologies, data on many molecular aspects, such as gene regulation, chromatin dynamics, and the three-dimensional organization of DNA, are rapidly being generated in an increasing number of laboratories. The variation in biological context, and the increasingly dispersed mode of data generation, imply a need for precise, interoperable and flexible representations of genomic features through formats that are easy to parse. A host of alternative formats are currently available and in use, complicating analysis and tool development. The issue of whether and how the multitude of formats reflects varying underlying characteristics of data has to our knowledge not previously been systematically treated.</p>
<p>Results: We here identify intrinsic distinctions between genomic features, and argue that the distinctions imply that a certain variation in the representation of features as genomic tracks is warranted. Four core informational properties of tracks are discussed: gaps, lengths, values and interconnections. From this we delineate fifteen generic track types. Based on the track type distinctions, we characterize major existing representational formats and find that the track types are not adequately supported by any single format. We also find, in contrast to the XML formats, that none of the existing tabular formats are conveniently extendable to support all track types. We thus propose two unified formats for track data, an improved XML format, BioXSD 1.1, and a new tabular format, GTrack 1.0.</p>
<p>Conclusions: The defined track types are shown to capture relevant distinctions between genomic annotation tracks, resulting in varying representational needs and analysis possibilities. The proposed formats, GTrack 1.0 and BioXSD 1.1, cater to the identified track distinctions and emphasize preciseness, flexibility and parsing convenience.</p>
Fri, 30 Dec 2011 00:00:00 GMThttp://hdl.handle.net/1956/87242011-12-30T00:00:00ZThe male germ cell gene regulator CTCFL is functionally different from CTCF and binds CTCF-like consensus sites in a nucleosome composition-dependent manner
http://hdl.handle.net/1956/8717
The male germ cell gene regulator CTCFL is functionally different from CTCF and binds CTCF-like consensus sites in a nucleosome composition-dependent manner
Sleutels, Frank; Soochit, Widia; Bartkuhn, Marek; Heath, Helen; Dienstbach, Sven; Bergmaier, Philipp; Franke, Vedran; Rosa-Garrido, Manuel; van de Nobelen, Suzanne; Caesar, Lisa; van der Reijden, Michael I.J.A.; Bryne, Jan Christian; van Ijcken, Wilfred F.J.; Grootegoed, J. Anton; Delgado, M. Dolores; Lenhard, Boris; Renkawitz, Rainer; Grosveld, Frank; Galjart, Niels
Journal article
<p>Background: CTCF is a highly conserved and essential zinc finger protein expressed in virtually all cell types. In conjunction with cohesin, it organizes chromatin into loops, thereby regulating gene expression and epigenetic events. The function of CTCFL or BORIS, the testis-specific paralog of CTCF, is less clear.</p>
<p>Results: Using immunohistochemistry on testis sections and fluorescence-based microscopy on intact live seminiferous tubules, we show that CTCFL is only transiently present during spermatogenesis, prior to the onset of meiosis, when the protein co-localizes in nuclei with ubiquitously expressed CTCF. CTCFL distribution overlaps completely with that of Stra8, a retinoic acid-inducible protein essential for the propagation of meiosis. We find that absence of CTCFL in mice causes sub-fertility because of a partially penetrant testicular atrophy. CTCFL deficiency affects the expression of a number of testis-specific genes, including Gal3st1 and Prss50. Combined, these data indicate that CTCFL has a unique role in spermatogenesis. Genome-wide RNA expression studies in ES cells expressing a V5- and GFP-tagged form of CTCFL show that genes that are downregulated in CTCFL-deficient testis are upregulated in ES cells. These data indicate that CTCFL is a male germ cell gene regulator. Furthermore, genome-wide DNA-binding analysis shows that CTCFL binds a consensus sequence that is very similar to that of CTCF. However, only ~3,700 out of the ~5,700 CTCFL- and ~31,000 CTCF-binding sites overlap. CTCFL binds promoters with loosely assembled nucleosomes, whereas CTCF favors consensus sites surrounded by phased nucleosomes. Finally, an ES cell-based rescue assay shows that CTCFL is functionally different from CTCF.</p>
<p>Conclusions: Our data suggest that nucleosome composition specifies the genome-wide binding of CTCFL and CTCF. We propose that the transient expression of CTCFL in spermatogonia and preleptotene spermatocytes serves to occupy a subset of promoters and maintain the expression of male germ cell genes.</p>
Mon, 18 Jun 2012 00:00:00 GMThttp://hdl.handle.net/1956/87172012-06-18T00:00:00ZHighlights from the Eighth International Society for Computational Biology (ISCB) Student Council Symposium 2012
http://hdl.handle.net/1956/8636
Highlights from the Eighth International Society for Computational Biology (ISCB) Student Council Symposium 2012
Goncearenco, Alexander; Grynberg, Priscila; Botvinnik, Olga B.; Macintyre, Geoff; Abeel, Thomas
Journal article
Abstract
The report summarizes the scientific content of the annual symposium organized by the Student Council of the International Society for Computational Biology (ISCB) held in conjunction with the Intelligent Systems for Molecular Biology (ISMB) conference in Long Beach, California on July 13, 2012.
Fri, 14 Dec 2012 00:00:00 GMThttp://hdl.handle.net/1956/86362012-12-14T00:00:00ZFreeContact: fast and free software for protein contact prediction from residue co-evolution
http://hdl.handle.net/1956/8604
FreeContact: fast and free software for protein contact prediction from residue co-evolution
Kaján, László; Hopf, Thomas A.; Kalaš, Matúš; Marks, Debora S.; Rost, Burkhard
Journal article
<p>Background: 20 years of improved technology and growing sequences now renders residue-residue contact constraints in large protein families through correlated mutations accurate enough to drive de novo predictions of protein three-dimensional structure. The method EVfold broke new ground using mean-field Direct Coupling Analysis (EVfold-mfDCA); the method PSICOV applied a related concept by estimating a sparse inverse covariance matrix. Both methods (EVfold-mfDCA and PSICOV) are publicly available, but both require too much CPU time for interactive applications. On top, EVfold-mfDCA depends on proprietary software.</p>
<p>Results: Here, we present FreeContact, a fast, open source implementation of EVfold-mfDCA and PSICOV. On a test set of 140 proteins, FreeContact was almost eight times faster than PSICOV without decreasing prediction performance. The EVfold-mfDCA implementation of FreeContact was over 220 times faster than PSICOV with negligible performance decrease. EVfold-mfDCA was unavailable for testing due to its dependency on proprietary software. FreeContact is implemented as the free C++ library “libfreecontact”, complete with command line tool “freecontact”, as well as Perl and Python modules. All components are available as Debian packages. FreeContact supports the BioXSD format for interoperability.</p>
<p>Conclusions: FreeContact provides the opportunity to compute reliable contact predictions in any environment (desktop or cloud).</p>
Wed, 26 Mar 2014 00:00:00 GMThttp://hdl.handle.net/1956/86042014-03-26T00:00:00ZSketch-based Modelling and Conceptual Visualization of Geomorphological Processes for Interactive Scientific Communication
http://hdl.handle.net/1956/8570
Sketch-based Modelling and Conceptual Visualization of Geomorphological Processes for Interactive Scientific Communication
Natali, Mattia
Doctoral thesis
<p>Throughout this dissertation, solutions for rapid digitalization of ideas will be defined. More precisely, the focus is on interactive scientific sketching and communication of geology, where the result is a digital illustrative 3D model. Results are achieved through a sketch-based modelling approach which gives the user a more natural and intuitive modelling process, hence leading to a quicker definition of a geological illustration.</p>
<p>To be able to quickly externalize and communicate ones ideas as a digital 3D model, can be of importance. For instance, students may profit from explanations supported by interactive illustrations. Exchange of information and hypotheses between domain experts is also a targeted situation in our work. Furthermore, illustrative models are frequently employed in business, when decisional meetings take place for convincing the management that a project is worth to be funded.</p>
<p>An advantage of digital models is that they can be saved and they are easy to distribute. In contrast to 2D images or paper sketches, one can interact with digital 3D models, and they can be transferred on portable devices for easy access (for instance during geological field studies). Another advantage, compared to standard geological illustrations, is that if a model has been created with internal structures, it can be arbitrarily cut and inspected.</p>
<p>Different solutions for different aspects of subsurface geology are presented in this dissertation. To express folding and faulting processes, a first modelling approach based on cross-sectional sketches is introduced. User defined textures can be associated to each layer, and can then be deformed with sketch strokes, for communicating layer properties such as rock type and grain size.</p>
<p>A following contribution includes a simple and compact representation to model and visualize 3D stratigraphic models. With this representation, erosion and deposition of fluvial systems are easy to specify and display. Ancient river channels and other geological features, which are present in the subsurface, can be accessed by means of a volumetric representation.</p>
<p>Geological models are obtained and visualized by sequentially defining stratigraphic layers, where each layer represents a unique erosion or deposition event. Evolution of rivers and deltas is important for geologists when interpreting the stratigraphy of the subsurface, in particular because it changes the landscape morphology and because river deposits are potential hydrocarbon reservoirs.</p>
<p>Time plays a fundamental role in geological processes. Animations are well suited for communicating temporal change and a contribution in this direction is also given.</p>
<p>With the techniques developed in this thesis, it becomes possible to produce a range of geological scenarios. The focus is on enabling geologists to create their subsurface models by means of sketches, to quickly communicate concepts and ideas rather than detailed information.</p>
<p>Although the proposed techniques are simple to use and require little design effort, complex models can be realized.</p>
Fri, 19 Sep 2014 00:00:00 GMThttp://hdl.handle.net/1956/85702014-09-19T00:00:00ZElastic Grid Resources using Cloud Technologies
http://hdl.handle.net/1956/8567
Elastic Grid Resources using Cloud Technologies
Carlsen, Joachim Christoffer
Master thesis
A Large Ion Collider Experiment (ALICE) is one of four experiments at the
Large Hadron Collider (LHC) at CERN. The detectors in the ALICE experiment
produce data at a rate of 4 GB/s after being filtered and compressed online. The
data are stored and processed in a Grid system. A Grid system allows for sharing
globally distributed computing resources crossing administrative domains. The
ALICE collaboration have created its own Grid middleware called Alice Environment
(AliEn) to facilitate the processing and storage.
This project will examine a possible way of better utilizing AliEn computing
resources by using Cloud techniques, more specifically OpenStack[6] together
with the virtual appliance CernVM. Cloud techniques allow for adding and
removing virtual computing resources through an API, providing elasticity in a
computing center. This technique gives the possibility of removing the need for
physical dedicated AliEn computer resources, and instead make them disposable;
the virtual computing resources should only exist while needed.
This report will begin with a short general introduction and history of the
technologies used in this thesis, followed by an introduction to Grid technology and
AliEn. An introduction to Cloud technologies, OpenStack, and Virtual machines will
then follow. After introducing the main concepts and tools, a description of a testbed
and its setup process will be given, followed by an implementation of a prototype.
Lastly, a short performance test, evaluation of the prototype and conclusions will
follow.
Results show that implementing an elastic AliEn site using Cloud techniques is
indeed feasible. The solution give an overhead of ~2:30 minutes per AliEn job agent,
which is short compared to the lifespan of AliEn job agents, which normally is of 48
hours. Additionally, some possible ways of further reducing the overhead will be
described in this report.
Mon, 02 Jun 2014 00:00:00 GMThttp://hdl.handle.net/1956/85672014-06-02T00:00:00ZOn the feasibility of distributed systems for interactive visual analysis of omics data
http://hdl.handle.net/1956/8552
On the feasibility of distributed systems for interactive visual analysis of omics data
Farag, Yehia Mohamed
Master thesis
The purpose of this thesis is to discuss the
feasibility of developing a distributed
interactive visual analysis omics system
demonstrating how selected modules from the
standalone J-Express Modularized application can
be converted into a web based distributed system
maintaining the original application's
interactivity and functionality.
A distributed system is considered as the main
architectural design, where the client-side will
mainly be responsible for managing the interactive
visualisation of the generated results, and the
server-side is responsible for the remote
processing and storage of the data files in
addition to sharing the data resources between the
users.
A prototype system of an interactive visual data
analysis of omics data will be implemented as an
online service to non-technical life science
researchers. This provides the opportunity to
address the main challenges and limitations of
such systems, as well as outlining the additive
benefits to the omics-data analysis process
achieved by the use of distributed systems.
The main goal for this thesis can thus be
summarised as:
Investigate the feasibility of developing a web
distributed system that supports interactive
visualisation of omics data analysis based on
converting selected modules from J-Express
Modularized, and discuss the main benefits and
limitations of such a system.
From the study and the prototype evaluation, we
conclude that even with the current limitations
for the browser memory and computing capabilities,
it is still feasible to develop a browser based
distributed system for interactive visual analysis
of omics data. This system will achieve good
performance, scalability, besides being biologist-
friendly.
Mon, 02 Jun 2014 00:00:00 GMThttp://hdl.handle.net/1956/85522014-06-02T00:00:00ZSoftware modeling of the propagation of electromagnetic beams through different media
http://hdl.handle.net/1956/8549
Software modeling of the propagation of electromagnetic beams through different media
Bruket, Kjetil Rørvik
Master thesis
Propagation and focusing of electromagnetic beams through layered anisotropic medium is of interest in the field of optical data storage, where thin layers are mounted on glass substrates, display technology, where polarised light passes through liquid crystals, and in biological and material sciences, where objects are portrayed through thin sheet glass. In the case of computing a focused or diffracted field of a three-dimensional optical problem, it would be required to evaluate two-dimensional integrals with amplitude and phase functions. Obtaining numerical results for these kinds of problems, is a difficult task because of the rapidly oscillating integrands and the singularities in the Fresnel transmission and the reflection coefficients. If a layered medium is an anisotropic crystal, the numerical analysis is complicated significantly, as it gives rise to birefringence and mode coupling.
Research within this field is jointly being conducted by the Department of Engineering at the University College of Bergen [HIB] and the Department of Physics and Technology at the University of Bergen [UIB]. Their researchers have obtained both exact and asymptotic results for propagated and focused fields in uniaxial crystals, and they have been adopting various techniques to obtain numerical results. The double integrals, which can be used for obtaining the results, can be reduced to single integrals by means of applying parabolic approximations to the phase and amplitude functions, and even though the approximations tend to give numerical results, the procedure is time-consuming. As such obtaining numerical results from this sort of procedure might take a few minutes to several computing hours. To date, they have typically written software in Fortran, in order to achieve the numerical results, and this intermediary data is then used as input data for the MATLAB software, in order to present the final results.
This process of having to code one program for producing results that are passed to another, just to able to produce the final results, is convoluted, the use of different programs in series for producing the final result creating unnecessary work for the user. As such, the researchers have requested a singular software solution that will handle both the numerical calculations and the graphical presentation of the propagated electromagnetic fields from the initial data, which describes the characteristics of the medium and the type of electromagnetic wave.
Mon, 02 Jun 2014 00:00:00 GMThttp://hdl.handle.net/1956/85492014-06-02T00:00:00ZFast methods to solve the pooling problem
http://hdl.handle.net/1956/8548
Fast methods to solve the pooling problem
Kejriwal, Anisha
Master thesis
In pipeline transportation of natural gas, simple network flow
problems are replaced by hard ones when bounds on the flow
quality are imposed.
The sources, typically represented by gas wells, provide flow of
unequal compositions.
For instance, some sources may be richer in undesired
contaminants, such as CO_2, than others.
At the terminals, constraints on the relative content of the
contaminant may be imposed.
Flow streams are blended at junction points in the network,
where the relative CO_2-content becomes a weighted average of
the relative CO_2-content in entering streams.
To account for the quality bounds at the terminals, the quality
therefore must be traced from the sources via junction points to
the terminals.
The problem of allocating flow at minimum cost is referred to as
the pooling problem when the above-mentioned quality bounds
are imposed.
It is known that the pooling problem is NP-hard, which means
that it is very unlikely that exact solutions can be found in
instances of large scale.
Some exact methods, based on strong mathematical
formulations and intended for instances of small and medium
size, have recently been developed.
However, the literature offers few approaches to approximation
algorithms and other inexact methods dedicated for large
pooling problem instances.
This thesis focuses on the development of inexact or heuristic
techniques for the pooling problem.
The aim of these techniques is to find good feasible solutions for
large pooling problem instances at a reasonable computation
cost, and
the methods do not guarantee global optimality.
In order to achieve this, three approaches are discussed in this
thesis.
First, we propose an improvement heuristic which iteratively
reduces the total cost.
%new line
Since the quality of the solutions provided by the improvement
method depends upon good initial solutions,
we propose construction heuristic methods that give good
feasible solutions for the pooling problem.
The methods construct a sequence of sub-graphs, each of which
contains a single terminal, and an associated linear program for
optimizing the flow to the terminal.
The optimal solution to each linear program serves as a feasible
augmentation of total flow accumulated so far.
Finally, we combine both the above mentioned methods, such
that the solution given by the construction heuristic is used as
the starting solution by the improvement method.
Computational experiments indicate that all the heuristic
methods proposed in this thesis are faster compared to the
heuristics that were proposed earlier.
Since the exact solutions are not known in large instances, the
solutions given by the heuristic methods are compared to lower
bounds on the optimal objective function value.
In this thesis, we also propose a constraint generation algorithm,
that aims to compute lower bounds on the minimum cost fast.
Sat, 31 May 2014 00:00:00 GMThttp://hdl.handle.net/1956/85482014-05-31T00:00:00ZLinear dependencies between non-uniform distributions in DES
http://hdl.handle.net/1956/8545
Linear dependencies between non-uniform distributions in DES
Fauskanger, Stian
Master thesis
Davies and Murphy explained some non-uniform distributions of
the output from pairs and triplets of S-boxes in DES, and how they
are completely dependent on some key bits. There are linear
dependencies between these distributions. In this thesis, we
describe these linear dependencies. We also describe linear
dependencies between the distributions of the output from three
adjacent S-boxes after n rounds in DES. We have found all linear
dependencies between the distributions of the output from 5 of the
S-box triplets in full DES. The dependencies originates from
properties common to all S-boxes in DES.
Fri, 30 May 2014 00:00:00 GMThttp://hdl.handle.net/1956/85452014-05-30T00:00:00ZBioHackathon series in 2011 and 2012: penetration of ontology and linked data in life science domains
http://hdl.handle.net/1956/8517
BioHackathon series in 2011 and 2012: penetration of ontology and linked data in life science domains
Katayama, Toshiaki; Wilkinson, Mark D.; Aoki-Kinoshita, Kiyoko F.; Kawashima, Shuichi; Yamamoto, Yasunori; Yamaguchi, Atsuko; Okamoto, Shinobu; Kawano, Shin; Kim, Jin-Dong; Wang, Yue; Wu, Hongyan; Kano, Yoshinobu; Ono, Hiromasa; Bono, Hidemasa; Kocbek, Simon; Aerts, Jan; Akune, Yukie; Antezana, Erick; Arakawa, Kazuharu; Aranda, Bruno; Baran, Joachim; Bolleman, Jerven; Bonnal, Raoul J. P.; Buttigieg, Pier Luigi; Campbell, Matthew P.; Chen, Yi-an; Chiba, Hirokazu; Cock, Peter J. A.; Cohen, K. Bretonnel; Constantin, Alexandru; Duck, Geraint; Dumontier, Michel; Fujisawa, Takatomo; Fujiwara, Toyofumi; Goto, Naohisa; Hoehndorf, Robert; Igarashi, Yoshinobu; Itaya, Hidetoshi; Ito, Maori; Iwasaki, Wataru; Kalaš, Matúš; Katoda, Takeo; Kim, Taehong; Kokubu, Anna; Komiyama, Yusuke; Kotera, Masaaki; Laibe, Camille; Lapp, Hilmar; Lütteke, Thomas; Marshall, M. Scott; Mori, Takaaki; Mori, Hiroshi; Morita, Mizuki; Murakami, Katsuhiko; Nakao, Mitsuteru; Narimatsu, Hisashi; Nishide, Hiroyo; Nishimura, Yosuke; Nyström-Persson, Johan; Ogishima, Soichi; Okamura, Yasunobu; Okuda, Shujiro; Oshita, Kazuki; Packer, Nicki H; Prins, Pjotr; Ranzinger, Rene; Rocca-Serra, Philippe; Sansone, Susanna; Sawaki, Hiromichi; Shin, Sung-Ho; Splendiani, Andrea; Strozzi, Francesco; Tadaka, Shu; Toukach, Philip; Uchiyama, Ikuo; Umezaki, Masahito; Vos, Rutger; Whetzel, Patricia L.; Yamada, Issaku; Yamasaki, Chisato; Yamashita, Riu; York, William S.; Zmasek, Christian M.; Kawamoto, Shoko; Takagi, Toshihisa
Journal article
Abstract
The application of semantic technologies to the integration of biological data and the interoperability of bioinformatics analysis and visualization tools has been the common theme of a series of annual BioHackathons hosted in Japan for the past five years. Here we provide a review of the activities and outcomes from the BioHackathons held in 2011 in Kyoto and 2012 in Toyama. In order to efficiently implement semantic technologies in the life sciences, participants formed various sub-groups and worked on the following topics: Resource Description Framework (RDF) models for specific domains, text mining of the literature, ontology development, essential metadata for biological databases, platforms to enable efficient Semantic Web technology development and interoperability, and the development of applications for Semantic Web data. In this review, we briefly introduce the themes covered by these sub-groups. The observations made, conclusions drawn, and software development projects that emerged from these activities are discussed.
Wed, 05 Feb 2014 00:00:00 GMThttp://hdl.handle.net/1956/85172014-02-05T00:00:00ZWhole genome sequencing of the fish pathogen Francisella noatunensis subsp. orientalis Toba04 gives novel insights into Francisella evolution and pathogenecity
http://hdl.handle.net/1956/8516
Whole genome sequencing of the fish pathogen Francisella noatunensis subsp. orientalis Toba04 gives novel insights into Francisella evolution and pathogenecity
Sridhar, Settu; Sharma, Animesh; Kongshaug, Heidi; Nilsen, Frank; Jonassen, Inge
Journal article
<p>Background: Francisella is a genus of gram-negative bacterium highly virulent in fishes and human where F.
tularensis is causing the serious disease tularaemia in human. Recently Francisella species have been reported to
cause mortality in aquaculture species like Atlantic cod and tilapia. We have completed the sequencing and draft
assembly of the Francisella noatunensis subsp. orientalisToba04 strain isolated from farmed Tilapia. Compared to
other available Francisella genomes, it is most similar to the genome of Francisella philomiragia subsp. philomiragia,
a free-living bacterium not virulent to human.</p><p>Results: The genome is rearranged compared to the available Francisella genomes even though we found no
IS-elements in the genome. Nearly 16% percent of the predicted ORFs are pseudogenes. Computational pathway
analysis indicates that a number of the metabolic pathways are disrupted due to pseudogenes. Comparing the
novel genome with other available Francisella genomes, we found around 2.5% of unique genes present in
Francisella noatunensis subsp. orientalis Toba04 and a list of genes uniquely present in the human-pathogenic
Francisella subspecies. Most of these genes might have transferred from bacterial species through horizontal gene
transfer. Comparative analysis between human and fish pathogen also provide insights into genes responsible for
pathogenecity. Our analysis of pseudogenes indicates that the evolution of Francisella subspecies’s pseudogenes
from Tilapia is old with large number of pseudogenes having more than one inactivating mutation.</p><p>Conclusions: The fish pathogen has lost non-essential genes some time ago. Evolutionary analysis of the Francisella
genomes, strongly suggests that human and fish pathogenic Francisella species have evolved independently from
free-living metabolically competent Francisella species. These findings will contribute to understanding the
evolution of Francisella species and pathogenesis.</p>
Tue, 06 Nov 2012 00:00:00 GMThttp://hdl.handle.net/1956/85162012-11-06T00:00:00ZThe Weight Distributions of Several Classes of Cyclic Codes From APN Monomials
http://hdl.handle.net/1956/8430
The Weight Distributions of Several Classes of Cyclic Codes From APN Monomials
Li, Chunlei; Li, Nian; Helleseth, Tor; Ding, Cunsheng
Journal article
Let m ≥ 3 be an odd integer and p be an odd prime.
In this paper, a number of classes of three-weight cyclic codes C(1,e) over Fp, which have parity-check polynomial m1(x)me (x), are presented by examining general conditions on the
parameters p, m and e, where mi (x) is the minimal
polynomial of π−i over Fp for a primitive element π of
Fpm . Furthermore, for p ≡ 3 (mod 4) and a positive integer e satisfying (pk + 1) · e ≡ 2 (mod pm
− 1) for some positive integer k with gcd(m, k) = 1, the value distributions of the exponential sums T(a, b) = ∑
x∈Fpm
ωTr(ax+bxe )
and S(a, b, c) = ∑
x∈Fpm
ωTr(ax+bxe +cxs ), where s = (pm
− 1)/2,
are determined. As an application, the value distribution of S(a, b, c) is utilized to derive the
weight distribution of the cyclic codes C(1,e,s) with parity-check polynomial m1(x)me (x)ms (x).
In the case of p = 3 and even e satis- fying the above condition, the dual of the cyclic code
C(1,e,s)
has optimal minimum distance.
Fri, 01 Aug 2014 00:00:00 GMThttp://hdl.handle.net/1956/84302014-08-01T00:00:00ZOptimal ternary cyclic codes with minimum distance four and five
http://hdl.handle.net/1956/8429
Optimal ternary cyclic codes with minimum distance four and five
Li, Nian; Li, Chunlei; Helleseth, Tor; Ding, Cunsheng; Tang, Xiaohu
Journal article
Cyclic codes are an important subclass of linear codes and
have wide applications in data storage systems, communication
systems and consumer electronics. In this paper, two
families of optimal ternary cyclic codes are presented. The
first family of cyclic codes has parameters [3m−1,3m−1−2m,4] and contains a class of conjectured cyclic codes and
several new classes of optimal cyclic codes. The second family
of cyclic codes has parameters [3m−1,3m−2−2m,5]
and contains a number of classes of cyclic codes that are
obtained from perfect nonlinear functions over F3m, where
m > 1 and is a positive integer.
Sat, 01 Nov 2014 00:00:00 GMThttp://hdl.handle.net/1956/84292014-11-01T00:00:00ZSequences and Linear Codes from Highly Nonlinear Functions
http://hdl.handle.net/1956/8414
Sequences and Linear Codes from Highly Nonlinear Functions
Li, Chunlei
Doctoral thesis
Due to optimal nonlinearity and differential uniformity, perfect nonlinear
(PN) and almost perfect nonlinear (APN) functions are of great
importance in cryptography. It is interesting that they also define optimal
objects in other domains of mathematics and information theory.
This dissertation is devoted to exploring the application of highly
nonlinear functions, especially PN and APN functions, to the construction
of low-correlation sequences and optimal linear codes. For an
arbitrary odd prime p, there are only two basic classes of two-level
auto-correlation p-ary sequences with no subfield structures: the msequences
and the Helleseth-Gong sequences, where Helleseth-Gong
sequences are closely related to a class of p-ary perfect nonlinear functions.
Papers I and II are dedicated to investigating the cross-correlation
between the p-ary m-sequences and d-decimated Helleseth-Gong sequences
for some decimations d, and to constructing sequence families
with low correlation from them. Papers III-IV have focused on the study
of linear codes defined from highly nonlinear functions. Paper III utilizes
some highly nonlinear functions including PN and APN functions
to construct ternary cyclic codes with the optimal minimum (Hamming)
distance. Paper IV further investigates the weight distribution of some
optimal cyclic codes proposed in Paper III. Paper V examines the covering
radius of some linear codes defined from PN and APN functions
and presents a number of quasi-perfect linear codes.
Mon, 16 Jun 2014 00:00:00 GMThttp://hdl.handle.net/1956/84142014-06-16T00:00:00ZPractical Aspects of the Graph Parameter Boolean-width
http://hdl.handle.net/1956/8406
Practical Aspects of the Graph Parameter Boolean-width
Sharmin, Sadia
Doctoral thesis
Mon, 18 Aug 2014 00:00:00 GMThttp://hdl.handle.net/1956/84062014-08-18T00:00:00ZData clustering optimization with visualization
http://hdl.handle.net/1956/8362
Data clustering optimization with visualization
Guillaume, Fabien
Master thesis
This thesis study the possible applications of
Particle Swarm Optimization in Kernel Clustering,
Dynamic modeling and Artificial Neural Network.
Thu, 20 Mar 2014 00:00:00 GMThttp://hdl.handle.net/1956/83622014-03-20T00:00:00ZA polynomial-time solvable case for the NP-hard problem Cutwidth
http://hdl.handle.net/1956/8355
A polynomial-time solvable case for the NP-hard problem Cutwidth
Lilleeng, Simen
Master thesis
The Cutwidth problem is a notoriously hard problem, and its complexity is open on several interesting graph classes. Motivated by this fact we investigate the problem on superfragile graphs, a graph class on which the complexity of the Cutwidth problem is open. We give an algorithm that solves Cutwidth on superfragile graphs in O(n^2) time and O(n) space, thus resolving the complexity of the Cutwidth problem on superfragile graphs. We also explore the usefulness of the algorithm for cutwidth on threshold graphs by Heggernes, Lokshtanov, Mihai and Papadopoulos as an approximation algorithm for cutwidth on other graph classes. The Cutwidth problem is NP-hard for general graphs and a brute force algorithm would require O(n!) time. We give two faster algorithms solving the Cutwidth problem: One algorithm applying dynamic programming that runs in O^*(2^n) time and space, and one algorithm that runs in O^*(4^n) time and O(n\cdot log(n)) space by applying the divide-and-conquer technique. Finally we take a look at a similar problem called Optimal Linear Arrangement and suggest algorithms for solving the problem on threshold graphs and superfragile graphs in polynomial time.
Mon, 02 Jun 2014 00:00:00 GMThttp://hdl.handle.net/1956/83552014-06-02T00:00:00ZSo you've got IPv6 address space. Can you defend it?
http://hdl.handle.net/1956/8336
So you've got IPv6 address space. Can you defend it?
Sande, Mikal
Master thesis
Internet Protocol version 6 (IPv6) is the successor of Internet Protocol version 4 (IPv4). IPv6 will become the next standard networking protocol on the Internet. It brings with it a great increase in address space, changes to network operations, and new network security concerns. In this thesis we examine IPv6 from a security perspective. The security of IPv6 is important to all protocols that use IPv6 on the Internet. The goal of this thesis is to introduce the reader to existing IPv6 security challenges, demonstrate how IPv6 changes network security and show how IPv6 is being improved.
Thu, 29 May 2014 00:00:00 GMThttp://hdl.handle.net/1956/83362014-05-29T00:00:00ZVisual Cavity Analysis in Molecular Simulations
http://hdl.handle.net/1956/8305
Visual Cavity Analysis in Molecular Simulations
Parulek, Julius; Turkay, Cagatay; Reuter, Nathalie; Viola, Ivan
Journal article
Molecular surfaces provide a useful mean for analyzing interactions between biomolecules; such as identification and characterization of ligand binding sites to a host macromolecule. We present a novel technique, which extracts potential binding sites, represented by cavities, and characterize them by 3D graphs and by amino acids. The binding sites are extracted using an implicit function sampling and graph algorithms. We propose an advanced cavity exploration technique based on the graph parameters and associated amino acids. Additionally, we interactively visualize the graphs in the context of the molecular surface. We apply our method to the analysis of MD simulations of Proteinase 3, where we verify the previously described cavities and suggest a new potential cavity to be studied.
Tue, 12 Nov 2013 00:00:00 GMThttp://hdl.handle.net/1956/83052013-11-12T00:00:00ZImproving Parallel Sparse Matrix-vector Multiplication
http://hdl.handle.net/1956/8024
Improving Parallel Sparse Matrix-vector Multiplication
Tessem, Torbjørn Johnsen
Master thesis
Sparse Matrix-vector Multiplication (SMvM) is a mathematical technique encountered in many programs and computations and is often heavily used. Solving SMvM in parallel allows for bigger instances to be solved, and problems to be solved faster. Several strategies have been tried to improve parallel SMvM. Work has been done with regard to improved cache use, better load balance and reduced conflicts. The aim of the work conducted in this thesis is to develop new ideas and algorithms to speed-up parallel SMvM on a shared memory computer. We use a method inspired by the min-makespan problem to distribute elements more evenly. We introduce a hybrid algorithm that gives better cache efficiency, and we work with colouring algorithms to avoid write conflicts.
Thu, 19 Dec 2013 00:00:00 GMThttp://hdl.handle.net/1956/80242013-12-19T00:00:00ZSubstation Location in Offshore Wind Farms - A Planar Multi-Facility Location-Routing Problem
http://hdl.handle.net/1956/8017
Substation Location in Offshore Wind Farms - A Planar Multi-Facility Location-Routing Problem
Amland, Thomas
Master thesis
In offshore wind farms, two important parts of the design are to determine locations for substations and a cabling layout that connects every turbine to a substation. These problems are interconnected, as the cable layout depends on the choice of location for the substation. In this thesis we investigate how to set the location of substations such that the total cable cost is minimized.
Fri, 14 Mar 2014 00:00:00 GMThttp://hdl.handle.net/1956/80172014-03-14T00:00:00ZThroughput and robustness of bioinformatics pipelines for genome-scale data analysis
http://hdl.handle.net/1956/7906
Throughput and robustness of bioinformatics pipelines for genome-scale data analysis
Sztromwasser, Paweł
Doctoral thesis
<p>The post-genomic era has been heavily influenced by the rapid development of highthroughput
molecular-screening technologies, which has enabled genome-wide analysis
approaches on an unprecedented scale. The constantly decreasing cost of producing
experimental data resulted in a data deluge, which has led to technical challenges
in distributed bioinformatics infrastructure and computational biology methods. At the
same time, the advances in deep-sequencing allowed intensified interrogation of human
genomes, leading to prominent discoveries linking our genetic makeup with numerous
medical conditions. The fast and cost-effective sequencing technology is expected to
soon become instrumental in personalized medicine. The transition of the methodology
related to genome sequencing and high-throughput data analysis from the research
domain to a clinical service is challenging in many aspects. One of them is providing
medical personnel with accessible, robust, and accurate methods for analysis of
sequencing data.</p><p>The computational protocols used for analysis of the sequencing data are complex,
parameterized, and in continuous development, making results of data analysis sensitive
to factors such as the software used and the parameter values selected. However,
the influence of parameters on results of computational pipelines has not been systematically
studied. To fill this gap, we investigated the robustness of a genetic variant
discovery pipeline against changes of its parameter settings. Using two sensitivity
screening methods, we evaluated parameter influence on the identified genetic variants,
and found that the parameters have irregular effects and are inter-dependent. Only a
fraction of parameters were identified to have considerable impact on the results, suggesting
that screening parameter sensitivity can lead to simpler pipeline configuration.
Our results showed, that although a simple metric can be used to examine parameter
influence, more informative results are obtained using a criterion related to the accuracy
of pipeline results. Using the results of sensitivity screening, we have shown that
the influential pipeline parameters can be adjusted to effectively increase the accuracy
of variant discovery. Such information is invaluable for researchers tuning pipeline parameters,
and can guide the search for optimal settings for computational pipelines in
a clinical setting. Contrasting the two applied screening methods, we learned more
about specific requirements of robustness analysis of computational methods, and were
able to suggest a more tailored strategy for parameter screening. Our contributions
demonstrate the importance and the benefits of systematic robustness analysis of bioinformatics
pipelines, and indicate that more efforts are needed to advance research in
this area.</p><p>Web services are commonly used to provide interoperable, programmatic access to bioinformatics resources, and consequently, they are natural building blocks of bioinformatics
analysis workflows. However, in the light of the data deluge, their usability
for data-intensive applications has been questioned. We investigated applicability of
standard Web services to high-throughput pipelines, and showed how throughput and
performance of such pipelines can be improved. By developing two complementary approaches,
that take advantage of established and proven optimization mechanisms, we
were able to enhance Web service communication in a non-intrusive manner. The first
strategy increases throughput ofWeb service interfaces by a stream-like invocation pattern.
This additionally allows for data-pipelining between consecutive steps of a workflow.
The second approach facilitated peer-to-peer data transfer between Web services
to increase the capacity of the workflow engine. We evaluated the impact of the enhancements
on genome-scale pipelines, and showed that high-throughput data analysis
using standard Web service pipelines is possible, when the technology is used sensibly.
However, considering the contemporary data volumes and their expected growth,
methods capable of handling even larger data should be sought.</p><p>Systematic analysis of pipeline robustness requires intensive computations, which are
particularly demanding for high-throughput pipelines. Providing more efficient methods
of pipeline execution is fundamental for enabling such examinations on a largescale.
Furthermore, the standardized interfaces of Web services facilitate automated
executions, and are perfectly suited for coordinating large computational experiments.
I speculate that, provided wide adoption of Web service technology in bioinformatics
pipelines, large-scale quality control studies, such as robustness analysis, could be
automated and performed routinely on newly published computational methods. This
work contributes to realizing such a conception, providing technical basis for building
the necessary infrastructure and suggesting methodology for robustness analysis.</p>
Wed, 19 Feb 2014 00:00:00 GMThttp://hdl.handle.net/1956/79062014-02-19T00:00:00ZDirect data transfer between SOAP web services in Orchestration
http://hdl.handle.net/1956/7905
Direct data transfer between SOAP web services in Orchestration
Subramanian, Sattanathan; Sztromwasser, Paweł; Puntervoll, Pål; Petersen, Kjell
Conference object
In scientific data analysis, workflows are used to integrate
and coordinate resources such as databases and tools. Workflows
are normally executed by an orchestrator that invokes
component services and mediates data transport between
them. Scientific data are frequently large, and brokering
large data increases the load on the orchestrator and reduces
workflow performance. To remedy this problem, we demonstrate
how plain SOAP web services can be tailored to support
direct service-to-service data transport, thus allowing
the orchestrator to delegate the data-flow. We formally define
a data-flow delegation message, develop an XML schema
for it, and analyze performance improvement of data-flow
delegation empirically in comparison with the regular orchestration
using an example bioinformatics workflow.
Sun, 01 Jan 2012 00:00:00 GMThttp://hdl.handle.net/1956/79052012-01-01T00:00:00ZData partitioning enables the use of standard SOAP Web Services in genome-scale workflows
http://hdl.handle.net/1956/7904
Data partitioning enables the use of standard SOAP Web Services in genome-scale workflows
Sztromwasser, Paweł; Puntervoll, Pål; Petersen, Kjell
Journal article
Biological databases and computational biology tools are provided by research groups
around the world, and made accessible on the Web. Combining these resources is a com-
mon practice in bioinformatics, but integration of heterogeneous and often distributed tools
and datasets can be challenging. To date, this challenge has been commonly addressed in
a pragmatic way, by tedious and error-prone scripting. Recently however a more reliable
technique has been identified and proposed as the platform that would tie together bioinfor-
matics resources, namely Web Services. In the last decade the Web Services have spread
wide in bioinformatics, and earned the title of recommended technology. However, in the
era of high-throughput experimentation, a major concern regarding Web Services is their
ability to handle large-scale data traffic. We propose a stream-like communication pattern
for standard SOAP Web Services, that enables efficient flow of large data traffic between
a workflow orchestrator and Web Services. We evaluated the data-partitioning strategy
by comparing it with typical communication patterns on an example pipeline for genomic
sequence annotation. The results show that data-partitioning lowers resource demands of
services and increases their throughput, which in consequence allows to execute in-silico
experiments on genome-scale, using standard SOAP Web Services and workflows. As a
proof-of-principle we annotated an RNA-seq dataset using a plain BPEL workflow engine.
Sat, 01 Jan 2011 00:00:00 GMThttp://hdl.handle.net/1956/79042011-01-01T00:00:00ZA Comparison of Vertex and Edge Partitioning Approaches for Parallel Maximal Matching
http://hdl.handle.net/1956/7896
A Comparison of Vertex and Edge Partitioning Approaches for Parallel Maximal Matching
Sørnes, Alexander N
Master thesis
This thesis will compare two ways of distributing data for parallel graph
algorithms: vertex and edge partitioning, using a distributed memory system.
Previous studies on the parallelization of graphs has often been focused on a
vertex partitioning, where each processor is assigned a set V' $\subseteq$ V
where G = (V,E), yielding a one-dimensional partitioning. It has been shown,
however, that an edge partitioning (or 2D partitioning), where each processor is
assigned a set E' $\subseteq$ E, may yield a benefit in terms of a lower
communication volume.
The performance and scalability of vertex and edge partitionings are
compared by implementing the Karp-Sipser matching set algorithm for both
partitioning schemes. A matching set is a set E' $\subseteq$ E of independent
edges such that each vertex in V occurs at most once in E'.
We find that while the vertex partitioned algorithm gives a significantly higher speedup,
the increased performance of the edge partitioned algorithm on more dense graphs suggests that
if the graph framework is improved further, it could lead to the implementation of an
edge partitioned matching algorithm that offers better scalability and comparable matching quality
to a vertex partitioned matching algorithm.
An edge partitioning requires a rigorous framework for handling the
communication resulting when edges owned by multiple processors are incident on
the same vertex. Hopefully, the framework developed for representing an edge partitioned
graph facilitates the implementation of other parallel graph algorithms using an edge
partitioning approach.
Mon, 09 Dec 2013 00:00:00 GMThttp://hdl.handle.net/1956/78962013-12-09T00:00:00ZEnhancing Content Management in DPG
http://hdl.handle.net/1956/7794
Enhancing Content Management in DPG
Pino Arevalo, Ana Gabriela
Master thesis
This thesis analyzes the usability aspects of PCE
and implements a new Single Page Application that
attempts to solve this issues
Wed, 20 Nov 2013 00:00:00 GMThttp://hdl.handle.net/1956/77942013-11-20T00:00:00ZGrein. A New Non-Linear Cryptoprimitive
http://hdl.handle.net/1956/7779
Grein. A New Non-Linear Cryptoprimitive
Thorsen, Ole Rydland
Master thesis
In this thesis, we will study a new stream cipher, Grein, and a new cryptoprimitive used in this cipher. The second chapter gives a brief introduction to cryptography in general. The third chapter looks at stream ciphers in general, and explains the advantages and disadvantages of stream ciphers compared to block ciphers. In the fourth chapter the most important building blocks used in stream ciphers are explained. The reader is excepted to know elementary abstract algebra, as much of the results in this chapter depend on it. In the fifth chapter, the stream cipher Grain is introduced. In chapter six, the new stream cipher, Grein, is introduced. Here, we look at the different components used in the cipher, and how they operate together. In chapter seven, we introduce an alteration to the Grein cryptosystem, which hopefully have some advantages
Tue, 10 Dec 2013 00:00:00 GMThttp://hdl.handle.net/1956/77792013-12-10T00:00:00ZWSDL Workshop: Semantic web application in HTML5 for the discovery, construction and analysis of workfows
http://hdl.handle.net/1956/7724
WSDL Workshop: Semantic web application in HTML5 for the discovery, construction and analysis of workfows
Cañadas, Rafael Adolfo Nozal
Master thesis
<p>WSDL-Workshop is a HTML5 web application for the discovery and exploration
of web services and for analyzing the compatibility between web services. This
is the result of a mathematical model developed from WSDL1.1. The program
provides a graphical user interface and let the user build a work ow composed
of services described with WSDL1.1 and tells if:</p>
<p><ul><li>An output is compatible with an input.</li>
<li>It is correct to link an output with an input.</li>
<li> It is correct to link a given operation after another.</li>
<li>If many operations correctly linked together still make sense as a group.</li></ul></p>
<p>In order to do that the WSDLs must have semantic annotations so the
computer can recognize what is the purpose of certain data or operation. WSDLWorkshop
uses the EDAM Ontology as a reference for semantic concepts.</p>
<p>In the discovery aspect; for a given set of WSDLs you can nd services by
ltering by operation name, input or output names, or semantic annotations.
For a given operation output it can also lter by WSDL which have inputs which
are correct to link with that output. For a given operation it can also lter by
operations which are correct to link after.</p>
Tue, 01 Jan 2013 00:00:00 GMThttp://hdl.handle.net/1956/77242013-01-01T00:00:00ZTournaments and Optimality: New Results in Parameterized Complexity
http://hdl.handle.net/1956/7650
Tournaments and Optimality: New Results in Parameterized Complexity
Pilipczuk, Michal Pawel
Doctoral thesis
Fri, 22 Nov 2013 00:00:00 GMThttp://hdl.handle.net/1956/76502013-11-22T00:00:00ZDependencies: No Software is an Island
http://hdl.handle.net/1956/7540
Dependencies: No Software is an Island
Tellnes, Jørgen
Master thesis
In the past years, package managers, application frameworks and open-source libraries have made it vastly simpler and faster to get functioning software up and running, while cloud providers and external service providers have made it easier to get the application out into the hands of millions of users without large up-front costs.
While this recent technology development has made it possible for companies with limited resources to build impressive software and valuable services, the development has serious security implications which the current state of software development and systems engineering are not yet able to handle very well.
In this thesis, we will show that the security and availability of a system are largely determined by the surrounding "ecosystem" of dependencies, and that techniques to reduce the reliance on a system's dependencies-software libraries, services and infrastructures-are hugely beneficial.
The intended audience for this thesis are computer scientists, professional and amateur software developers, and system designers, but anyone with basic IT knowledge is encouraged to keep reading.
Tue, 15 Oct 2013 00:00:00 GMThttp://hdl.handle.net/1956/75402013-10-15T00:00:00ZTowards Privacy Managment of Information Systems
http://hdl.handle.net/1956/7539
Towards Privacy Managment of Information Systems
Drageide, Vidar
Master thesis
This masters thesis provides insight into the concept of privacy. It argues why privacy is important, and why developers and system owners should keep privacy in mind when developing and maintaining systems containing personal information. Following this, a strategy for evaluating the overall level of privacy in a system is defined. The strategy is then applied to parts of the cellphone system in an attempt to evaluate the privacy of traffic and location data in this system.
Tue, 02 Jun 2009 00:00:00 GMThttp://hdl.handle.net/1956/75392009-06-02T00:00:00ZA polynomial-time algorithm for LO based on generalized logarithmic barrier functions
http://hdl.handle.net/1956/7473
A polynomial-time algorithm for LO based on generalized logarithmic barrier functions
El Ghami, Mohamed; Ivanov, I.D.; Roos, C.; Steihaug, Trond
Peer reviewed; Journal article
Tue, 01 Jan 2008 00:00:00 GMThttp://hdl.handle.net/1956/74732008-01-01T00:00:00ZA type system for counting instances of software components
http://hdl.handle.net/1956/7467
A type system for counting instances of software components
Bezem, Marcus A.; Hovland, Dag; Truong, Anh Hoang
Peer reviewed; Journal article
We identify an abstract language for component software based on process algebra. Besides the usual operators for sequential, alternative and parallel composition, it has primitives for instantiating components and for deleting instances of components. We define an operational semantics for our language and give a type system in which types express quantitative information on the components involved in the execution of the expressions of the language. Included in this information is for each component the maximum number of instances that are simultaneously active during the execution of the expression. The type system is compositional by the novel use of ‘deficit types’. The type inference algorithm runs in time quadratic in the size of the input. We consider extensions of the language with loops and tail recursion, and with a scope mechanism. We illustrate the approach with some examples, one on UML diagram refinement and one on counting objects on the free store in C++.
Sun, 01 Jan 2012 00:00:00 GMThttp://hdl.handle.net/1956/74672012-01-01T00:00:00ZFiltering duplicate reads from 454 pyrosequencing data
http://hdl.handle.net/1956/7458
Filtering duplicate reads from 454 pyrosequencing data
Balzer, Susanne Mignon; Malde, Ketil; Grohme, Markus A.; Jonassen, Inge
Peer reviewed; Journal article
<p><strong>Motivation:</strong> Throughout the recent years, 454 pyrosequencing has
emerged as an efficient alternative to traditional Sanger sequencing
and is widely used in both de novo whole-genome sequencing and
metagenomics. Especially the latter application is extremely sensitive
to sequencing errors and artificially duplicated reads. Both are
common in 454 pyrosequencing and can create a strong bias in the
estimation of diversity and composition of a sample. To date, there
are several tools that aim to remove both sequencing noise
and duplicates. Nevertheless, duplicate removal is often based on
nucleotide sequences rather than on the underlying flow values,
which contain additional information.</p>
<p><strong>Results:</strong> With the novel tool JATAC, we present an approach towards
a more accurate duplicate removal by analysing flow values directly.
Making use of previous findings on 454 flow data characteristics,
we combine read clustering with Bayesian distance measures.
Finally, we provide a benchmark with an existing algorithm.</p>
<p><strong>Availability:</strong> JATAC is freely available under the General Public
License from <a href="http://malde.org/ketil/jatac/" target="blank">http://malde.org/ketil/jatac/</a></p>.
<p><strong>Supplementary information:</strong> Supplementary data are available at
Bioinformatics online</p>
Tue, 01 Jan 2013 00:00:00 GMThttp://hdl.handle.net/1956/74582013-01-01T00:00:00ZSystematic exploration of error sources in pyrosequencing flowgram data
http://hdl.handle.net/1956/7457
Systematic exploration of error sources in pyrosequencing flowgram data
Balzer, Susanne Mignon; Malde, Ketil; Jonassen, Inge
Peer reviewed; Journal article
<p><strong>Motivation:</strong> 454 pyrosequencing, by Roche Diagnostics, has
emerged as an alternative to Sanger sequencing when it comes to
read lengths, performance and cost, but shows higher per-base error
rates. Although there are several tools available for noise removal,
targeting different application fields, data interpretation would benefit
from a better understanding of the different error types.</p>
<p><strong>Results:</strong> By exploring 454 raw data, we quantify to what extent
different factors account for sequencing errors. In addition to the
well-known homopolymer length inaccuracies, we have identified
errors likely to originate from other stages of the sequencing process.
We use our findings to extend the flowsim pipeline with functionalities
to simulate these errors, and thus enable a more realistic simulation
of 454 pyrosequencing data with flowsim.</p>
<p><strong>Availability:</strong> The flowsim pipeline is freely available under the
General Public License from <a href="http://biohaskell.org/Applications/FlowSim" target="blank">http://biohaskell.org/Applications/FlowSim</a></p>
ISMB/ECCB 2011 PROCEEDINGS PAPERS COMMITTEE JULY 17 TO JULY 19, 2011, VIENNA, AUSTRIA.
Sat, 01 Jan 2011 00:00:00 GMThttp://hdl.handle.net/1956/74572011-01-01T00:00:00ZCharacteristics of 454 pyrosequencing data—enabling realistic simulation with flowsim
http://hdl.handle.net/1956/7456
Characteristics of 454 pyrosequencing data—enabling realistic simulation with flowsim
Balzer, Susanne Mignon; Malde, Ketil; Lanzén, Anders; Sharma, Animesh; Jonassen, Inge
Peer reviewed; Journal article
<p><strong>Motivation:</strong> The commercial launch of 454 pyrosequencing in 2005was a milestone in genome sequencing in terms of performance and
cost. Throughout the three available releases, average read lengths
have increased to ∼500 base pairs and are thus approaching read
lengths obtained from traditional Sanger sequencing. Study design
of sequencing projects would benefit from being able to simulate
experiments.</p>
<p><strong>Results:</strong> We explore 454 raw data to investigate its characteristics
and derive empirical distributions for the flow values generated by
pyrosequencing. Based on our findings, we implement Flowsim,
a simulator that generates realistic pyrosequencing data files of
arbitrary size from a given set of input DNA sequences. We finally
use our simulator to examine the impact of sequence lengths on the
results of concrete whole-genome assemblies, and we suggest its
use in planning of sequencing projects, benchmarking of assembly
methods and other fields.</p>
<p><strong>Availability:</strong> Flowsim is freely available under the General Public
License from <a href="http://blog.malde.org/index.php/flowsim/" target="blank">http://blog.malde.org/index.php/flowsim/</a></p>
ECCB 2010 CONFERENCE PROCEEDINGS SEPTEMBER 26 TO SEPTEMBER 29, 2010, GHENT, BELGIUM.
Fri, 01 Jan 2010 00:00:00 GMThttp://hdl.handle.net/1956/74562010-01-01T00:00:00ZCharacteristics of Pyrosequencing Data – Analysis, Methods, and Tools
http://hdl.handle.net/1956/7455
Characteristics of Pyrosequencing Data – Analysis, Methods, and Tools
Balzer, Susanne Mignon
Doctoral thesis
The introduction of this thesis provides background knowledge on the 454 sequencing
technology and a detailed review of the most relevant sequencing artifacts.
Chapter 1 puts the 454 sequencing technology into a historical context. Chapter
2 gives an overview of where 454 sequencing is applied, focusing on the most common
application areas. Chapter 3 provides a detailed description of how 454 sequencing
works, from library preparation to sequencing, imaging and data output.
Here, the distinction between the different detail levels of sequencing information is
crucial since data aggregation involves information loss. Chapter 4 describes where
errors and artifacts can arise, how they are manifested in the sequencing data, and
what impact they can have on downstream analyses. Finally, Chapter 5 puts the
contributions into their respective analytical contexts and discusses their relevance
for the research community.
The first paper, published in Bioinformatics in September 2010 and presented at
the European Conference on Computational Biology (ECCB) in Belgium the same
year, comprises of the exploration, modeling and simulation of 454 data. Under
the title “Characteristics of 454 pyrosequencing data – enabling realistic simulation
with Flowsim”, we present a detailed analysis of sequencing data and a simulation
tool that facilitates the design of sequencing projects. The tool can be used to
examine and quantify the impact of read length, coverage, sequencing errors and
signal degradation on genome assembly. Furthermore, it enables the testing and
benchmarking of known and novel algorithms, methods and tools in a number of
application areas such as whole genome assembly, read alignment, read correction,
single-nucleotide polymorphism (SNP) identification and metagenomics.
The second paper, “Systematic exploration of error sources in pyrosequencing
flowgram data”, was published in Bioinformatics in July 2011 and presented at
the Intelligent Systems for Molecular Biology (ISMB)/ECCB conference in Austria
the same year. We added several features and modules to the existing simulation
pipeline. Those were based on the observation of several error sources such as copy ing errors introduced through polymerase chain reaction (PCR), a method used in
454 sequencing for amplification of the templates. These errors appear as mutations
and are virtually impossible to distinguish from true sequence variants.
Similar to the second paper, the third paper, “Filtering duplicate reads from
454 pyrosequencing data”, focuses on a single error type, namely artificially duplicated
reads. Our JATAC tool enables removal of this artifact on the most detailed
sequencing data level, outperforming existing tools. The paper was published in
Bioinformatics in April 2013.
Mon, 17 Jun 2013 00:00:00 GMThttp://hdl.handle.net/1956/74552013-06-17T00:00:00ZQALM - a tool for automating quantitative analysis of LC-MS-MS/MS data
http://hdl.handle.net/1956/7369
QALM - a tool for automating quantitative analysis of LC-MS-MS/MS data
Lerøy, Kjartan
Master thesis
The goal of bioinformatics is to support science and research in the field of biology through the application of information technology. Proteomics is a field within biology that deals with the study of proteins. This paper describes QALM, an application developed to automate and simplify a specific type of proteomics analysis. QALM is first and foremost a proof of concept through which certain options for implementing such automation have been explored. Although a functional and usable application has been created, this should primarily be considered a stepping stone for similar applications in the future. Currently QALM is a desktop tool for importing and exporting data, inte- grating and communicating with external systems for the analysis of such data, and finally generating reports to present the results. It currently runs only un- der the Linux operating system, but it should be possible to change this fairly easily.
Mon, 31 May 2010 00:00:00 GMThttp://hdl.handle.net/1956/73692010-05-31T00:00:00ZPrediction of Polycomb/Trithorax Response Elements using Support Vector Machines
http://hdl.handle.net/1956/7222
Prediction of Polycomb/Trithorax Response Elements using Support Vector Machines
Bredesen, Bjørn Andre
Master thesis
Polycomb/Trithorax Response Elements (PREs) are epigenetic elements that can maintain established transcriptional states over multiple cell divisions. Sequence motifs in known PREs have enabled genome-wide PRE prediction by the PREdictor and jPREdictor, using combined motif occurrences for scoring sequence windows. The EpiPredictor predicts PREs by using the method of Support Vector Machines (SVM), which enables the construction of non-linear classifiers by use of kernel functions. Aspects of using SVMs for PRE prediction can be investigated, such as setting of SVM parameters, using SVM decision values for scoring and using alternative feature sets.
The PRE prediction implementation presented in this thesis, called PRESVM, uses SVM decision values to score sequence windows. PRESVM implements the feature sets used by (j)PREdictor and EpiPredictor, as well as feature sets using relative motif occurrence distances and periodic motif occurrence. Grid search and Particle Swarm Optimization are supported for setting SVM parameters.
For evaluating PRE predictions of multiple classifiers against experimental data sets, an application called PREsent has been implemented.
For a similar configuration for PRESVM and jPREdictor, PRESVM predicted a larger number of candidate PREs, which were more sensitive to but had lower Positive Predictive Values against experimental data considered than those of jPREdictor. A formal relationship was established between the PRESVM and jPREdictor decision functions for this configuration. The trade-offs make it difficult to conclude that either classifier is superior. Many configurations remain to be tested, and the results encourage further testing.
Mon, 03 Jun 2013 00:00:00 GMThttp://hdl.handle.net/1956/72222013-06-03T00:00:00ZRegulatory mechanisms of non-coding RNAs during zebrafish embryogenesis
http://hdl.handle.net/1956/7202
Regulatory mechanisms of non-coding RNAs during zebrafish embryogenesis
Nepal, Chirag
Doctoral thesis
<p>For many years, RNAs were thought to be intermediate products between DNA and
protein. The discovery of RNA interference (RNAi), a regulatory process that uses
small non-coding RNAs to regulate gene expression at the post-transcriptional level,
changed our view about RNAs. However, the discovery of microRNAs was the
realization of RNAs as the regulatory elements. In recent years, many highthroughput
sequencing studies have identified hundreds to thousands of various
kinds of non-coding RNAs. The existence and biological relevance of these noncoding
RNAs detected in large-scale analysis of human tissues have not yet been
characterized in a vertebrate animal in vivo. To gain insight into the existence and
biological relevance of these non-coding RNAs in vertebrate animal in vivo, we have
set out to generate the first global description of TSS usage during key stages of
vertebrate embryonic development at single nucleotide resolution. We have coupled
CAGE maps to protein-coding and non-coding transcripts by RNA sequencing
(providing a quantitative description of TSS usage on a genome scale) and anchored
to posttranslational histone modifications (H3K4me3) by ChIP sequencing.</p>
<p>We reveal an extraordinary dynamics of promoter usage that takes place during
development of the vertebrate embryo. We showed that the onset of transcription
and subsequent differentiation of the embryo is characterized by the developmentally
regulated appearance of 5’-ends of intragenic RNAs on many genes, and of an entire
hitherto unknown layer of RNA species overlapping known genes and having specific
signatures occurring in exons, introns and 3’-UTRs of developmentally active genes.
We characterize the pervasive production of intragenic processed RNAs including
exonic and intron-5’ end specific RNAs and provide the first indication for the
biological processes in which they may function. Notably, intron 5’ end associated
non-coding RNAs are active zygotically and restricted to genes that encode RNA
processing and the splicing proteins in both fish and human. We demonstrated
evidence that exonic RNAs are produced by a non-canonical posttranscriptional
mechanism independent of the gene 5’ end. We show the initiation landscape and
developmental dynamics of lincRNAs; we show the evolutionary conserved process
of developmentally regulated posttranscriptional processing of lincRNAs into
intragenic RNAs, which demonstrate the utility of zebrafish in studying mammalian
lincRNA processing.</p>
<p>The main aim of this work was to provide a (currently non-existent) annotation of miRNA promoters and characterize their common characteristics features at
transcription, post transcription and chromatin level. We describe the first genomewide
identification of miRNA promoters in zebrafish active during the early embryonic
developmental stages. We identified a small number of maternally transcribed
miRNAs, one MBT specific miRNA and the majority that are zygotically transcribed.
We report the first evidence of moRNAs in zebrafish and pufferfish that were
previously reported in human and Ciona intestinalis. We show evidence for
unexpected enrichment of pre-miRNA sites with promoter-associated histone
modification marks (H3K4me3 and H2A.Z) suggesting chromatin regulation and
potential involvement of transcription machinery in pre-miRNA processing,
suggesting co-transcriptional splicing of pre-miRNAs and pri-miRNA.</p>
<p>We have provided a catalogue of intermediate-sized non-coding RNAs in zebrafish,
by making RNA library enriched for intermediate-sized (50-500 nt) non-coding RNAs,
collected from zebrafish larvae (5-7 days post fertilization). In particular, we validated
most annotated snoRNAs and identified few hundreds of novel snoRNAs making the
most comprehensive annotations of zebrafish snoRNAs. Host genes for most
snoRNAs showed no evidence for independent transcription of snoRNAs, suggesting
they are co-transcribed by host genes. Interestingly, host (coding and non-coding)
genes require non-canonical transcription initiation machinery, as indicated by TCT
initiation signals, that is associated with translation machinery. 5’-end of many
snoRNAs overlaps with CAGE 5’-ends, suggesting either they are capped or
undergo post-transcriptional modification, which is also evolutionary conserved in
human snoRNAs. Small RNAs derived from snoRNAs are generated from most
snoRNAs and provide first evidence of sd-snoRNAs produced in oocytes, suggesting
their potential importance during early embryogenesis.</p>
Fri, 03 May 2013 00:00:00 GMThttp://hdl.handle.net/1956/72022013-05-03T00:00:00ZUtvidelse og formell sikkerhetsanalyse av Dynamic Presentation Generator
http://hdl.handle.net/1956/7191
Utvidelse og formell sikkerhetsanalyse av Dynamic Presentation Generator
Vines, Aleksander
Master thesis
Oppgaven omhandler sikkerhetsproblematikk i Dynamic
Presentation Generator og undersøker muligheten for
å bruke en single-sign-on via Mi side.
Thu, 02 May 2013 00:00:00 GMThttp://hdl.handle.net/1956/71912013-05-02T00:00:00ZSketch-based Storytelling for Cognitive Problem Solving: Externalization, Evaluation, and Communication in Geology
http://hdl.handle.net/1956/7175
Sketch-based Storytelling for Cognitive Problem Solving: Externalization, Evaluation, and Communication in Geology
Lidal, Endre Mølster
Doctoral thesis
<p>PROBLEM solving is an important part of all engineering and scientific activities. It
is present, for instance, when experts want to develop more fuel-efficient cars or
when they are searching for oil and gas in the subsurface. Many alternatives have to be
examined and evaluated before the optimal solution is found. Solving such problems is
not only performed inside the mind of the scientist, but it is also an interaction between
mind and scribbles, sketches, or visualizations on papers, on blackboards, and on computers.
For problem solving in expert teams, this externalization through sketches and
visualizations also plays an important communicative role.</p>
<p>This dissertation presents research for assisting the problem-solving process on the
computer, through novel technological advances in the fields of illustrative visualization
and sketch-based modeling. Specifically, it targets problems that are related to
evolutionary processes. Firstly, inspired by storytelling, the domain experts can express
their ideas for solution as stories. These stories are based on sketches that the
experts draw, utilizing a novel temporal-sketching interface inspired by a flip-over canvas
metaphor. Further, the dissertation describes a set of sketching proxy geometries,
such as the box-proxy geometry, that the experts can take advantage of when drawing
three-dimensional (3D) sketches. These proxy geometries support the task of mapping
a two-dimensional input (2D), e.g., a mouse or a digitizer tablet, to a 3D sketch. Solving
difficult problems require that many different solutions are evaluated to identify the
most optimal one. This dissertation introduces the story-tree, a tree-graph data structure
and visualization, which manages and provides access to an ensemble of alternative
stories. The story-tree also provides an interface where the stories can be evaluated and
compared. This playback of the stories is done through automatic animations of the
2D sketches. The third challenge addressed in this dissertation is to communicate the
optimal solution to decision-makers and laymen. By combining the animated 2D story
sketches with illustrative visualization techniques it is possible to automatically synthesize
and animate 3D models. These animations can be combined with new cutaway
visualization techniques to reveal features hidden inside such 3D models.</p>
<p>All of these contributions have been investigated in the context of the problemsolving
tasks relevant to the early phase of petroleum exploration. This phase is characterized
by having very little ground-through data available. Thus, a large solution
space needs to be explored. Even so, the geologists need to produce models that can
predict if petroleum is present. In addition to working with few data, the geologists also
work under heavy time constraints because of the competition between the oil companies
exploring the same area. The contributions from this dissertation have created
enthusiasm among the domain experts and already, a new research initiative has materialized from the work described in this dissertation. Based on the feedback from the
domain experts, we can conclude that the contributions presented in this dissertation
form a valuable step towards better tools for problem solving, involving the computer,
for the domain investigated here.</p>
Tue, 25 Jun 2013 00:00:00 GMThttp://hdl.handle.net/1956/71752013-06-25T00:00:00ZXHM: A system for detection of potential cross hybridizations in DNA microarrays
http://hdl.handle.net/1956/7174
XHM: A system for detection of potential cross hybridizations in DNA microarrays
Flikka, Kristian; Yadetie, Fekadu; Lægreid, Astrid; Jonassen, Inge
Journal article; Peer reviewed
<p>Background: Microarrays have emerged as the preferred platform for high throughput gene
expression analysis. Cross-hybridization among genes with high sequence similarities can be a
source of error reducing the reliability of DNA microarray results.</p> <p>Results: We have developed a tool called XHM (cross hybridization on microarrays) for
assessment of the reliability of hybridization signals by detecting potential cross-hybridizations on
DNA microarrays. This is done by comparing the sequences of the probes against an extensive
database representing the transcriptome of the organism in question. XHM is available online at
http://www.bioinfo.no/tools/xhm/.</p><p>Conclusions: Using XHM with its user-adjustable parameters will enable scientists to check their
lists of differentially expressed genes from microarray experiments for potential crosshybridizations.
This provides information that may be useful in the validation of the microarray
results.</p>
Fri, 27 Aug 2004 00:00:00 GMThttp://hdl.handle.net/1956/71742004-08-27T00:00:00ZNovice Difficulties with Language Constructs
http://hdl.handle.net/1956/7167
Novice Difficulties with Language Constructs
Rosbach, Alexander Hoem
Master thesis
Programming is a difficult skill to learn, and
programming courses have
high dropout rates. In this thesis we study the
problems that students have
during their first introductory programming course
at The University of
Bergen. We inspect the solutions that they submit
for the given assignments,
and look at the frequency of the different kinds
of mistakes in their work.
We present a problem taxonomy that we use to
classify the mistakes
found to be the most common, and conclude that a
significant part of
the problems are observable misconceptions. We
introduce a web-based
tool, Javis, that we have developed to aid the
students with these kinds of
problems.
Based on the experience and knowledge gained
during this work we
present a proposal of a grading by annotation
scheme. This scheme is specif-
ically designed to increase the quality of the
feedback given to students
on their submitted work and provide valuable
feedback to the teachers
regarding the problems that their students have.
Thu, 01 Aug 2013 00:00:00 GMThttp://hdl.handle.net/1956/71672013-08-01T00:00:00ZAn implementation of a Feedback Vertex Set algorithm.
http://hdl.handle.net/1956/7082
An implementation of a Feedback Vertex Set algorithm.
Sivertsen, Arvid Soldal
Master thesis
An implementation, improvements to implementation
and empirical results.
Feedback Vertex Set on undirected and unweighted
graphs.
Mon, 03 Jun 2013 00:00:00 GMThttp://hdl.handle.net/1956/70822013-06-03T00:00:00ZNetwork coding in Bluetooth networks
http://hdl.handle.net/1956/7061
Network coding in Bluetooth networks
Stenvoll, Roger
Master thesis
This thesis discusses the possibility to apply network coding to a Bluetooth piconet. A protocol is proposed. This protocol is based on using deterministic linear network coding. The proposed alphabet size is binary, and the encoding equation is a trivial parity check code. By using the proposed code the encoding scale easily by the number of source nodes in the network, and do not require exchange of coding equations. The encoding and decoding is performed using bitwise XOR of the packets, and does not require any pre-computed look-up table, nor a great amount of dedicated memory to store intermediate packets. Finally, the encoding and decoding is not computational hard. Network coding applied as the proposed protocol is only beneficial to the master node and the communication from the master node to the slave nodes. Furthermore, it does not give any erroneous protection. Information exchanged in the network will be easier available to all the slave nodes in the network, and would require system to maintain confidentiality if this is required. However, this is trivially achieved in a Bluetooth network without network coding as well, and the same countermeasure should be enforced in all Bluetooth network. Not only when applying network coding. A theoretical study of the proposed algorithm shows a gain in throughput, and reduced power consumption. These features are appreciated by computational and power challenged nodes. This efficiency is maximized when there are few source nodes in the network, and large frame sizes (DH5). The theoretical study is verified by a simulator designed to this purpose.
Thu, 01 Oct 2009 00:00:00 GMThttp://hdl.handle.net/1956/70612009-10-01T00:00:00Z