• norsk
    • English
  • norsk 
    • norsk
    • English
  • Logg inn
Vis innførsel 
  •   Hjem
  • Faculty of Mathematics and Natural Sciences
  • Department of Informatics
  • Department of Informatics
  • Vis innførsel
  •   Hjem
  • Faculty of Mathematics and Natural Sciences
  • Department of Informatics
  • Department of Informatics
  • Vis innførsel
JavaScript is disabled for your browser. Some features of this site may not work without it.

Optimizing Approximate Weighted Matching on Nvidia Kepler K40

Naim, Md.; Manne, Fredrik; Halappanavar, Mahantesh; Tumeo, Antonino; Langguth, Johannes
Conference object, Peer reviewed
Accepted version
Thumbnail
Åpne
Accepted version (16.09Mb)
Permanent lenke
https://hdl.handle.net/1956/16752
Utgivelsesdato
2018
Metadata
Vis full innførsel
Samlinger
  • Department of Informatics [746]
Originalversjon
https://doi.org/10.1109/hipc.2015.15
Sammendrag
Matching is a fundamental graph problem with numerous applications in science and engineering. While algorithms for computing optimal matchings are difficult to parallelize, approximation algorithms on the other hand generally compute high quality solutions and are amenable to parallelization. In this paper, we present efficient implementations of the current best algorithm for half-approximate weighted matching, the Suitor algorithm, on Nvidia Kepler K-40 platform. We develop four variants of the algorithm that exploit hardware features to address key challenges for a GPU implementation. We also experiment with different combinations of work assigned to a warp. Using an exhaustive set of 269 inputs, we demonstrate that the new implementation outperforms the previous best GPU algorithm by 10 to 100x for over 100 instances, and from 100 to 1000x for 15 instances. We also demonstrate up to 20x speedup relative to 2 threads, and up to 5x relative to 16 threads on Intel Xeon platform with 16 cores for the same algorithm. The new algorithms and implementations provided in this paper will have a direct impact on several applications that repeatedly use matching as a key compute kernel. Further, algorithm designs and insights provided in this paper will benefit other researchers implementing graph algorithms on modern GPU architectures.
Utgiver
IEEE
Opphavsrett
Copyright 2015 IEEE

Kontakt oss | Gi tilbakemelding

Personvernerklæring
DSpace software copyright © 2002-2019  DuraSpace

Levert av  Unit
 

 

Bla i

Hele arkivetDelarkiv og samlingerUtgivelsesdatoForfattereTitlerEmneordDokumenttyperTidsskrifterDenne samlingenUtgivelsesdatoForfattereTitlerEmneordDokumenttyperTidsskrifter

Min side

Logg inn

Statistikk

Besøksstatistikk

Kontakt oss | Gi tilbakemelding

Personvernerklæring
DSpace software copyright © 2002-2019  DuraSpace

Levert av  Unit