Vis enkel innførsel

dc.contributor.authorBjerland, Øystein Førsundeng
dc.date.accessioned2015-09-03T13:02:58Z
dc.date.available2015-09-03T13:02:58Z
dc.date.issued2015-06-01
dc.date.submitted2015-06-01eng
dc.identifier.urihttps://hdl.handle.net/1956/10391
dc.description.abstractThis thesis explores the model of projective simulation (PS), a novel approach for an artificial intelligence (AI) agent. The model of PS learns by interacting with the environment it is situated in, and allows for simulating actions before real action is taken. The action selection is based on a random walk through the episodic & compositional memory (ECM), which is a network of clips that represent previous experienced percepts. The network takes percepts as inputs and returns actions. Through the rewards from the environment, the clip network will adjust itself dynamically such that the probability of doing the most favourable action (.i.e most rewarded) is increased in similar subsequent situations. With a feature called generalisation, new internal clips can be created dynamically such that the network will grow to a multilayer network, which improves the classification and grouping of percepts. In this thesis the PS model will be tested on a large and complex task, learning to play the classic Mario platform game. Throughout the thesis the model will be compared to the typical reinforcement algorithms (RL) algorithms, Q-Learning and SARSA, by means of experimental simulations. A framework for PS was built for this thesis, and games used in the previous papers that introduced PS were used to validate the correctness of the framework. Games are often used as a benchmark for learning agents, a reason is that the rules of the experiment are already defined and the evaluation can easily be compared to human performance. The games that will be used in this thesis are: The Blocking game, Mountain Car, Pole Balancing and, finally, Mario. The results show that the PS model is competitive to RL for complex tasks, and that the evolving network will improve the performance. A quantum version of the PS model has recently been proven to realise a quadratic speed-up compared to the classical version, and this was one of the primary reasons for the introduction of the PS model. This quadratic speed-up is very promising as training AI is computationally heavy and requires a large state space. This thesis will, however, consider only the classical version of the PS model.en_US
dc.format.extent4088957 byteseng
dc.format.mimetypeapplication/pdfeng
dc.language.isoengeng
dc.publisherThe University of Bergenen_US
dc.subjectKunstig intelligensNob
dc.subjectMaskinlæringNob
dc.subjectSimuleringsmetoderNob
dc.titleProjective Simulation compared to reinforcement learningen_US
dc.typeMaster thesis
dc.rights.holderCopyright the Author. All rights reserveden_US
dc.description.degreeMaster i Informatikken_US
dc.description.localcodeMAMN-INF
dc.description.localcodeINF399
dc.subject.realfagstermerhttp://data.ub.uio.no/realfagstermer/c007197
dc.subject.realfagstermerhttp://data.ub.uio.no/realfagstermer/c003530
dc.subject.realfagstermerhttp://data.ub.uio.no/realfagstermer/c007657
dc.subject.nus754199eng
fs.subjectcodeINF399


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel