• norsk
    • English
  • English 
    • norsk
    • English
  • Login
View Item 
  •   Home
  • Faculty of Mathematics and Natural Sciences
  • Department of Informatics
  • Department of Informatics
  • View Item
  •   Home
  • Faculty of Mathematics and Natural Sciences
  • Department of Informatics
  • Department of Informatics
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Projective Simulation compared to reinforcement learning

Bjerland, Øystein Førsund
Master thesis
Thumbnail
View/Open
135269761.pdf (3.899Mb)
URI
https://hdl.handle.net/1956/10391
Date
2015-06-01
Metadata
Show full item record
Collections
  • Department of Informatics [738]
Abstract
This thesis explores the model of projective simulation (PS), a novel approach for an artificial intelligence (AI) agent. The model of PS learns by interacting with the environment it is situated in, and allows for simulating actions before real action is taken. The action selection is based on a random walk through the episodic & compositional memory (ECM), which is a network of clips that represent previous experienced percepts. The network takes percepts as inputs and returns actions. Through the rewards from the environment, the clip network will adjust itself dynamically such that the probability of doing the most favourable action (.i.e most rewarded) is increased in similar subsequent situations. With a feature called generalisation, new internal clips can be created dynamically such that the network will grow to a multilayer network, which improves the classification and grouping of percepts. In this thesis the PS model will be tested on a large and complex task, learning to play the classic Mario platform game. Throughout the thesis the model will be compared to the typical reinforcement algorithms (RL) algorithms, Q-Learning and SARSA, by means of experimental simulations. A framework for PS was built for this thesis, and games used in the previous papers that introduced PS were used to validate the correctness of the framework. Games are often used as a benchmark for learning agents, a reason is that the rules of the experiment are already defined and the evaluation can easily be compared to human performance. The games that will be used in this thesis are: The Blocking game, Mountain Car, Pole Balancing and, finally, Mario. The results show that the PS model is competitive to RL for complex tasks, and that the evolving network will improve the performance. A quantum version of the PS model has recently been proven to realise a quadratic speed-up compared to the classical version, and this was one of the primary reasons for the introduction of the PS model. This quadratic speed-up is very promising as training AI is computationally heavy and requires a large state space. This thesis will, however, consider only the classical version of the PS model.
Publisher
The University of Bergen
Copyright
Copyright the Author. All rights reserved

Contact Us | Send Feedback

Privacy policy
DSpace software copyright © 2002-2019  DuraSpace

Service from  Unit
 

 

Browse

ArchiveCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsDocument TypesJournalsThis CollectionBy Issue DateAuthorsTitlesSubjectsDocument TypesJournals

My Account

Login

Statistics

View Usage Statistics

Contact Us | Send Feedback

Privacy policy
DSpace software copyright © 2002-2019  DuraSpace

Service from  Unit