Decision making in multiplayer environments: application in backgammon variants

Abstract

Tesauro’s TD-Gammon was the first major success of machine learning and artificial intelligence in general, when it demonstrated world-class performance against the human backgammon champion of that time. Even more impressively, the method used required little expert knowledge, relying on self-playing and training neural networks using reinforcement learning. However, apart from standard backgammon, several – yet unexplored – variants of the game exist, which use the same board, number of checkers and dice, but have different rules for moving the checkers, starting positions or movement direction. In this thesis we focus our research on three such popular variants in Greece and neighboring countries, named Portes, Plakoto, and Fevga (collectively called Tavli). Motivated by the successful methods of TD-Gammon, we extend and devise new reinforcement learning methods for building artificial intelligent agents and show that expert-level play can also be achieved in these games. All the re ...
show more

All items in National Archive of Phd theses are protected by copyright.

DOI
10.12681/eadd/43622
Handle URL
http://hdl.handle.net/10442/hedi/43622
ND
43622
Alternative title
Λήψη αποφάσεων σε πολυπρακτορικά περιβάλλοντα: εφαρμογή σε παραλλαγές ταβλιού
Author
Papahristou, Nikolaos (Father's name: Eleftherios)
Date
2015
Degree Grantor
University of Macedonia Economic and Social Sciences
Committee members
Ρεφανίδης Ιωάννης
Σαμαράς Νικόλαος
Σακελαρίου Ηλίας
Στεφανίδης Γεώργιος
Σατρατζέμη Μαρία
Σιφαλέρας Άγγελος
Βεργίδης Κωνσταντίνος
Discipline
Natural SciencesComputer and Information Sciences
Keywords
Reinforcement learning; Neural networks; Backgammon; Temporal difference learning; Self - play
Country
Greece
Language
English
Description
146 σ., im., tbls., fig., ch.
Usage statistics
VIEWS
Concern the unique Ph.D. Thesis' views for the period 07/2018 - 07/2023.
Source: Google Analytics.
ONLINE READER
Concern the online reader's opening for the period 07/2018 - 07/2023.
Source: Google Analytics.
DOWNLOADS
Concern all downloads of this Ph.D. Thesis' digital file.
Source: National Archive of Ph.D. Theses.
USERS
Concern all registered users of National Archive of Ph.D. Theses who have interacted with this Ph.D. Thesis. Mostly, it concerns downloads.
Source: National Archive of Ph.D. Theses.
Related items (based on users' visits)