University of Cambridge > Talks.cam > Machine Learning Journal Club > Measuring Game Temperature With UCT-Monte Carlo

Measuring Game Temperature With UCT-Monte Carlo

Download to your calendar using vCal

If you have a question about this talk, please contact Carl Scheffler .

Exhaustive search in reasonably complex trees, like e.g. the board game Go, is extremely expensive. A particular Monte Carlo policy called upper confidence bound for trees (UCT) has emerged over the last three years as a very promising lever on such reinforcement learning problems, but has recently run into scaling problems when attempting to beat humans on large Go boards. It represents a very heuristic approach to Games.

Combinatorial Game theory on the other hand, is a branch of pure maths providing a sturdy framework for sub-divisible full-information games. It uses an abstract concept called “Temperature” to develop approximate strategies that have a bounded error on the perfect line of play. Unfortunately, it is extremely tedious to discover the temperature of games like Go using traditional exhaustive search.

In this talk I will present preliminary results on an attempt to combine the two worlds of Monte Carlo planning and Combinatorial Game Theory to produce a UCT algorithm that measures Temperature and simultaneously searches for good moves on small (sub) games. There’s faint hope that this could lead to “divide-and-conquer” solutions for search in general AND /OR trees with bounded rewards.

This talk is about a work in progress and part of my preparations for my first year report.

This talk is part of the Machine Learning Journal Club series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

Š 2006-2025 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity