 |
Tantrix Tournament Reports You can view reports as a guest, but please REGISTER to write reports, vote in polls or see commands in your own language.
|
| View previous topic :: View next topic |
| Author |
Message |
Pellepen
Joined: 14 Apr 2004 Posts: 154
|
Posted: Sun Jul 06, 2008 10:02 pm Post subject: Roll Out Competition |
|
|
I think some of us are participating in the Roll Out Competition:
http://www.tantrix.com/english/TantrixRollout.html
But is there anyone who is working on a program so we will have a winner in this competition?
Perhaps there is a lot of people working on it now?
Please let us know if there are some programmers out there somewhere who are capable of making a "roll out" program. |
|
| Back to top |
|
 |
fafa
Joined: 26 May 2006 Posts: 127 Location: Delft, Netherlands
|
Posted: Tue Jul 08, 2008 12:56 pm Post subject: |
|
|
i was a bit confused by this competition, because i thought the aim was to come up with a good, efficient roll out program, but as we aren't allowed to use any of the existing programs, it seems like a huge waste of time to start building something from the ground up, when dave (or anyone using the goodbot program as a basis) could much more easily do so.
So i think this should either be about guessing, and then, at a set deadline, you could just let goodbot play that particular game (from the given starting point) say, 1000 times. (which is the most basic way of determining the odds). Or it's about making a good, fast (because playing a game 1000 times probably isn't the most efficient way) roll-out program, but then i think we should be able to use goodbot as a basis.
But i'm very curious too if anyone is working on it. |
|
| Back to top |
|
 |
blick
Joined: 16 Aug 2003 Posts: 83 Location: Debrecen, Hungary
|
Posted: Tue Jul 08, 2008 6:25 pm Post subject: |
|
|
| I have been thinking about this problem, and of course, my first idea was goodbot as well. But, goodbot uses the same algorythm, so in every game, only luck would make difference, not the first move, or sg. But i don't have a better idea - yet. |
|
| Back to top |
|
 |
galca
Joined: 23 Nov 2004 Posts: 176
|
Posted: Tue Jul 08, 2008 7:22 pm Post subject: |
|
|
i think there is a conceptual difference between robot and rollout:
in rollout the tile sequence is known and this knowledge can be (and should be) used when the best move is chosen. robot program, however, decides on the best move over all possible tile sequences. These two "best moves" can be different: for example, the move which is best in the given position when we know the tile sequence can be too risky in general.
I am not trying to say that using existing robot is not an advantage (it is), but maybe it is not such a big advantage: there are a lot more things that a good rollout program can do to improve its decisions which i assume don't exist in the lobby robots.
As for the people working or planning to work on rollout program: i had some thoughts of turning my endgame program (which is used so far only by me and a couple of other players - AC's in fact - because of it's huge cheating potential) into rollout program, but i am not sure i'll have enough time for it in the nearest future. |
|
| Back to top |
|
 |
Zormac
Joined: 13 Aug 2003 Posts: 743 Location: Budapest, Hungary
|
Posted: Wed Jul 09, 2008 10:00 am Post subject: |
|
|
| galca wrote: | i think there is a conceptual difference between robot and rollout:
in rollout the tile sequence is known and this knowledge can be (and should be) used when the best move is chosen. robot program, however, decides on the best move over all possible tile sequences. These two "best moves" can be different: for example, the move which is best in the given position when we know the tile sequence can be too risky in general.
I am not trying to say that using existing robot is not an advantage (it is), but maybe it is not such a big advantage: there are a lot more things that a good rollout program can do to improve its decisions which i assume don't exist in the lobby robots.
As for the people working or planning to work on rollout program: i had some thoughts of turning my endgame program (which is used so far only by me and a couple of other players - AC's in fact - because of it's huge cheating potential) into rollout program, but i am not sure i'll have enough time for it in the nearest future. |
I disagree Galca, I think the order of the remaining tiles was never meant to be fix in rollout. That would be entirely pointless, given that it has nothing to do with a real game situation.
That's why I have also been thinking about turning either GoodBot or Oliver into a rollout checker program (since they had an automated match against each other, surely it can't be impossible to automate such a batch either). |
|
| Back to top |
|
 |
galca
Joined: 23 Nov 2004 Posts: 176
|
Posted: Wed Jul 09, 2008 12:02 pm Post subject: |
|
|
| Zormac wrote: | | galca wrote: |
I disagree Galca, I think the order of the remaining tiles was never meant to be fix in rollout. That would be entirely pointless, given that it has nothing to do with a real game situation.. |
|
Somehow I was sure it was clear from the roll out page, so i checked it now and it was not there of course... So i was close to disagree myself, but looked at the math wiki pages and now i am back to agreement with myself:
we want to estimate the probability of Mike winning the game from the given position. Lets say the tiles can come out in N different ways. Prob. theory says (i think it is called "law of total probability") that
Pr(Mike wins) = Pr(Mike wins| tile sequence 1)*Pr(tile sequence 1) + ... + Pr(Mike wins | tile sequence N)*Pr(tile sequence N)
So the task would be solved if we could find if the position is win or lose for every tile sequence. This is practically impossible for 15 tiles left positions, but playing it again and again can give a good approximation.
Letting goodbot playing itself "as is" is another way to estimate this probability but i have a gut feeling that it will be less accurate because of 1 reason: if tile sequence is not fixed, the robot will always play the same first move. So what happens if another move is played and what are the odds of winning in this case will not affect the results and it should! |
|
| Back to top |
|
 |
Zormac
Joined: 13 Aug 2003 Posts: 743 Location: Budapest, Hungary
|
Posted: Wed Jul 09, 2008 12:38 pm Post subject: |
|
|
| galca wrote: | | Zormac wrote: |
I disagree Galca, I think the order of the remaining tiles was never meant to be fix in rollout. That would be entirely pointless, given that it has nothing to do with a real game situation.. |
we want to estimate the probability of Mike winning the game from the given position. Lets say the tiles can come out in N different ways. Prob. theory says (i think it is called "law of total probability") that
Pr(Mike wins) = Pr(Mike wins| tile sequence 1)*Pr(tile sequence 1) + ... + Pr(Mike wins | tile sequence N)*Pr(tile sequence N)
So the task would be solved if we could find if the position is win or lose for every tile sequence. This is practically impossible for 15 tiles left positions, but playing it again and again can give a good approximation.
Letting goodbot playing itself "as is" is another way to estimate this probability but i have a gut feeling that it will be less accurate because of 1 reason: if tile sequence is not fixed, the robot will always play the same first move. So what happens if another move is played and what are the odds of winning in this case will not affect the results and it should! |
Yup, that's about it, with the small modification that the conditional probabilities are all 0 or 1, depending on whether the position with the given tile order is a winning position or not.
Now, 15 factorial sounds a big enough number not to even try to brute-force all positions, hence (I think) the suggestion on the original page to take a smaller sample (say, 500, or similar), which evaluates in a reasonable amount of time, repeat that observation and if the winning probabilities is within the statistic error margin, we consider it measured - I don't think that an accuracy beyond this is realistically achievable. And for this kind of measurement, the currently available robot tools should be perfectly suitable.
Or maybe I'm missing something from the original intention...
The point about the robot always making the same first move is OK... but, this being a game played with never knowing what comes out of the bag next, in any given position there is a single choice of moves to achieve what you are after. The bot has its own measure and will decide on that, in the same way, every time.
Note that this is the same for human players, however the terms "what you are after" and "your own measure" can change over time - in the 6th game of a KO match, you will rather go for a one tile loss that is 100% certain, if that's all you need to go through, rather than going for a risky win that can lose you the game by a bigger margin hence forcing you into tiebreak - in other games it can be precisely an outside chance of a big loop that you are after. Clearly, in the rollout, your goal is to maximise the chance to win the game, irrespective of margins (note that this is VERY different to robots' approach, which is probably to achieve the highest margin). Obviously, in endgames, where uncertainty disappears, all these strategies map down to finding the best position. |
|
| Back to top |
|
 |
galca
Joined: 23 Nov 2004 Posts: 176
|
Posted: Wed Jul 09, 2008 12:54 pm Post subject: |
|
|
| Zormac wrote: |
Yup, that's about it, with the small modification that the conditional probabilities are all 0 or 1, depending on whether the position with the given tile order is a winning position or not. |
correct, and we want to know is it 1 or 0 with the highest confidence level we can achieve
| Zormac wrote: |
Now, 15 factorial sounds a big enough number not to brute-force all positions, hence (I think) the suggestion on the original page to take a smaller sample (say, 500, or similar), which evaluates in a reasonable amount of time, repeat that observation and if the winning probabilities is within the statistic error margin, we consider it measured - I don't think that an accuracy beyond this is realistically achievable.
|
this is how i see it too (i never looked at the formulas, but statistically even 500 out of 151 can be enough)
| Zormac wrote: |
And for this kind of measurement, the currently available robot tools are perfectly suitable.
|
and this is exactly the point we disagree on.
If the robot makes a wrong first move, the total measurement worth nothing.
recursively, same is true for any other position deeper on the way, so any "wrong" decision there reduces the confidence. Same robot will make the same mistake over and over. As i see it, something needs to be done here - fixing tile sequences is one way to solve this problem, human expert checking all robot decisions is another but this is quite a lot for 500 games x 27 tiles depth!
EDITING: and, to better explain myself before anyone spends too much time to understand my point, here is an example:
suppose we want to know what is the winning probability from some position. There is a move, A, that is significantly better than any other move. Lets say it is the winning move in 90% of possible tile sequences. But for the remaining 10% if the player plays A he loses. However, another move, B, is winning move for half of the remaining sequences. Probability to win from the given position is then 95% However, any perfect robot which doesn't look at the sequence but rather tries to maximize the winning probability OVER MOVES, will ALWAYS play A and the rollout result will be 90%. Now, how will you feel if your rollout guess was 95%?
| Zormac wrote: |
Or maybe I'm missing something from the original intention... |
Would be nice to get Mike's reply to this...
----
Anyway, I would like to suggest the Bond's endgame search to be used for the rollout: it gives the best confidence for the endgame (last 13 tiles) - it finds 100% best result. It is not fast enough to be a robot (20 min per endgame worst case), but it is fast enough for rollout with less than 1 min per endgame in average. It can then evaluate 1000 positions in less than 1 day which i believe is acceptable speed. It can run stand-alone (on files) or can be integrated to another program. So, if tantrix headquaters would like to use it, please let me know - I'll be glad to help. |
|
| Back to top |
|
 |
Mikem
Joined: 14 Aug 2003 Posts: 22
|
Posted: Wed Jul 09, 2008 2:44 pm Post subject: Tiles must come out randomly |
|
|
For me, the whole point of a roll out is to analyse something which varies. If the sequence of tiles was fixed as suggested then this would be like solving a puzzle...certainly still difficult but not what Tantrix is about.
Perhaps it is unfortunate that applet gives the ability to control the tile sequence, with the confusion added to by the "Challenge Room".
But in theory Tantrix is like backgammon..ie there is no control over what the dice will say next, or in what order the tiles will come out. |
|
| Back to top |
|
 |
galca
Joined: 23 Nov 2004 Posts: 176
|
Posted: Wed Jul 09, 2008 3:17 pm Post subject: |
|
|
| If i understand you, Mike, the goal of rollout is not to find probability to win but to find the move with maximal probability to win? (and submit this number?) (the difference is explained above in my previous post) |
|
| Back to top |
|
 |
Mikem
Joined: 14 Aug 2003 Posts: 22
|
Posted: Wed Jul 09, 2008 3:34 pm Post subject: |
|
|
| Well, if it were possible I would play the game (ie roll out) the game from the position one million times using the two best players in the world. The question is what % would each player win. This is the type of decision you need to be evaluating constantly when a doubling cube is involved...as in the tournament we played last weekend http://www.tantrix.com/english/tournaments/Tantrix2008Vauban.html |
|
| Back to top |
|
 |
Zormac
Joined: 13 Aug 2003 Posts: 743 Location: Budapest, Hungary
|
Posted: Wed Jul 09, 2008 4:17 pm Post subject: |
|
|
| Mikem wrote: | | Well, if it were possible I would play the game (ie roll out) the game from the position one million times using the two best players in the world. The question is what % would each player win. This is the type of decision you need to be evaluating constantly when a doubling cube is involved...as in the tournament we played last weekend http://www.tantrix.com/english/tournaments/Tantrix2008Vauban.html |
Sure, but the question is the same, even if the two best players in the world are playing - what are they up to? Do they want to win each game? or do they want to maximise their TP score as they would in a tournament play? |
|
| Back to top |
|
 |
fafa
Joined: 26 May 2006 Posts: 127 Location: Delft, Netherlands
|
Posted: Wed Jul 09, 2008 4:36 pm Post subject: |
|
|
i agree with the non-fixed sequence here. Like Zormac said (as far as i can follow all this mathematical talk) if the sequence would be known, hypothetically (as mike pointed out, it would be quite a difficult puzzle) the chances of red winning would indeed be 100 or 0% (depending on whether or not red can find a series of moves which blue cannot counter.)
and i'm not sure i can follow the logic to get to that percentage of 95% in your example earlier, because if you would compute blue's chances of winning in the same way (if red plays A, he has 10% chance of winning, if red plays B, 50%) blue wins (10+90/2) 55% of the times, so the odds would be 95-55 so if move A gives red a 90% chance of winning, that is the optimal move, and should therefor always be played, countered by blue's optimal move etc.
the point made about bot making the same mistake every time is a valid one though, but then it's a matter of improving bot, which is a good idea then anyway, so we have a 'bestbot' to do the rollout with
And apparently (as mike makes the comparison with a cube-game) the aim is to win, and not to maximise margin. Which would then indeed (if bot does go for maximum margin, at risk of losing) require some modification of bot. |
|
| Back to top |
|
 |
galca
Joined: 23 Nov 2004 Posts: 176
|
Posted: Wed Jul 09, 2008 4:55 pm Post subject: |
|
|
| Fafa wrote: |
and i'm not sure i can follow the logic to get to that percentage of 95% in your example earlier, because if you would compute blue's chances of winning in the same way (if red plays A, he has 10% chance of winning, if red plays B, 50%) blue wins (10+90/2) 55% of the times, so the odds would be 95-55 so if move A gives red a 90% chance of winning, that is the optimal move, and should therefor always be played, countered by blue's optimal move etc. |
Not exactly
how many games red wins? for 90% of games, red will play A and win. For another 5% of sequences, red will play B and win.
Blue wins only when he has a winning counter move to whatever red plays, and it happens only on 5% of sequences. So 95% : 5%
This is the difference between "chances that position is a win" and "chances that a player can win it". Which are 95% and 90% in the example.
If we all agree on definition, we are looking for the "chances that a player can win the game from the given position". In this case, 90% is the correct answer and we stay with the "normal" player/robot decision, which is good.
In this case, the only question remaining is : do we trust GoodBot enough to run rollout now or we'll wait until the competition deadline to see if there is an improved robot? It would be interesting for example to compare GoodBot rollouts vs Oliver's : I think this can give us a feeling how reliable they are. (Of course, with the small changes of margin to %). |
|
| Back to top |
|
 |
Mikem
Joined: 14 Aug 2003 Posts: 22
|
Posted: Wed Jul 09, 2008 6:17 pm Post subject: |
|
|
| Yes sorry, this subject sprang from the idea of applying doubliing cube theory to Tantrix and I should have made it more clear that we do mean that both players must maximise their chance to win, as in in 'cube' games margins are not taken into account. |
|
| Back to top |
|
 |
Retro
Joined: 23 Aug 2006 Posts: 100 Location: Edinburgh, Scotland
|
Posted: Thu Jul 10, 2008 1:09 am Post subject: |
|
|
Obviously we need a robot to run through a sample of the 15! permutations of draw sequences; as well as Backgammon I believe this technique is also used in televised poker.
Another question is, how accurate does the robot's win% prediction need to be? We are allowed to guess in units of 0.1%, but if rollouts of 500 or 1000 samples were done multiple times we would get a different answer every time!
If Galca's program can do 1,000 runs in a day, this would estimate a win% to within a range of 5%* (with 90% confidence), which doesn't seem that good, i.e. if you run the program again with a different set of 1000 tile sequences, you would probably get a different answer (and if you get the same it would be coincidence).
To be 90% sure a rollout program has the 'true' win% to within 1% may require 27,000 runs** - to get it within 0.1% requires 2,700,000! Is this possible?
On another slightly cynical point, surely whoever develops the accepted rollout program has a distinct advantage in the competition, by being able to see in advance what result to expect?
---
Appendix:
I'm sure some select people may want to know where those figures came from, so here's my working:
Suppose the 'true' winning proportion is p. Then, on a rollout of n trials, the observed success rate is P, where P = (number of wins)/n.
nP has a binomial distribution with parameters (n,p), but as n will be very large we approximate nP to the normal distribution with mean np and variance npq, thus P~N(p,pq/n)
Calculations:
(*) If n=1000 and p is about 0.5, then the standard error in P is sqrt(0.25/1000)=0.0158 (1.58%). From statistical tables we can see the size of the equitailed 90% confidence interval is 3.29 this amount [Phi(.95)-Phi(.05)], or 5.2%.
(**) If we want to be 90% sure our guess P is within x(=0.01), then we must run enough trials such that the standard error is only x/3.29. We already know the standard error can be up to sqrt(0.25/n), so we can solve algebraically to find x=0.01 => n=27056, and x=0.001 => n=2.7e+6 |
|
| Back to top |
|
 |
Metallor
Joined: 01 Mar 2006 Posts: 178 Location: Auckland, NZ
|
Posted: Thu Jul 10, 2008 3:42 am Post subject: |
|
|
| Retro wrote: |
On another slightly cynical point, surely whoever develops the accepted rollout program has a distinct advantage in the competition, by being able to see in advance what result to expect?
|
But thats the whole point isnt it? Surely someone who has gone to the time and effort to develop a roll out piece of software should have a distinct advantage over someone who has just made a guess after thinking about it for a few minutes.
My programming capacity is quite limited, however i did think of an algorithm quite similar to you Ben. Run the game through the required number of times so that all possible permutations (suppose its 15!) are run and have the bots play out each tile permutation.
Another possibility is to have different bots playing each colour. For example, run the above algorithm with GB playing against itself, the Oliver playing against himself, then GB blue Oliver red, then GB red Oliver blue.
Another alternative is to chain Pekko to a computer desk and leave him playing tantrix until the roll out competition is over. |
|
| Back to top |
|
 |
galca
Joined: 23 Nov 2004 Posts: 176
|
Posted: Thu Jul 10, 2008 6:53 am Post subject: |
|
|
Thanks Ben for the confidence calculations!
Even with 1 min per the whole game (which is i think is not so much for a good robot) it will take about a month
and, since this subject was opened and i found myself looking at it, here is one of the possible endgame positions that can happen :
How much time will it take you to find 10 tiles win for Mike? |
|
| Back to top |
|
 |
lytnin
Joined: 28 Feb 2004 Posts: 366 Location: Windsor NSW, AUSTRALIA
|
Posted: Thu Jul 10, 2008 7:23 am Post subject: |
|
|
My solution has been PM'd to galca. So that others will also have a go.
I think this may even end up an 11 or 12 tile win. |
|
| Back to top |
|
 |
galca
Joined: 23 Nov 2004 Posts: 176
|
Posted: Thu Jul 10, 2008 7:41 am Post subject: |
|
|
this was fast! (and simpler than my solution)
but it is only 10 (and you already know why so I am removing it from this post )
Last edited by galca on Thu Jul 10, 2008 8:37 am; edited 2 times in total |
|
| Back to top |
|
 |
lytnin
Joined: 28 Feb 2004 Posts: 366 Location: Windsor NSW, AUSTRALIA
|
Posted: Thu Jul 10, 2008 7:53 am Post subject: |
|
|
Cool  |
|
| Back to top |
|
 |
galca
Joined: 23 Nov 2004 Posts: 176
|
Posted: Thu Jul 10, 2008 8:13 am Post subject: |
|
|
ok, this one is quite easy, but when this game was played (L8 of WTC 2006) Cat didn't spot her killer move (can imagine Matt's relief when another move was played: he wouldn't probably go to L4 otherwise ...)
well i hope it was so long time ago that I will be forgiven for publishing it here
and the goal is 10 tiles win again |
|
| Back to top |
|
 |
lytnin
Joined: 28 Feb 2004 Posts: 366 Location: Windsor NSW, AUSTRALIA
|
Posted: Thu Jul 10, 2008 8:30 am Post subject: |
|
|
PM'ing solution  |
|
| Back to top |
|
 |
Zormac
Joined: 13 Aug 2003 Posts: 743 Location: Budapest, Hungary
|
Posted: Thu Jul 10, 2008 11:23 am Post subject: |
|
|
| galca wrote: | ok, this one is quite easy, but when this game was played (L8 of WTC 2006) Cat didn't spot her killer move (can imagine Matt's relief when another move was played: he wouldn't probably go to L4 otherwise ...)
well i hope it was so long time ago that I will be forgiven for publishing it here
and the goal is 10 tiles win again |
Aaaargh... put'em back please those who aren't even awake at those hours had no chance to even look at the puzzles :S |
|
| Back to top |
|
 |
lytnin
Joined: 28 Feb 2004 Posts: 366 Location: Windsor NSW, AUSTRALIA
|
Posted: Fri Jul 11, 2008 1:18 am Post subject: |
|
|
Z , the puzzles haven't been removed, just my answer and response to what the final score could/would have been, so that others will get to have a look and put some thought in as well. I'm sure Galca will publish them in a short time.  |
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|