Detecting all the colors was pretty tough, and detecting the type of gem in each cell was even tougher; but now we need the software to make a decision and act on it. Specifically, of all the possible trigger/target combinations available on the board at the moment, which one is the best one to do next? I chose to go a pretty simplistic route on this, since I was, after all, writing this for the heck of it.
I do not have the software attempt to predict combos or other more advanced plays to maximize points. This is something that humans playing the game do to some extent without even thinking about it, but it becomes more challenging for software. Consider the picture below. I have highlighted the 4 possible moves by drawing a red blobby line between the trigger and target gems. Which move will generate the most points? If you said "the bottom-right one", you would be correct, because both the green and red gems would be removed, not to mention the bonus you get from doing a combo move like that. By the way, the software would call that move "6,5-7,5". The best move that the software sees from this board is "4,6-3,6" - swapping the white and yellow gems in the bottom-middle of the board. I'll explain why this is chosen shortly.
|Which move is best?|
- Always use paths with multipliers in them first - this is a no-brainer, because that has long-term positive effects on the entire game.
- If a path has a crosshair in it, it will destroy more gems than any other move on the board - except maybe a hypercube, but the software doesn't detect those, so that's moot.
- If a path has a flaming gem in it, its explosion will take out a 3x3 square of gems as well as the path gems themselves, yielding 9-10 gems destroyed on a 3-gem path.
- Longer paths are better than shorter ones, because they destroy more gems. Plus, they generate crosshairs, hypercubes, and flaming gems, so that's goodness too.
- Paths lower on the board have more opportunities to create combos than paths near the top. All else being equal, pick the lowest path on the board.
OK, great, we have rules of how to pick a path, but we've gotten a little ahead of ourselves. How do we find the damn things? In the software, I have a Grid object with 8x8 GridItem objects (aka Gems) in it. During the detection phase, I set the color and type of each of those gem objects. When I'm done, I have a software representation of the board as of the last screen capture. Then I look at each gem and check to see if it is the leading edge gem of one of the following patterns:
|The four path patterns in their basic form|
Anyway, in examining each gem to see if it is the leading edge of one of the patterns above, I actually examine it 4 times for each of the 4 patterns above - once with each possible 90-degree rotation of the pattern. Mirror-images are included as well, but they are done within each rotation, so it's only 16 (4x4) checks on each gem instead of 32 (4x4x2).
Once all the processing above is done, we have a full list of every path available for activating. For each path, we generate a score based on the 5 rules discussed above, and then sort them by score in descending order. Then we pick the best possible path and execute it by moving the mouse to the trigger, clicking, moving the mouse to the target, and clicking again. I do the mouse movement and clicking using the XLib API again: XSetInputFocus, XSync, XWarpPointer, XQueryPointer, and XSendEvent in various combinations that I worked out primarily through internet searching and a lot of trial and error.
As I was coding all of this, I went with the most expedient code possible, rather than the highest-performing. Again, it was a project being done for the heck of it, so spending a lot of time designing around performance was too much like work. Nevertheless, as I reached code completion, I started being concerned that the image processing and path detection would take so long that the software wouldn't even outscore me. After all, I can hit 350k pretty reliably every week, and my highest human score was 652k. Besides speed, I also have the human judgment that lets me pick the combo path automatically as we saw at the start of the post.
I needn't have worried: the screenshot, the origin detection, the color and state detection, the Grid-building, the path detection, and the scoring and decision process all takes 9 milliseconds to finish. In fact, this thing is so fast that it finds matches as gems fall that don't really exist. This causes it to have a high mistake rate, so I actually have it sleep (do nothing) for 50 milliseconds between passes to give Bejeweled a little time to catch up. Even so, it's still mighty fast.
Below are two videos, neither one using boosts. The first was captured while I was manually playing, and while it's not exactly a stellar run, it's a fairly representative game, finishing with a score of 166k. The second is a run using my software. It gets confused a couple of times, pausing for a full second while it reacquires the origin; and notice how it just ignores the hypercubes and makes a lot of stupid plays. And yet the score speaks for itself.
Playing by hand, for 166k
Playing by program, for 698k
Yeah, that'll do.