In Part 1, I explained my motives for writing software to play Bejeweled Blitz. In Part 2, I defined the terms and general outline of a program to automatically play Bejeweled Blitz. In Part 3, I'll start at the screen level and work all the way down to the pixel, showing how I detect the color and state of each gem on the grid.
Where to Begin
I run 64-bit Ubuntu at home, and my browser is Firefox. To capture and analyze the contents of the screen, I used XLib API calls. In X, every window is laid out in a hierarchy starting with the root window that holds the desktop, taskbars, and all the top-level application windows. So first I open my display with XOpenDisplay and store it for the life of the capture job, since it gets used throughout the process. Next, I wrote a function to search a given window for the word "Bejeweled" in its menu text (using XGetWMName). If found, it returns a handle to the window. Otherwise it uses XQueryTree to get an array of all the immediate children and recursively calls itself with each of them. Then it is just a simple matter of calling that function initially with the result of RootWindow(disp, DefaultScreen(disp)).
Now we drop down a level from screen to window: specifically, the Firefox browser in which Bejeweled is running. To get an image of that window, it is as easy as calling XGetWindowAttributes to see how big it is, and then XGetImage to get an XImage pointer that we can analyze pixel by pixel. To get a pixel, I use XGetPixel, passing the XImage that resulted from XGetImage, and the X/Y coordinates of the pixel I want. This returns me the RGB value of the pixel as a long integer, which I can break into separate color levels with a little ANDing and shifting.
The next problem is finding the grid origin. There are a lot of better ways to do this, but I chose the simple brute-force method. In the picture on the right, I've put a red box around the group of pixels that I search for to find the top-left corner of the grid. To do this, I simply examine every pixel until I find one that matches the first pixel in the red box. Then I check to see if every other pixel in my test section also matches. If so, I offset to where the grid origin is and proceed. Otherwise, I move to the next pixel and do it all over again. Not especially efficient, but it gets the job done. This spot on the board is key, because it is only there during game play, and it never changes color except for short time periods during hypercube and crosshair detonation.
Once I know the grid origin, I can divide the grid into an 8x8 array of cells, with each cell 40x40 pixels in size. At this point I had to get a little more creative, because of the dynamic nature of the gems and the board. The background changes color frequently in response to multiplier changes, power-ups, and game mode. The gems also spin when they're clicked on, making pixel-by-pixel identification impossible. The key here is to focus on what matters, and to eliminate that which doesn't.
Special Multiplier Processing
First I test each cell to see if it is a multiplier. Through a great deal of manual playing and taking screenshots, I discovered that the x2, x3, and x5 multipliers all had the exact same X shape on the gems, but the x4 was a little different - I haven't analyzed anything higher than x5. I decided the best way to handle this was to look for a set of pixels that were white, forming the X shape, but examine only those pixels that were common to both shapes. What I ended up with was the set of pixels depicted to the right, where the gray pixels are white only in x4 or only in x2/3/5. Then it was a simple matter of checking each pixel in the gem in the right positions to see if it was white. Anything else that wasn't in the must-be-white list could be ignored.
If the previous step determined that the cell is a multiplier, then I do a special color test on it, different from other gems. I check the color of a single pixel just above the "X". Based on that color, I know the color of the multiplier cell and I stop processing it further for color. In the picture on the right you can see the pixelated shape of the white X, as well as a red dot where I check the multiplier for color.
Sensing the Aura
The next bit of special processing is to determine whether the cell is flaming or is a crosshair. These are almost as important to detect as multipliers, because using them increases the chance of getting a multiplier: especially crosshairs, which will generate a multiplier on every use, so long as the multiplier time limit has expired. Crosshairs also require some special color processing later, so we need to know if the current cell is a crosshair before we start looking at color.
The way I detect crosshairs and flames is to look at the top-middle of the cell, actually bleeding over into the cell above it by one pixel and extending down two pixels into the current cell. This area is never occupied by gems at rest, so it is a good place to search for auras. Based on the average color in this region, I determine if we have a crosshair, a flame, or a normal gem in this cell.
Finally, I check the color. To do this, I only look at the middle 12x12 square of pixels, because this area is all gem (no background) for all colors and shapes, and never is corrupted by flaming aura. I take the simple average of each of the RGB values in the pixels to come up with the average color. For non-crosshair gems, I can be pretty precise because there is seldom any variation. Crosshairs pulsate, though, so I started by examining a bunch of frames of crosshairs and finding optimum RGB values. Then the program works outward from these values until the calculated average value fits into the range of one of the colors. In the picture on the right, we can see a flaming blue gem with a red border around its aura zone and its color zone.
Once I know the color and state, I can move on to the next cell, repeating the process until all 64 cells have been identified. If falling gems, hypercubes, or other temporary embellishments cause a cell to be undetected or mis-detected, it usually has little to no effect on the outcome of the game. There is enough going on at any given time that little pockets of misinformation can be absorbed.
The conclusion is in Part 4: Playing the Game
No comments:
Post a Comment