Skip to content
Go back

Death, taxes and optimization

A few months ago, I created a programming challenge, an invitation to do some old-fashioned programming from before the LLM era. The goal was to write a solver for Rushhour. Give it a try! It is a lot of fun, almost as much fun as playing the game!

I implemented two classical search algorithms: BFS and Aβˆ—A^{*}.

According to the Book πŸ“—, Aβˆ—A^{*} is cost-optimal if it uses an admissible heuristic, i.e., one that never overestimates the cost to reach the goal. The heuristic I used counts the number of blocking cars in front of the red car πŸš—. This heuristic is admissible because one would need at least that many moves to reach the goal. Right? Therefore, the solutions provided by Aβˆ—A^{*} should be cost-optimal.

So, what’s the point of expressing Rushhour as an optimization problem?

A few calculations will quickly make you reconsider. With a board of 6Γ—66\times6 and a maximum number of 5050 steps, plus variables for representing the cars and other auxiliary variables, you end up with thousands or tens of thousands of discrete variables and constraints, possibly including nonlinear or indicator constraints linking the car moves to their positions on the board.

In short, there is no practical point. It’s just for fun.

The main reason I wanted to do this is that when my son brought home this game from the library, the first thing I wondered was if I could formulate it as an optimization problem. So here we are.

If you prefer reading the code, you can jump directly to it here.

Representing the board and the cars

First, we must decide how to represent the board and the cars. The board can be modeled as a three-dimensional binary variable b(T,6,6)b(T, 6, 6), where TT is the maximum number of time steps and 11 represents occupied and 00 represents free. Each car can also be represented as a binary variable of shape (T,N)(T, N), N∈{2,3}N\in \{2,3\}.

I spent quite some time trying to figure out how to link car’s movement to the board, which is non-trivial in a declarative formulation. One cannot simply β€œassign” a variable based on the outcome of another variable.

It took me a 40-minute walk to come up with a different solution that doesn’t require complicated constraints. Rather than modeling the board and the cars separately, I model them together, adding one more dimension to the board b(T,C,S,S)b(T, C, S, S).

This approach simplifies the logic but it adds a lot of unnecessary variables, even if only a few rows and columns have active constraints.

I ended up combining both approaches by modeling the cars and the board as separate variables but operating on them as if the cars were an additional dimension of the board.

β†’ Parameter: T=50T=50, maximum number of time steps.

β†’ Parameter: S=6S=6, board size.

β†’ Parameter: CC, the set of cars in the current game instance, e.g, C={A,B,G,H,I,J,K,O,P,Q,X}C=\{A, B, G, H, I, J, K, O, P, Q, X\}.

β†’ Variable: b(T,S,S)∈{0,1}b(T, S, S) \in \{0, 1\}, the board.

β†’ Variable: cars(T,C,S)∈{0,1}\text{cars}(T, C, S) \in \{0, 1\}, the cars.

β†’ Constraint 1: The link between the board and the cars. This set of constraints is probably best understood by looking at the code.

b(t,i,j)=βˆ‘c∈Ccars(t,c,i)+βˆ‘c∈Ccars(t,c,j)\begin{align*} b(t, i, j) &= \sum_{c \in C} \text{cars}(t, c, i) \\ &+ \sum_{c \in C} \text{cars}(t, c, j) \end{align*}

For all t,i,jt, i, j if car cc is in row ii or column jj.

β†’ Constraint 2: Non-overlapping cars

βˆ‘c∈Cb(t,i,j)≀1forΒ allt,i,j\sum_{c \in C} b(t, i, j) \leq 1 \quad \text{for all}\quad t, i, j

The following figures illustrate valid and invalid car positions according to Constraint 2.

Car A board Car X board Superposition without overlap

Figure: Valid car positions - Car A board (left), Car X board (center), and superposition without overlap (right) at some time tkt_{k}

The following arrangement is not allowed:

Car A board with overlap Car X board Superposition with overlap

Figure: Unfeasible car positions - Car A board (left), Car X board (center), and superposition with overlap (right) at some time tkt_{k}

β†’ Constraint 3: Constant car size.

βˆ‘i=1Scar(t,c,i)=len(c),forΒ allt,c\sum_{i=1}^{S}\text{car}(t, c, i) = len(c), \quad \text{for all}\quad t, c

XX can leave the board; that’s the objective of the game. However, I consider the position in which XX is about to leave to be the last one.

Even if the number of blocks that make up a car is constrained to remain constant, it doesn’t mean that the blocks are automatically contiguous.

β†’ Variable: pin(T,C,Sβˆ’1)∈{0,1}\text{pin}(T, C, S - 1) \in \{0, 1\}

This variable indicates the starting position of the car: the leftmost and uppermost positions for horizontal and vertical cars, respectively.

β†’ Constraint 4: Cars can only start at one position.

βˆ‘i=1Sβˆ’1pin(c,t,i)=1,forΒ allt,c\sum_{i=1}^{S-1} \text{pin}(c, t, i) = 1, \quad \text{for all}\quad t,c

However, I suspect that this condition is redundant and can be removed.

β†’ Constraint 5: The blocks that make up a car must be contiguous.

cars(t,c,j)=βˆ‘k=0len(c)βˆ’1pin(t,c,jβˆ’k)\text{cars}(t, c, j) = \sum_{k=0}^{len(c)-1} \text{pin}(t, c, j-k) forΒ allt,cifΒ 1≀jβˆ’k≀Sβˆ’1\quad \text{for all}\quad t, c \quad \text{if } 1 \le j-k \leq S-1 \quad

Preventing β€œTunneling”

The crux of this formulation is the constraint that ensures cars of length two cannot β€œtunnel” through other cars. This is not an issue for cars of length three because the non-overlapping constraint guarantees that all moves are valid.

As shown in the figure below, the two moves in the middle (+1+1, +2+2) are not problematic because they are accounted for by the non-overlapping constraint. However, the move on the far right is problematic (pun intended πŸ˜‰) because the red car would go through the purple car.

Car starting position Car moved 1 tile right Car moved 2 tiles right Car moved 3 tiles right (tunneling)

Figure. A car that is two tiles long starts in the third row, first column. In a single time step, it moves right by 1, 2, and 3 tiles, respectively.

Model 0

This is the base model, the first one I implemented. It restricts cars from moving more than one tile per turn. This condition prevents tunneling. The objective function is the number of steps it takes to move car XX πŸš— to the exit.

This model solves a normalized version of the puzzle. The optimal value is the number of one-tile moves required to solve the puzzle. This number is equal to the number of one-tile moves in the optimal solution. Unfortunately, there is no one-to-one correspondence between one-tile moves and actual moves. The solution obtained by aggregating one-tile moves generally does not correspond to the optimal solution. You can think of this solution as an upper bound though.

Model 1

This model builds on Model 0. It introduces the variable nmoves\text{nmoves}, which indirectly counts the actual number of moves. A move is defined as a car that is different from the car in the previous turn\text{turn} and that moves any number of tiles. The objective function is the sum over nmoves\text{nmoves}. The resulting sequence is equivalent to the optimal one (although it must be compressed), but the objective value does not represent the actual number of moves (at least not directly). This objective function is not well-behaved, and the problem can become intractable for larger instances.

Model 2

This model uses the same objective function as in Model 0, but it allows cars to move more than one tile per turn. To prevent tunneling, a mask\text{mask} representing the tiles between the starting and the end positions of a single move is introduced. This constraint, which is only active if turn(t,c)=1\text{turn}(t, c)=1, is added to the board as if it were an additional dimension. The rest of the post describes this model in detail.

Definition of Model 2

β†’ Variable: z(T,C,S)∈{0,1}z(T, C, S) \in \{0, 1\}, auxiliary variable to count the number of tiles a car moves in a single turn.

β†’ Variable: turn(Tβˆ’1,C,S)∈{0,1}\text{turn}(T-1, C, S) \in \{0, 1\}, indicates which car can move at each time step.

β†’ Constraint 6: Only one car can move per time step.

βˆ‘c∈Cturn(t,c)≀1forΒ allt>1\sum_{c \in C} \text{turn}(t, c) \leq 1 \quad \text{for all}\quad t > 1

We will need a few tricks to represent Boolean conditions in a linear way.

z=a∧bz = a \wedge bz=a∨bz = a \vee b
1z≀az \leq aa≀za \leq z
2z≀az \leq ab≀zb \leq z
3zβ‰₯a+bβˆ’1z \geq a + b - 1z≀a+bz \leq a + b

Table: Tricks for representing AND and OR logical conditions. Source: jump.dev

β†’ Constraint 7: Cars can only move if it is their turn.

z(t,c,i)=cars(t,c,j)∨cars(tβˆ’1,c,j),t>1,c,jβˆ‘i=1Sz(t,c,i)≀len(c)+turn(t,c)β‹…len(c)\begin{align*} z(t, c, i) &= \text{cars}(t, c, j) \\ & \vee \text{cars}(t-1, c, j), \quad t>1, c, j\\ \sum_{i=1}^{S}z(t, c, i) &\leq len(c) + \text{turn}(t, c)\cdot len(c) \end{align*}

For all t>1t>1, and all cc.

Preventing tunneling

These constraints are rather convoluted, but I haven’t found a simpler way yet. Ideas are welcome!

The basic idea is to create a mask representing the tiles between the car’s starting and ending positions in the current turn. This mask is then added to the board as an additional dimension to ensure there are no obstacles in the path.

β†’ Variable: mask(W,T,C,S,S)∈{0,1}\text{mask}(W, T, C, S, S) \in \{0, 1\}, is the set of variables to implement the approach described above.

W∈{fwd,bwd,or,and1,and2,mask,mask_cond}W \in \{\text{fwd}, \text{bwd}, \text{or}, \text{and1}, \text{and2}, \text{mask}, \text{mask\_cond}\} is just an index to identify the different variables.

β†’ Constraint 8: Calculate the mask representing the car’s path as it moves through the current turn.

mask(fwd,t,c,i)=mask(fwd,t,c,iβˆ’1)+pin(t,c,i)i=1…Smask(bwd,t,c,i)=mask(bwd,t,c,iβˆ’1)+pin(t,c,i)i=Sβˆ’1…1mask(and1,t,c,i)=mask(fwd,tβˆ’1,c,i)∧mask(bwd,t,c,i)mask(and2,t,c,i)=mask(fwd,t,c,i)∧mask(bwd,tβˆ’1,c,i)mask(mask,t,c,i)=mask(and1,t,c,i)∨mask(and2,t,c,i)mask_cond(t,c,i)=mask(mask,t,c,i)βˆ’cars(t,c,i)\begin{align*} \text{mask}(\text{fwd}, t, c, i) &= \text{mask}(\text{fwd}, t, c, i-1)\\ &+ \text{pin}(t, c, i)\quad i=1\dots S\\ \text{mask}(\text{bwd}, t, c, i) &= \text{mask}(\text{bwd}, t, c, i-1)\\ &+\text{pin}(t, c, i)\quad i=S-1\dots 1\\ \text{mask}(\text{and1}, t, c, i) &= \text{mask}(\text{fwd}, t-1, c, i)\\ &\wedge \text{mask}(\text{bwd}, t, c, i)\\ \text{mask}(\text{and2}, t, c, i) &= \text{mask}(\text{fwd}, t, c, i)\\ &\wedge \text{mask}(\text{bwd}, t-1, c, i)\\ \text{mask}(\text{mask}, t, c, i) &= \text{mask}(\text{and1}, t, c, i)\\ &\vee \text{mask}(\text{and2}, t, c, i)\\ \text{mask\_cond}(t, c, i) &= \text{mask}(\text{mask}, t, c, i)\\ &-\text{cars}(t, c, i) \end{align*}

For all t>1,c,it>1, c, i. The last constraint is implemented using the auxiliary variable mask_diff\text{mask\_diff} and big-M constraints. The Boolean operations are implemented using the aforementioned tricks. See the code for details.

Success criterium

The goal of the game is to get the red car XX πŸš— to the exit.

Car X at the exit position

Figure: Car X at the exit position

β†’ Variable: exit(T)∈{0,1}\text{exit}(T) \in \{0, 1\}, indicates wheter car X is at the exit position at time step tt.

β†’ Constraint 9: Car X must be at the exit at some time step t<Tt < T.

exit(t)=cars(t,X,5)∧cars(t,X,6)forΒ allt1=βˆ‘t=1Texit(t)\begin{align*} \text{exit}(t) &= \text{cars}(t, X, 5) \wedge \text{cars}(t, X, 6) \quad \text{for all}\quad t\\ 1 &= \sum_{t=1}^{T} \text{exit}(t) \end{align*}

Objective function

β†’ Objective: Minimize the number of time steps it takes for car X to reach the exit.

Minβˆ‘t=1Ttβ‹…exit(t)\text{Min} \sum_{t=1}^{T} t \cdot \text{exit}(t)

Conclusions

A note on the solvers

I tried HiGHS and cuOpt which was recently open-sourced. However, I only had success with HiGHS. I didn’t investigate why cuOpt failed in these instances.

Based on this and previous experiences, my only conclusion is that HiGHS is a superb solver.

Open questions

I am convinced that this is one is not the best possible formulation. Ideas are welcome!

Final thoughts

I enjoy doing this kind of work because I always learn something new. If you compare the Branch and Bound (B&B) algorithm used by HiGHS with the Aβˆ—A^{*} algorithm, you will see that they are very similar. I learned that someone in the past tried to generalize these ideas β€”search algorithms, dynamic programming, and B&Bβ€” into a single concept. Check out this article.

And a final though on optimization, and why the title of this post is β€œDeath, taxes and optimization”.

Optimization is one of those beautiful, pervasive theories that has made an indelible mark on the history of mathematics, engineering, and many other disciplines since its inception. It will continue to be relevant, even if hidden behind something more showy and bombastic. Mathematical optimization is a fundamental discovery. It will continue to be relevant in the foreseeable future and will survive all the hype.

If those lines sound dramatic and opinionated, they are. They’re also true πŸ˜‰.


Share this post on:

Next Post
Who is up for a programming challenge?