Maxi/Mini-mize through Linear Programming
I often struggle to explain the field of Operations Research (OR) and the role of an OR professional in industry to my friends and family members. The popularity of other applied mathematics disciplines such as data science and machine learning further mystifies the identity of OR. This is my humble attempt to immerse you into the fascinating universe of Operations Research by hovering over a galaxy named mathematical optimization and exploring the planet of Linear Programming.
What is mathematical optimization?
Mathematical optimization is a branch of applied mathematics that deals with selecting the best out of all possible outcomes. The overarching question to model mathematically tends to be “how should I make a set of decisions within my limitations such that I get the maximum bang for the buck?” The highlighted words of the sentence essentially build the skeleton of optimization models.
Set of decisions formally termed as decision variables – e.g. which surgeon should perform what surgery in what operating room on what date at what time
Limitations formally termed as constraints – e.g. there are only 3 operating rooms that are open for 8 hours, there are only 2 assistants available between 5 surgeons, there are only 10 post-surgery beds, etc.
Maximum bang for the buck formally termed as objective function – e.g. maximize usage of operating rooms, or minimize patient wait time, etc.
A generalized optimization model looks like this in mathematical language:
When the problem in real world can be structured and quantified in this format, it can be solved using the modeling techniques of mathematical optimization.
Below are more examples from various industries that can be formulated as optimization models to either maximize or minimize an objective value:
Decide how many factories/warehouses to open and where should they be located to minimize operating cost
Decide how much of what product to produce at what factory to maximize profit
Decide how to route food deliveries given multiple origins and destinations to minimize customer wait times
Decide how to fill a trailer with multiple shipments to maximize trailer utilization
Decide how many operating rooms should be open and for what time intervals to maximize utilization
Schedule jobs for processors to minimize wait times
Create robust electricity networks to balance the load
There are different categories of optimization models based on the types of objective function, decision variables, and constraints. Below table is an overview of different categories.
Note: Non-linear program has a bit abstract definition – anything that is not linear in nature falls under non-linear program.
Notice the use of “program” while categorizing mathematical models. If you also assumed the popular meaning of program (computer program) while reading, you are not alone. However, history suggests that this usage is indeed anachronistic since the term “linear programming” was coined way before “programming” got associated with computers. Linear programs were initially utilized by the US military and they used the term program to refer to proposed training and logistics schedules. In this context, programming means a step by step procedure to solve an optimization problem.
One could wonder about the need of converting a real world problem to a math model. It may initially seem more intuitive to calculate each possible outcome and pick the best of all the outcomes. For example, you need to choose 3 furniture items out of 5 available items such that they remain at the top of your preference list and can also fit in your Uhaul. Say, you can fit 50 units of space in Uhaul and below are the specifications for each furniture item.
The brute way of finding the best set of three furniture items would be to calculate all the 10 combinations (5C3) and check which of the combinations abide the space limitation of 50 units. You can then select the combination that covers the maximum value of the sum of the preference scores. This may not seem complicated for such a small exercise. However, the number of combinations exponentially increases as the number of decisions and outcomes increase, making the brute method of calculation almost intractable. One could get smarter to use a heuristic that selects the items in the descending order of preference and ascending order of space. This heuristic will not provide the most optimal - the best - solution for all the cases, including the furniture problem. If an optimization model was created instead to solve the same problem, finding the best solution is guaranteed. (most of the time!)
The benefit of formulating real world problems in mathematical models is to leverage the algorithms exploiting model specific mathematical properties such that the optimal solutions are created in a reasonable amount of time and with limited computational powers. In practice, the math model (LP/MILP/QP) is formulated using a modeling language (JuMP/AMPL/GAMS) and is solved using solvers (CPLEX/Gurobi/Mosek) that can utilize algorithms best suited for the problem at hand. The image below shows the hierarchy of mathematical models, from formulation in modeling languages to solution through algorithms customized for each model type.
What is Linear programming?
The theory of linear programming was first introduced by a soviet mathematician and economist Leonid Kantorovich in 1939. It gained more traction when George Dantzig published the simplex algorithm and John von Neumann developed the theory of duality in 1947. We will briefly touch on the simplex algorithm (an algorithm to solve linear programs) and duality in later sections.
A linear program in standard format with continuous variables, linear constraints, and linear objective function can be written in the following manner:
An important concept in mathematical optimization models, which is easier to understand in the context of linear programming, is of feasible region. A feasible region is an area or a set of values that satisfies all the constraints (e.g. linear inequalities such as x1 + x2 <= 5). For the example shown in the figure above, x1 = 4 and x2 = 1 satisfies the first constraint but not the second one and so point (4,1) is not part of the feasible region. x1 = 3 and x2 = 2 on the other hand satisfies both constraints, therefore point (3,2) is part of the feasible region and could be an optimal solution. In summary, the optimal solution for any optimization model must be one of the values from the feasible region.
Let’s understand linear programming in more detail through an example.
Is there anything that Arya cannot do?
During her adventurous expedition to the West of the Westeros, Arya Stark – the Night King slayer – got mesmerized by the simple and mystical life of the anthropomorphic land she stumbled upon. There she met a dashing dragon Jaqen to fly in love with and started a store to sell copies of her precious weapons, Catspaw dagger (d) and Needle sword (s), as antiques. She needs 1 dragon fire and 3 magic spells to make one dagger. She needs 4 dragon fires and 2 magic spells to make one sword. Each weapon should be sold in its own leather cover that she sources from another shopkeeper in limited quantity, 6 for daggers and 10 for swords. She can have up to 24 dragon fires and 22 magic spells in a day. The dagger brings a profit of 5 gold coins while the sword brings 4 gold coins. How many daggers and swords should be made from the available resources in a day to maximize profits assuming that everything made gets sold? Arya has determined to solve this puzzle by writing a linear program.
Decision variables: d (how many daggers), s (how many swords)
maximize z = 5d + 4s
d + 4s <= 24 (1)
3d + 2s <= 22 (2)
0 <= d <= 6 (3)
0 <= s <= 10 (4)
First and second constraints are for the limited availability of dragon fire and magic spells while third and fourth constraints are for the limited availability of dagger and sword covers. The objective, z = 5d + 4s, is to maximize total profit made by selling daggers and swords.
How can she solve this linear program? She can try different combinations of d and s such that it satisfies all the constraints and maximizes the profit. But this can be very tedious and time consuming. So, she decided to visualize these equations and see how she can solve it.
The colored lines show the constraints with an equality (e.g. d + 4s = 24) and the inequality is translated into a feasible region under each equality line. To elaborate, the region below each line (due to <= inequality) is feasible for each constraint individually and the feasible region considering all constraints is the intersection of all constraint specific feasible regions. The answer to how many swords and daggers to make lies within the highlighted feasible region. Now, she adds a line representing the objective function that has a value of 10 to start with and moves it across the feasible region with other values such as 20, 30, 40, etc. to find a point that gives the maximum possible value (see the image below). In this case when the line is moved towards the farthest right point in the feasible region, maximum profit value of 40 (5*4 + 4*5) is achieved. This means that making 4 daggers and 5 swords each day brings Arya the maximum profit of 40 gold coins.
Did you notice that the optimal solution is found by simply solving two constraints related to dragon fire and magic spell? Other two constraints related to weapon covers are not an active part of the solution. This observation from the graphical method is important in answering following what-if questions.
What happens if Arya decides to increase the profit on the dagger from its current value of 5 gold coins? Will the solution of producing 4 daggers and 5 swords change? For what profit amount, would the solution change?
Let’s try to answer these questions starting with making the profit on dagger to 5.5 gold coins from 5 and keeping the rest of the parameters as they are. This does not change the feasible region since we are not changing constraints. As shown in the figure below, the line representing objective function (brown dashed line) slightly turned clockwise. It also shows that this much change in objective function did not change the solution from before and still stands at the intersection of dragon fire and magic spell constraints. Therefore, Arya can increase the profit by 2 gold coins overall without changing the “production plan”.
Now, for what profit amount, would the solution change? If we keep rotating the objective function in the clockwise direction, it will eventually coincide with the magic spell constraint and dagger cover constraint. At that time, the objective function becomes 6d + 4s. In other words, when Arya increases the profit on dagger from 5 to 6 gold coins, the solution with a maximum profit value of 44 is achieved - making 6 daggers and 2 swords. This implies that as she increases the profit on the dagger while keeping the same profit on the sword, it makes sense to make as many daggers as she can and then make swords with the remaining resources. That is why the solution is to make 6 daggers (the maximum she can make limited by the availability of dagger pockets) and then to make 2 swords. This also suggests that if she keeps increasing the profit on dagger from 6, the solution will not change since she cannot make more than 6 daggers anyway.
Let’s get back to the original problem to explore another interesting observation that is with the magic spell constraint. It seems that if Arya had more magic spells, meaning if the line 3d + 2s <= 22 could be shifted more towards right (3d + 2s <= 22 + c), then she might be able to produce more daggers which gives 1 extra gold coin of profit compared to swords. The question is, how many more magic spells should she have to increase profit?
Let’s say a Raven named Gundry practices black magic next door and is ready to sell Arya the magic spells at 1 gold coin per spell. How many more magic spells should she buy?
There are few possibilities:
1) She decides to buy a small number of magic spells c such that the solution is still at the intersection of dragon fire and magic spell constraints
This means that the solution found by solving below linear equations is (d, s) = (4 + 0.4c, 5 – 0.1c)
d + 4s = 24 (1)
3d + 2s = 22 + c (2)
As we assign the solution to the objective function z = 5d + 4s – c = 5(4+0.4c) + 4(5-0.1c) - c, the profit is 40 + 0.6c. A gain of 0.6 gold coins. Notice that, c is subtracted from the original objective function to reflect spending 1 gold coin per spell for c extra spells. Below figure shows how the solution changes minutely to still be at the same intersection when c is 1.
2) She decides to buy a large number of magic spells c such that the solution is at the intersection of dragon fire, magic spell, and dagger cover constraint
In this case, d is 6, s is 4.5 ([24 – 6]/4), and c is 5. With this solution the profit gained is 43 gold coins. 3 more gold coins in profit from the original problem.
3) She decides to buy more than 5 magic spells out of greed. Below figure shows how magic spell constraint is no longer a part of the solution intersection when c > 5. Therefore, keeping the solution at 6 daggers and 4.5 swords with declining profit (c increases in 6d + 4s – c with constant 6d + 4s). This says that Arya is buying more magic spells than she needs to make weapons.
Arya could also ask if the offer to purchase additional magic spells at 1 gold coin per spell is profitable. She can try various purchase values to make sure her profit is maintained with the extra purchase. However, wouldn’t she be tired with all these what-if analyses? Also, would she have to go through this exercise again if Gundry changes his offer? The original problem is perfect to get the answer of how many daggers and swords to make for given profit values and resources. But, when there is a sensitivity around the changes in resources, the original problem becomes a bit out of context. Could she frame the original problem slightly differently to answer slightly different questions around sensitivity?
The question would be what per unit price is worth for each of the resources, dragon fires (f), magic spells (m), dagger covers (pd), and sword covers (ps) (decision variables) to keep the profit at least to its current values (constraints)? For example, if the price that Arya can afford for a magic spell is 1 gold coin to retain the current profit value, then purchasing it for 1.5 could diminish her profits. To generalize, when the prices are > 0, profit can be increased only when extra amount is purchased at the rate below the price they are worth for. If the price is 0, purchasing extra amount of those resources will not change the solution and the profit value. These prices are often known as shadow prices.
Same data from the original problem:
Linear program written differently:
Decision variables: f (how many gold coins per dragon fire), m (how many gold coins per magic spell), pd (how many gold coins per dagger cover), ps (how many gold coins per sword cover)
maximize z = 5d + 4s minimize z’ = 24*f + 22*m + 6*pd + 10*ps
d + 4s <= 24 (1) 1f + 3m + 1pd + 0ps >= 5 (1)
3d + 2s <= 22 (2) 4f + 2m + 0pd + 1ps >= 4 (2)
f, m, pd, ps >=0 (3)
The first constraint says that a dagger must be worth at least its profit value of 5. Similarly, the second constraint says that a magic spell must be worth at least its profit value of 4.
This problem is impossible to solve through graphical method since it has four dimensions (f, m, pd, and ps). If the problem is solved through a clever algorithm simplex (more in later section), we get f = 0.2, m = 1.6, pd = 0, and ps = 0. This means that if Arya purchases magic spells from Gundry at 1 gold coin per spell, she will make profit because she can afford at max 1.6 gold coins for a magic spell. Notice that prices for the covers of sword and dagger are zero which means that their worth in increasing the profit is nothing. This confirms our observation of cover constraints in the original problem being passive for the optimal solution. The problem can now be plotted in two dimensions since ps and pd are 0.
Arya can validate the price dynamics by assuming the purchase cost to be 2 gold coins for a magic spell from Gundry and check how her profit changes. In this case assuming that the value of c does not change the active constraints, profit becomes 40 – 0.4c, a real decline!
The way Arya transformed the original linear program into a different format to answer questions around the sensitivity of the parameters is not a coincidence for the problem at hand. There is a beautiful term associated with this transformation, called duality. The original problem is called the primal problem and the transformed problem is called the dual problem. Every maximization primal problem has its minimization dual problem and vice versa. The primal-dual relationship provides two different perspectives of the same problem and one could derive two different insights from the same. In our example, the primal problem answered how many weapons to produce of each kind in order to maximize the profit and the dual problem answered what is the worth of each ingredient with respect to the profit values. Duality also sheds light on which constraints in the primal problem are crucial for the objective function, dragon fire and magic spell constraint in our example. In addition, the duals of certain primal problems are easier to solve.
If we notice closely, decision variables of the primal problem become constraints for the dual problem while constraints of the primal problem become decision variables of the dual problem.
Below is the general transformation between primal and dual problems with Arya’s example.
The primal dual relationship is also important to prove optimality (no other solution can be better than the optimal solution). This can be further understood through weak and strong duality theorems. (An entire article can be dedicated to this topic)
The graphical method of solving a linear program seems useful but what happens if there are more than 2 dimensions? It can get tricky to visualize in three dimensions and impossible as the number of dimensions increases. This is where algorithms prove their metal. Linear program of any dimension can be solved by various algorithms like simplex, revised simplex, interior point, etc. This book covers these algorithms in detail for further interest.
It is fairly simple to understand how simplex works graphically. A linear program in standard format with the maximizing objective function (z) needs to be converted to minimization (-z) for the algorithm to work. The algorithm utilizes the observation from graphical method that an optimal solution occurs at a vertex of a feasible region. It starts with a base vertex when the feasible region is not empty, let’s say (0,0) for Arya’s example as shown in the figure below. It then moves to the adjacent vertex that has lower objective value. In our example, there are two adjacent vertices (6,0) with the objective value of -30 and (0,6) with the objective value of -24. Since (6,0) has lower value (-30 compared to -24), the algorithm moves to (6,0). It keeps on traversing to adjacent vertices until it finds a point that has lower value compared to its adjacent vertices or finds the problem to be unbounded (infinitely minimizing). For our example, when the algorithm is at (4,5) vertex, both the adjacent vertices (6,2) and (0,6) have lower values than (4,5) and that is how the optimality is declared for (4,5) vertex.
There are many real world problems that can be formulated and solved as linear programs.
The diet problem - from various food items, select ones that satisfy all the nutritional requirements while minimizing the total cost of purchasing those items
Linear model fitting – Given the set of observations for various inputs, find weights of a function that fits the observations such that the error distance is minimized
Resource allocation – e.g. how many different kind of labor hours to allocate to produce different products such that the profit is maximized
Classification – Support vectors – find a hyperplane that separates two sets of outcomes. This hyperplane is used to classify a new input to a specific outcome such that the error of misclassification is minimized
That’s all for today. I hope the article at least provided enough context about OR and LP to have a laugh at the below meme :) Stay safe!
I deeply thank Isabella, Srijan, and Girish for reviewing this article. In addition, kudos to Charu for creating this knowledge-rich platform and providing the opportunity to contribute.