Modeling is a powerful way to solve complex problems. According to the book Modeling Languages in Mathematical Optimization (see the Resources section), models are used to:

  • Explain phenomena
  • Make predictions
  • Assess key factors
  • Identify extreme
  • Analyze trade-offs

In business, spreadsheet software like Microsoft® Excel is often the first choice for modeling problems. Now, spreadsheets are often very intuitive, but they can be limited in what they can ultimately do for large-scale problems. If you’re a developer, you might find it more expedient simply to write a script to solve a modeling problem, because then you can easily integrate the script into other systems. This article introduces the basics of linear optimization in Python using the Pyomo library.

You might be familiar with algebraic modeling languages such as AMPL, AIMMS, and GAMS. The advantage of using Pyomo is that its modeling objects are embedded within a high-level programming language with a rich set of supporting libraries. Another key advantage to doing scientific or mathematical computing in Python is that the language can easily translate algebraic notation into an expression in code. In addition, there are many mature libraries you can leverage. An example is NumPy (see the Resources section), a library that provides additional data structures such as a multi-dimensional array. The NumPy library is used in a code example later in the article because it answers the requirement for a multi-dimensional data structure.

Getting started

To start, install Pyomo. Pyomo is a central component of Coopr, a collection of Python software packages. Download the coopr_install script, which creates a Python virtual environment when you run it with the Python interpreter.

To create a relative directory named “coopr”:

noahs‑MacBook‑Air% python coopr_install

Use the following command to start Pyomo, which puts the relative directory and its executables in your path:

noahs‑MacBook‑Air% source coopr/bin/activate

Use the Pyomo "--help command to get help for using pyomo:

(coopr)noahs‑MacBook‑Air% pyomo ‑‑help

You need a compiler to use both the Virtual Python Environment builder (virtualenv) and Pyomo. On OS X, use the XCode Developer Tools command-line tools. On Linux, use the GNU Compiler Collection (GCC). Once this virtual environment is initialized, you can solve problems with Pyomo in one of two ways:

  • Use the Pyomo command-line tool:
    (coopr)noahs‑MacBook‑Air% pyomo ‑‑solver=glpk
  • Or, embed the initialization inside your script and run it via the Python interpreter:
    Listing 1. Invoking Pyomo inside a script
    #This is an optional code path that allows the script to be run outside of
    #pyomo command‑line.  For example:  python
    if name == 'main':
        #This replicates what the pyomo command‑line tools does
        from coopr.opt import SolverFactory
        opt = SolverFactory("glpk")
        instance = model.create()
        results = opt.solve(instance)
        #sends results to stdout

Pyomo assumes you have at least one solver installed. The GNU Linear Programming Kit (glpk) is the default. Refer to the installation instructions for the solver you want to use. Then make sure Pyomo is able to find this solver on its path.

Modeling strategies

There are two ways to create a data model with Pyomo: Abstract and concrete. The key difference is whether to separate the model and the data.

  • In an abstract model, you separate the model and the data.
  • In a concrete model, you define the data inside the model itself. For example, in a Pyomo script that uses a concrete model, everything is defined inside a single script file. For simpler problems, especially teaching problems, this can be an efficient approach. On the other hand, larger problems often have larger data sets, which makes it impractical to embed the data in the same script as the code needed to work on the data.

At a high level, a Pyomo model consists of variables, objectives, and constraints.

  • Variables are values that are calculated during the optimization.
  • Objective are Python functions that take data and variables, and is either maximized or minimized.
  • Constraints are a way of defining an expression that limits the values a variable can assume.

When creating a Pyomo model, you need to undertand the Python method of notating algebraic expressions. Here is a quick summary that compares expressions between a spreadsheet, algebra, and Python:

Figure 1. Table of notation comparisons
Table of notation comparisons

Pyomo application example: Wyndor

To illustrate a simple linear optimization problem using Pyomo, let’s take a product-mix problem from the book Introduction to Management Science (see the Resources section). The Wyndor factory produces doors and windows. Each of three plants with different available hours can produce either doors or windows. To determine how to maximize the profit available in the limited amount of hours of production, write the following script. It has three main sections: model, objective, and constraint.

  • The model — it’s a concrete model — instantiates the data of the problem, such as the hours available from plants.
  • The objective (which can be either to maximize or minimize) is to maximize profit.
  • The constraint uses a CapacityRule function, written using a Python idiom called list comprehension.

Here’s the script for the Wyndor optimization problem:

Listing 2. Wyndor Pyomo basic examples

"""Wyndor model from Hillier and Hillier *Introduction to Management Science*

One way to do it...
Original Author dlw
Modified ndg

To run this you need pyomo and the glpk solver installed, and need to be
inside of the virtual environment.
When these dependencies are installed you can solve this way (glpk is default):



#Explicit importing makes it more clear what a user needs to define
#versus what is included within the pyomo library
from coopr.pyomo import (ConcreteModel, Objective, Var, NonNegativeReals,
                              maximize, Constraint)

Products = ['Doors', 'Windows']
ProfitRate = {'Doors':300, 'Windows':500}
Plants = ['Door Fab', 'Window Fab', 'Assembly']
HoursAvailable = {'Door Fab':4, 'Window Fab':12, 'Assembly':18}
HoursPerUnit = {('Doors','Door Fab'):1, ('Windows', 'Window Fab'):2,
                ('Doors','Assembly'):3, ('Windows', 'Assembly'):2,
                ('Windows', 'Door Fab'):0, ('Doors', 'Window Fab'):0}

#Concrete Model
model = ConcreteModel()
#Decision Variables
model.WeeklyProd = Var(Products, within=NonNegativeReals)

model.obj = Objective(expr=
            sum(ProfitRate[i] * model.WeeklyProd[i] for i in Products),
            sense = maximize)

def CapacityRule(model, p):
    """User Defined Capacity Rule

    Accepts a pyomo Concrete Model as the first positional argument,
    and and a plant index as a second positional argument

    return sum(HoursPerUnit[i,p] * model.WeeklyProd[i] for i in Products)
    <= HoursAvailable[p]

This statement is what Pyomo needs to generate one constraint for each plant
model.Capacity = Constraint(Plants, rule = CapacityRule)

#This is an optional code path that allows the script to be run outside of
#pyomo command-line.  For example:  python
if __name__ == '__main__':

    #This replicates what the pyomo command-line tools does
    from coopr.opt import SolverFactory
    opt = SolverFactory("glpk")
    instance = model.create()
    results = opt.solve(instance)
    #sends results to stdout

Let’s run the script:

(coopr)noahs‑MacBook‑Air% pyomo

Pyomo returns the following output, which shows one feasible solution that achieves a $3,600 profit:

Listing 3. One feasible solution with $3,600 profit
    (coopr)noahs‑MacBook‑Air% pyomo  
[    0.00] Setting up Pyomo environment
[    0.00] Applying Pyomo preprocessing actions
[    0.00] Creating model
[    0.00] Applying solver
[    0.01] Processing results
    Number of solutions: 1
    Solution Information
      Gap: 0.0
      Status: feasible
      Function Value: 3600.0
    Solver results file: results.json
[    0.02] Applying Pyomo postprocessing actions
[    0.02] Pyomo Finished

Let’s find out which ratio of products you should produce. To do that, cat the output file results.json. The results show this:

"Variable": {
                "WeeklyProd[Doors]": {
                    "Id": 0, 
                    "Value": 2.0
                "WeeklyProd[Windows]": {
                    "Id": 1, 
                    "Value": 6.0

From this output, you see that producing six windows and two doors gives you a profit maximization of $3,600.

Another way you can solve the script is to use Python because of the conditional statement added. This invokes Pyomo inside of the script, instead of using Pyomo to invoke the script.

Heuristic optimization: The greedy randomized traveling salesman

The book In Pursuit of the Traveling Salesman: Mathematics at the Limits of Computation by Bill Cook contains a great example of a problem that doesn’t always scale well as a linear program. It’s the Traveling Salesman Problem, or TSP: Given a list of cities, find the shortest possible route that visits each city exactly once and returns to the original city. The worst-case running time that solves the traveling salesman problem increases exponentially with the number of cities.

Figure 2 is a graph that simulates the amount of time necessary to solve for every possible route, assuming it takes a few seconds to solve for a couple of cities. As you can see, it rapidly becomes impossible to solve the problem in your lifetime.

Figure 2. Traveling salesman exponential time to solve
Traveling salesman exponential time to solve

You need another strategy to find a good result. Often a good, but not a guaranteed optimal, solution is achieved by running simulations and then taking the best results of those simulations. An example of this approach is shown in the Python code below. Randomly select a starting city, and then greedily select the next city with the shortest distance. This simulation can then be run N, or multiple times, and the shortest path among the simulations will be run.

Listing 4. Greedy shortest-distance selections

noahs-MacBook-Air% python 20 
Running simulation 20 times
Shortest Distance: 128
Optimal Route: [
 ('PCG', 'GPS', 1), 
 ('GPS', 'URS', 1),
 ('URS', 'WFC', 0),
 ('WFC', 'MCK', 2), 
 ('MCK', 'SFO', 16), 
 ('SFO', 'ORCL', 20), 
 ('ORCL', 'HPQ', 12), 
 ('HPQ', 'GOOG', 6), 
 ('GOOG', 'AAPL', 11), 
 ('AAPL', 'INTC', 8), 
 ('INTC', 'CSCO', 6), 
 ('CSCO', 'EBAY', 0), 
 ('EBAY', 'SWY', 32), 
 ('SWY', 'CVX', 13)

Listing 5. TSP greedy random start

#!/usr/bin/env python
Traveling salesman solution with random start and greedy path selection
You can select how many iterations to run by doing the following:

python 20 #runs 20 times


import sys
from random import choice
import numpy as np
from routes import values

dt = np.dtype([('city_start', 'S10'), ('city_end', 'S10'), ('distance', int)])
data_set = np.array(values,dtype=dt)

def all_cities(mdarray):
    """Takes a multi-dimensional array in this format and finds unique cities

    array([["A", "A"],
    ["A", "B"]])

    cities = {}
    city_set = set(data_set['city_end'])
    for city in city_set:
        cities[city] = ""
    return cities

def randomize_city_start(all_cities):
    """Returns a randomized city to start trip"""

    return choice(all_cities)

def get_shortest_route(routes):
    """Sort the list by distance and return shortest distance route"""

    route = sorted(routes, key=lambda dist: dist[2]).pop(0)
    return route

def greedy_path():
    """Select the next path to travel based on the shortest, nearest path"""

    itinerary = []
    cities = all_cities(data_set)
    starting_city = randomize_city_start(cities.keys())
    #print "starting_city: %s" % starting_city
    cities_visited = {}
    #we want to iterate through all cities once
    count = 1
    while True:
        possible_routes = []
        distance = [] 
        #print "starting city: %s" % starting_city
        for path in data_set:
            if starting_city in path['city_start']:
                #we can't go to cities we have already visited
                if path['city_end'] in cities_visited:
                    #print "path: ", path

        if not possible_routes:
        #append this to itinerary
        route = get_shortest_route(possible_routes)
        #print "Route(%s): %s " % (count, route)
        count += 1
        #add this city to the visited city list
        cities_visited[route[0]] = count
        #print "cities_visited: %s " % cities_visited
        #reset the starting_city to the next city
        starting_city = route[1]
        #print "itinerary: %s" % itinerary

    return itinerary

def get_total_distance(complete_itinerary):

    distance = sum(z for x,y,z in complete_itinerary)
    return distance

def lowest_simulation(num):

    routes = {}
    for i in range(num):
        itinerary = greedy_path()
        distance = get_total_distance(itinerary)
        routes[distance] = itinerary
    shortest_distance = min(routes.keys())
    route = routes[shortest_distance]
    return shortest_distance, route

def main():
    """runs everything"""

    if len(sys.argv) == 2:
        iterations = int(sys.argv[1])
        print "Running simulation %s times" % iterations
        distance, route = lowest_simulation(iterations)
        print "Shortest Distance: %s" % distance
        print "Optimal Route: %s" % route
        #print "All Routes: %s" % data_set
        itinerary = greedy_path()
        print "itinerary: %s" % itinerary
        print "Distance: %s" % get_total_distance(itinerary)

if __name__ == '__main__':

The TSP script uses a NumPy multi-dimensional array as the source of data. If you want to run this example, you need to install NumPy (see the Resources section). The array is defined in a file called Here is what those routes look like.

Listing 6. NumPy multi-dimensional array containing city-to-city routes

values = [
("AAPL", "CSCO", 14),
("AAPL", "CVX", 44),
("AAPL", "EBAY", 14),
("AAPL", "GOOG", 14),
("AAPL", "GPS", 59),
("AAPL", "HPQ", 14),
("AAPL", "INTC", 8),
("AAPL", "MCK", 60),
("AAPL", "ORCL", 26),
("AAPL", "PCG", 59),
("AAPL", "SFO", 46),
("AAPL", "SWY", 37),
("AAPL", "URS", 60),
("AAPL", "WFC", 60),

("CSCO","AAPL" ,14),
("CSCO", "CVX", 43),
("CSCO", "EBAY",0),
("CSCO", "GOOG",21),
("CSCO", "GPS",67),
("CSCO", "HPQ",26),
("CSCO", "INTC",6),
("CSCO", "MCK",68),
("CSCO", "ORCL",37),
("CSCO", "PCG",68),
("CSCO", "SFO",57),
("CSCO", "SWY",32),
("CSCO", "URS",68),
("CSCO", "WFC",68),

("CVX","AAPL" ,44),
("CVX", "CSCO",43),
("CVX", "EBAY",43),
("CVX", "GOOG",36),
("CVX", "GPS",43),
("CVX", "HPQ",40),
("CVX", "INTC",41),
("CVX", "MCK",46),
("CVX", "ORCL",39),
("CVX", "PCG",44),
("CVX", "SFO",45),
("CVX", "SWY",13),
("CVX", "URS",44),
("CVX", "WFC",44),

("EBAY","AAPL" ,14),
("EBAY", "CSCO",0),
("EBAY", "CVX",43),
("EBAY", "GOOG",21),
("EBAY", "GPS",67),
("EBAY", "HPQ",26),
("EBAY", "INTC",6),
("EBAY", "MCK",68),
("EBAY", "ORCL",37),
("EBAY", "PCG",68),
("EBAY", "SFO",57),
("EBAY", "SWY",32),
("EBAY", "URS",68),
("EBAY", "WFC",68),

("GOOG", "CSCO",21),
("GOOG", "CVX",36),
("GOOG", "EBAY",21),
("GOOG", "GPS",48),
("GOOG", "HPQ",6),
("GOOG", "INTC",15),
("GOOG", "MCK",49),
("GOOG", "ORCL",16),
("GOOG", "PCG",48),
("GOOG", "SFO",36),
("GOOG", "SWY",32),
("GOOG", "URS",49),
("GOOG", "WFC",49),

("GPS","AAPL" ,59),
("GPS", "CSCO",67),
("GPS", "CVX",43),
("GPS", "EBAY",67),
("GPS", "GOOG",48),
("GPS", "HPQ",45),
("GPS", "INTC",62),
("GPS", "MCK",03),
("GPS", "ORCL",34),
("GPS", "PCG",01),
("GPS", "SFO",18),
("GPS", "SWY",53),
("GPS", "URS",01),
("GPS", "WFC",01),

("HPQ","AAPL" ,14),
("HPQ", "CSCO",26),
("HPQ", "CVX",40),
("HPQ", "EBAY",26),
("HPQ", "GOOG",6),
("HPQ", "GPS",45),
("HPQ", "INTC",20),
("HPQ", "MCK",46),
("HPQ", "ORCL",12),
("HPQ", "PCG",46),
("HPQ", "SFO",32),
("HPQ", "SWY",37),
("HPQ", "URS",46),
("HPQ", "WFC",46),

("INTC", "CVX",41),
("INTC", "EBAY",6),
("INTC", "GOOG",15),
("INTC", "GPS",62),
("INTC", "HPQ",20),
("INTC", "MCK",63),
("INTC", "ORCL",31),
("INTC", "PCG",62),
("INTC", "SFO",51),
("INTC", "SWY",32),
("INTC", "URS",63),
("INTC", "WFC",63),

("MCK", "AAPL",60),
("MCK", "CSCO",68),
("MCK", "CVX",46),
("MCK", "EBAY",68),
("MCK", "GOOG",49),
("MCK", "GPS",03),
("MCK", "HPQ",46),
("MCK", "INTC",63),
("MCK", "ORCL",34),
("MCK", "PCG",3),
("MCK", "SFO",16),
("MCK", "SWY",56),
("MCK", "URS",3),
("MCK", "WFC",2),

("ORCL", "AAPL",22),
("ORCL", "CSCO",37),
("ORCL", "CVX",39),
("ORCL", "EBAY",37),
("ORCL", "GOOG",16),
("ORCL", "GPS",34),
("ORCL", "HPQ",12),
("ORCL", "INTC",31),
("ORCL", "MCK",34),
("ORCL", "PCG",35),
("ORCL", "SFO",20),
("ORCL", "SWY",40),
("ORCL", "URS",35),
("ORCL", "WFC",35),

("PCG", "AAPL",59),
("PCG", "CSCO",68),
("PCG", "CVX",44),
("PCG", "EBAY",68),
("PCG", "GOOG",48),
("PCG", "GPS",01),
("PCG", "HPQ",46),
("PCG", "INTC",62),
("PCG", "MCK",03),
("PCG", "ORCL",35),
("PCG", "SFO",18),
("PCG", "SWY",54),
("PCG", "URS",01),
("PCG", "WFC",01),

("SFO", "AAPL",46),
("SFO", "CSCO",57),
("SFO", "CVX",45),
("SFO", "EBAY",57),
("SFO", "GOOG",36),
("SFO", "GPS",18),
("SFO", "HPQ",32),
("SFO", "INTC",51),
("SFO", "MCK",16),
("SFO", "ORCL",20),
("SFO", "PCG",18),
("SFO", "SWY",52),
("SFO", "URS",18),
("SFO", "WFC",18),

("SWY", "AAPL",37),
("SWY", "CSCO",32),
("SWY", "CVX",13),
("SWY", "EBAY",32),
("SWY", "GOOG",32),
("SWY", "GPS",53),
("SWY", "HPQ",37),
("SWY", "INTC",32),
("SWY", "MCK",56),
("SWY", "ORCL",40),
("SWY", "PCG",54),
("SWY", "SFO",52),
("SWY", "URS",54),
("SWY", "WFC",54),

("URS", "AAPL",60),
("URS", "CSCO",68),
("URS", "CVX",44),
("URS", "EBAY",68),
("URS", "GOOG",49),
("URS", "GPS",01),
("URS", "HPQ",46),
("URS", "INTC",63),
("URS", "MCK",03),
("URS", "ORCL",35),
("URS", "PCG",01),
("URS", "SFO",18),
("URS", "SWY",54),
("URS", "WFC",0),

("WFC", "AAPL",60),
("WFC", "CSCO",68),
("WFC", "CVX",44),
("WFC", "EBAY",68),
("WFC", "GOOG",49),
("WFC", "GPS",01),
("WFC", "HPQ",46),
("WFC", "INTC",63),
("WFC", "MCK",02),
("WFC", "ORCL",35),
("WFC", "PCG",01),
("WFC", "SFO",18),
("WFC", "SWY",54),
("WFC", "URS",0),



In this article, I covered the basics of doing optimization with Pyomo and Python. This should be enough to get you started on your own. In the next two articles in this series, I walk through some more detailed examples and cover some pragmatic aspects of building larger programs and dealing with scaling issues.


Special thanks to Dr. David L. Woodruff for reviewing this article.