Introduction to Minimum Cost Flow Optimization in Python

Minimal value circulation optimization minimizes the price of shifting circulation by a community of nodes and edges. Nodes embrace sources (provide) and sinks (demand), with totally different prices and capability limits. The purpose is to seek out the least pricey method to transfer quantity from sources to sinks whereas adhering to all capability limitations.

Purposes

Purposes of minimal value circulation optimization are huge and assorted, spanning a number of industries and sectors. This method is essential in logistics and provide chain administration, the place it’s used to attenuate transportation prices whereas guaranteeing well timed supply of products. In telecommunications, it helps in optimizing the routing of information by networks to cut back latency and enhance bandwidth utilization. The power sector leverages minimal value circulation optimization to effectively distribute electrical energy by energy grids, decreasing losses and operational prices. City planning and infrastructure improvement additionally profit from this optimization method, because it assists in designing environment friendly public transportation methods and water distribution networks.

Instance

Beneath is an easy circulation optimization instance:

The picture above illustrates a minimal value circulation optimization drawback with six nodes and eight edges. Nodes A and B function sources, every with a provide of fifty models, whereas nodes E and F act as sinks, every with a requirement of 40 models. Each edge has a most capability of 25 models, with variable prices indicated within the picture. The target of the optimization is to allocate circulation on every edge to maneuver the required models from nodes A and B to nodes E and F, respecting the sting capacities on the lowest potential value.

Node F can solely obtain provide from node B. There are two paths: straight or by node D. The direct path has a value of two, whereas the oblique path by way of D has a mixed value of three. Thus, 25 models (the utmost edge capability) are moved straight from B to F. The remaining 15 models are routed by way of B -D-F to satisfy the demand.

At the moment, 40 out of fifty models have been transferred from node B, leaving a remaining provide of 10 models that may be moved to node E. The out there pathways for supplying node E embrace: A-E and B-E with a value of three, A-C-E with a value of 4, and B-C-E with a value of 5. Consequently, 25 models are transported from A-E (restricted by the sting capability) and 10 models from B-E (restricted by the remaining provide at node B). To fulfill the demand of 40 models at node E, a further 5 models are moved by way of A-C-E, leading to no circulation being allotted to the B-C pathway.

Mathematical formulation

I introduce two mathematical formulations of minimal value circulation optimization:

1. LP (linear program) with steady variables solely

2. MILP (combined integer linear program) with steady and discrete variables

I’m utilizing following definitions:

LP formulation

This formulation solely incorporates determination variables which can be steady, that means they will have any worth so long as all constraints are fulfilled. Choice variables are on this case the circulation variables x(u, v) of all edges.

The target perform describes how the prices which can be presupposed to be minimized are calculated. On this case it’s outlined because the circulation multiplied with the variable value summed up over all edges:

Constraints are circumstances that should be happy for the answer to be legitimate, guaranteeing that the circulation doesn’t exceed capability limitations.

First, all flows should be non-negative and never exceed to edge capacities:

Circulate conservation constraints be certain that the identical quantity of circulation that goes right into a node has to come back out of the node. These constraints are utilized to all nodes which can be neither sources nor sinks:

For supply and sink nodes the distinction of out circulation and in circulation is smaller or equal the provision of the node:

If v is a supply the distinction of outflow minus influx should not exceed the provision s(v). In case v is a sink node we don’t permit that greater than -s(v) can circulation into the node than out of the node (for sinks s(v) is adverse).

MILP

Moreover, to the continual variables of the LP formulation, the MILP formulation additionally incorporates discreate variables that may solely have particular values. Discrete variables permit to limit the variety of used nodes or edges to sure values. It may also be used to introduce mounted prices for utilizing nodes or edges. On this article I present easy methods to add mounted prices. You will need to be aware that including discrete determination variables makes it rather more troublesome to seek out an optimum answer, therefore this formulation ought to solely be used if a LP formulation isn’t potential.

The target perform is outlined as:

With three phrases: variable value of all edges, mounted value of all edges, and stuck value of all nodes.

The utmost circulation that may be allotted to an edge depends upon the sting’s capability, the sting choice variable, and the origin node choice variable:

This equation ensures that circulation can solely be assigned to edges if the sting choice variable and the origin node choice variable are 1.

The circulation conservation constraints are equal to the LP drawback.

Implementation

On this part I clarify easy methods to implement a MILP optimization in Python. You could find the code on this repo.

Libraries

To construct the circulation community, I used NetworkX which is a superb library (https://networkx.org/) for working with graphs. There are various fascinating articles that reveal how highly effective and straightforward to make use of NetworkX is to work with graphs, i.a. customizing NetworkX Graphs, NetworkX: Code Demo for Manipulating Subgraphs, Social Network Analysis with NetworkX: A Gentle Introduction.

One essential side when constructing an optimization is to be sure that the enter is accurately outlined. Even one small error could make the issue infeasible or can result in an surprising answer. To keep away from this, I used Pydantic to validate the person enter and lift any points on the earliest potential stage. This article provides a simple to grasp introduction to Pydantic.

To rework the outlined community right into a mathematical optimization drawback I used PuLP. Which permits to outline all variables and constraint in an intuitive method. This library additionally has the benefit that it could actually use many various solvers in a easy pug-and-play vogue. This article supplies good introduction to this library.

Defining nodes and edges

The code under reveals how nodes are outlined:

from pydantic import BaseModel, model_validator
from typing import Non-obligatory

# node and edge definitions
class Node(BaseModel, frozen=True):
    """
    class of community node with attributes:
    title: str - title of node
    demand: float - demand of node (if node is sink)
    provide: float - provide of node (if node is supply)
    capability: float - most circulation out of node
    kind: str - kind of node
    x: float - x-coordinate of node
    y: float - y-coordinate of node
    fixed_cost: float - value of choosing node
    """
    title: str
    demand: Non-obligatory[float] = 0.0
    provide: Non-obligatory[float] = 0.0
    capability: Non-obligatory[float] = float('inf')
    kind: Non-obligatory[str] = None
    x: Non-obligatory[float] = 0.0
    y: Non-obligatory[float] = 0.0
    fixed_cost: Non-obligatory[float] = 0.0

    @model_validator(mode="after")
    def validate(self):
        """
        validate if node definition are appropriate
        """
        # examine that demand is non-negative
        if self.demand

Nodes are outlined by the Node class which is inherited from Pydantic’s BaseModel. This allows an automated validation that ensures that every one properties are outlined with the proper datatype each time a brand new object is created. On this case solely the title is a required enter, all different properties are optionally available, if they don’t seem to be supplied the required default worth is assigned to them. By setting the “frozen” parameter to True I made all properties immutable, that means they can’t be modified after the item has been initialized.

The validate technique is executed after the item has been initialized and applies extra checks to make sure the supplied values are as anticipated. Particularly it checks that demand, provide, capability, variable value and stuck value should not adverse. Moreover, it additionally doesn’t permit infinite demand as this might result in an infeasible optimization drawback.

These checks look trivial, nevertheless their essential profit is that they may set off an error on the earliest potential stage when an enter is wrong. Thus, they forestall making a optimization mannequin that’s incorrect. Exploring why a mannequin can’t be solved can be rather more time consuming as there are a lot of components that will have to be analyzed, whereas such “trivial” enter error is probably not the primary side to research.

Edges are applied as follows:

class Edge(BaseModel, frozen=True):
"""
class of edge between two nodes with attributes:
origin: 'Node' - origin node of edge
vacation spot: 'Node' - vacation spot node of edge
capability: float - most circulation by edge
variable_cost: float - value per unit circulation by edge
fixed_cost: float - value of choosing edge
"""
origin: Node
vacation spot: Node
capability: Non-obligatory[float] = float('inf')
variable_cost: Non-obligatory[float] = 0.0
fixed_cost: Non-obligatory[float] = 0.0@model_validator(mode="after")
def validate(self):
"""
validate of edge definition is appropriate
"""
# examine that node names are totally different
if self.origin.title == self.vacation spot.title: increase ValueError('origin and vacation spot names should be totally different')
# examine that capability is non-negative
if self.capability         # examine that variable_cost is non-negative
if self.variable_cost         # examine that fixed_cost is non-negative
if self.fixed_cost         return self

The required inputs are an origin node and a vacation spot node object. Moreover, capability, variable value and stuck value might be supplied. The default worth for capability is infinity which suggests if no capability worth is supplied it’s assumed the sting doesn’t have a capability limitation. The validation ensures that the supplied values are non-negative and that origin node title and the vacation spot node title are totally different.

Initialization of flowgraph object

To outline the flowgraph and optimize the circulation I created a brand new class referred to as FlowGraph that’s inherited from NetworkX’s DiGraph class. By doing this I can add my very own strategies which can be particular to the circulation optimization and on the identical time use all strategies DiGraph supplies:

from networkx import DiGraph
from pulp import LpProblem, LpVariable, LpMinimize, LpStatus

class FlowGraph(DiGraph):
    """
    class to outline and remedy minimal value circulation issues
    """
    def __init__(self, nodes=[], edges=[]):
        """
        initialize FlowGraph object
        :param nodes: listing of nodes
        :param edges: listing of edges
        """
        # initialialize digraph
        tremendous().__init__(None)

        # add nodes and edges
        for node in nodes: self.add_node(node)
        for edge in edges: self.add_edge(edge)


    def add_node(self, node):
        """
        add node to graph
        :param node: Node object
        """
        # examine if node is a Node object
        if not isinstance(node, Node): increase ValueError('node should be a Node object')
        # add node to graph
        tremendous().add_node(node.title, demand=node.demand, provide=node.provide, capability=node.capability, kind=node.kind, 
                         fixed_cost=node.fixed_cost, x=node.x, y=node.y)
        
    
    def add_edge(self, edge):    
        """
        add edge to graph
        @param edge: Edge object
        """   
        # examine if edge is an Edge object
        if not isinstance(edge, Edge): increase ValueError('edge should be an Edge object')
        # examine if nodes exist
        if not edge.origin.title in tremendous().nodes: self.add_node(edge.origin)
        if not edge.vacation spot.title in tremendous().nodes: self.add_node(edge.vacation spot)

        # add edge to graph
        tremendous().add_edge(edge.origin.title, edge.vacation spot.title, capability=edge.capability, 
                         variable_cost=edge.variable_cost, fixed_cost=edge.fixed_cost)

FlowGraph is initialized by offering an inventory of nodes and edges. Step one is to initialize the mum or dad class as an empty graph. Subsequent, nodes and edges are added by way of the strategies add_node and add_edge. These strategies first examine if the supplied component is a Node or Edge object. If this isn’t the case an error will probably be raised. This ensures that every one parts added to the graph have handed the validation of the earlier part. Subsequent, the values of those objects are added to the Digraph object. Be aware that the Digraph class additionally makes use of add_node and add_edge strategies to take action. By utilizing the identical technique title I’m overwriting these strategies to make sure that each time a brand new component is added to the graph it should be added by the FlowGraph strategies which validate the item kind. Thus, it’s not potential to construct a graph with any component that has not handed the validation assessments.

Initializing the optimization drawback

The strategy under converts the community into an optimization mannequin, solves it, and retrieves the optimized values.

  def min_cost_flow(self, verbose=True):
        """
        run minimal value circulation optimization
        @param verbose: bool - print optimization standing (default: True)
        @return: standing of optimization
        """
        self.verbose = verbose

        # get most circulation
        self.max_flow = sum(node['demand'] for _, node in tremendous().nodes.information() if node['demand'] > 0)

        start_time = time.time()
        # create LP drawback
        self.prob = LpProblem("FlowGraph.min_cost_flow", LpMinimize)
        # assign determination variables
        self._assign_decision_variables()
        # assign goal perform
        self._assign_objective_function()
        # assign constraints
        self._assign_constraints()
        if self.verbose: print(f"Mannequin creation time: {time.time() - start_time:.2f} s")

        start_time = time.time()
        # remedy LP drawback
        self.prob.remedy()
        solve_time = time.time() - start_time

        # get standing
        standing = LpStatus[self.prob.status]

        if verbose:
            # print optimization standing
            if standing == 'Optimum':
                # get goal worth
                goal = self.prob.goal.worth()
                print(f"Optimum answer discovered: {goal:.2f} in {solve_time:.2f} s")
            else:
                print(f"Optimization standing: {standing} in {solve_time:.2f} s")
        
        # assign variable values
        self._assign_variable_values(standing=='Optimum')

        return standing

Pulp’s LpProblem is initialized, the fixed LpMinimize defines it as a minimization drawback — that means it’s supposed to attenuate the worth of the target perform. Within the following strains all determination variables are initialized, the target perform in addition to all constraints are outlined. These strategies will probably be defined within the following sections.

Subsequent, the issue is solved, on this step the optimum worth of all determination variables is set. Following the standing of the optimization is retrieved. When the standing is “Optimum” an optimum answer may very well be discovered different statuses are “Infeasible” (it’s not potential to satisfy all constraints), “Unbounded” (the target perform can have an arbitrary low values), and “Undefined” that means the issue definition isn’t full. In case no optimum answer was discovered the issue definition must be reviewed.

Lastly, the optimized values of all variables are retrieved and assigned to the respective nodes and edges.

Defining determination variables

All determination variables are initialized within the technique under:

   def _assign_variable_values(self, opt_found):
        """
        assign determination variable values if optimum answer discovered, in any other case set to None
        @param opt_found: bool - if optimum answer was discovered
        """
        # assign edge values        
        for _, _, edge in tremendous().edges.information():
            # initialize values
            edge['flow'] = None
            edge['selected'] = None
            # examine if optimum answer discovered
            if opt_found and edge['flow_var'] isn't None:                    
                edge['flow'] = edge['flow_var'].varValue                    

                if edge['selection_var'] isn't None: 
                    edge['selected'] = edge['selection_var'].varValue

        # assign node values
        for _, node in tremendous().nodes.information():
            # initialize values
            node['selected'] = None
            if opt_found:                
                # examine if node has choice variable
                if node['selection_var'] isn't None: 
                    node['selected'] = node['selection_var'].varValue

First it iterates by all edges and assigns steady determination variables if the sting capability is larger than 0. Moreover, if mounted prices of the sting are larger than 0 a binary determination variable is outlined as nicely. Subsequent, it iterates by all nodes and assigns binary determination variables to nodes with mounted prices. The overall variety of steady and binary determination variables is counted and printed on the finish of the tactic.

Defining goal

In spite of everything determination variables have been initialized the target perform might be outlined:

    def _assign_objective_function(self):
        """
        outline goal perform
        """
        goal = 0
 
        # add edge prices
        for _, _, edge in tremendous().edges.information():
            if edge['selection_var'] isn't None: goal += edge['selection_var'] * edge['fixed_cost']
            if edge['flow_var'] isn't None: goal += edge['flow_var'] * edge['variable_cost']
        
        # add node prices
        for _, node in tremendous().nodes.information():
            # add node choice prices
            if node['selection_var'] isn't None: goal += node['selection_var'] * node['fixed_cost']

        self.prob += goal, 'Goal',

The target is initialized as 0. Then for every edge mounted prices are added if the sting has a range variable, and variable prices are added if the sting has a circulation variable. For all nodes with choice variables mounted prices are added to the target as nicely. On the finish of the tactic the target is added to the LP object.

Defining constraints

All constraints are outlined within the technique under:

  def _assign_constraints(self):
        """
        outline constraints
        """
        # rely of contraints
        constr_count = 0
        # add capability constraints for edges with mounted prices
        for origin_name, destination_name, edge in tremendous().edges.information():
            # get capability
            capability = edge['capacity'] if edge['capacity'] = demand - provide
                rhs = node['demand'] - node['supply']
                self.prob += in_flow - out_flow >= rhs, f"flow_balance_{node_name}",
            constr_count += 1

            # replace complete demand and provide
            total_demand += node['demand']
            total_supply += node['supply']

        if self.verbose:
            print(f"Constraints: {constr_count}")
            print(f"Complete provide: {total_supply}, Complete demand: {total_demand}")

First, capability constraints are outlined for every edge. If the sting has a range variable the capability is multiplied with this variable. In case there isn’t a capability limitation (capability is about to infinity) however there’s a choice variable, the choice variable is multiplied with the utmost circulation that has been calculated by aggregating the demand of all nodes. An extra constraint is added in case the sting’s origin node has a range variable. This constraint signifies that circulation can solely come out of this node if the choice variable is about to 1.

Following, the circulation conservation constraints for all nodes are outlined. To take action the full in and outflow of the node is calculated. Getting all in and outgoing edges can simply be performed by utilizing the in_edges and out_edges strategies of the DiGraph class. If the node has a capability limitation the utmost outflow will probably be constraint by that worth. For the circulation conservation it’s essential to examine if the node is both a supply or sink node or a transshipment node (demand equals provide). Within the first case the distinction between influx and outflow should be larger or equal the distinction between demand and provide whereas within the latter case in and outflow should be equal.

The overall variety of constraints is counted and printed on the finish of the tactic.

Retrieving optimized values

After operating the optimization, the optimized variable values might be retrieved with the next technique:

    def _assign_variable_values(self, opt_found):
        """
        assign determination variable values if optimum answer discovered, in any other case set to None
        @param opt_found: bool - if optimum answer was discovered
        """
        # assign edge values        
        for _, _, edge in tremendous().edges.information():
            # initialize values
            edge['flow'] = None
            edge['selected'] = None
            # examine if optimum answer discovered
            if opt_found and edge['flow_var'] isn't None:                    
                edge['flow'] = edge['flow_var'].varValue                    

                if edge['selection_var'] isn't None: 
                    edge['selected'] = edge['selection_var'].varValue

        # assign node values
        for _, node in tremendous().nodes.information():
            # initialize values
            node['selected'] = None
            if opt_found:                
                # examine if node has choice variable
                if node['selection_var'] isn't None: 
                    node['selected'] = node['selection_var'].varValue

This technique iterates by all edges and nodes, checks if determination variables have been assigned and provides the choice variable worth by way of varValue to the respective edge or node.

Demo

To reveal easy methods to apply the circulation optimization I created a provide chain community consisting of two factories, 4 distribution facilities (DC), and 15 markets. All items produced by the factories should circulation by one distribution heart till they are often delivered to the markets.

Node properties had been outlined:

Ranges imply that uniformly distributed random numbers had been generated to assign these properties. Since Factories and DCs have mounted prices the optimization additionally must resolve which of those entities must be chosen.

Edges are generated between all Factories and DCs, in addition to all DCs and Markets. The variable value of edges is calculated because the Euclidian distance between origin and vacation spot node. Capacities of edges from Factories to DCs are set to 350 whereas from DCs to Markets are set to 100.

The code under reveals how the community is outlined and the way the optimization is run:

# Outline nodes
factories = [Node(name=f'Factory {i}', supply=700, type="Factory", fixed_cost=100, x=random.uniform(0, 2),
                  y=random.uniform(0, 1)) for i in range(2)]
dcs = [Node(name=f'DC {i}', fixed_cost=25, capacity=500, type="DC", x=random.uniform(0, 2), 
            y=random.uniform(0, 1)) for i in range(4)]
markets = [Node(name=f'Market {i}', demand=random.randint(1, 100), type="Market", x=random.uniform(0, 2), 
                y=random.uniform(0, 1)) for i in range(15)]

# Outline edges
edges = []
# Factories to DCs
for manufacturing unit in factories:
    for dc in dcs:
        distance = ((manufacturing unit.x - dc.x)**2 + (manufacturing unit.y - dc.y)**2)**0.5
        edges.append(Edge(origin=manufacturing unit, vacation spot=dc, capability=350, variable_cost=distance))

# DCs to Markets
for dc in dcs:
    for market in markets:
        distance = ((dc.x - market.x)**2 + (dc.y - market.y)**2)**0.5
        edges.append(Edge(origin=dc, vacation spot=market, capability=100, variable_cost=distance))

# Create FlowGraph
G = FlowGraph(edges=edges)

G.min_cost_flow()

The output of circulation optimization is as follows:

Variable varieties: 68 steady, 6 binary
Constraints: 161
Complete provide: 1400.0, Complete demand: 909.0
Mannequin creation time: 0.00 s
Optimum answer discovered: 1334.88 in 0.23 s

The issue consists of 68 steady variables that are the sides’ circulation variables and 6 binary determination variables that are the choice variables of the Factories and DCs. There are 161 constraints in complete which encompass edge and node capability constraints, node choice constraints (edges can solely have circulation if the origin node is chosen), and circulation conservation constraints. The following line reveals that the full provide is 1400 which is larger than the full demand of 909 (if the demand was larger than the provision the issue can be infeasible). Since this can be a small optimization drawback, the time to outline the optimization mannequin was lower than 0.01 seconds. The final line reveals that an optimum answer with an goal worth of 1335 may very well be present in 0.23 seconds.

Moreover, to the code I described on this put up I additionally added two strategies that visualize the optimized answer. The code of those strategies may also be discovered within the repo.

All nodes are positioned by their respective x and y coordinates. The node and edge dimension is relative to the full quantity that’s flowing by. The sting coloration refers to its utilization (circulation over capability). Dashed strains present edges with out circulation allocation.

Within the optimum answer each Factories had been chosen which is inevitable as the utmost provide of 1 Manufacturing facility is 700 and the full demand is 909. Nevertheless, solely 3 of the 4 DCs are used (DC 0 has not been chosen).

Typically the plot reveals the Factories are supplying the closest DCs and DCs the closest Markets. Nevertheless, there are just a few exceptions to this commentary: Manufacturing facility 0 additionally provides DC 3 though Manufacturing facility 1 is nearer. That is as a result of capability constraints of the sides which solely permit to maneuver at most 350 models per edge. Nevertheless, the closest Markets to DC 3 have a barely larger demand, therefore Manufacturing facility 0 is shifting extra models to DC 3 to satisfy that demand. Though Market 9 is closest to DC 3 it’s equipped by DC 2. It is because DC 3 would require a further provide from Manufacturing facility 0 to produce this market and because the complete distance from Manufacturing facility 0 over DC 3 is longer than the gap from Manufacturing facility 0 by DC 2, Market 9 is equipped by way of the latter route.

One other method to visualize the outcomes is by way of a Sankey diagram which focuses on visualizing the flows of the sides:

The colours signify the sides’ utilizations with lowest utilizations in inexperienced altering to yellow and crimson for the very best utilizations. This diagram reveals very nicely how a lot circulation goes by every node and edge. It highlights the circulation from Manufacturing facility 0 to DC 3 and likewise that Market 13 is equipped by DC 2 and DC 1.

Abstract

Minimal value circulation optimizations generally is a very useful instrument in lots of domains like logistics, transportation, telecommunication, power sector and plenty of extra. To use this optimization you will need to translate a bodily system right into a mathematical graph consisting of nodes and edges. This must be performed in a method to have as few discrete (e.g. binary) determination variables as essential as these make it considerably tougher to seek out an optimum answer. By combining Python’s NetworkX, Pulp and Pydantic libraries I constructed an circulation optimization class that’s intuitive to initialize and on the identical time follows a generalized formulation which permits to use it in many various use instances. Graph and circulation diagrams are very useful to grasp the answer discovered by the optimizer.

If not in any other case said all pictures had been created by the creator.

Source link

How AI Agents “Talk” to Each Other

Stop Building AI Platforms | Towards Data Science

What If I had AI in 2018: Rent the Runway Fulfillment Center Optimization

How Python’s all() Became My Go-To for Iterable Truthiness Checks | by PURRFECT SOFTWARE LIMITED | Apr, 2025

What’s the Highest Paid Hourly Position at Walmart?

Manufacturing Digital Transformation Could Lead to Increased Data Security Risks

The Future Just Landed — Are You Watching Closely, AI Techies? | by Sourabh Joshi | Jun, 2025

Deep Learning Approaches for Blood Disease Diagnosis Across Hematopoietic Lineages | by Gabriel Bo | Mar, 2025

Most Popular

4 Levels of GitHub Actions: A Guide to Data Workflow Automation

Deploying Machine Learning Models with FastAPI | by Abhishek Shaw | Mar, 2025

Advanced Rag Techniques- Elevating LLM Interactions with Intelligent Routing | by Guarav Bansal | May, 2025

Our Picks

ALL-IN-ONE Agent — Manus?. Alright! Let’s chat about something… | by Kaushik Holla | Mar, 2025

The Power of Data Science: Shaping the Future Across Industries and Technologies | by Kasa | Apr, 2025

Graph Laplacian: From Basic Concepts to Modern Applications | by Hussein Mhadi | Feb, 2025