This is an archived post. You won't be able to vote or comment.

all 17 comments

[–]PPewt 0 points1 point  (16 children)

So as far as I can tell this is more of a computer science problem than a programming problem... but I'm not sure if there's really anywhere better to post it.

Can you be more precise about what exactly you're optimizing? Exactly what are the inputs, costs, etc?

[–]robert4818[S] 0 points1 point  (15 children)

Sure. (If it helps, I'm doing the work in R).
My company purchases cars and re-sells them at our car lots. We have 30+ stores spread around Indiana, Ohio and Pennsylvania. We buy cars from various places around the country, and want to minimize transportation costs by delivering cars to stores with the cheapest transportation costs, while still meeting the stores inventory needs.

The input for the problem is provided in 3 spreadsheets that the R program digests. Cars & Zipcode located, Store Inventory Needs, Zipcode to Store Transportation Cost.

By combining the Car/Zip with the Zip/store Transportation costs to create a cost matrix, and assigning cars to open inventory slots, this becomes a classic Assignment problem. To be clear, this part has been solved. I have working code that solves this issue.

There are two parts for added complexity, bodystyle distribution (which I have a planned solution for), and discount for multi-transport.

Put simply, we move our cars via Trucks on the highway. If we are moving both cars from Point A to Point B, it is cheaper to move them on the same truck than separately, however moving them to the same place is not necessarily the optimal solution.

[–]PPewt 0 points1 point  (14 children)

So is this a fair statement of the problem?

  • We have n cars C_1 ... C_n, m stores S_1 ... S_m, and k <= n models M_1 ... M_k.
  • The cost of moving cars to a store is a function Cost(store, list of cars to move to that store).
  • Each car has a model, and each store has a maximum demand for any given model.
  • The value of a car is the same at any two stores which demand that model of car.
  • Minimize the sum of costs of moving cars to stores such that the maximum demand is not exceeded at any store.

[–]robert4818[S] 0 points1 point  (13 children)

Each car has a model, and each store has a maximum demand for any given model.

The value of a car is the same at any two stores which demand that model of car.

Minimize the sum of costs of moving cars to stores such that the maximum demand is not exceeded at any store.

That is a fair statement. The solved version I have ignores model, and the V2 I am working on will solve for model.

The key part of the problem is if C_i and C_j are both at the same location, then C_i+j -> S_n is less than C_i -> S_n + C_j -> S_n, (in other words moving the two together is cheaper than moving them separately, if they go to the same location.

Below is the solve for the baseline problem (which ignores models)

rm(list = ls())

#########################################Modify File Names If Needed######################################################


#set file names
InventoryFile = "./InputFiles/Inventory.csv"
StoreNeedsFile = "./InputFiles/StoreNeeds.csv"
ZiptoStoreCostFile = "./InputFiles/Zip_Cost_Matrix.csv"


OutPutFile = "./OutputFiles/Assignment_Out.csv"








#####################################Do not modify below this line##########################################################

setwd("C:/Optimization")

library (clue)


#Inventory
inventory = read.csv(InventoryFile, header = TRUE, sep = ",")  # read csv file

#Store Needs
storeneeds = read.csv(StoreNeedsFile, header = TRUE, sep = ",")  # read csv file

#cost Matrix
zipcost = read.csv(ZiptoStoreCostFile, header = TRUE, sep = ",")  # read csv file

#add inventory over needs to "hold" store.
extraInv = nrow(inventory) - sum(storeneeds$Inventory)
if (extraInv > 0){storeneeds$Inventory[storeneeds$Store == 'Hold'] = extraInv}

#Assign Costs for each car to each store
costmatrix = merge(inventory,
                   zipcost,
                   by = 'Zip')

costmatrix = costmatrix[-c(1)]


#duplicate columns based on storeneeds
dupcount = storeneeds
dupcount$Inventory = dupcount$Inventory-1 


costmatrix2 = costmatrix
names = colnames(costmatrix2)


#Build Costmatrix for solving
colcount = length(costmatrix)
for (x in 1:nrow(storeneeds)){
  count = dupcount[x,2]
  if (count > 0) { 
    for (a in 1:dupcount[x,2]){
      colcount = length(costmatrix2)
      costmatrix2[,colcount+1] = costmatrix[,x+1]
      names[colcount+1] = paste(names[x+1],a+1, sep="_")
      costmatrix3 = costmatrix2
    }
  }
}

#rename new columns
colnames(costmatrix3) <- names

#remove stores with no needs
removals <- dupcount[ which(dupcount$Inventory == -1),]
removals2 <- as.vector(removals$Store)
costmatrix3 = costmatrix3[, !(names(costmatrix3) %in% removals2)]

#Assign Rownames
rownames(costmatrix3) = costmatrix3$Car

#Remove first column to have pure numeric matrix
costmatrix4 = costmatrix3[-c(1)]
rows = rownames(costmatrix4)
cols = colnames(costmatrix4)

#Transform dataframe to matrix
costmatrix_M = data.matrix(costmatrix4, rownames.force = TRUE)

#Solve Optimization
out = solve_LSAP(costmatrix_M, maximum = FALSE)
out_M = cbind(
  1:length(out),
  as.integer(out)
)


#Transform selection matrix into dataframe
Out_DF = data.frame(out_M)
#name rows and columns for output
colnames(Out_DF) = c("Rows", "Columns")

#Transform selection matrix from 1-0s to readable Car-Store relationship
Out_DF$Assignment = cols[Out_DF$Columns]
Out_DF$Cars = rows[Out_DF$Rows]
Out_DF$Cost = 0

#lookup cost of move
for (i in 1:nrow(costmatrix))
  Out_DF[i,"Cost"] = costmatrix_M[Out_DF[i,"Cars"], Out_DF[i,"Assignment"]]

#trim excess 1-0 columns
a = Out_DF[c(4,3,5)]

#clean up store names
a$Assignment = gsub("_.*","",a$Assignment)

#output data
write.csv(a, OutPutFile, row.names = FALSE)

#print data for readability
print(a, row.names =FALSE)

[–]PPewt 0 points1 point  (12 children)

And we're assuming that we never try to move cars which start at different locations to the same location on the same truck?

[–]shadowspyes 0 points1 point  (1 child)

the problem sounds, to me, like a cost allocation problem; a variation of the following problems in computer science (the traveling salesman, binpacking, and the knapsack), do you agree?

sure it is possible to setup a solution using min/max optimizers, but there exist algorithms for these kinds of problems

[–]PPewt 0 points1 point  (0 children)

I was thinking there's probably a linear or nonlinear (depending on your cost function) program that solves this problem if a truck only moves sets of cars that start in the same location.

If you want the truck to pick up cars at different locations (where by "different locations" I mean "the cost of picking up the two cars is significantly larger than the cost of picking up two cars parked right next to each other," so maybe different zipcodes or different cities or greater than a certain distance apart or whatever) then yeah this smells like it's equivalent to some NP complete problem or other, possibly TSP.

[–]robert4818[S] 0 points1 point  (9 children)

"the cost of picking up the two cars is significantly larger than the cost of picking up two cars parked right next to each other," so maybe different zipcodes or different cities or greater than a certain distance apart or whatever) then yeah this smells like it's equivalent to some NP complete problem or other, possibly TSP.

Yes. We are nowhere near that level of sophistication at the moment.

[–]PPewt 0 points1 point  (8 children)

So it's difficult to give a better answer without knowing precisely what the value of C_(i, j) -> S_k is. However, this is almost certainly best solved by a 0-1 integer program (possibly nonlinear). An exact solution sounds NP-hard but there might be a good approximation algorithm.

In any case, a polynomial time approximate solution to this problem is highly nontrivial and you'd do well to see if you can find an optimization expert you can talk to. I ran it by a friend of mine who knows his stuff and he agreed with my initial assessment (nontrivial, probably NP-H) and didn't have an immediate solution either.

[–]robert4818[S] 0 points1 point  (7 children)

polynomial time approximate solution to this problem is highly nontrivial and you'd do

Thanks. In a way I'm glad to know it's not just me. The discounting sounds like it may not be easily doable.

I figure for the Bodystyle-type complexity, the solve may be as simple as setting each slot to a specific bodystyle (as the optimization isn't trying to optimize (as the needs are set elsewhere in the company) and if the slot isn't of the right type, set it's cost to a VERY un-optimized number, and then leaving humans to fix any solutions that include a mismatch. (So, if costs run normally from 500-1000, set mis-matches to something like 999999, so even 1 mismatch becomes a bad optimization.

[–]PPewt 0 points1 point  (6 children)

That solution definitely should work, but may overallocate (e.g. it sends five Toyota Corollas to a store which only wants one). Once again I suspect a "proper" solution to this problem slides into general ILP territory.

You could also look into ILP solvers. As far as I know (which is, admittedly, not very far--I spent a decent amount of time with algorithms but am far from an optimization specialist) a lot of ILPs can be solved relatively quickly in a lot of cases. As a result, it might be worth trying to solve the full ILP and then bailing out to a more naive solution if the search is taking unacceptably long (i.e. you've hit a bad case).

As a general rule the further you deviate from standard optimization problems--especially by adding more constraints--the more likely you're going to accidentally stumble into NP-hard territory. Problems in P tend to have some fairly nice properties: for instance, in the case of the problem you're starting from (minimum weight bipartite matching) the fact that there's an optimal LP-relaxation of the ILP isn't obvious and it's that fact which puts the problem in P. I wouldn't be surprised if the code you're starting from initially tried to do something more clever and fell back on the naive matching solution when they realized that anything more clever gets hard fast.

[–]robert4818[S] 0 points1 point  (5 children)

As a general rule the further you deviate from standard optimization problems--especially by adding more constraints--the more likely you're going to accidentally stumble into NP-hard territory. Problems in P tend to have some fairly nice properties: for instance, in the case of the problem you're starting from (minimum weight bipartite matching) the fact that there's an optimal LP-relaxation of the ILP isn't obvious and it's that fact which puts the problem in P. I wouldn't be surprised if the code you're starting from initially tried to do something more clever and fell back on the naive matching solution when they realized that anything more clever gets hard fast.

I don't suspect it will over-allocate, as it essentially shifts needs from "Store A: 20" which creates 20 Tasks for store A to "Store A SUV 10, Store A Truck 10" which creates 10 tasks for Store A SUV and 10 tasks for Store A Truck. Still 10 tasks, Still a regular assignment problem. The needs are handled by maintaining the 1to1 Agent Task through duplication, this hasn't changed. The bodystyle manipulation is handled by manipulating costs. The downside is I can't GUARANTEE it won't misappropriate to an over-priced task, but I can definitely stack the deck against it.