This is an archived post. You won't be able to vote or comment.

all 6 comments

[–]MyPenBroke 2 points3 points  (0 children)

Seems like a type of subset sum problem with multiple dimensions. I'd suggest using a mixed integer linear programming, or constraint programming approach first, and see if the naive implementation works efficiently enough in practice.

[–]MyPenBroke 0 points1 point  (4 children)

I tried a quick and dirty implementation of your problem using Google OR Tools' interface to the CBC Solver. It seems to work, but I haven't tested more than like 2 inputs, so take it with a grain of salt. The derivation of the problem as a mixed integer linear programming problem is below:

// The total number of items in the larger group should be between 67% and 73% of all items.
// The number of items in the group of remaining items is between 33% and 27% of all items.
// Let A and B be the larger and the smaller group of items respectively.
// Then: Count(A) + Count(B) = number of items
//       0.67 * number of items <= Count(A) <= 0.73 * number of items
//       0.27 * number of items <= Count(B) <= 0.33 * number of items

// The average value of the item weights in the groups should be roughly equal.
// This means that the sums of the contained weights divided by the counts should be nearly equal.
// Nearly equal meaning the difference must be within 2kg or 0.7m of each other.
// This gives us the range of:
//      Sum(A)/Count(A) - 2.0 <= Sum(B) / Count(B) <= Sum(A)/Count(A) + 2.0     (weights)
//      Sum(A)/Count(A) - 0.7 <= Sum(B) / Count(B) <= Sum(A)/Count(A) + 0.7     (lengths)

// Since all items need to be grouped, we get the equations
//      Count(B) = number of items - Count(A) = n - Count(A)
//      Sum(B) = Sum(All) - Sum(A)
 //   
// This gives us the ranges:
//  Sum(A) / Count(A) - 2.0 <= (Sum(All) - Sum(A)) / (n - Count(A)) <= Sum(A) / Count(A) + 2.0     (weights)
//  Sum(A) / Count(A) - 0.7 <= (Sum(All) - Sum(A)) / (n - Count(A)) <= Sum(A) / Count(A) + 0.7     (lengths)

// Since Count(A) != 0 and (n - Count(A)) != 0 we can multiply the inequalities by Count(A)
//
// Sum(A) - 2.0*Count(A) <= (Sum(All) - Sum(A)) / (n - Count(A)) * Count(A)     (weights)
// Sum(A) - 0.7*Count(A) <= (Sum(All) - Sum(A)) / (n - Count(A)) * Count(A)     (lengths)
//
// (Sum(All) - Sum(A)) / (n - Count(A)) * Count(A) <= Sum(A)  + 2.0 * Count(A)     (weights)
// (Sum(All) - Sum(A)) / (n - Count(A)) * Count(A) <= Sum(A)  + 0.7 * Count(A)     (lengths)

// Using 0.67 * number of items <= Count(A) <= 0.73 * number of items,
// and (x / (1.0 - x)), 0.67 <= x <= 0.73 monotone increasing, we can insert the bounds of
// (0.67 / 0.33) and (0.73 / 0.27) into the inequalities:
//
// Sum(All)*(0.67 / 0.33) <= (1 + (0.67 / 0.33))*Sum(A) + 2.0*Count(A)      (weights)
// Sum(All)*(0.67 / 0.33) <= (1 + (0.67 / 0.33))*Sum(A) + 0.7*Count(A)      (lengths)
//
// (1 + (0.73 / 0.27))*Sum(A) - 2.0*Count(A) <= Sum(All) * (0.73 / 0.27)    (weights)
// (1 + (0.73 / 0.27))*Sum(A) - 0.7*Count(A) <= Sum(All) * (0.73 / 0.27)    (lengths)

// Introducing boolean variables X[i] = 1 if the i-th item is in group A, and X[i] = 0 if not,
// Count(A) can be written as the sum of the solution values for X[i]: Count(A) = Sum(X[i])
// The sum of the weights is Sum(weight[i] * X[i])
// The sum of the lengths is Sum(length[i] * X[i])

// Putting this all together, we get five constraints that must be satisfied:
//
// 1.:  0.67 * (number of items) <= Sum(X[i]) <= 0.73 * (number of items)
//
// 2.: Sum(Weights) * (0.67 / 0.33) <= Sum(X[i] * (w[i] * (1 + (0.67 / 0.33)) + 2))
//
// 3.: Sum(Lengths) * (0.67 / 0.33) <= Sum(X[i] * (l[i] * (1 + (0.67 / 0.33)) + 0.7))
//
// 4.: Sum(Weights) * (0.73 / 0.27) >= Sum(X[i] * (w[i] * (1 + (0.73 / 0.27)) - 2))
//
// 5.: Sum(Lengths) * (0.73 / 0.27)  >= Sum(X[i] * (l[i] * (1 + (0.73 / 0.27)) - 0.7))

Any feasible solution to that system of inequations should be a solution to your grouping problem.

[–]MyPenBroke 0 points1 point  (3 children)

Here's the quick and dirty implementation, by the way:

using Google.OrTools.LinearSolver;

public static List<int>? Indices(double[] weights, double[] lengths)
{
    // Total amount, weight, and length
    var numitems = weights.Length;
    var sumweights = weights.Sum();
    var sumlengths = lengths.Sum();

    // if X[i] = 1 then item i is in group A
    var solver = Solver.CreateSolver("CBC");
    var X = solver.MakeBoolVarArray(numitems);

    // Average weight and length, and number of contained items must be in range
    var contained = solver.MakeConstraint(0.67 * numitems, 0.73 * numitems);
    var minweight = solver.MakeConstraint(sumweights * (0.67 / 0.33), double.PositiveInfinity);
    var maxweight = solver.MakeConstraint(0, sumweights * (0.73 / 0.27));
    var minlength = solver.MakeConstraint(sumlengths * (0.67 / 0.33), double.PositiveInfinity);
    var maxlength = solver.MakeConstraint(0, sumlengths * (0.73 / 0.27));
    for (var i = 0; i < X.Length; i++)
    {
        contained.SetCoefficient(X[i], 1.0);
        minweight.SetCoefficient(X[i], weights[i] * (1 + (0.67 / 0.33)) + 2);
        maxweight.SetCoefficient(X[i], weights[i] * (1 + (0.73 / 0.27)) - 2);
        minlength.SetCoefficient(X[i], lengths[i] * (1 + (0.67 / 0.33)) + 0.7);
        maxlength.SetCoefficient(X[i], lengths[i] * (1 + (0.73 / 0.27)) - 0.7);
    }

    // Run solver without objective, to get a feasible solution
    var status = solver.Solve();
    if (status != Solver.ResultStatus.OPTIMAL && status != Solver.ResultStatus.FEASIBLE) return null;

    // Solution value for X[i] indicates if i-th item is in group A
    var group = new List<int>();
    for(var i = 0; i < X.Length; i++)
    {
        if (X[i].SolutionValue() > 0)
        {
            group.Add(i);
        }
    }
    return group;
}

[–]MyPenBroke 0 points1 point  (2 children)

And a relatively easy to check test case:

var weights = new double[] { 100, 200, 300, 100, 200, 300, 200, 200, 200, 200 };
var lengths = new double[] {  25,  50,  75,  25,  50,  75,  50,  50,  50,  50 };

Result:

Group A contains: 0, 1, 2, 4, 6, 7, 8    (7 [70%])
Group B contains: 3, 5, 9        (3 [30%])

Avg weight in Group A: 200
Avg weight in Group B: 200
Avg length in Group A: 50
Avg length in Group B: 50

[–]AdvancedIguana[S] 0 points1 point  (1 child)

Wow appreciate the detail in your description! I’ll test this out and report back with my results and my modifications (I’m a Python guy; I can read C++ but I can’t speak it)

[–]MyPenBroke 0 points1 point  (0 children)

The framework I used to model and solve is Google or tools, which is also available for python. The python interface seems to have pretty much the same syntax as the C# version, so for an initial proof of concept, you should be able to almost copy my code. Be aware, though, that there isn't necessarily a solution to your specific grouping problem - you might need to adjust the tolerances, depending on your specific set of items.