Balanced Number Partitioning
   HOME

TheInfoList



OR:

Balanced number partitioning is a variant of
multiway number partitioning In computer science, multiway number partitioning is the problem of partitioning a multiset of numbers into a fixed number of subsets, such that the sums of the subsets are as similar as possible. It was first presented by Ronald Graham in 1969 in t ...
in which there are constraints on the number of items allocated to each set. The input to the problem is a set of ''n'' items of different sizes, and two integers ''m'', ''k''. The output is a partition of the items into ''m'' subsets, such that the number of items in each subset is at most ''k''. Subject to this, it is required that the sums of sizes in the ''m'' subsets are as similar as possible. An example application is
identical-machines scheduling Identical-machines scheduling is an optimization problem in computer science and operations research. We are given ''n'' jobs ''J''1, ''J''2, ..., ''Jn'' of varying processing times, which need to be scheduled on ''m'' identical machines, such that ...
where each machine has a job-queue that can hold at most ''k'' jobs. The problem has applications also in manufacturing of
VLSI Very large-scale integration (VLSI) is the process of creating an integrated circuit (IC) by combining millions or billions of MOS transistors onto a single chip. VLSI began in the 1970s when MOS integrated circuit (Metal Oxide Semiconductor) c ...
chips, and in assigning tools to machines in
flexible manufacturing system A flexible manufacturing system (FMS) is a manufacturing system in which there is some amount of flexibility that allows the system to react in case of changes, whether predicted or unpredicted. This flexibility is generally considered to fall i ...
s. In the standard three-field notation for optimal job scheduling problems, the problem of minimizing the largest sum is sometimes denoted by "P ,  # ≤ k ,  ''C''max". The middle field "# ≤ k" denotes that the number of jobs in each machine should be at most ''k''. This is in contrast to the unconstrained version, which is denoted by "P\parallel C_\max".


Two-way balanced partitioning

A common special case called two-way balanced partitioning is when there should be two subsets (''m'' = 2). The two subsets should contain floor(''n''/2) and ceiling(''n''/2) items. It is a variant of the
partition problem In number theory and computer science, the partition problem, or number partitioning, is the task of deciding whether a given multiset ''S'' of positive integers can be partitioned into two subsets ''S''1 and ''S''2 such that the sum of the numbers ...
. It is NP-hard to decide whether there exists a partition in which the sums in the two subsets are equal; see problem P12 There are many algorithms that aim to find a balanced partition in which the sum is as nearly-equal as possible. * Coffman, Frederickson and Lueker present a restricted version of the
LPT algorithm Longest-processing-time-first (LPT) is a greedy algorithm for job scheduling. The input to the algorithm is a set of ''jobs'', each of which has a specific processing-time. There is also a number ''m'' specifying the number of ''machines'' that can ...
(called RLPT), in which inputs are assigned in pairs. When inputs are uniformly-distributed random variables, the expected largest sum of RLPT is exactly \frac+\frac. The expected work-difference (difference between largest and smallest sum) is \Theta(1/n). * Lueker presents a variant of the LDM algorithm (called the pairwise differencing method (PDM)). Its expected work-difference is \Theta(1/n). * Tsai presents an algorithm called Restricted Largest Difference (RLD). Its work-difference is O(\log n/n^2) almost surely. * Yakir presents a balanced variant of the LDM algorithm for ''m'' = 2, called BLDM. Its expected work-difference is n^. *Mertens presents a complete
anytime algorithm In computer science, an anytime algorithm is an algorithm that can return a valid solution to a problem even if it is interrupted before it ends. The algorithm is expected to find better and better solutions the longer it keeps running. Most algo ...
for balanced two-way partitioning. It combines the BLDM algorithm with the complete-Karmarkar–Karp algorithm.


Balanced triplet partitioning

Another special case called 3-partitioning is when the number of items in each subset should be at most 3 (''k'' = 3). Deciding whether there exists such a partition with equal sums is exactly the
3-partition problem The 3-partition problem is a strongly NP-complete problem in computer science. The problem is to decide whether a given multiset of integers can be partitioned into triplets that all have the same sum. More precisely: * The input to the problem ...
, which is known to be
strongly NP-hard In computational complexity, strong NP-completeness is a property of computational problems that is a special case of NP-completeness. A general computational problem may have numerical parameters. For example, the input to the bin packing proble ...
. There are approximation algorithms that aim to find a partition in which the sum is as nearly-equal as possible. * Kellerer and Woeginger adapt the
LPT algorithm Longest-processing-time-first (LPT) is a greedy algorithm for job scheduling. The input to the algorithm is a set of ''jobs'', each of which has a specific processing-time. There is also a number ''m'' specifying the number of ''machines'' that can ...
to triplet partitioning (where there are at most 3*''m'' items, and each subset should contain at most 3 items). Their algorithm is called ''modified-LPT'' or ''MLPT''. It orders the items from large to small, and puts each item in turn into the bin with the smallest sum among those bins that contain fewer than 3 items. They show that the MLPT algorithm attains at most \frac of the ''minimum largest'' sum, which is the same approximation ratio that LPT attains for the unconstrained problem. The bound is tight for MLPT. * Chen, He and Lin show that, for the same problem, MLPT attains at least \frac of the ''maximum smallest'' sum, which is again the same ratio that LPT attains for the unconstrained problem. * Kellerer and Kotov present a different algorithm (for the case with exactly 3*''m'' items), that attains at most 7/6 of the ''minimum largest'' sum.


Balanced partitioning with larger cardinalities

A more general case, called ''k''-partitioning, is when the number of items in each subset should be at most ''k'', where ''k'' can be any positive integer.. * Babel, Kellerer and Kotov study a variant in which there are ''k''×''m'' items (for some integer ''k''), and each of the ''m'' sets must contain exactly ''k'' items. They present several heuristic algorithms for approximating the ''minimum largest'' sum: ** ''Folding algorithm'': optimal for ''m'' = 2, and in general has tight approximation ratio 2-\frac. ** ''Exchange algorithm'': tight approximation ratio 2-\frac. It is not known if it runs in polynomial time. ** ''Primal-dual algorithm'' (a combination of LPT and MultiFit): approximation ratio at most 4/3. It is tight for ''k'' = 4 when ''m'' is sufficiently large; the precise lower bound is \frac). **The also conjecture that Modified-LPT has an approximation ratio \frac . Currently, this conjecture is known to be true only for ''k'' = 3. For ''k'' > 3, it is known that its approximation ratio is at most 2. *Michiels, Korst, Aarts, van Leeuwen and Spieksma study a variant in which each of the ''m'' sets must contain either ceiling(''n''/''m'') or floor(''n''/''m'') items (so ''k'' = ceiling(''n''/''m'')). They extend the Balanced-LDM (BLDM) from ''m''=2 to general ''m''. The generalized algorithm runs in time O(n\log n). They prove that its approximation ratio for the minimum largest sum is exactly 4/3 for ''k'' = 3, 19/12 for ''k'' = 4, 103/60 for ''k'' = 5, 643/360 for ''k'' = 6, and 4603/2520 for ''k'' = 7. The ratios were found by solving a mixed integer linear program. In general (for any ''k''), the approximation ratio is at least 2-\sum_^\frac and at most 2-\frac. The exact MILP results for 3,4,5,6,7 correspond to the lower bound. For ''k''>7, no exact results are known, but the difference between the lower and upper bound is less than 0.3%. When the parameter is the number of subsets (''m''), the approximation ratio is exactly 2-\frac. *Zhang, Mouratidis and Pang show that BLDM might yield partitions with a high work-difference (difference between highest and lowest sum), both when the inputs are distributed uniformly, and when their distribution is skewed. They propose two alternative heuristics: LRM reduces the work-difference to 1/3 the work-difference of BLDM when the distribution is uniform; Meld reduces the work-difference when the distribution is skewed. A hybrid algorithm combines BLDM, LRM and Meld and adapts dynamically to different data distributions. *When ''k'' is fixed, a PTAS of Hochbaum and Shmoys can be used for balanced partitioning. When ''k'' is part of the input, no PTAS is currently known. *Dell'Amico and Martello study the problem of minimizing the largest sum when the number of items in all sets is at most ''k''. They show that the linear-program relaxation of this variant has the same optimal value as the LP relaxation of the unconstrained variant. The expression \max((\sum x_i)/m, x_1), where ''xi'' are the inputs ordered from large to small, is a lower bound for the optimal largest-sum, and its worst-case ratio is 1/2 in both variants. The improved expression \max((\sum x_i)/m, x_1, x_k+x_) has worst-case ratio 2/3 in the unconstrained variant and 1/2 in the constrained variant. The approximation ratio of the modified
list scheduling List scheduling is a greedy algorithm for Identical-machines scheduling. The input to this algorithm is a list of jobs that should be executed on a set of ''m'' machines. The list is ordered in a fixed order, which can be determined e.g. by the pri ...
is 1/2 for the unconstrained variant, but it is 0 for the constrained variant (it can be arbitrarily bad). The approximation ratio of the modified LPT algorithm is at most 2. They also show that the lower bound of has a tight worst-case performance ratio of 3/4, and that their PD algorithm has a tight performance ratio of 4/3 (when ''m'' is sufficiently large). *He, Tan, Zhu and Yao consider the problem of maximizing the smallest sum. They show that the FOLDING algorithm has tight approximation ratio \max\left(\frac, \frac\right). They present a new algorithm, HARMONIC1, with worst-case ratio at least \max\left(\frac, \frac\right). Both these algorithms are '' ordinal'' – they partition the items based only on the order between them rather than their exact values. They prove that ''any'' ordinal algorithm has ratio at most O(1/\ln)for maximizing the smallest sum. This indicates that HARMONIC1 is asymptotically optimal. For any fixed ''k'', any ordinal algorithm has ratio at most the smallest root of the equation \sum_^m \left\lfloor\left\lfloor \frac\right\rfloor x \right\rfloor = k. When ''k'' tends to infinity, this upper bound approaches 0.


Relations between balanced and unconstrained problems

There are some general relations between approximations to the balanced partition problem and the standard (unconstrained) partition problem. * Babel, Kellerer and Kotov prove that the ratio between the unconstrained optimum and the constrained optimum is at most 2-\frac, and it is tight. * Kellerer and Kotov prove that every heuristic for balanced partitioning with capacity ''k'' and approximation-ratio ''r'' (for the minimum largest sum) can be employed to obtain a heuristic for unconstrained partitioning with approximation-ratio \max\left(r, \frac-\frac\right). In particular, their 7/6-approximation algorithm for triplet partitioning (''k'' = 3) can be used to obtain a heuristic for unconstrained partitioning with approximation-ratio \max\left(\frac, \frac-\frac\right).


Different cardinality constraints

The cardinality constraints can be generalized by allowing a different constraint on each subset. This variant is introduced in the "open problems" section of, who call the ''ki''-partitioning problem. He, Tan, Zhu and Yao present an algorithm called HARMONIC2 for maximizing the smallest sum with different cardinality constraints. They prove that its worst-case ratio is at least \max\left(\frac, \frac\frac\right).


Categorized cardinality constraints

Another generalization of cardinality constraints is as follows. The input items are partitioned into ''k'' categories. For each category ''h'', there is a capacity constraint ''kh''. Each of the ''m'' subsets may contain at most ''kh'' items from category ''h''. In other words: all ''m'' subsets should be independent set of a particular
partition matroid In mathematics, a partition matroid or partitional matroid is a matroid that is a direct sum of uniform matroids. It is defined over a base set in which the elements are partitioned into different categories. For each category, there is a ''capaci ...
. Two special cases of this problem have been studied.


Kernel partitioning

In the ''kernel balanced-partitioning problem'', some ''m'' pre-specified items are ''kernels'', and each of the ''m'' subsets must contain a single kernel (and an unlimited number of non-kernels). Here, there are two categories: the kernel category with capacity 1, and the non-kernel category with unlimited capacity. * Chen, He and Yao prove that the problem is NP-hard even for ''k'' = 3 (for ''k'' = 2 it can be solved efficiently by finding a
maximum weight matching In computer science and graph theory, the maximum weight matching problem is the problem of finding, in a weighted graph, a matching in which the sum of weights is maximized. A special case of it is the assignment problem, in which the input is ...
). They then present an algorithm called ''Kernel-LPT'' (KLPT): it assigns a kernel to each subset, and then runs the modified
LPT algorithm Longest-processing-time-first (LPT) is a greedy algorithm for job scheduling. The input to the algorithm is a set of ''jobs'', each of which has a specific processing-time. There is also a number ''m'' specifying the number of ''machines'' that can ...
(puts each item into the subset with the smallest sum among those that have fewer than ''k'' items). They prove that, with ''k'' = 3, KLPT has an approximation ratio \frac for the ''minimum largest'' sum. However, Chen, He and Lin claim that its tight approximation ratio is \frac for the minimum largest sum, and \frac for the maximum smallest sum.


One-per-category partitioning

In another variant of this problem, there are some ''k'' categories of size ''m'', and each subset should contain exactly one item from each category. That is, ''k''''h'' = 1 for each category ''h''. * Wu and Yao presented the ''layered LPT'' algorithm – a variant of the
LPT algorithm Longest-processing-time-first (LPT) is a greedy algorithm for job scheduling. The input to the algorithm is a set of ''jobs'', each of which has a specific processing-time. There is also a number ''m'' specifying the number of ''machines'' that can ...
. They prove that its approximation ratio is 2-\frac for minimizing the largest sum; \frac for maximizing the smallest sum in the general case; and in some special cases, can be improved to \frac for general ''k'' and \frac for ''k'' = 3. * Li and Li presented different algorithms for the same problem. For minimizing the largest sum, they present an EPTAS for constant ''k'', and FPTAS for constant ''m''. For maximizing the smallest sum, they present a 1/(''k'' − 1) approximation algorithm for the general case, and an EPTAS for constant ''k''. They also study a more general objective: minimizing the lp-norm of the vector of sums. They prove that the layered-LPT algorithm is a 2-approximation algorithm for all norms. *Dell'Olmo, Hansen, Pallottino and Storchi study 32 different objectives for this problem. For each of the four operators max, min, sum, diff, one operator can be applied to the ''k'' items inside each subset, and then one operator can be applied to the ''m'' results for the different subsets. Each of these 16 objectives can be either maximized or minimized, for a total of 32. They show that 21 of these problems can be solved in linear time; 7 require more complex, but still polynomial-time, algorithms; 3 are NP-hard: maximizing (min, sum), minimizing (max, sum) and minimizing (diff, sum). They left open the status of minimizing (diff, diff).


See also

Matroid-constrained number partitioning Matroid-constrained number partitioning is a variant of the multiway number partitioning problem, in which the subsets in the partition should be independent sets of a matroid. The input to this problem is a set ''S'' of items, a positive integer ' ...
is a generalization in which a fixed matroid is given as a parameter, and each of the ''m'' subsets should be an independent set or a base of this matroid. * Cardinality constraints are special cases of matroid constraints in which the matroid is a
uniform matroid In mathematics, a uniform matroid is a matroid in which the independent sets are exactly the sets containing at most ''r'' elements, for some fixed integer ''r''. An alternative definition is that every permutation of the elements is a symmetry. ...
. * Categorized cardinality constraints are a special case in which the matroid is a
partition matroid In mathematics, a partition matroid or partitional matroid is a matroid that is a direct sum of uniform matroids. It is defined over a base set in which the elements are partitioned into different categories. For each category, there is a ''capaci ...
.


References

{{reflist Number partitioning