The partition problem
The partition problem is intuitive: Given a collection of n objects, can we place part of those objects
in the left hand of a balance and the remaining objects in the right hand side of a balance so that
they weigh the same?
In other (more mathematical) words, if we have a multiset of integers s, can we partition that multiset into two separate
multisets q and r so that the sum of the elements in q is the same as the sum of the elements in r.
A more formal statement of the problem can be found on mathworld.wolfram
here.
This simple problem belongs to a very unique kind of problems called NP-complete problems that
are very hard to solve.
Given an instance of the partition problem with n positive integer values,
one way to find if there is a solution to that instance
is to examine each
and every one of the 2n subsets and to check
for every one of those subsets if
that subset and its complement are a solution to
this instance of the partition problem.
One adds up the elements in the subset,
adds up the elements in the complement
and if the two sums are the same,
this is a solution to this instance of the partition problem.
Consider the following example: {2,3,5}.
In this case n, the number of elements in the instance
of the partition problem, is 3.
If one calculates all the possible subsets, there are 2 3 = 8 possible distinct subsets.
The following table shows the eight subsets,
their complements and indicates which of those subsets are
solutions for this instance of the partition problem.
| Index |
Binary Encoding |
Elements |
Elements in the Complement |
Solution |
| 0 |
000 |
{} |
{2,3,5} |
No |
| 1 |
001 |
{2} |
{3,5} |
No |
| 2 |
010 |
{3} |
{2,5} |
No |
| 3 |
011 |
{2,3} |
{5} |
Yes |
| 4 |
100 |
{5} |
{2,3} |
Yes |
| 5 |
101 |
{2,5} |
{3} |
No |
| 6 |
110 |
{3,5} |
{2} |
No |
| 7 |
111 |
{2,3,5} |
{} |
No |
Notice that the first half of the table is symmetrical
to the second half of the table.
Hence, one only needs to go through the first half
of all the elements in the Power Set.
On this web site you will find several programs that solve this problem.
The input file has the following format:
- The first line contains the number of integers in the multiset, n
- The next n lines contain the n integers, one per line.
The sequential version of the program is available here:
sequentialPartition.c
This version works only for small instances of the problem, it will solve
instances of size at most 32. This version uses "brute force".
There is a Makefile available that will allow you to compile the code.
The Makefile is available here:
Makefile.
You can compile the code by typing:
make sequentialPartition
Several test files are available to test the program:
You can execute the program (after compiling it) by typing:
./sequentialPartition < test16.txt
If you want to find how long does the program take to execute, you can use the command time as follows:
time ./sequentialPartition < test16.txt
If you observe carefully the source code of that solution, and furthermore if you profile the execution
of the code, you will notice that the loop in the main function is responsible for the bulk
of the execution time of the code.
OpenMP
That loop can be parallelized using OpenMP, a standard that allows developers to parallelize
programs written in C/C++ or Fortran. OpenMP is available with gcc.
A parallel version of the sequential code based on OpenMP is available:
openmpPartition.c
Again you can compile this version of the program by typing:
make openmpPartition
The changes to the source code are minimal:
- Including the header file for the openmp library.
- A pragma statement that indicates to the compiler to parallelize the main loop.
As with the sequential program, you can compile this version of the program by typing:
make openmpPartition
You can execute the program and observe the speedup by typing:
time ./openmpPartition < test16.txt
Computers with microprocessors with several cores are ubiquitous these days.
OpenMP is available on the gcc family of compilers and on Visual Studio.
Lawrence Livermore National Laboratory has an excellent tutorial on OpenMP that is available
here.
MPI (Message Passing Interface)
For very, very large problems, a single computer with multiple cores might not be big enough.
In that case, using a cluster of workstations (computers) becomes an alternative.
A standard has been created to make it feasible to program clusters of workstations
as a large "virtual" computer. This standard is called MPI (Message Passing Interface).
Again,
Lawrence Livermore National Laboratory has an excellent tutorial on MPI that is available
here.
The sequential program to solve the partition problem has been parallelized using MPI
and it is available:
mpiPartition.c.
As before, you can compile this version of the program by typing:
make mpiPartition
To execute the program, and check how long does it take, type:
time mpiexec -np 4 mpiPartition < test16.txt
Combining OpenMP and MPI
It is possible to combine OpenMP and MPI.
A version of the program that combines both approaches is available:
mpiOpenMPPartition.c.
As in the previous cases, you can compile this version of the program by typing:
make mpiOpenMPPartition
To execute the program, and check how long does it take, type:
time mpiexec -np 4 mpiOpenMPPartition < test16.txt
Streams in Java
Streams in Java are a very expressive style to write code.
Streams can also be parallelized very easily on machines with
several processors and shared memory.
A tutorial on Java Streams is available
here.
A version of the algorithm that uses Streams in Java is available:
Partition.java.
CUDA
CUDA is a programming language that extends C.
It is used to program NVIDIA GPUs (Graphical Processing Units).
GPUs are cards with their own memory and a large number of processors.
In CUDA, one allocates memory on the GPU card using a variant of the malloc function.
There are also functions to copy the content of variables from the host to the GPU card
and from the GPU card back to the host.
The many processors on the GPU card are programmed using "kernels".
These are functions that are executed simultaneously on the processors on the GPU.
A version of the algorithm in CUDA is available:
cudaPartition.cu.
The source code can be compiled with the nvcc compiler from NVIDIA.
The program requires an NVIDIA GPU to be executed.