The partition problem

The partition problem is intuitive: Given a collection of n objects, can we place part of those objects in the left hand of a balance and the remaining objects in the right hand side of a balance so that they weigh the same? In other (more mathematical) words, if we have a multiset of integers s, can we partition that multiset into two separate multisets q and r so that the sum of the elements in q is the same as the sum of the elements in r. A more formal statement of the problem can be found on mathworld.wolfram here.
This simple problem belongs to a very unique kind of problems called NP-complete problems that are very hard to solve.

Given an instance of the partition problem with n positive integer values, one way to find if there is a solution to that instance is to examine each and every one of the 2ⁿ subsets and to check for every one of those subsets if that subset and its complement are a solution to this instance of the partition problem. One adds up the elements in the subset, adds up the elements in the complement and if the two sums are the same, this is a solution to this instance of the partition problem.

Consider the following example: {2,3,5}. In this case n, the number of elements in the instance of the partition problem, is 3. If one calculates all the possible subsets, there are 2 ³ = 8 possible distinct subsets. The following table shows the eight subsets, their complements and indicates which of those subsets are solutions for this instance of the partition problem.

Index	Binary Encoding	Elements	Elements in the Complement	Solution
0	000	{}	{2,3,5}	No
1	001	{2}	{3,5}	No
2	010	{3}	{2,5}	No
3	011	{2,3}	{5}	Yes
4	100	{5}	{2,3}	Yes
5	101	{2,5}	{3}	No
6	110	{3,5}	{2}	No
7	111	{2,3,5}	{}	No

Notice that the first half of the table is symmetrical to the second half of the table. Hence, one only needs to go through the first half of all the elements in the Power Set.

On this web site you will find several programs that solve this problem. The input file has the following format:

The first line contains the number of integers in the multiset, n
The next n lines contain the n integers, one per line.

The sequential version of the program is available here: sequentialPartition.c This version works only for small instances of the problem, it will solve instances of size at most 32. This version uses "brute force".
There is a Makefile available that will allow you to compile the code. The Makefile is available here: Makefile. You can compile the code by typing:
make sequentialPartition
Several test files are available to test the program:

You can execute the program (after compiling it) by typing:
./sequentialPartition < test16.txt
If you want to find how long does the program take to execute, you can use the command time as follows:
time ./sequentialPartition < test16.txt
If you observe carefully the source code of that solution, and furthermore if you profile the execution of the code, you will notice that the loop in the main function is responsible for the bulk of the execution time of the code.

OpenMP

That loop can be parallelized using OpenMP, a standard that allows developers to parallelize programs written in C/C++ or Fortran. OpenMP is available with gcc. A parallel version of the sequential code based on OpenMP is available: openmpPartition.c Again you can compile this version of the program by typing:
make openmpPartition
The changes to the source code are minimal:

Including the header file for the openmp library.
A pragma statement that indicates to the compiler to parallelize the main loop.

As with the sequential program, you can compile this version of the program by typing:
make openmpPartition
You can execute the program and observe the speedup by typing:
time ./openmpPartition < test16.txt
Computers with microprocessors with several cores are ubiquitous these days. OpenMP is available on the gcc family of compilers and on Visual Studio. Lawrence Livermore National Laboratory has an excellent tutorial on OpenMP that is available here.

MPI (Message Passing Interface)

For very, very large problems, a single computer with multiple cores might not be big enough. In that case, using a cluster of workstations (computers) becomes an alternative. A standard has been created to make it feasible to program clusters of workstations as a large "virtual" computer. This standard is called MPI (Message Passing Interface). Again, Lawrence Livermore National Laboratory has an excellent tutorial on MPI that is available here. The sequential program to solve the partition problem has been parallelized using MPI and it is available: mpiPartition.c. As before, you can compile this version of the program by typing:
make mpiPartition
To execute the program, and check how long does it take, type:
time mpiexec -np 4 mpiPartition < test16.txt

Combining OpenMP and MPI

It is possible to combine OpenMP and MPI. A version of the program that combines both approaches is available: mpiOpenMPPartition.c. As in the previous cases, you can compile this version of the program by typing:
make mpiOpenMPPartition
To execute the program, and check how long does it take, type:
time mpiexec -np 4 mpiOpenMPPartition < test16.txt

Streams in Java

Streams in Java are a very expressive style to write code. Streams can also be parallelized very easily on machines with several processors and shared memory. A tutorial on Java Streams is available here. A version of the algorithm that uses Streams in Java is available: Partition.java.

CUDA

CUDA is a programming language that extends C. It is used to program NVIDIA GPUs (Graphical Processing Units). GPUs are cards with their own memory and a large number of processors. In CUDA, one allocates memory on the GPU card using a variant of the malloc function. There are also functions to copy the content of variables from the host to the GPU card and from the GPU card back to the host. The many processors on the GPU card are programmed using "kernels". These are functions that are executed simultaneously on the processors on the GPU. A version of the algorithm in CUDA is available: cudaPartition.cu. The source code can be compiled with the nvcc compiler from NVIDIA. The program requires an NVIDIA GPU to be executed.