MULTIPROCESSOR COMPUTING
SourceForge.net Logo
Home Home Contact us Contacts
Overview Documentation Downloads Roadmap Contacts Links
 
News & Updates
 
December 13, 2012
MPC 2.4.1 (stable) is available
September 24, 2012
MPC 2.4.0 (stable) is available
December 9, 2011
MPC 2.3.1 (stable) is available
November 28, 2011
MPC 2.3.0 (stable) is available
June 14, 2011
MPC 2.2.0 (stable) is available
May 1, 2011
MPC 2.1.0 (stable) is available
December 11, 2010
MPC 2.1_rc2 (dev) is available
December 7, 2010
MPC 2.1_rc1 (dev) is available
June 11, 2010
MPC 2.0 (stable) is available
February 18, 2010
MPC 2.0 rc2 (dev) is available
December 14, 2009
MPC 2.0 rc1 (dev) is available
November 30, 2009
MPC 1.1 (stable) is available
July 03, 2009
MPC 1.1 rc8 is available
June 26, 2009
MPC 1.1 rc7 is available
March 31, 2009
MPC 1.1 rc6 is available
Decembre 2008
Project started
Quick Links
 
Sourceforge project's page
CeCILL license
CEA web site
HPC CEA web site
Exascale Computing Research

MPC

The MPC (MultiProcessor Computing) framework provides a unified parallel runtime designed to improve the scalability and performances of applications running on clusters of (very) large multiprocessor/multicore NUMA nodes.

MPC is available under the CeCILL-C license, which is a French transposition of the LGPL and is fully LGPL-compatible.

MPC conforms to the POSIX Threads, OpenMP 2.5 and MPI 1.3 standards. All these standards can be mixed together in an efficient way, thanks to process virtualization. MPC has been ported on x86, x86_64 with Linux and OpenSolaris systems and supports TCP and InfiniBand interconnects.

Main MPC features:

  • Thread Library:
    MPC comes with its own MxN thread library and POSIX Thread implementation. MxN thread libraries provide lightweight user-level threads that are mapped to kernel threads. One key advantage of the MxN approach is the ability to optimize the user-level thread scheduler to create and schedule a very large number of threads with a reduced overhead. The MPC thread scheduler provides a polling method that avoids busy-waiting and keeps a high level of reactivity for communications, even when the number of tasks is much larger than the number of available CPU cores. Furthermore, collective communications are integrated into the thread scheduler to enable efficient barrier, reduction and broadcast operations.

  • Memory Allocator:
    The MPC thread library comes with a thread-aware and NUMA-aware memory allocator (malloc, calloc, realloc, free, memalign and posix_memalign). It implements a per-thread heap to avoid contention during allocation and to maintain data locality on NUMA nodes. Each new data allocation is first performed by a lock-free algorithm on the thread private heap. If this local private heap is unable to provide a new memory block, the requesting thread queries a large page to the second-level global heap with a synchronization scheme. A large page is a parametrized number of system pages. Memory deallocation is locally performed in each private heap. When a large page is totally free, it is returned to the second-level global heap with a lock-free method. Pages in second-level global heap are virtual and are not necessarily backed by physical pages. On a same node, memory pages freed by a thread are provided to new allocations of other threads without any system call.

  • Thread debugging:
    Support for debugging user-level MPC threads is provided thanks to an implementation of the libthread_db and a patch to the GNU Debugger (GDB). It allows to manage user-level threads in GDB and all GUIs based on GDB. It is also compatible with SUN's DBX debugger.

  • OpenMP:
    MPC supports compilation and execution of C/C++/Fortran OpenMP applications thanks to its built-in OpenMP 2.5 runtime. The compiling part is done through a patched version of GCC (4.3, 4.4) called MPC_GCC. The OpenMP runtime has been optimized to efficiently support hybrid MPI/OpenMP codes.

  • Thread safety:
    MPC_GCC converts standard C/C++/Fortran MPI codes to the MPC thread-based MPI implementation: thread safety issues are managed with automatic privatization of global variables.

  • Hierarchical Local Storage:
    HLS is a set of directives in C, C++ and Fortran allowing the application developer to share global variables across MPI tasks running on a same node. The HLS extension can be used to reduce the memory footprint of MPI programs by avoiding to duplicate data that are common to all MPI tasks. One typical use case of HLS variables is a large table of physical constants. With the HLS extension, only one table per node will be allocated instead of one table per core (if there is one MPI task per core) thus reducing the memory consumption by a factor equal to the number of cores per node.

  • MPI:
    MPC's implementation of MPI fully respects the MPI 1.3 standard. It also provides an efficient MPI_THREAD_MULTIPLE support (MPI2 feature). MPC's communications are implemented in the following way: intra-node communications involve two tasks in a unique process (MPC's default mode uses one process per node). These tasks use the optimized thread-scheduler polling method and thread-scheduler integrated collectives to communicate with each other. As far as inter-node communications are concerned, MPC uses direct access to the TCP or InfiniBand interconnect. MPC provides performances close to MPICH2 or OpenMPI, but with a much better support of hybrid programming models (e.g., MPI/PThreads, MPI/OpenMP, ...) and lower memory consumption.

Additional features:

  • Intel Thread Building Blocks::
    TBB has been ported on the top of MPC. The patched version of the TBB is provided with the MPC distribution.


Thanks to its design, MPC allows mixed-mode programming models and efficient interaction with the HPC software stack.



Designed by CMG Technologies, adapted by Thomas LEIBOVICI
Design downloaded from Free Templates