| News & Updates |
| |
| December 13, 2012 |
| MPC 2.4.1 (stable) is available |
| September 24, 2012 |
| MPC 2.4.0 (stable) is available |
| December 9, 2011 |
| MPC 2.3.1 (stable) is available |
| November 28, 2011 |
| MPC 2.3.0 (stable) is available |
| June 14, 2011 |
| MPC 2.2.0 (stable) is available |
| May 1, 2011 |
| MPC 2.1.0 (stable) is available |
| December 11, 2010 |
| MPC 2.1_rc2 (dev) is available |
| December 7, 2010 |
| MPC 2.1_rc1 (dev) is available |
| June 11, 2010 |
| MPC 2.0 (stable) is available |
| February 18, 2010 |
| MPC 2.0 rc2 (dev) is available |
| December 14, 2009 |
| MPC 2.0 rc1 (dev) is available |
| November 30, 2009 |
| MPC 1.1 (stable) is available |
| July 03, 2009 |
| MPC 1.1 rc8 is available |
| June 26, 2009 |
| MPC 1.1 rc7 is available |
| March 31, 2009 |
| MPC 1.1 rc6 is available |
| Decembre 2008 |
| Project started |
|
|
|
|
|
MPC
The MPC (MultiProcessor Computing) framework provides a
unified parallel runtime designed to improve the scalability and
performances of applications running on clusters of (very) large
multiprocessor/multicore NUMA nodes.
MPC is available under the CeCILL-C
license, which is a French transposition of the LGPL and is fully
LGPL-compatible.
MPC conforms to the POSIX Threads, OpenMP 2.5 and MPI 1.3
standards. All these standards can be mixed
together in an efficient way, thanks to process virtualization. MPC
has been ported on x86, x86_64 with Linux and
OpenSolaris systems and supports TCP and InfiniBand interconnects.
Main MPC features:
Thread Library: MPC
comes with its own MxN thread library and POSIX Thread
implementation. MxN thread libraries provide lightweight
user-level threads that are mapped to kernel threads. One key advantage of the MxN approach
is the ability to optimize the user-level thread scheduler to create and schedule a very large number of threads with a
reduced overhead. The MPC thread scheduler provides a polling method
that avoids busy-waiting and keeps a high level of reactivity for
communications, even when the number of tasks is much larger than
the number of available CPU cores. Furthermore, collective
communications are integrated into the thread scheduler to enable
efficient barrier, reduction and broadcast
operations.
Memory Allocator: The
MPC thread library comes with a thread-aware and NUMA-aware
memory allocator (malloc, calloc, realloc, free, memalign and
posix_memalign). It implements a per-thread heap to avoid
contention during allocation and to maintain data locality on NUMA
nodes. Each new data allocation is first performed by a lock-free
algorithm on the thread private heap. If this local private
heap is unable to provide a new memory block, the requesting
thread queries a large page to the second-level global heap
with a synchronization scheme. A large page is a parametrized number
of system pages. Memory deallocation is locally performed in each
private heap. When a large page is totally free, it is
returned to the second-level global heap with a lock-free method.
Pages in second-level global heap are virtual and are not
necessarily backed by physical pages. On a same node, memory pages freed by a thread are provided to new allocations of other threads without any system call.
Thread debugging: Support
for debugging
user-level MPC threads is
provided thanks to an implementation of the libthread_db
and a patch to the GNU Debugger (GDB). It allows to manage
user-level threads in GDB and all GUIs based on GDB. It is also
compatible with SUN's DBX debugger.
OpenMP: MPC
supports compilation and execution of C/C++/Fortran OpenMP
applications thanks to its built-in OpenMP
2.5 runtime. The
compiling part is done through a patched version of GCC (4.3, 4.4) called MPC_GCC. The OpenMP
runtime has been optimized to efficiently support hybrid MPI/OpenMP
codes.
Thread safety:
MPC_GCC converts standard C/C++/Fortran MPI codes to the MPC thread-based MPI implementation: thread safety issues are managed with automatic privatization of global variables.
Hierarchical Local Storage: HLS is a set of directives in C, C++ and Fortran allowing the application developer to share global variables across MPI tasks running on a same node. The HLS extension can be used to reduce the memory footprint of MPI programs by avoiding to duplicate data that are common to all MPI tasks. One typical use case of HLS variables is a large table of physical constants. With the HLS extension, only one table per node will be allocated instead of one table per core (if there is one MPI task per core) thus reducing the memory consumption by a factor equal to the number of cores per node.
MPI: MPC's
implementation of MPI fully respects the MPI 1.3 standard. It also
provides an efficient MPI_THREAD_MULTIPLE support (MPI2 feature).
MPC's communications are implemented in the following way:
intra-node communications involve two tasks in a unique process (MPC's
default mode uses one process per node). These tasks use the optimized
thread-scheduler polling method and thread-scheduler integrated
collectives to communicate with each other. As far as inter-node
communications are concerned, MPC uses direct access to the TCP or
InfiniBand interconnect. MPC provides performances close to MPICH2
or OpenMPI, but with a much better support of hybrid programming
models (e.g., MPI/PThreads, MPI/OpenMP, ...) and lower memory consumption.
Additional features:
Thanks to its design, MPC allows mixed-mode programming models and efficient interaction with the HPC software stack.
|