Next: Implementing Madeleine on top Up: Using the VI Architecture Previous: VIA: a low-level, general

Subsections

Madeleine: a medium-level interface for high performance multithreaded environments

The VIA interface has been designed as a fair trade-off between portability and efficiency for general-purpose applications. In contrast, the Madeleine medium-level interface is specifically aimed at meeting the efficiency and portability requirements of RPC-based multithreaded environments. Unlike environments like MPI or PVM, Madeleine is not designed to be used directly by regular user-level applications but is especially targeted at RPC-based multithreaded environments such as PM2 or Nexus [6]. It is intended to bridge the gap between functionalities provided by low-level network protocols such as VIA [4,5] and requirements of high-level communication schemes such as remote procedure calls. Shah et. al. [11] also show how stream sockets and RPC (DCOM) can be efficiently implemented over the VI Architecture. However they do not tackle the complex integration of a thread package and communication software which is crucial in the design of multithreaded environments.

One important feature of Madeleine is that it was designed to allow upper software layers to avoid extra copies of transmitted data. At the lowest level, this cooperation is realized using up-calls on the receiving side to allow the upper layers to interactively participate in data emission and extraction. At the user interface level, this cooperation is elegantly accomplished through the use of simple buffer management primitives. Another main characteristic of Madeleine is its ``thread-awareness''. Madeleine is able to properly schedule threads accessing the network. For instance, Madeleine can group several polling operations in order to avoid multiple threads simultaneously busy waiting in the system.

The structure of Madeleine is split into two software layers: a generic user programming interface and a low-level portability interface whose implementation is network-dependent.

The programming interface

The Madeleine programming interface provides a small set of primitives to build RPC-like communication schemes. These primitives look like classical message-passing-oriented primitives. They only differ in the way the reception of data is handled. A message consists of several pieces of data, located anywhere in user space. They are assembled (resp. disassembled) incrementally using packing (resp. unpacking) primitives. This way, a message can be built (resp. examined) at multiple software levels, which enables for instance the use of piggy-backing techniques without losing efficiency. The prototypes of these primitives are very similar to the buffer management primitives provided by the PVM interface (e.g., pvm_pkint, pvm_pkfloat, etc.). For instance, the function used to append one or more integers to the current message has the following prototype:

void pack_int(int mode, int *elems, int nb);

Among the parameters, elems is the address of an array of integers and nb represents the size of this array. The mode parameter plays an important role in the Madeleine interface, because it determines the semantics of the operation. It can be assigned different values, also called flags. For instance, the possible flags defining the semantics of the sendbuf_pack operation are: SEND_SAFER, SEND_LATER and SEND_CHEAPER. Please refer to [3] for an extensive description of the programming interface.

The portability layer

The portability layer of Madeleine is intended to provide a common interface to the various communication subsystems on which it may be ported. Its execution model is inspired by the Active Messages model [14]. Messages consist of a set of one or more vectors. Each vector is a contiguous area of memory and is defined by a pair (address, size) where the size is expressed in bytes. The first vector of a message has a particular semantics with respect to the receive operation: this vector (the ``header'') contains the data that must be extracted prior to the rest of the message (the ``body''). Once extracted, these data are immediately made available to the upper layers so that some application-level actions may be executed before the rest of the message is extracted.

**Figure 3:** Conceptual view of the data-exchange protocol in the portability layer.
$\begin{figure}\begin{center} \psfig{width=10cm,file=protocol.eps} \end{center}\end{figure}$

Figure 3 sketches a situation where process A is sending a message to process B. First, the message header (i.e., the first vector) is sent to process B. On its receipt, a handler is executed with the header as a parameter. This handler, which is defined by the upper layer, can inspect the header and take appropriate actions to prepare for full message extraction. Typically, the handler may compute the locations where data should be placed (Step (3) on Figure 3). Process A is allowed to send the vectors of remaining data (i.e., the message body) only after the completion of the handler. These data may then directly be stored at the right memory locations on their receipt.

Next: Implementing Madeleine on top Up: Using the VI Architecture Previous: VIA: a low-level, general

Raymond Namyst
1999-11-26