Distributed shared memory(DSM) systems [5] provide a simple mean of
programming for the networks of workstations, which are gaining popularity due
to their cost-effective high computing power.
However, as the number of workstations participating in a DSM system increases,
the probability of failure also increases, which could render the system
useless for long-running applications.
For DSM systems to be of any practical use, it is important for the system to
be recoverable so that the processes do not have to restart from the beginning
when a failure occurs.
One solution to provide fault-tolerance for DSM systems is to use message
logging in addition to independent checkpointing [8], and many
message logging schemes have been suggested.
In the reader-based logging schemes [8,11], each process
logs the data items it has accessed into the stable storage in the access order,
so that the process can retrieve the logged messages at the same computational
points after the failure-recovery.
However, these schemes could incur high logging overheads during the
failure-free operations.
The writer-based logging schemes [6,7] utilize the volatile
log of the writer process for each data item, and the corresponding readers
only log the access information into the stable storage.
As a result, the logging overhead can be reduced, however, the overhead of
stable logging is still non-negligible.
One solution suggested in [2] uses only the volatile log of the
reader to store the access information.
This scheme removes the overhead of stable logging, however, concurrent failures
of multiple processes cannot be tolerated.
Causal logging is one logging approach which is gaining a lot of attention for
the message-passing based distributed computing systems [1].
In the causal logging technique, the writer-based logging of data items is
performed and the access information is logged at the volatile storage of the
dependent processes.
Since this scheme completely eliminates the needs for stable logging, logging
overhead can significantly be reduced.
Also, since the storage of the dependent processes are utilized, concurrent and
multiple failures can be handled.
However, in this scheme, the log of the access information has to be causally
spread over the dependent processes, which may cause the non-negligible message
overhead.
A causal logging scheme for the DSM system based on lazy release consistent(LRC)
memory model[3] has been suggested in [12].
In this scheme, to reduce the message overhead, the data structures and
operations supported by the LRC model, such as diff, write notices, and
vector clocks, are utilized.
Moreover, logging is performed only at the synchronization points instead of
every message passing point.
However, since this scheme utilizes the vector clock to trace the access
information, a vector clock for each synchronization interval must be logged.
As a result, the message overhead turns out to be non-negligible for the system
with a large number of processes.
Especially, for the barrier operations, the logging overhead can be significant,
since each process has to perform the logging for all the other processes.
In this paper, we propose an efficient implementation technique for the causal
logging.
In the proposed scheme, instead of logging the vector clock for each
synchronization operation, the sufficient and necessary information to recreate
the corresponding vector clock is inserted into the existing write notice
structures.
As a result, the additional information carried by each message becomes less
than two integers for each synchronization interval, and hence, fault-tolerance
can be achieved with a very little overhead.
Moreover, the size of the additional information is independent of the number
of processes in the system, and hence, the new scheme can be very efficient for
the large size systems.
The rest of this paper is organized as follows: In Section 2, the system model
and the definition of consistent logging for the correct recovery are presented
and in Section 3, the protocols for causal logging and recovery are presented
with their correctness.
The performance of the proposed protocol is discussed with the experimental
results in Section 4 and Section 5 concludes the paper.