Yin-Huei LOH
Takahiro HARA
Masahiko TSUKAMOTO
Shojiro NISHIO |
2-1 Yamadaoka, Suita, Osaka 565-0871, Japan
{yhloh,hara,tuka,nishio}@ise.eng.osaka-u.ac.jp
In mobile computing environments, necessary data is replicated on
mobile hosts during disconnections to improve availability.
In order to minimize problems of
inconsistency among data copies while maximizing the data update
ability
on disconnected sites, we propose an approach which controls database
updates based on the probabality that transactions occur and the
duration of disconnection time between the sites. We derive a formula
to choose the appropriate method between token and optimistic methods,
and show that the proposed method works well even when the
disconnection time deviates from the expected value.
Concurrency control, mobile computing environments, disconnected
operations, database replication.
In mobile computing environments, small portable computers with
wireless connection capability known as mobile hosts (MHs) are carried
by users who can move freely everywhere. Although this gives the users
great convenience in terms of mobility, the wireless network is
costly, has limited bandwidth and provides connection with lower
quality and more interference. Consequently, disconnection, whether
intentional or unintentional, occurs occasionally. During
disconnections, necessary data is replicated on the MHs to allow
access but updates to the data copies are disallowed to prevent
problems of data inconsistency. This limits the update ability of MHs.
Hence, there emerges a need to be able to effectively and concurrently
run updates to these data, even during disconnections. For example,
the schedule plan of a manager may need to be updated concurrently by
his secretary in the office and himself being overseas.
[2], [3] and [4] attempted to solve this
problem on file systems, while [1] and [6] on
database systems. However, rollback of the transactions frequently
happens as conflicts may occur among them, resulting in heavy
workload. In [5], a strategy is proposed to partition
the data value to different sites for
data items which are partitionable and in which transactions executed
on them are commutative, but it is only applicable to this limited
kind of data. In this paper, we propose an algorithm which adaptively
chooses the appropriate method, whether token or optimistic, to
effectively handle concurrent database updates during disconnections
in mobile computing environments.
We assume an environment which consists of mobile hosts and fixed
hosts without making a distinction between them. Each host carries a
copy of the database and connection may exist between any two hosts.
For example, ad hoc networks which consist of only MHs. We
assume that all disconnections between the hosts are intentional and
the length of the disconnection time is fixed and known. Later, we
also show that the algorithm can be applied on unintentional
disconnections.
Assume that the database is partitioned into clusters, where data
items which are often accessed together in the same transaction are
clustered together. The probability of the occurrence of transactions
for each cluster at each host is known beforehand by referring to
historical data or planned schedule. Our algorithm focuses on only one
data cluster to enable independent transactions to be handled
separately so that the number of transactions which can be executed at
the same time is higher. The whole database can be handled by applying
the algorithm on all the data clusters. All the descriptions below are
meant for only one data cluster.
In the network, disconnection is assumed to happen sequentially but
not simultaneously. Thus, upon disconnection, the network is
separated into two different networks, which may be further
separated into two different networks again and again,
recursively. The two different separated networks are considered as
two sites, each consists of one or more hosts. For the purpose
of simplicity, we consider the case in which there are no two
disconnections coexist at the same time. Thus, there are at most two
sites in the whole network.
Assume that in 1 very small unit of time, at most 1 transaction can
occur at 1 site. Thus, the number of transactions, n, which can
happen in 1 unit of time is such that 0 £ n £ 1 where
n = probability of the occurrence of 1 transaction per 1 unit of
time.
Let P(A)it denote the probability that i transactions occur at
site A in t units of time. In general,
ABSTRACT
Keywords
1. INTRODUCTION
2. SYSTEM ARCHITECTURE
| (1) |
Before the disconnection, our algorithm chooses the best method between token and optimistic method, which are defined as follows:
Here, the term `` conflict of transaction'' does not just mean conflicts caused by multiple ``write'' transactions, but also conflicting ``read'' and ``write'' transactions, as defined by the serializability of the transactions.
In order to quantify the two methods, we consider the features of each method. The number of transactions which succeed (i.e., committed on the spot or later upon reconnection) is always greater for optimistic method because it chooses the successful transactions after they happen whereas token method has to decide before the transactions occur. However, in optimistic method, transactions executed have to wait for a certain amount of time before they can be committed. Thus, we introduce a function f(x) to define the satisfaction level for each successful transaction which decreases with the increase of x, the duration between the time the transaction happens and the time it commits. Using parameter K ( < 1) as a constant, we define f(x) as:
| (2) |
We define the evaluation function as the summation of satisfaction level for all successful transactions, represented with the symbol F(T) for token method and F(O) for optimistic method.
In token method, since transactions always commit soon after they are executed, the waiting time is always 0 and the satisfaction level is always 1. Thus, F(T) equals the number of transactions that succeed in the corresponding disconnection time, i.e. the summation of multiplication of (i) number of transactions, and (ii) the probability that this number of transactions happen, at the site which holds the token. Thus, when there are two sites, A and B:
| (3) |
For optimistic method, the satisfaction level is always less than 1, since a successful transaction only commits when the sites reconnect. Since transactions that happen at different times have different satisfaction levels, we need to consider each of them separately. From equation (1), we know that there are tCi possible patterns that i number of transactions happen in the duration of disconnection time t. Thus, the probability that each way happens is P(A)it/ tCi.
The formula of P(A)it can be expanded into summation of tCi terms, in which each term equals (P(A)11)i(P(A)01)(t-i), and thus contains i number of P(A)11 and (t-i) number of P(A)01. Hence, when we consider the total of all tCi terms, there are i × tCi number of P(A)11. These i × tCi number of P(A)11 are also evenly distributed over the time 1,2,...,t. So, for each time 1,2,...,t, the total number of P(A)11 is (i × tCi)/t. As an example, consider the case of P(A)24 (t = 4 and i = 2) in Figure 1. To arrange 2 transactions in 4 units of time, there are 4C2 ( = 6) ways. The probability that each of these 6 ways happens is P(A)42/6. For each of these 6 ways, there exist 2P(A)11, which add up to the total of (2 ×6)P(A)11, i.e. 12P(A)11. These 12P(A)11 are evenly distributed over the time 1,2,3,4. So, for each time 1,2,3,4, there are (12/4)P(A)11, i.e. 3P(A)11.
Considering the actual number of transactions which happen at each time 1,2,...,t, it is equal to the number of transactions multiplies the probability that this number of transactions happen (as in token method), as follows:
|
| time 1, | x = t, |
| time 2, | x = t-1, |
| time t, | x = 1. |
Hence, by referring to equation (2), we can sum up the satisfaction level f(x) for transactions which happen at time 1,2,...t. Consequently, the total satisfaction function, S(A)it, which is equal to the summation of satisfaction level for the transactions that happen can be defined as follows:
| (4) |
For F(T), the evaluation function corresponding to each i ( = 1,2,...,t) at site A is iP(A)it. For F(O), the satisfaction level is integrated and the evaluation function for each i at site A is S(A)it. Thus, we conclude the following equation where the first and second terms refer to the case where transactions occur at only one single site; while the third and forth terms refer to the case where transactions occur at both sites, in which the transactions happen at the site with the higher number of transactions will succeed.
| (5) |
When F(T) > F(O), token method is used, and vice versa.
The graphs in Figure 2 and Figure 3 show how the values of the evaluation functions for each method are affected by the change of disconnection time. From the graphs we can see that with different values of K, the algorithm gives a different result of the choice of method as the value of the evaluation function of optimistic method is affected by the change of value K. As a guidance to choose the value of K, for applications which cannot tolerate waiting time and needs to be committed urgently (e.g. those involving money like systems for business trading, budget management and bank account management), it is appropriate to set the value of K bigger. On the other hand, a small value of K fits applications like schedule management systems as they are not time-critical and transactions in these systems are relatively simple and thus rollback can easily be performed.
Comparing our approach with the approach which uses only one method, either token or optimistic, certainly our approach is better as it balances off the merits and demerits of both methods and gives the users a choice to decide the appropriate method to be used.
We now investigate the effectiveness of our algorithm when the disconnection time deviates from the expected value. To achieve this purpose, we used a few probability distributions (Poisson, normal, uniform and exponential) to confirm the validity of the results. In these distributions, we calculate the ideal values (i.e., when the exact disconnection time is known beforehand) and the actual values (i.e., when we take the average disconnection time as the parameter) of the proposed algorithm. Figure 4 shows the result for Poisson distribution. From the graph, it can be observed that the difference between the actual values and the ideal values is very small. The difference increases as the disconnection time approaches the point where the values of evaluation functions for token and optimistic methods intersect. From calculation, we found that the percentage of difference for each disconnection time ((ideal- actual) / ideal) is less than 5% for Poisson, normal and uniform distribution, and less than 10% for exponential distribution. Hence, it is verified that even if the proposed algorithm's performance deteriorates due to the deviation of the disconnection time, as long as the deviation is not extreme and follows a common probability distribution, it still outperforms the approach which only uses a single method, either token or optimistic method.
In this paper, we have proposed an algorithm to choose the most effective method between two approaches, token method and optimistic method to handle database updates during periods of disconnection depending on the probability that transactions occur and the disconnection time. We also showed that the algorithm works well with little error when the disconnection time deviates from the average value. In mobile computing environments, using the available statistical information like the movement speed and direction, and the current location of the mobile host, we can estimate the expected reconnection time. As long as the actual reconnection time does not vary too drastically from the average value, the algorithm can be applied. Thus, we conclude that the proposed algorithm can work for both intentional and unintentional disconnections.
1 This research was supported in part by the Research for the Future Program of Japan Society for the Promotion of Science under the Project ``Advanced Multimedia Content Processing (Project No. JSPS-RFTF97P00501)'' and by Grant-in-Aid for Scientific Research on Priority Areas ``Advanced Databases'' of the Ministry of Education, Science, Sports and Culture of Japan (Grant No. 08244103).