YinHuei LOH  Takahiro HARA  Masahiko TSUKAMOTO  Shojiro NISHIO 
21 Yamadaoka, Suita, Osaka 5650871, Japan
{yhloh,hara,tuka,nishio}@ise.eng.osakau.ac.jp
In mobile computing environments, necessary data is replicated on mobile hosts during disconnections to improve availability. In order to minimize problems of inconsistency among data copies while maximizing the data update ability on disconnected sites, we propose an approach which controls database updates based on the probabality that transactions occur and the duration of disconnection time between the sites. We derive a formula to choose the appropriate method between token and optimistic methods, and show that the proposed method works well even when the disconnection time deviates from the expected value.
Concurrency control, mobile computing environments, disconnected operations, database replication.
In mobile computing environments, small portable computers with wireless connection capability known as mobile hosts (MHs) are carried by users who can move freely everywhere. Although this gives the users great convenience in terms of mobility, the wireless network is costly, has limited bandwidth and provides connection with lower quality and more interference. Consequently, disconnection, whether intentional or unintentional, occurs occasionally. During disconnections, necessary data is replicated on the MHs to allow access but updates to the data copies are disallowed to prevent problems of data inconsistency. This limits the update ability of MHs. Hence, there emerges a need to be able to effectively and concurrently run updates to these data, even during disconnections. For example, the schedule plan of a manager may need to be updated concurrently by his secretary in the office and himself being overseas.
[2], [3] and [4] attempted to solve this problem on file systems, while [1] and [6] on database systems. However, rollback of the transactions frequently happens as conflicts may occur among them, resulting in heavy workload. In [5], a strategy is proposed to partition the data value to different sites for data items which are partitionable and in which transactions executed on them are commutative, but it is only applicable to this limited kind of data. In this paper, we propose an algorithm which adaptively chooses the appropriate method, whether token or optimistic, to effectively handle concurrent database updates during disconnections in mobile computing environments.
We assume an environment which consists of mobile hosts and fixed hosts without making a distinction between them. Each host carries a copy of the database and connection may exist between any two hosts. For example, ad hoc networks which consist of only MHs. We assume that all disconnections between the hosts are intentional and the length of the disconnection time is fixed and known. Later, we also show that the algorithm can be applied on unintentional disconnections.
Assume that the database is partitioned into clusters, where data items which are often accessed together in the same transaction are clustered together. The probability of the occurrence of transactions for each cluster at each host is known beforehand by referring to historical data or planned schedule. Our algorithm focuses on only one data cluster to enable independent transactions to be handled separately so that the number of transactions which can be executed at the same time is higher. The whole database can be handled by applying the algorithm on all the data clusters. All the descriptions below are meant for only one data cluster.
In the network, disconnection is assumed to happen sequentially but not simultaneously. Thus, upon disconnection, the network is separated into two different networks, which may be further separated into two different networks again and again, recursively. The two different separated networks are considered as two sites, each consists of one or more hosts. For the purpose of simplicity, we consider the case in which there are no two disconnections coexist at the same time. Thus, there are at most two sites in the whole network.
Assume that in 1 very small unit of time, at most 1 transaction can
occur at 1 site. Thus, the number of transactions, n, which can
happen in 1 unit of time is such that 0 £ n £ 1 where
n = probability of the occurrence of 1 transaction per 1 unit of time.
Let P(A)_{i}^{t} denote the probability that i transactions occur at site A in t units of time. In general,
 (1) 
Before the disconnection, our algorithm chooses the best method between token and optimistic method, which are defined as follows:
Here, the term `` conflict of transaction'' does not just mean conflicts caused by multiple ``write'' transactions, but also conflicting ``read'' and ``write'' transactions, as defined by the serializability of the transactions.
In order to quantify the two methods, we consider the features of each method. The number of transactions which succeed (i.e., committed on the spot or later upon reconnection) is always greater for optimistic method because it chooses the successful transactions after they happen whereas token method has to decide before the transactions occur. However, in optimistic method, transactions executed have to wait for a certain amount of time before they can be committed. Thus, we introduce a function f(x) to define the satisfaction level for each successful transaction which decreases with the increase of x, the duration between the time the transaction happens and the time it commits. Using parameter K ( < 1) as a constant, we define f(x) as:
 (2) 
We define the evaluation function as the summation of satisfaction level for all successful transactions, represented with the symbol F(T) for token method and F(O) for optimistic method.
In token method, since transactions always commit soon after they are executed, the waiting time is always 0 and the satisfaction level is always 1. Thus, F(T) equals the number of transactions that succeed in the corresponding disconnection time, i.e. the summation of multiplication of (i) number of transactions, and (ii) the probability that this number of transactions happen, at the site which holds the token. Thus, when there are two sites, A and B:
 (3) 
For optimistic method, the satisfaction level is always less than 1, since a successful transaction only commits when the sites reconnect. Since transactions that happen at different times have different satisfaction levels, we need to consider each of them separately. From equation (1), we know that there are ^{t}C_{i} possible patterns that i number of transactions happen in the duration of disconnection time t. Thus, the probability that each way happens is P(A)_{i}^{t}/ ^{t}C_{i}.
The formula of P(A)_{i}^{t} can be expanded into summation of ^{t}C_{i} terms, in which each term equals (P(A)_{1}^{1})^{i}(P(A)_{0}^{1})^{(ti)}, and thus contains i number of P(A)_{1}^{1} and (ti) number of P(A)_{0}^{1}. Hence, when we consider the total of all ^{t}C_{i} terms, there are i × ^{t}C_{i} number of P(A)_{1}^{1}. These i × ^{t}C_{i} number of P(A)_{1}^{1} are also evenly distributed over the time 1,2,...,t. So, for each time 1,2,...,t, the total number of P(A)_{1}^{1} is (i × ^{t}C_{i})/t. As an example, consider the case of P(A)_{2}^{4} (t = 4 and i = 2) in Figure 1. To arrange 2 transactions in 4 units of time, there are ^{4}C_{2} ( = 6) ways. The probability that each of these 6 ways happens is P(A)^{4}_{2}/6. For each of these 6 ways, there exist 2P(A)_{1}^{1}, which add up to the total of (2 ×6)P(A)_{1}^{1}, i.e. 12P(A)_{1}^{1}. These 12P(A)_{1}^{1} are evenly distributed over the time 1,2,3,4. So, for each time 1,2,3,4, there are (12/4)P(A)_{1}^{1}, i.e. 3P(A)_{1}^{1}.
Considering the actual number of transactions which happen at each time 1,2,...,t, it is equal to the number of transactions multiplies the probability that this number of transactions happen (as in token method), as follows:

time 1,  x = t, 
time 2,  x = t1, 
time t,  x = 1. 
Hence, by referring to equation (2), we can sum up the satisfaction level f(x) for transactions which happen at time 1,2,...t. Consequently, the total satisfaction function, S(A)_{i}^{t}, which is equal to the summation of satisfaction level for the transactions that happen can be defined as follows:
 (4) 
For F(T), the evaluation function corresponding to each i ( = 1,2,...,t) at site A is iP(A)_{i}^{t}. For F(O), the satisfaction level is integrated and the evaluation function for each i at site A is S(A)_{i}^{t}. Thus, we conclude the following equation where the first and second terms refer to the case where transactions occur at only one single site; while the third and forth terms refer to the case where transactions occur at both sites, in which the transactions happen at the site with the higher number of transactions will succeed.
 (5) 
When F(T) > F(O), token method is used, and vice versa.
The graphs in Figure 2 and Figure 3 show how the values of the evaluation functions for each method are affected by the change of disconnection time. From the graphs we can see that with different values of K, the algorithm gives a different result of the choice of method as the value of the evaluation function of optimistic method is affected by the change of value K. As a guidance to choose the value of K, for applications which cannot tolerate waiting time and needs to be committed urgently (e.g. those involving money like systems for business trading, budget management and bank account management), it is appropriate to set the value of K bigger. On the other hand, a small value of K fits applications like schedule management systems as they are not timecritical and transactions in these systems are relatively simple and thus rollback can easily be performed.
Comparing our approach with the approach which uses only one method, either token or optimistic, certainly our approach is better as it balances off the merits and demerits of both methods and gives the users a choice to decide the appropriate method to be used.
We now investigate the effectiveness of our algorithm when the disconnection time deviates from the expected value. To achieve this purpose, we used a few probability distributions (Poisson, normal, uniform and exponential) to confirm the validity of the results. In these distributions, we calculate the ideal values (i.e., when the exact disconnection time is known beforehand) and the actual values (i.e., when we take the average disconnection time as the parameter) of the proposed algorithm. Figure 4 shows the result for Poisson distribution. From the graph, it can be observed that the difference between the actual values and the ideal values is very small. The difference increases as the disconnection time approaches the point where the values of evaluation functions for token and optimistic methods intersect. From calculation, we found that the percentage of difference for each disconnection time ((ideal actual) / ideal) is less than 5% for Poisson, normal and uniform distribution, and less than 10% for exponential distribution. Hence, it is verified that even if the proposed algorithm's performance deteriorates due to the deviation of the disconnection time, as long as the deviation is not extreme and follows a common probability distribution, it still outperforms the approach which only uses a single method, either token or optimistic method.
In this paper, we have proposed an algorithm to choose the most effective method between two approaches, token method and optimistic method to handle database updates during periods of disconnection depending on the probability that transactions occur and the disconnection time. We also showed that the algorithm works well with little error when the disconnection time deviates from the average value. In mobile computing environments, using the available statistical information like the movement speed and direction, and the current location of the mobile host, we can estimate the expected reconnection time. As long as the actual reconnection time does not vary too drastically from the average value, the algorithm can be applied. Thus, we conclude that the proposed algorithm can work for both intentional and unintentional disconnections.
^{1} This research was supported in part by the Research for the Future Program of Japan Society for the Promotion of Science under the Project ``Advanced Multimedia Content Processing (Project No. JSPSRFTF97P00501)'' and by GrantinAid for Scientific Research on Priority Areas ``Advanced Databases'' of the Ministry of Education, Science, Sports and Culture of Japan (Grant No. 08244103).