A Hybrid Method for Concurrent Updates on Disconnected Databases in Mobile Computing Environments ¹

Yin-Huei LOH
Takahiro HARA
Masahiko TSUKAMOTO
Shojiro NISHIO

Dept. of Information Systems Engineering, Graduate School of Engineering, Osaka University

2-1 Yamadaoka, Suita, Osaka 565-0871, Japan

{yhloh,hara,tuka,nishio}@ise.eng.osaka-u.ac.jp

ABSTRACT

In mobile computing environments, necessary data is replicated on mobile hosts during disconnections to improve availability. In order to minimize problems of inconsistency among data copies while maximizing the data update ability on disconnected sites, we propose an approach which controls database updates based on the probabality that transactions occur and the duration of disconnection time between the sites. We derive a formula to choose the appropriate method between token and optimistic methods, and show that the proposed method works well even when the disconnection time deviates from the expected value.

Keywords

Concurrency control, mobile computing environments, disconnected operations, database replication.

1. INTRODUCTION

In mobile computing environments, small portable computers with wireless connection capability known as mobile hosts (MHs) are carried by users who can move freely everywhere. Although this gives the users great convenience in terms of mobility, the wireless network is costly, has limited bandwidth and provides connection with lower quality and more interference. Consequently, disconnection, whether intentional or unintentional, occurs occasionally. During disconnections, necessary data is replicated on the MHs to allow access but updates to the data copies are disallowed to prevent problems of data inconsistency. This limits the update ability of MHs. Hence, there emerges a need to be able to effectively and concurrently run updates to these data, even during disconnections. For example, the schedule plan of a manager may need to be updated concurrently by his secretary in the office and himself being overseas.

[2], [3] and [4] attempted to solve this problem on file systems, while [1] and [6] on database systems. However, rollback of the transactions frequently happens as conflicts may occur among them, resulting in heavy workload. In [5], a strategy is proposed to partition the data value to different sites for data items which are partitionable and in which transactions executed on them are commutative, but it is only applicable to this limited kind of data. In this paper, we propose an algorithm which adaptively chooses the appropriate method, whether token or optimistic, to effectively handle concurrent database updates during disconnections in mobile computing environments.

2. SYSTEM ARCHITECTURE

We assume an environment which consists of mobile hosts and fixed hosts without making a distinction between them. Each host carries a copy of the database and connection may exist between any two hosts. For example, ad hoc networks which consist of only MHs. We assume that all disconnections between the hosts are intentional and the length of the disconnection time is fixed and known. Later, we also show that the algorithm can be applied on unintentional disconnections.

Assume that the database is partitioned into clusters, where data items which are often accessed together in the same transaction are clustered together. The probability of the occurrence of transactions for each cluster at each host is known beforehand by referring to historical data or planned schedule. Our algorithm focuses on only one data cluster to enable independent transactions to be handled separately so that the number of transactions which can be executed at the same time is higher. The whole database can be handled by applying the algorithm on all the data clusters. All the descriptions below are meant for only one data cluster.

In the network, disconnection is assumed to happen sequentially but not simultaneously. Thus, upon disconnection, the network is separated into two different networks, which may be further separated into two different networks again and again, recursively. The two different separated networks are considered as two sites, each consists of one or more hosts. For the purpose of simplicity, we consider the case in which there are no two disconnections coexist at the same time. Thus, there are at most two sites in the whole network.

Assume that in 1 very small unit of time, at most 1 transaction can occur at 1 site. Thus, the number of transactions, n, which can happen in 1 unit of time is such that 0 £ n £ 1 where

n = probability of the occurrence of 1 transaction per 1 unit of time.

Let P(A)_i^t denote the probability that i transactions occur at site A in t units of time. In general,

P(A)_i^t = ^tC_i(P(A)₁¹)ⁱ(P(A)₀¹)^(t-i).

(1)

Before the disconnection, our algorithm chooses the best method between token and optimistic method, which are defined as follows:

Token Method: A ``token'' represents the right to execute transactions. Upon disconnection, the token is given to the site with higher probability of the occurrence of transactions, and only this site is granted the permission to run transactions (followed by a commit action).
Optimistic Method: Both sites can run transactions on the data cluster, but it is not guaranteed that these transactions can be committed. During the disconnection, if transactions occur at only one of the sites, these transactions will succeed when the sites reconnect. However, if conflicts occur, i.e. transactions occur at both sites, rollback is forced to be performed at the site where the number of transactions occurred is smaller.

Here, the term `` conflict of transaction'' does not just mean conflicts caused by multiple ``write'' transactions, but also conflicting ``read'' and ``write'' transactions, as defined by the serializability of the transactions.

3. ALGORITHM DESCRIPTION

In order to quantify the two methods, we consider the features of each method. The number of transactions which succeed (i.e., committed on the spot or later upon reconnection) is always greater for optimistic method because it chooses the successful transactions after they happen whereas token method has to decide before the transactions occur. However, in optimistic method, transactions executed have to wait for a certain amount of time before they can be committed. Thus, we introduce a function f(x) to define the satisfaction level for each successful transaction which decreases with the increase of x, the duration between the time the transaction happens and the time it commits. Using parameter K ( < 1) as a constant, we define f(x) as:

f(x) =

ì
ï
ï
ï
í
ï
ï
ï
î

1 - Kx

x <

otherwise.

(2)

We define the evaluation function as the summation of satisfaction level for all successful transactions, represented with the symbol F(T) for token method and F(O) for optimistic method.

In token method, since transactions always commit soon after they are executed, the waiting time is always 0 and the satisfaction level is always 1. Thus, F(T) equals the number of transactions that succeed in the corresponding disconnection time, i.e. the summation of multiplication of (i) number of transactions, and (ii) the probability that this number of transactions happen, at the site which holds the token. Thus, when there are two sites, A and B:

F(T) =

ì
ï
ï
ï
í
ï
ï
ï
î

t
å
i = 1

iP(A)_i^t = tP(A)₁¹

P(A)₁¹ > P(B)₁¹

t
å
i = 1

iP(B)_i^t = tP(B)₁¹

otherwise.

(3)

For optimistic method, the satisfaction level is always less than 1, since a successful transaction only commits when the sites reconnect. Since transactions that happen at different times have different satisfaction levels, we need to consider each of them separately. From equation (1), we know that there are ^tC_i possible patterns that i number of transactions happen in the duration of disconnection time t. Thus, the probability that each way happens is P(A)_i^t/ ^tC_i.

The formula of P(A)_i^t can be expanded into summation of ^tC_i terms, in which each term equals (P(A)₁¹)ⁱ(P(A)₀¹)^(t-i), and thus contains i number of P(A)₁¹ and (t-i) number of P(A)₀¹. Hence, when we consider the total of all ^tC_i terms, there are i × ^tC_i number of P(A)₁¹. These i × ^tC_i number of P(A)₁¹ are also evenly distributed over the time 1,2,...,t. So, for each time 1,2,...,t, the total number of P(A)₁¹ is (i × ^tC_i)/t. As an example, consider the case of P(A)₂⁴ (t = 4 and i = 2) in Figure 1. To arrange 2 transactions in 4 units of time, there are ⁴C₂ ( = 6) ways. The probability that each of these 6 ways happens is P(A)⁴₂/6. For each of these 6 ways, there exist 2P(A)₁¹, which add up to the total of (2 ×6)P(A)₁¹, i.e. 12P(A)₁¹. These 12P(A)₁¹ are evenly distributed over the time 1,2,3,4. So, for each time 1,2,3,4, there are (12/4)P(A)₁¹, i.e. 3P(A)₁¹.

Figure 1: An example: expanding P(A)₂⁴.

Considering the actual number of transactions which happen at each time 1,2,...,t, it is equal to the number of transactions multiplies the probability that this number of transactions happen (as in token method), as follows:

i × ^tC_i

P(A)_i^t

^tC_i

= P(A)_i^t

æ
ç
è

ö
÷
ø

Now we consider x, the waiting time of the transactions that happen at each time 1,2,...,t. Consider the case where disconnection happens at time 1 and reconnection happens at time t+1 with the disconnection time t. When a transaction happens at:

time 1, x = t,
time 2, x = t-1,

time t, x = 1.

Hence, by referring to equation (2), we can sum up the satisfaction level f(x) for transactions which happen at time 1,2,...t. Consequently, the total satisfaction function, S(A)_i^t, which is equal to the summation of satisfaction level for the transactions that happen can be defined as follows:

S(A)_i^t = P(A)_i^t

æ
ç
è

ö
÷
ø

t
å
j = 1

f(j)

ì
ï
ï
ï
ï
ï
í
ï
ï
ï
ï
ï
î

P(A)_i^t

æ
ç
è

ö
÷
ø

(2-(t+1)K)

t <

P(A)_i^t

æ
ç
è

2Kt

ö
÷
ø

(1-K)

otherwise.

(4)

For F(T), the evaluation function corresponding to each i ( = 1,2,...,t) at site A is iP(A)_i^t. For F(O), the satisfaction level is integrated and the evaluation function for each i at site A is S(A)_i^t. Thus, we conclude the following equation where the first and second terms refer to the case where transactions occur at only one single site; while the third and forth terms refer to the case where transactions occur at both sites, in which the transactions happen at the site with the higher number of transactions will succeed.

F(O) =

P(B)₀^t

t
å
i = 1

S(A)_i^t + P(A)₀^t

t
å
i = 1

S(B)_i^t

t
å
i = 1

i
å
j = 1

S(A)_i^tP(B)_j^t +

t
å
i = 1

t
å
j = i

P(A)_i^tS(B)_j^t.

(5)

When F(T) > F(O), token method is used, and vice versa.

4. EVALUATION

The graphs in Figure 2 and Figure 3 show how the values of the evaluation functions for each method are affected by the change of disconnection time. From the graphs we can see that with different values of K, the algorithm gives a different result of the choice of method as the value of the evaluation function of optimistic method is affected by the change of value K. As a guidance to choose the value of K, for applications which cannot tolerate waiting time and needs to be committed urgently (e.g. those involving money like systems for business trading, budget management and bank account management), it is appropriate to set the value of K bigger. On the other hand, a small value of K fits applications like schedule management systems as they are not time-critical and transactions in these systems are relatively simple and thus rollback can easily be performed.

Figure 2: Evaluation function versus disconnection time with the following parameters: P(A)₁¹ = 0.3, P(B)₁¹ = 0.2, K = 0.02.

Figure 3: Evaluation function versus disconnection time with the following parameters: P(A)₁¹ = 0.3, P(B)₁¹ = 0.2, K = 0.04.

Comparing our approach with the approach which uses only one method, either token or optimistic, certainly our approach is better as it balances off the merits and demerits of both methods and gives the users a choice to decide the appropriate method to be used.

We now investigate the effectiveness of our algorithm when the disconnection time deviates from the expected value. To achieve this purpose, we used a few probability distributions (Poisson, normal, uniform and exponential) to confirm the validity of the results. In these distributions, we calculate the ideal values (i.e., when the exact disconnection time is known beforehand) and the actual values (i.e., when we take the average disconnection time as the parameter) of the proposed algorithm. Figure 4 shows the result for Poisson distribution. From the graph, it can be observed that the difference between the actual values and the ideal values is very small. The difference increases as the disconnection time approaches the point where the values of evaluation functions for token and optimistic methods intersect. From calculation, we found that the percentage of difference for each disconnection time ((ideal- actual) / ideal) is less than 5% for Poisson, normal and uniform distribution, and less than 10% for exponential distribution. Hence, it is verified that even if the proposed algorithm's performance deteriorates due to the deviation of the disconnection time, as long as the deviation is not extreme and follows a common probability distribution, it still outperforms the approach which only uses a single method, either token or optimistic method.

Figure 4: Evaluation function versus average disconnection time for Poisson distribution.

5. CONCLUSION

In this paper, we have proposed an algorithm to choose the most effective method between two approaches, token method and optimistic method to handle database updates during periods of disconnection depending on the probability that transactions occur and the disconnection time. We also showed that the algorithm works well with little error when the disconnection time deviates from the average value. In mobile computing environments, using the available statistical information like the movement speed and direction, and the current location of the mobile host, we can estimate the expected reconnection time. As long as the actual reconnection time does not vary too drastically from the average value, the algorithm can be applied. Thus, we conclude that the proposed algorithm can work for both intentional and unintentional disconnections.

References

[1]: Breitbar, Y. and Korth, H. F.: Replication and Consistency: Being Lazy Helps Sometimes, Proc. of 16th ACM SIGACT-SIGMOD-SIGART Symposium on PODS , pp. 173-184 (1997).
[2]: Demers, A., Petersen, K., Spreitzer, M., Terry, D., Theimer, M. and Welch, B.: The Bayou Architecture: Support for Data Sharing among Mobile Users, Proc. of the Workshop on Mobile Computing Systems and Applications (1994).
[3]: Heidemann, J. S., Page, T. W., Guy, R. G. and Popek, G. J.: Primarily Disconnected Operation: Experiences with Ficus, Proc. of the Second Workshop on Management of Replicated Data, IEEE (1992).
[4]: Kistler, J. J. and Satyanarayanan, M.: Disconnected Operation in the Coda File System, ACM Transactions on Computer Systems , Vol. 10, No. 1 (1992).
[5]: Krishnakumar, N. and Jain, R.: Escrow Techniques for Mobile Sales and Inventory Applications, Wireless Networks 3 , Vol. 3, No. 3, pp. 235-246 (1997).
[6]: Pitoura, E. and Bhargava, B.: Maintaining Consistency of Data in Mobile Distributed Environments, Proc. of 15th International Conference on Distributed Computer System , pp. 404-413 (1995).

Footnotes:

¹ This research was supported in part by the Research for the Future Program of Japan Society for the Promotion of Science under the Project ``Advanced Multimedia Content Processing (Project No. JSPS-RFTF97P00501)'' and by Grant-in-Aid for Scientific Research on Priority Areas ``Advanced Databases'' of the Ministry of Education, Science, Sports and Culture of Japan (Grant No. 08244103).

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and or fee.
SAC 2000 March 19-21 Como, Italy
(c) 2000 ACM 1-58113-239-5/00/003>...>$5.00

A Hybrid Method for Concurrent Updates on Disconnected Databases in Mobile Computing Environments 1

Yin-Huei LOH Takahiro HARA Masahiko TSUKAMOTO Shojiro NISHIO Dept. of Information Systems Engineering, Graduate School of Engineering, Osaka University 2-1 Yamadaoka, Suita, Osaka 565-0871, Japan {yhloh,hara,tuka,nishio}@ise.eng.osaka-u.ac.jp

Yin-Huei LOH

Takahiro HARA

Masahiko TSUKAMOTO

Shojiro NISHIO

ABSTRACT

Keywords

1. INTRODUCTION

2. SYSTEM ARCHITECTURE

3. ALGORITHM DESCRIPTION

4. EVALUATION

5. CONCLUSION

References

Footnotes:

Copyright 2000 ACM

A Hybrid Method for Concurrent Updates on Disconnected Databases in Mobile Computing Environments ¹

Yin-Huei LOH
Takahiro HARA
Masahiko TSUKAMOTO
Shojiro NISHIO

Dept. of Information Systems Engineering, Graduate School of Engineering, Osaka University

2-1 Yamadaoka, Suita, Osaka 565-0871, Japan

{yhloh,hara,tuka,nishio}@ise.eng.osaka-u.ac.jp