Parallel
Database Systems Engineering
Date:
Sunday, February 26, 1995
Speaker : Dr. Kam-Fai Wong
Department of Systems Engineering
Chinese University
Shatin, N.Y.
Hong Kong.
Email - kfwong@se.cuhk.hk
Tel: +852 6098332
Fax: +852 6035505
Abstract:
=========
Today, very large databases may
easily involve over
tera-
bytes
of data. This
trend shows no sign of
diminishing.
Albeit the advancements in processor
technology, handling
such
large volume of information is becoming increasingly
difficult for conventional database
management systems which
run
on sequential computers. To
overcome this predicament,
a number of research projects are
investigating the use of
parallel
computers. The inherent parallelisms behind its
data model (e.g. relational) render database
suitable for
parallel
implementation. In this
tutorial, the concept of
parallel database systems
(PDS) which is
based on the
extended
dataflow computation model will be presented. In
addition, few engineering issues regarding
to the implemen-
tation of the model will be reviewed.
Target Audience:
================
*
Database developers who are
interested in parallel
implementations
*
Parallel software developers
who are planning
to
develop a database system
*
First year Postgraduate students in database or paral-
lel computing
- 2 -
Course Structure (provisional):
===============================
Section I: Introduction
I.1
Why Parallel Database Systems (PDS)?
I.2
Overview of the existing parallel machine architectures
suitable for PDS implementation
Section II: PDS Computation Model
Aim(s): introduce the extended
dataflow paradigm which
is
the
computation model of PDS and identify the parallelism
therein.
II.1 Extended Dataflow Paradigm
II.2 Forms of Parallelisms
Section III: Engineering Model
Aim(s): Look at various
implementation issues in
extended
dataflow graphs.
III.1 Data Placement
III.2 Control Mechanism in execution of an
Extended Dataflow
Graph (EDG)
III.3 Self Scheduling in EDG Execution
III.4 Localisation of EDG Operations
III.5 Forms of Pipelines between EDG
Operators
III.6 Queuing and De-Queuing
III.7 How to Schedule Work
Section IV: System Architecture
Aim(s): introduce a classical PDS system architecture and
review different implementation techniques
IV.1 A PDS Environment
IV.2 Route for Query Compilation
IV.3 Dynamic versus Static Program Loading
and Execution
IV.4 Parallel Query Optimization
- 3 -
Section V: Transaction Model
Aim(s): Because of parallel execution, PDS
must maintain the
ACID
properties. What are
these properties and how to
implement them?
V.1
Definition of the Transaction Model
V.2
How to achieve Atomicity?
V.3
How to achieve Consistency?
V.2
How to achieve Isolation?
V.2
How to achieve Durability?
Section VI: Existing PDS Systems
Aim(s): If time allows, some prominent PDS
will be reviewed.
This will include: EDS (European
Declarative System), Gamma,
Bubba and XRPS.
Biography
=========
Kam-Fai Wong obtained his PhD from the
University of Edin-
burgh,
Scotland, in 1987, in the area of computer architec-
tures.
After his PhD, he has performed research in Heriot-
Watt
University (Edinburgh,
Scotland), UniSys (Livingston,
Scotland) and ECRC (Munich, Germany). At present
he is a
Project
Coordinator at the Chinese University of Hong Kong,
in charge of the IPOC (Intelligent Processing
Of Chinese)
project.
His research interests are
parallel database and
information systems. He has
published over 20
technical
papers
in these areas
in various international journals,
conferences and books.
During his 7 years postdoctoral research period, he
has
given
many seminars. In 1993/94, he is
one of the ACM lec-
turers worldwide. He
is a member of IEEE-CS,
ACM and
IEE(UK) and have served as the AI/DB track
chair in 1994 ACM
Symposium on Applied Computing, the Asian
Coordinator of the
1994
Parallel and Distributed
Information Systems and PC
members of TOOLS94, GWIC94, PARLE94,
VLDB94, SPDP94, ICDCS95
and DASFAA95.