Parallel Database Systems Engineering

     

      Date:    Sunday, February 26, 1995 

 

      Speaker : Dr. Kam-Fai Wong

                Department of Systems Engineering

                Chinese University

                Shatin, N.Y.

                Hong Kong.

                Email - kfwong@se.cuhk.hk

                Tel:    +852 6098332

                 Fax:  +852 6035505

     

     

      Abstract:

      =========

     

      Today, very large databases may easily  involve  over  tera-

      bytes  of  data.  This  trend  shows no sign of diminishing.

      Albeit the advancements in  processor  technology,  handling

      such  large  volume  of information is becoming increasingly

      difficult for conventional database management systems which

      run  on sequential computers.  To overcome this predicament,

      a number of research projects are investigating the  use  of

      parallel  computers.   The  inherent parallelisms behind its

      data model (e.g.  relational) render database  suitable  for

      parallel  implementation.   In this tutorial, the concept of

      parallel database  systems  (PDS)  which  is  based  on  the

      extended  dataflow  computation  model will be presented. In

      addition, few engineering issues regarding to the  implemen-

      tation of the model will be reviewed.

     

      Target Audience:

      ================

     

      *    Database developers  who  are  interested  in  parallel

           implementations

     

      *    Parallel  software  developers  who  are  planning   to

           develop a database system

     

      *    First year Postgraduate students in database or  paral-

           lel computing

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

                                 - 2 -

     

     

      Course Structure (provisional):

      ===============================

     

      Section I: Introduction

     

      I.1  Why Parallel Database Systems (PDS)?

     

      I.2  Overview of the existing parallel machine architectures

           suitable for PDS implementation

     

      Section II:  PDS Computation Model

     

      Aim(s): introduce the extended dataflow  paradigm  which  is

      the  computation  model  of PDS and identify the parallelism

      therein.

     

      II.1 Extended Dataflow Paradigm

     

     

      II.2 Forms of Parallelisms

     

      Section III:  Engineering Model

     

      Aim(s): Look at various implementation  issues  in  extended

      dataflow graphs.

     

      III.1 Data Placement

     

      III.2 Control Mechanism in execution of an Extended  Dataflow

           Graph (EDG)

     

      III.3 Self Scheduling in EDG Execution

     

      III.4 Localisation of EDG Operations

     

      III.5 Forms of Pipelines between EDG Operators

     

      III.6 Queuing and De-Queuing

     

      III.7 How to Schedule Work

     

      Section IV:  System Architecture

     

      Aim(s): introduce a classical PDS system   architecture  and

      review different implementation techniques

     

      IV.1 A PDS Environment

     

      IV.2 Route for Query Compilation

     

      IV.3 Dynamic versus Static Program Loading and Execution

     

      IV.4 Parallel Query Optimization

     

     

     

     

     

     

     

     

     

     

                                 - 3 -

     

     

      Section V:  Transaction Model

     

      Aim(s): Because of parallel execution, PDS must maintain the

      ACID  properties.   What  are  these  properties  and how to

      implement them?

     

      V.1  Definition of the Transaction Model

     

      V.2  How to achieve Atomicity?

     

      V.3  How to achieve Consistency?

     

      V.2  How to achieve Isolation?

     

      V.2  How to achieve Durability?

     

      Section VI: Existing PDS Systems

     

      Aim(s): If time allows, some prominent PDS will be reviewed.

      This will include: EDS (European Declarative System), Gamma,

      Bubba and XRPS.

     

      Biography

      =========

     

      Kam-Fai Wong obtained his PhD from the University  of  Edin-

      burgh,  Scotland, in 1987, in the area of computer architec-

      tures.  After his PhD, he has performed research in  Heriot-

      Watt  University  (Edinburgh, Scotland), UniSys (Livingston,

      Scotland) and ECRC (Munich, Germany).  At present  he  is  a

      Project  Coordinator at the Chinese University of Hong Kong,

      in charge of the IPOC (Intelligent  Processing  Of  Chinese)

      project.   His  research interests are parallel database and

      information systems.  He has  published  over  20  technical

      papers  in  these  areas  in various international journals,

      conferences and books.

     

      During his 7 years  postdoctoral  research  period,  he  has

      given  many  seminars. In 1993/94, he is one of the ACM lec-

      turers worldwide.  He  is  a  member  of  IEEE-CS,  ACM  and

      IEE(UK) and have served as the AI/DB track chair in 1994 ACM

      Symposium on Applied Computing, the Asian Coordinator of the

      1994  Parallel  and  Distributed  Information Systems and PC

      members of TOOLS94, GWIC94, PARLE94, VLDB94, SPDP94, ICDCS95

      and DASFAA95.