Parallel Database Systems Engineering


      Date:    Sunday, February 26, 1995 


      Speaker : Dr. Kam-Fai Wong

                Department of Systems Engineering

                Chinese University

                Shatin, N.Y.

                Hong Kong.

                Email -

                Tel:    +852 6098332

                 Fax:  +852 6035505






      Today, very large databases may easily  involve  over  tera-

      bytes  of  data.  This  trend  shows no sign of diminishing.

      Albeit the advancements in  processor  technology,  handling

      such  large  volume  of information is becoming increasingly

      difficult for conventional database management systems which

      run  on sequential computers.  To overcome this predicament,

      a number of research projects are investigating the  use  of

      parallel  computers.   The  inherent parallelisms behind its

      data model (e.g.  relational) render database  suitable  for

      parallel  implementation.   In this tutorial, the concept of

      parallel database  systems  (PDS)  which  is  based  on  the

      extended  dataflow  computation  model will be presented. In

      addition, few engineering issues regarding to the  implemen-

      tation of the model will be reviewed.


      Target Audience:



      *    Database developers  who  are  interested  in  parallel



      *    Parallel  software  developers  who  are  planning   to

           develop a database system


      *    First year Postgraduate students in database or  paral-

           lel computing
















                                 - 2 -



      Course Structure (provisional):



      Section I: Introduction


      I.1  Why Parallel Database Systems (PDS)?


      I.2  Overview of the existing parallel machine architectures

           suitable for PDS implementation


      Section II:  PDS Computation Model


      Aim(s): introduce the extended dataflow  paradigm  which  is

      the  computation  model  of PDS and identify the parallelism



      II.1 Extended Dataflow Paradigm



      II.2 Forms of Parallelisms


      Section III:  Engineering Model


      Aim(s): Look at various implementation  issues  in  extended

      dataflow graphs.


      III.1 Data Placement


      III.2 Control Mechanism in execution of an Extended  Dataflow

           Graph (EDG)


      III.3 Self Scheduling in EDG Execution


      III.4 Localisation of EDG Operations


      III.5 Forms of Pipelines between EDG Operators


      III.6 Queuing and De-Queuing


      III.7 How to Schedule Work


      Section IV:  System Architecture


      Aim(s): introduce a classical PDS system   architecture  and

      review different implementation techniques


      IV.1 A PDS Environment


      IV.2 Route for Query Compilation


      IV.3 Dynamic versus Static Program Loading and Execution


      IV.4 Parallel Query Optimization











                                 - 3 -



      Section V:  Transaction Model


      Aim(s): Because of parallel execution, PDS must maintain the

      ACID  properties.   What  are  these  properties  and how to

      implement them?


      V.1  Definition of the Transaction Model


      V.2  How to achieve Atomicity?


      V.3  How to achieve Consistency?


      V.2  How to achieve Isolation?


      V.2  How to achieve Durability?


      Section VI: Existing PDS Systems


      Aim(s): If time allows, some prominent PDS will be reviewed.

      This will include: EDS (European Declarative System), Gamma,

      Bubba and XRPS.





      Kam-Fai Wong obtained his PhD from the University  of  Edin-

      burgh,  Scotland, in 1987, in the area of computer architec-

      tures.  After his PhD, he has performed research in  Heriot-

      Watt  University  (Edinburgh, Scotland), UniSys (Livingston,

      Scotland) and ECRC (Munich, Germany).  At present  he  is  a

      Project  Coordinator at the Chinese University of Hong Kong,

      in charge of the IPOC (Intelligent  Processing  Of  Chinese)

      project.   His  research interests are parallel database and

      information systems.  He has  published  over  20  technical

      papers  in  these  areas  in various international journals,

      conferences and books.


      During his 7 years  postdoctoral  research  period,  he  has

      given  many  seminars. In 1993/94, he is one of the ACM lec-

      turers worldwide.  He  is  a  member  of  IEEE-CS,  ACM  and

      IEE(UK) and have served as the AI/DB track chair in 1994 ACM

      Symposium on Applied Computing, the Asian Coordinator of the

      1994  Parallel  and  Distributed  Information Systems and PC

      members of TOOLS94, GWIC94, PARLE94, VLDB94, SPDP94, ICDCS95

      and DASFAA95.