Properties of Parallel and Distributed Database

Parallel database system consists of multiple processors and multiple disks which are connected by fast interconnection networks. A coarse-grain parallel machine consists of a small number of powerful processors. A largely parallel or fine grain parallel machine utilizes thousands of smaller processors. The two main performance measures are throughput and response time. The fixed-sized problem executing on a small system is given to a system which is N-times larger. This process is known as speed up and it is measured as Speedup = Time elapsed by the small system / Time elapsed by the large system. Speedup is said to be linear if equation equals N. If we increase the size of both the problem and the system this process is called as scale up. The N-times larger system is used to perform N-times larger job. It is measured as Scale up = Small system small problem elapsed time / Big system big problem elapsed time. Scale up is linear if equation equals 1. Parallel database architecture consists of shared memory, shared disk, shared nothing and hierarchical. Shared memory offers extremely efficient communication between processors. The data in shared memory can be accessed by any processor without moving it using software. The architecture of the shared disk provides a degree of fault-tolerance which means if a processor fails then the other processors can take over its tasks since the database is residential on disks which are accessible from all the processors. The shared-nothing multiprocessors can be scaled up to thousands of processors without any interference. The main drawback of shared nothing is the cost of communication and non-local disk access. The sending of data involves software interaction at both ends. Hierarchical combines the characteristics of shared-memory, shared-disk, and shared-nothing architectures. The topmost level is a shared-nothing architecture where the nodes are connected by an interconnection network and they do not share disks or memory with each other.

Summary

Things to Remember

Parallel database system consists of multiple processors and multiple disks which are connected by fast interconnection networks.
A coarse-grain parallel machine consists of a small number of powerful processors. A largely parallel or fine grain parallel machine utilizes thousands of smaller processors.
The two main performance measures are throughput and response time.
The fixed-sized problem executing on a small system is given to a system which is N-times larger. This process is known as speed up and it is measured as Speedup = Time elapsed by the small system / Time elapsed by the large system. Speedup is said to be linear if equation equals N.
If we increase the size of both the problem and the system this process is called as scale up. The N-times larger system is used to perform N-times larger job. It is measured as Scale up = Small system small problem elapsed time / Big system big problem elapsed time. Scale up is linear if equation equals 1.
Parallel database architecture consists of shared memory, shared disk, shared nothing and hierarchical.
Shared memory offers extremely efficient communication between processors. The data in shared memory can be accessed by any processor without moving it using software.
The architecture of the shared disk provides a degree of fault-tolerance which means if a processor fails then the other processors can take over its tasks since the database is residential on disks which are accessible from all the processors.
The shared-nothing multiprocessors can be scaled up to thousands of processors without any interference. The main drawback of shared nothing is the cost of communication and non-local disk access. The sending of data involves software interaction at both ends.
Hierarchical combines the characteristics of shared-memory, shared-disk, and shared-nothing architectures. The topmost level is a shared-nothing architecture where the nodes are connected by an interconnection network and they do not share disks or memory with each other.

MCQs

No MCQs found.

Subjective Questions

No subjective questions found.

Videos

No videos found.

Properties of Parallel and Distributed Database

Parallel database systems

Throughput: It can be described as the number of tasks that can be completed in a given time interval.
Response time: The amount of time that it takes to complete a single task from the time it is submitted is known as a response time.

Speedup: The fixed-sized problem executing on a small system is given to a system which is N-times larger. This process is known as speed up and it is measured as follows:

Speedup = Time elapsed by the small system / Time elapsed by the large system. Speedup is said to be linear if equation equals N.

If we increase the size of both the problem and the system this process is called as scale up. The N-times larger system is used to perform N-times larger job. It is measured as follows:

Scale up = Small system small problem elapsed time / Big system big problem elapsed time. Scale up is linear if equation equals 1.

Parallel Database Architectures:

The architecture consists of:

Shared memory: The processors share a common memory.
Shared disk: The processors share a common disk.
Shared nothing: The processors share neither a common memory nor common disk.
Hierarchical: It is known as the hybrid of the above architectures.

Shared Memory:

The processors and the disks have access to a common memory which is typically through a bus or an interconnection network. Shared memory offers extremely efficient communication between processors. The data in shared memory can be accessed by any processor without moving it using software. The architecture is not scalable beyond 32 or 64 processors on the bus or the interconnection network becomes the bottleneck. Shared memory is widely used for lower degrees of parallelism from 4 to 8.

Shared Disk:

All the processors can directly access all of the disks through an interconnection network while the processors have private memories. The memory bus is not a bottleneck. The architecture of the shared disk provides a degree of fault-tolerance which means if a processor fails then the other processors can take over its tasks since the database is residential on disks which are accessible from all the processors. Examples are IBM Sysplex and DEC clusters which are now part of Compaq, running Rdb which is now Oracle Rdb were early commercial users. The bottleneck now occurs at interconnection to the disk subsystem. The shared-disk systems can scale to a somewhat larger number of processors but communication between processor is slower.

Shared Nothing:

The node consists of a processor, memory, and one or more disks. The processors at one node communicate with another processor at another node using an interconnection network. The node functions as a server for the data on the disk or disks that the node owns. Examples are Teradata, Tandem, Oracle-n CUBE. The data accessed from local disks and local memory accesses do not pass through the interconnection network thereby minimizing the interference of resource sharing. The shared-nothing multiprocessors can be scaled up to thousands of processors without any interference. The main drawback of shared nothing is the cost of communication and non-local disk access. The sending of data involves software interaction at both ends.

Hierarchical:

It combines the characteristics of shared-memory, shared-disk, and shared-nothing architectures. The topmost level is a shared-nothing architecture where the nodes are connected by an interconnection network and they do not share disks or memory with each other. Each node of the system could be a shared-memory system with a few processors. Alternatively, each node could be a shared-disk system and each of the system sharing a set of disks could be a shared-memory system. It reduces the complexity of programming of such systems by distributed virtual memory architectures. It is also called as non-uniform memory architecture (NUMA).

References:

H.F.Korth and A. Silberschatz,"Database system concepts",McGraw Hill,2010
A.K.Majumdar and p, Bhatt acharya,"Database Management Systems",Tata McGraw Hill,India,2004
F.Korth, Henry. Database System Concepts. 6th edition.

Lesson

Advanced database Concepts

Subject

Computer Engineering

Grade

Engineering

Recent Notes

No recent notes.

Related Notes

No related notes.