NoSQL Databases:HBase

HBase is an open source, non-relational, distributed database modeled after Google's BigTable and written in Java. HBase is sparse, distributed, persistent, multi-dimensional sorted map or Key/value store.

Summary

HBase is an open source, non-relational, distributed database modeled after Google's BigTable and written in Java. HBase is sparse, distributed, persistent, multi-dimensional sorted map or Key/value store.

Things to Remember

HBase is designed to efficiently address

  • Random Access
  • Fast record lookup
  • Support for record-level insertion
  • Support for updates

 

Rows maintained in sorted lexicographic order
Efficient row scans
Row ranges dynamically partitioned into tablets

Columns grouped into column families
• Column key = family:qualifier
• Column families - locality indications
• Boundless number of columns

MCQs

No MCQs found.

Subjective Questions

No subjective questions found.

Videos

No videos found.

NoSQL Databases:HBase

NoSQL Databases:HBase

HBase

  • created in 2007 at Powerset
  • distributed column-oriented data store built on top of HDFS
  • available under Apache Software License

Features

  • non-relational and sparse
  • distributed
  • persistent
  • opensource
  • horizontal scalable
  • multidimensional rather than 2-D(relational)
  • schema-free
  • modeled after google's BigTable
  • the decentralized storage system
  • sorted map or Key/Value store
  • easy replication support
  • written in Java
  • simple API, etc.

fig:- Hadoop Ecosystem
fig:- Hadoop Ecosystem

Architecture

Region

  • A subset of a table’s rows
  • Horizontal range partitioning


Region Server

  • Manages data regions
  • Serves data for reads and writes


Master

  • Responsible for coordinating the slaves
  • Assigns regions, detects failures
  • Administrative functionality [2]

Tables, Rows, Columns, and Cells

  • Its basic unit is a column.
  • One or multiple columns from a row that is addressed uniquely by a row key.
  • A number of rows, in turn, form a table, and there can be many of them.
  • Each of the columns may have distinct value contained in a separate cell.

All rows are always sorted lexicographically by their row key.[1]

fig:- Hbase Rows
fig:- Hbase Rows

Data Model

  • Map indexed by a row key, column key and a timestamp
  • supports lookups, inserts, deletes (single row transactions only)

Rows are composed of columns, and those, in turn, are grouped into column families. All columns in a column family are stored together in the same low-level storage file, called an HFile. Millions of columns in a particular column family. There is also no type nor length boundary on the column values.

Rows and columns in HBase

fig:- Hbase
fig:- Hbase

fig:- BigTable
fig:- BigTable

References:

  1. "HBase:The Definitive Guide",Lars George,page-17
  2. "Introduction to Big Data Analytics" at corsera.org,Natasha Balac
  3. "Apache HBase Reference Guide" at hbase.apache.org

Lesson

NoSQL

Subject

Computer Engineering

Grade

Engineering

Recent Notes

No recent notes.

Related Notes

No related notes.