JavaScript Databases II

JavaScript Databases II

@rvagg

Max Ogden - "JavaScript Databases" - LXJS 2012

"I want to see a time where I can write a persistence function that can run in Node, the browser and anywhere else JavaScript runs."

LevelDB?

  • Open-source, embedded key/value store by Google
  • Basic operations: Get(), Put(), Del()
  • Atomic Batch()
  • Entries sorted by keys
  • Bi-directional iterators

LevelDB: basic architecture

Log Structured Merge Tree (LSM)

  • Writes go straight into a log
  • Log is flushed to string sorted table (SST) files
  • SST files grow into a hierarchy of overlapping "levels"
  • Reads merge the log and the level / SST data
  • Cache speeds up common reads

Database Primitives for JS

LevelUP:

  • Open / Close
  • Get
  • Put
  • Del
  • Atomic batch
  • ReadStream

For arbitrary data

Primitives: ReadStream

The simplest form of a query mechanism

Basic range query:

| a | b | e | f1 | f2 | g | h | i | o | p | q | r | v |
          ↑    'e''h'    ↑
          ╰-----------------╯

Primitives: ReadStream

The simplest form of a query mechanism

Basic range query:

| a | b | e | f1 | f2 | g | h | i | o | p | q | r | v |
          ↑    'e''h'    ↑
          ╰-----------------╯
db.createReadStream({ start: 'e', end: 'h' })

// 'e', 'f1', 'f2', 'g', 'h'

Primitives: ReadStream

Stab in the dark:

| a | b | e | f1 | f2 | g | h | i | o | p | q | r | v |
              ↑     ↑
              ╰----╯

Primitives: ReadStream

Stab in the dark:

| a | b | e | f1 | f2 | g | h | i | o | p | q | r | v |
              ↑     ↑
              ╰----╯

Bytewise comparison to the rescue!

db.createReadStream({ start: 'f', end: 'f~' })

// 'f1', 'f2'

Primitives: Batch

Atomic operations for sophisticated behaviour

Example: Indexes

db.put('foo', { name: 'bar', x: 100 })
db.put('boom', { name: 'bang', x: 222 })

// ?? db.getBy('name', 'bar')

Primitives: Batch

db.put('foo', { name: 'bar', x: 100 }) // primary
db.put('index~name~bar~foo', 'foo') // index

Primitives: Batch

db.put('foo', { name: 'bar', x: 100 }) // primary
db.put('index~name~bar~foo', 'foo') // index
getBy = function (index, value, callback) {
  var keys = []
  db.createReadStream({
      start : 'index~' + index + '~' + value + '~'
    , end   : 'index~' + index + '~' + value + '~~'
  }).on('data', function (entry) {
    keys.push(entry.value)
  }).on('end', function () {
    callback(null, keys)
  })
}

Primitives: Batch

But what about consistency?

db.put('foo', { name: 'bar', x: 100 }) // primary
db.put('index~name~bar~foo', 'foo') // index

Primitives: Batch

put = function (key, value, callback) {
  db.batch().put(key, value)            // primary entry
    .put('index~name~' + value.name + '~', key) // index
    .write(callback)                          // atomic!
}

put('foo', { name: 'bar', x: 100 }, ...)

//  db.createReadStream({
//      start : 'index~' + index + '~' + value + '~'
//    , end   : 'index~' + index + '~' + value + '~~'
//  })

Automated with level-hooks

Primitives: Batch

Example: Async work that must be done for each entry

put = function (key, value, callback) {
  db.batch().put(key, value)            // primary entry
    .put('pending~' + key + '~', key)          // marker
    .write(callback)                          // atomic!
  work({ key: key, value: value })
}
work = function (entry) {
  // do some async work...
  db.del('pending~' + entry.key + '~')
}
// on restart:
db.createReadStream({ start: 'pending~' })
  .on('data', work)

Buckets

Or "namespaces"

Like tables, for organising data and separating types of data

db.put('~countries~Morocco', { capital: 'Rabat' })
db.put('~countries~Portugal', { capital: 'Lisbon' })
db.put('~countries~Spain', { capital: 'Madrid' })
db.put('~cities~Leiria', { population: 50264 })
db.put('~cities~Lisbon', { population: 547631 })
db.put('~cities~Lixa', { population: 5500 })

Buckets

Automated with level-sublevel

db = sublevel(db)
countriesDb = db.sublevel('countries')
citiesDb = db.sublevel('cities')

countriesDb.put('Morocco', { capital: 'Rabat' })
countriesDb.put('Portugal', { capital: 'Lisbon' })
countriesDb.put('Spain', { capital: 'Madrid' })
citiesDb.put('Leiria', { population: 50264 })
citiesDb.put('Lisbon', { population: 547631 })
citiesDb.put('Lixa', { population: 5500 })

countriesDb.createReadStream().on('data', console.log)

LevelDOWN: Storage flexibility

  • LevelDB (Google)
  • LevelDB (Basho)
  • HyperLevelDB (HyperDex)
  • LMDB
  • MemDOWN
  • mysqlDOWN
  • more under development...

...and level.js

The Level* ecosystem in the browser!

Tools
lev levelweb
Packages
tacodb couchup LevelGraph firedup level-assoc
level-static level-store level-session level-fs LevelTTLCache
Extensions
level-live-stream map-reduce level-queryengine Level-Multiply
multilevel level-replicate level-master Level TTL
Extensibility
sublevel level-hooks level-mutex
Core
LevelUP
Storage
LevelDOWN LevelDOWN (Hyper) LevelDOWN (Basho) MemDOWN level.js leveldown-gap LMDB mysqlDOWN
@chesles @raynos @dominictarr @maxogden
@ralphtheninja @kesla @juliangruber @hij1nx @no9
@mcollina @pgte @substack @rvagg

LevelUP Core Team