LevelDB and Node Sitting in a Tree

by Rod Vagg / tw:@rvagg / gh:rvagg / bl:http://r.va.gg

LevelDB in a Nutshell

  • Open-source, embedded key/value store by Google
  • Sorted by keys
  • Values are compressed with Snappy
  • Basic operations: Get(), Put(), Del()
  • Atomic Batch()
  • Bi-directional iterators

Basic architecture

LSM-tree

  • Writes go straight into a log
  • Log is flushed to 2MB string sorted table (SST) files
  • Reads merge the log and the table files
  • Cache speeds up common reads

Table file hierarchy

The "Level" in LevelDB

Log: Max size of 4MB then flushed into a set of Level 0 SST files
Level 0: Max of 4 SST files then one file compacted into Level 1
Level 1: Max total size of 10MB then one file compacted into Level 2
Level 2: Max total size of 100MB then one file compacted into Level 3
Level 3+: Max total size of 10 x previous level then one file compacted into next level

0 ↠ 4 SST, 1 ↠ 10M, 2 ↠ 100M, 3 ↠ 1G, 4 ↠ 10G, 5 ↠ 100G, 6 ↠ 1T, 7 ↠ 10T+

Get to the Node bit!

LevelDOWN http://ghub.io/leveldown

Pure C++ interface between Node and LevelDB

LevelUP http://ghub.io/levelup

Wrap LevelDOWN to provide a Node-style interface

  • Sugar: optional args, deferred-till-open
  • Streams!
  • JSON & other encoding types

Basic operations

var db = levelup('/path/to/database')

db.put('key', 'value', function (err) { /* ... */ })
db.get('key', function (err, value) { /* ... */ })
db.del('key', function (err) { /* ... */ })

db.close(function (err) { /* closed */ })
// multiple atomic writes with batch()
var operations = [
    { type: 'put', key: 'Franciscus', value: 'Jorge Bergoglio' }
  , { type: 'del', key: 'Benedictus XVI' }
]

db.batch(operations, function (err) { /* ... */ })

A simple example

var levelup = require('levelup')
var db = levelup('/tmp/dprk.db')

db.put('name', 'Kim Jong-un', function (err) {
  db.batch([
      { type: 'put', key: 'spouse', value: 'Ri Sol-ju' }
    , { type: 'put', key: 'dob', value: '8 January 1983' }
    , { type: 'put', key: 'occupation', value: 'Clown' }
  ], function (err) {
    db.createReadStream()
      .on('data', console.log)
      .on('end', function () { db.close() })
  })
})

Streams!

var rs = db.createReadStream()
rs.on('error', function (err) { /* handle err */ })
rs.on('data' , function (data) { /* data.key & data.value */ })
rs.on('end', function () { /* stream finished */ })

// Options! Oh my!
db.createReadStream({
    start     : 'somewheretostart'
  , end       : 'endkey'
  , limit     : 100
  , reverse   : true
})

Streams!

function copy (srcdb, destdb, callback) {
  srcdb.createReadStream()
    .pipe(destdb.createWriteStream())
    .on('error', callback)
    .on('end', callback)
}

Encoding

var db = levelup('/path/to/db', { valueEncoding: 'json' })
var data = {
    name       : 'Kim Jong-un'
  , spouse     : 'Ri Sol-ju'
  , dob        : '8 January 1983'
  , occupation : 'Clown'
}

db.put('dprk', data, function (err) {
  db.get('dprk', function (err, value) {
    console.log('dprk:', value)
    db.close()
  })
})

UTF8 (default), JSON, Buffer encoding types

Modularity FTW!

Swappable back-end:

  • MemDOWN: a pure in-memory store
  • level.js: backed by IndexedDB for use in the browser
  • leveldown-gap: in the browser and in PhoneGap apps

Modularity FTW!

Small, extensible core with a thriving ecosystem

Level-Multiply

db.put({ boom: 'bang', whoa: 'dude' }, function (err) { })
db.get([ 'boom', 'whoa' ], function (err, data) { })
db.del([ 'boom', 'whoa' ], function (err) { })

Level TTL

db.put('foo', 'bar', { ttl: 3600000 }, function (err) { })

level-live-stream

db.liveStream().on('data', console.log)

Modularity FTW!

level-delete-range

deleteRange(db, {
    start: "foo:"
  , end: "foo:~"
}, cb)

level-store

fs.createReadStream('foo.dat')
  .pipe(store.createWriteStream('foo'))

store.createReadStream('whoa').pipe(response);

Modularity FTW!

Map Reduce

var mapdb = MapReduce(
    db
  , 'example'
  , function (key, value, emit) {
      var obj = JSON.parse(value)
      emit([ 'all', obj.group ], String(obj.lines.length))
    }
  , function (acc, value, key) {
      return String(acc + value)
    }
  , '0'
})

mapDb.createReadStream({ range: [ 'all', group ]})

Modularity FTW!

level-mapped-index

db.registerIndex('id', function (key, value, emit) {
    value = JSON.parse(value)
    if (value.id) emit(value.id)
})

db.put('foo1', JSON.stringify({ one: 'ONE', id : '1' }))
db.put('foo2', JSON.stringify({ two: 'TWO', id : '2' }))

db.getBy('id', '1', function (err, data) {
  // [{ key: 'foo1', value: '{"one":"ONE","key":"1"}' }]
})

Modularity

Getting ambitious

multilevel & level-rpc: expose the LevelUP API over the network to multiple end-points

level-master: master / slave replication—multi-slave or aggregated multi-master

Modularity

Turning a 2D storage space into 3D

level-sublevel: Uses namespacing to divide a LevelUP instance into multiple sub-instances.

var db = levelup('/tmp/store.db')
sublevel(db)

var foodb = db.sublevel('foo')
var bardb = db.sublevel('bar')
  • Same API on each sub-instance
  • Operate on independent partitions of the store

"Node Databases"

  • Rumours
  • LevelGraph
  • PushDB
  • NeutrinoDB
  • PlumbDB
  • Syncstore

LevelDB tools

  • lev: A LevelDB REPL
  • levelweb: A LevelDB web interface, with visualisation tools

The End. Questions?

Rod Vagg / tw:@rvagg / gh:rvagg / bl:http://r.va.gg

Resources

LevelUPghub.io/levelup
LevelDOWNghub.io/leveldown
Extensions & toolsghub.io/levelup/wiki/Modules
Articlesghub.io/levelup/wiki/Resources

Help & community

IRC ↠ ##leveldb on Freenode
Google groupnode-levelup
Issue trackerghub.io/levelup/issues