r.va.gg

LevelDOWN Alternatives

Since LevelUP v0.9, LevelDOWN has been an optional dependency, allowing you to switch in alternative back-ends.

We have MemDOWN, a pure in-memory data-store, allowing you to run LevelUP against transient, and very fast storage.

We also have level.js which works against IndexedDB, allowing you to run LevelUP in the browser!

Since LevelUP just needs some basic primitives and a sorted bi-directional iterator, we can swap out the back-end with numerous alternatives.

The easy targets are the forks of LevelDB that purport to fix or improve LevelDB in some way. I have another post brewing on what I think about the claims made in this area and how we ought to approach them, but that can come later. For now I have some packages in npm for you to try for yourself!

Basho

First of all we have leveldown-basho which bundles the Basho LevelDB fork into LevelDOWN. See Matthew Von-Maszewski's slides from the recent Ricon East 2013 for more information on what they've tried to do with LevelDB.

In summary, Basho's aim is to optimise LevelDB "for the server", particularly for high write throughput. They use >1 compaction threads and relax the rules a little on overlapping keys for the lower levels. Plus a few other things that I won't get in to here.

$ npm install levelup leveldown-basho
var levelup = require('levelup')
  , leveldown = require('leveldown-basho')
  , db = levelup('/path/to/db', { db: leveldown })  
// go to work on `db`

Disclaimer: some of the LevelDOWN and LevelUP tests are failing on the current build for this release, although I don't believe they should impact on standard usage but your mileage may vary...

HyperDex

Next, we have leveldown-hyper, which bundles a fork by the people behind HyperDex, a key-value store. Again their aim is to optimise LevelDB for a server environment. You can see some of their claims about performance here. I don't know as much about this fork, I'll investigate further when I have time, but they are also using multiple compaction threads to do the background work.

$ npm install levelup leveldown-hyper
var levelup = require('levelup')
  , leveldown = require('leveldown-hyper')
  , db = levelup('/path/to/db', { db: leveldown })  
// go to work on `db`

Lies! Benchmarks!

OK, benchmarks kind of suck, particularly microbenchmarks. It's really hard to test something that's meaningful for everyone's use-case. But you can make pretty pictures with them and they can tell something of a story, even if it's just the first page of a novel.

So here we go. I've put together a simplistic benchmark that tries to test the kind of situation that these two forks are aiming to optimise for. In particular, high-throughput writes. There's a common claim that LevelDB has problems with writes because the compaction thread can hold up levels 0 and 1 while it's working on higher levels; and you really want to be flushing the new data as soon as possible so you can get more in. (Again, I have more to say on this & the claims about "fixing" the problem in a later post).

I have a sorted-write benchmark in the LevelDOWN repo that tries to push in 10M pre-sorted entries as fast as possible, fully utilising Node's worker-threads for the job. So this isn't your typical browser scenario. An important point here is that Node is a unique environment when looking at LevelDB performance. It's not going to be a straightforward mapping of benchmark results obtained with other LevelDB bindings onto what we can achieve in Node.

Because there are so many entries, instead of recording the time for individual writes, I've recorded average time for batches of 1000 writes. Below you can see what the write-times look like when plotted over time. There are a bunch of outliers that are above the maximum Y of 0.6ms, but not enough to warrant distracting from the interesting behaviour below 0.6ms so I chopped it off there.

It is important to note that I'm using the default options here and this is where I'll probably cop some flak. Basho in particular advocate a healthy amount of "tuning" to achieve appropriate performance. In particular the write-buffer defaults to only 4M and you can push data in faster (at the cost of compactions later on) by increasing this. I think the forks may even have additional tunables of their own that you can fiddle with. But, this whole tuning thing is a rabbit hole I don't dare go down right now!

I'm running this on an i7-2630QM CPU, plenty of RAM and an SSD.

You can see that we managed to push in the 10M entries in just over 95 seconds with the plain Google LevelDB (v1.10.0).


Next up we have the HyperDex fork. The main difference here is that we have it working slightly faster in total and the write-times have been trimmed down a bit to be more consistent. Not a bad effort with default settings, quite a nice picture.


Lastly we can see what Basho have done. They've been on this case for a lot longer than HyperDex have and their fork, internally at least, diverges quite a bit from Google's LevelDB.

We can see that the write-time has been considerably flattened; which is in line with what Basho claim and are aiming for, the consistency here is very impressive. Unfortunately we've ended up with a total time that is double what it took Google's LevelDB to get the 10M entries in!

No doubt this is probably something to do with the tunables, or perhaps I've messed something up, anything's possible!


So?

If you take anything away from this, here's what I think it should be: Do your own benchmarks if performance really is an issue for you. You're going to need some kind of benchmark suite that is tailored to your particular application. This will not only let you choose the appropriate storage system but it will give you something to work with when you start to get in to the mire that is "tunables".

It's likely I won't be able to leave this alone and will be posting more benchmarks with some tweaking and tuning. I'd love to have input from others on this too of course! The code for this is all in the LevelDOWN repo with both of these forks under appropriately named branches.

LevelUP v0.9 Released

LevelDB

As per my previous post, LevelUP v0.9 has been released!

I'm doing a quick post about this release because it's got more changes in it than we normally see, including some things worth explaining.

Relationship to LevelDOWN

The biggest change is the removal of LevelDOWN as a dependency, you should review what I've already said about this as this will impact you if you're currently using LevelUP. In short, you'll either need to explicitly npm install leveldown or switch to using the new Level package which bundles them both.

Along with this change, we also get better Browserify support. See level.js for more information on this.

Chained batch

The other major change is the introduction of a new chained batch syntax, additional to the existing batch syntax. This method of creating and writing batch operations is much closer to the way LevelDB does batches and under certain circumstances you may find improved performance from using this method.

If you call db.batch() with no arguments, you'll get a Batch object back which has the following operations: put(), del(), clear() and write(). The first three are chainable so you can call them one after the other to build your batch. write() is the only method that takes a callback because it submits the batch. Until you call write(), the batch is transient and can be discarded.

Example from the README:

db.batch()
  .del('father')
  .put('name', 'Yuri Irsenovich Kim')
  .put('dob', '16 February 1941')
  .put('spouse', 'Kim Young-sook')
  .put('occupation', 'Clown')
  .write(function () { console.log('Done!') })

Some love for WriteStream

WriteStream got some attention in this release. On the main createWriteStream() method and on individual write() calls, you can now pass some new options:

  • 'type' can switch from the default 'put' to 'del' so you can make a WriteStream that only deletes when you write({ key: 'foo' }), or you can make individual writes delete: write({ type: 'del', key: 'foo' }).
  • 'keyEncoding' and 'valueEncoding' will switch from default encodings for the current LevelUP instance. Again, you can specify them on the main createWriteStream() or on individual write() calls.

Other changes

  • A race condition was fixed that allowed a put() to write to the store before an iterator was obtained when calling `createReadStream().
  • ReadStream no longer emits a 'ready' event.
  • The db property on LevelUP instances can be used to get access to LevelDOWN or whatever LevelDOWN-substitute you are using (this was _db).
  • Some very LevelDB-specific methods have been deprecated on LevelUP and the documentation now recommends either directly using LevelDOWN or calling via the db property. Specifically:
    • db.db.approximateSize()
    • leveldown.repair()
    • leveldown.destroy()
  • LevelDOWN got a new LevelDB method: getProperty() that currently understands 3 properties:
    • db.db.getProperty('leveldb.num-files-at-levelN'): returns the number of files at level N, where N is an integer representing a valid level (e.g. "0")').
    • db.db.getProperty('leveldb.stats'): returns a multi-line string describing statistics about LevelDB's internal operation.
    • db.db.getProperty('leveldb.sstables'): returns a multi-line string describing all of the sstables that make up contents of the current database.
  • Significantly improved ReadStream performance improvements (up to 50% faster).
  • Some LevelDOWN memory leaks discovered and fixed.
  • LevelDOWN upgraded to LevelDB@1.10.0, details here.

Who you should thank

A lot of people put in work to this release. There's a team of people that can claim ownership of LevelUP, LevelDOWN and related projects and most of them have been involved in this release. You should follow these people on Twitter and GitHub!

And others, who you can find in this 0.9 WIP thread, plus additional users who reported & found issues.

LevelUP v0.9 - Some Major Changes

LevelDB

LevelUP is still quite young and bound to go through some major shifts. It's best to not be too tied to immature APIs early in a project's lifetime.

That said, we're very interested in stability so we try to keep breaking changes to a minimum. However, we're about to publish version 0.9 and there's one change that's not exactly a "breaking" change in the normal sense, but it is something that I need to explain because it will impact on almost everyone currently using LevelUP.

Severing the dependency on LevelDOWN

LevelUP depends on LevelDOWN to do its LevelDB thing. LevelDOWN was once part of LevelUP until we split it off to a discrete project that focuses entirely on acting as a direct C++ bridge between LevelDB and Node. We get to focus on making LevelUP an awesome LevelDB-ish interface without being tied directly to LevelDB implementation details (e.g. Iterators vs Streams).

In fact, a new project was spawned to define the LevelDOWN interface that LevelUP requires. AbstractLevelDOWN is a set of strict tests for the functionality that LevelUP uses and it also implements a basic abstract shell that can be extended to create additional back-ends for LevelUP.

So far, there are 3 projects worth mentioning that extend AbstractLevelDOWN:

  • level.js operates on top of IndexedDB (which is in turn implemented on top of LevelDB in Chrome!).

  • leveldown-gap is another browser implementation that uses localStorage and is designed to be able to work in PhoneGap applications.

  • MemDOWN is a pure in-memory implementation that doesn't touch the disk. It's obviously not good for persistent data but sometimes that's not what you need.

Plus some other efforts to adapt other embedded and non-embedded data stores to the LevelDOWN interface. Additionally, there are other versions of LevelDB that can be used, including the fork that Basho maintains for use in Riak. (I have a branch of LevelDOWN that uses this version of LevelDB that I'll release as soon as I can explain and demonstrate the performance differences to vanilla LevelDB for Node users).

In short, LevelUP doesn't need LevelDOWN in the way it once did and LevelUP is turning into a more generic interface to sorted key/value storage systems, albeit with a distinct LevelDB-flavour.

Since version 0.8 we've supported a 'db' option when you create a LevelUP instance. This option can be used to provide an alternative LevelDOWN-compatible back-end. Unfortunately, LevelDOWN being defined as a strict dependency of LevelUP means that each time you install it you have to compile LevelDOWN, even if you don't want it. So, we've removed it as a dependency but it's still wired up so that that the only thing you need to do is actually install LevelDOWN alongside LevelUP and it'll take care of the rest.

$ npm install levelup leveldown

From version 0.9 onwards, you'll need to do this, or you'll see an (informative) error.

Introducing "Level"

To make life easier, we're publishing an additional package in npm that will make this easier by bundling both LevelUP and LevelDOWN as dependencies and exposing LevelUP directly. The Level package is a very simple wrapper that exists purely as a convenience. It'll track the same versioning as LevelUP so it's a straight substitution.

$ npm install level

You can simply change your "dependencies" from "levelup" to "level", plus you can use it just like LevelUP:

var levelup = require('level')
var db = levelup('./my.db')
db.put('yay!', 'it works!')

Switching things up

Now we have a properly pluggable back-end, expect to see a growing array of choice and innovation. The most exciting space at the moment is browser-land. Consider level.js:

var levelup = require('levelup')
  , leveljs = require('level-js')

window.db = levelup('foo', { db: leveljs })

db.put('name', 'LevelUP string', function (err) {
  db.get('name', function (err, value) {
    console.log('name=' + value)
  })
})

Yep, that's browser code. Simply npm install levelup level-js and run the module through Browserify and you get the full LevelUP API in your browser!


Stay tuned! This is just one step in the quest for a truly modular database system that lets you build a database that suits your applications and not the other way around.

Node.ninjas Presentation - LevelDB and Node Sitting in a Tree

I'm giving a presentation at Node.ninjas tonight in Sydney. I've put together a talk about LevelDB and Node that covers:

  1. What LevelDB is and the basics of how it works
  2. A quick introduction to the core LevelDB libraries in Node: LevelUP and LevelDOWN
  3. Some preaching about the awesomeness of modularity around a small, extensible core; including a whirlwind tour of the current, flourishing, LevelDB+Node ecosystem

It's this last point that excites me the most. There's some very smart people building some very clever pieces to the Node Database puzzle. What's more, people are actually building functional databases in Node now, I've just collected a list from npm of what looks like functional databases that use LevelDB:

  • Rumours
  • LevelGraph
  • PushDB
  • NeutrinoDB
  • PlumbDB
  • Syncstore

And a few more that look like a work in progress. Plus, I'm sure there's more people out there we've never even heard of who are cooking up some amazing things using the LevelDB+Node combination!

The slides to my talk are here.

LevelDB and Node: Getting Up and Running

This is the second article in a three-part series on LevelDB and how it can be used in Node.

Our first article covered the basics of LevelDB and its internals. If you haven't already read it you are encouraged to do so as we will be building upon this knowledge as we introduce the Node interface in this article.

LevelDB

There are two primary libraries for using LevelDB in Node, LevelDOWN and LevelUP.

LevelDOWN is a pure C++ interface between Node.js and LevelDB. Its API provides limited sugar and is mostly a straight-forward mapping of LevelDB's operations into JavaScript. All I/O operations in LevelDOWN are asynchronous and take advantage of LevelDB's thread-safe nature to parallelise reads and writes.

LevelUP is the library that the majority of people will use to interface with LevelDB in Node. It wraps LevelDOWN to provide a more Node.js-style interface. Its API provides more sugar than LevelDOWN, with features such as optional arguments.

LevelUP exposes iterators as Node.js-style object streams. A LevelUP ReadStream can be used to read sequential entries, forward or reverse, to and from any key.

LevelUP handles JSON and other encoding types for you. For example, when operating on a LevelUP instance with JSON value-encoding, you simply pass in your objects for writes and they are serialised for you. Likewise, when you read them, they are deserialised and passed back in their original form.

Continue reading this article on DailyJS.com