r.va.gg

The Truth About Rod Vagg

NOTE: This post is copied from https://github.com/nodejs/CTC/issues/165#issuecomment-324798494 and the primary intended audience was the Node.js CTC.


Dear reader from ${externalSource}: I neither like nor support personal abuse or attacks. If you are showing up here getting angry at any party involved, I would ask you to refrain from targeting them, privately or in public. Specifically to people who think they may be supporting me by engaging in abusive behaviour: I do not appreciate, want or need it, in any form and it is not helpful in any way.

Yep, this is a long post, but no apologies for the length this time. Buckle up.

I'm sad that we have reached this point, and that the CTC is being asked to make such a difficult decision. One of the reasons that we initially split the TSC into two groups was to insulate the technical doers on the CTC from the overhead of administrative and political tedium. I know many of you never imagined you'd have to deal with something like this when you agreed to join and that this is a very uncomfortable experience for you.

It's obvious that we never figured out a suitable structure that made the TSC a useful, functional, and healthy body that might be able to deal more effectively with these kinds of problems, more isolated from the CTC. I'm willing to accept a sizeable share of the blame for not improving our organisational structure during my tenure in leadership.

My response

Regarding the request for me to resign from the CTC: in lieu of clear justification that my removal is for the benefit of the Node.js project, or a case for my removal that is not built primarily on hearsay and innuendo, I respectfully decline.

There are two primary reasons for which I am standing my ground.

I cannot, in good conscience, give credence to the straw-man version of me being touted loudly on social media and on GitHub. This caricature of me and vague notions regarding my "toxicity", my propensity for "harassment", the "systematic" breaking of rules and other slanderous claims against my character has no basis in fact. I will not dignify these attacks by taking tacit responsibility through voluntary resignation.

Secondly, and arguably more importantly for the CTC: I absolutely will not take responsibility for the precedent that is currently being set. The dogged pursuit of a leader of this project, the strong-arm tactics being deployed with the goal of having me voluntarily resign, or my eventual removal from this organisation are not the behavior of a healthy, productive, or inclusive community.

My primary concern is that the consequences of these actions endanger the future health of the Node.js project. I do not believe that I am an irreplaceable snowflake (I’m entirely replaceable). There is reason to pause before making this an acceptable part of how we conduct our governance and our internal relationships.

However, while I am not happy to have the burden of this decision being foisted upon all of you, I am content with standing to be judged by this group. As the creative force behind Node.js and the legitimate owners of this project, my respect for you as individuals and as a group and your rightful position as final arbiters of the technical Node.js project makes entirely comfortable living with whatever decision you arrive at regarding my removal.

I will break the rest of this post into the following sections:

  • My critique of the process so far
  • My response to list of complaints made against me to the TSC
  • Addressing the claims often repeated across the internet regarding me as a hinderance to progress on inclusivity and diversity
  • The independence of the technical group, the new threats posed to that independence
  • The threats posed to future leadership of the project

The process so far

My personal experience so far has been approximately as follows:

  • Some time ago I received notification via email that there are complaints against me. No details were provided and I was informed that I would neither receive those details or be involved in the whatever process was to take place. Further, TSC members were not allowed to speak to me directly about these matters, including my work colleagues also on the TSC. I was never provided with an opportunity to understand the specific charges against me or be involved in any discussions on this topic from that point onward.
  • 3 days ago, I saw nodejs/TSC#310 at the same time as the public. This was the first time that I had seen the list of complaints. It was the first that I heard that there was a vote taking place regarding my position.
  • At no point have I been provided with an opportunity to answer to these complaints, correct the factual errors contained in them (see below), apologise and make amends where possible, or provide additional context that may further explain accusations against me.
  • At no point have I been approached by a member of the TSC or CTC regarding any of these items other than what the record that we have here on GitHub shows—primarily in the threads involved and in the moderation repository, the record is open for you to view regarding the due diligence undertaken either by my accusers or those executing the process. I have had interactions with only a single member of the TSC regarding one of these matters in private email and in person which has, on both occasions, involved me attempting to coax out the source of bad feelings that I had sensed and attempting to (relatively blindly) make amends.

I hope you can empathise that to me this process is rather unfair and regardless of whether this process is informed or dictated by our governance documents as has been claimed, it should be changed so that in the future accused parties have the chance to at least respond to accusations.

Response to the list of complaints

I am including the text that was redacted from nodejs/TSC#310 as it is already in the public domain, on social media, also on GitHub and now in the press. Please note that I did not ask for this text to be redacted.

1.

In [link to moderation repository discussion, not copied here out of respect for additional parties involved], Rod’s first action was to apologize to a contributor who had been repeatedly moderated. Rod did not discuss the issue with other members of the CTC/TSC first. The result undermined the moderation process as it was occurring. It also undercut the authority as moderators of other CTC/TSC members.

Rather than delving into the details of this complaint, I will simply say that I was unaware at the time that the actions I had taken were inappropriate and had caused hurt to some CTC/TSC members involved in this matter. Having had this belatedly explained to me (again, something I have had to coax out, not offered freely to me), I issued a private statement to the TSC and CTC via email at the beginning of this month offering my sincere apologies. (I did this without knowing whether it was part of the list of complaints against me.) The most relevant part of my private statement is this:

In relation to my behaviour in the specific: I should not have weighed in so heavily, or at all, in this instance as I lacked so much of the context of what was obviously a very sensitive matter that was being already dealt with by some of you (in a very taxing way, as I understand it). I missed those signals entirely and weighed in without tact, took sides against some of you—apologising to [unnecessary details withheld] on behalf of some of you was an absurd thing for me to do without having being properly involved prior to this. And for this I unreservedly apologise!

I don't know if this apology was acknowledged during the process of dealing with the complaints against me. This apology has neither been acknowledged in the publication of the complaints handling process, nor has it seemed to have any impact on the parties involved who continue to hold it against me. I can only assume that they either dismiss my sincerity or that apologies are not a sufficient means of rectifying these kinds of missteps.

In this matter I accept responsibility and have already attempted to make amends and prevent a similar issue from recurring. It disappoints me that it is still used as an active smear against me. Again, had I been given clear feedback regarding my misstep earlier, I would have attempted to resolve this situation sooner.

2.

In nodejs/board#58 and nodejs/moderation#82 Rod did not moderate himself when asked by another foundation director and told them he would take it to the board. He also ignored the explicit requests to not name member companies and later did not moderate the names out of his comments when requested. Another TSC member needed to follow up later to actually clean up the comments. Additionally he discussed private information from the moderation repo in the public thread, which is explicitly against the moderation policy.

My response to this complaint is as follows:

  1. This thread unfortunately involves a significant amount of background corporate politics, personal relationship difficulties and other matters which conspired to raise the temperature, for me at least. This is not an excuse, simply an explanation for what may have appeared to some to be a heated interjection on my part.
  2. I did edit my post very soon after—I was the first to edit my posts in there after the quick discussion that followed in the moderation repository and I realised I had made a poor judgement call with my choice of words. I both removed my reading of intent into the words of another poster and removed the disclosure of matters discussed in a private forum.
  3. I do not recall being asked to remove the names of the companies involved, I have only now seen that they have been edited out of my post. I cannot find any evidence that such a request was even made. This would have been a trivial matter on my part and I would have done it without argument if I had have seen such a request. To find this forming the basis of a complaint is rather troubling without additional evidence.
  4. A board member asking another board member (me) to edit their postings seemed to me to be a board matter, hence my suggestion to take it to the board. I was subsequently corrected on this—as it is a TSC-owned repository it was therefore referred to the TSC for adjudication.

I considered the remaining specifics of this issue to have been resolved and have not been informed otherwise since this event took place. Yet I now find that the matters are still active and I am the target of criticism rather than that criticism being aimed at the processes that apparently resolved the matter in the first place. Why was I never informed that my part in the resolution was unsatisfactory and why was I not provided a chance to rectify additional perceived misdeeds?

3.

Most recently Rod tweeted in support of an inflammatory anti-Code-of-Conduct article. As a perceived leader in the project, it can be difficult for outsiders to separate Rod’s opinions from that of the project. Knowing the space he is participating in and the values of our community, Rod should have predicted the kind of response this tweet received. https://twitter.com/rvagg/status/887652116524707841

His tweeting of screen captures of immature responses suggests pleasure at having upset members of the JavaScript community and others. As a perceived leader, such behavior reflects poorly on the project. https://twitter.com/rvagg/status/887790865766268928

Rod’s public comments on these sorts of issues is a reason for some to avoid project participation. https://twitter.com/captainsafia/status/887782785221615618

It is evidence to others that Node.js may not be serious about its commitment to community and inclusivity. https://twitter.com/nodebotanist/status/887724138516951049

  1. The post I linked to was absolutely not an anti-Code-of-Conduct article. It was an article written by an Associate Professor of Evolutionary Psychology at the University of New Mexico, discussing free speech in general and suggesting a case against speech codes in American university campuses. In sharing this, I hoped to encourage meaningful discussion regarding the possible shortcomings of some standard Code of Conduct language. My intent was not to suggest that the Node.js project should not have a Code of Conduct in place.
  2. "Rod should have predicted the kind of response this tweet received" is a deeply normative statement. I did not predict the storm generated, and assumed that open discussion on matters of speech policing was still possible, and that my personal views would not be misconstrued as the views of the broader Node.js leadership group or community. I obviously chose the wrong forum. If TSC/CTC members are going to be held responsible for attempting to share or discuss personal views on personal channels, then that level of accountability should be applied equally across technical and Foundation leadership.
  3. "His tweeting of screen captures of immature responses suggests pleasure" is an assumption of my feelings at the time. I find this ironic especially in the context of complaint number 2 (above); I was criticised for reading the intention of another individual into their words yet that’s precisely what is being done here. This claim is absolutely untrue, I do not take pleasure in upsetting people. I will refrain from justifying my actions further on this matter but this accusation is baseless and disingenuous.
  4. To re-state for further clarity, I have not made a case against Codes of Conduct in general, but rather, would like to see ongoing discussion about how such social guidelines could be improved upon, as they clearly have impact on open source project health.
  5. I have never made a case against the Node.js Code of Conduct.
  6. I have a clear voting record for adopting the Node.js project's Code of Conduct and for various changes made to it. Codes of Conduct have been adopted by a number of my own projects which have been moved from my own GitHub account to that of the Node.js Foundation.

I will refrain from further justifying a tweet. As with all of you, I bring my own set of opinions and values to our diverse mix and we work to find an acceptable common space for us all to operate within. I don’t ask that you agree with me, but within reason I hope that mutual respect is stronger than a single disagreement. I cannot accept that my opinions on these matters form a valid reason for my removal. I have submitted myself to our Code of Conduct as a participant in this project. I have been involved in the application of our Code of Conduct. But I do not accept it as a sacred text that is above critique or even discussion.

While not a matter for the TSC or CTC, a Board member on the Foundation who (by their own admission), has repeatedly discussed sensitive and private Board matters publicly on Twitter, causing ongoing consternation and legal concern for the Board. As far as I know, this individual has not been asked to resign. I consider this type of behaviour to be considerably more problematic for the Foundation than my tweeting of a link to an article completely unrelated to Node.js.

Taking action against me on the basis of this tweet, while ignoring the many tweets and other social media posts that stand in direct conflict to the goals of the Foundation by other members of our technical team, its leadership and other members of the Foundation and its various bodies, strikes me as a deeply unequal (and, it must be said, un-inclusive) application of the rules.

If it is the case that the TSC/CTC is setting limits on personal discussion held outside the context of the project repo, then these limits should be applied to all members of both groups without prejudice.

Board accusations

In addition to the above list, we now have new claims from the Node.js Foundation board. It appears to suggest that I have and/or do engage in “antagonistic, aggressive or derogatory behavior”, with no supporting evidence provided. Presumably the supporting evidence is the list in nodejs/TSC#310 to which I have responded with above.

I can’t respond to an unsupported claim such as this, it’s presented entirely without merit and I cannot consider it anything other than malicious, self-serving, and an obvious attempt to emotionally manipulate the TSC and CTC by charging the existing claims with a completely new level of seriousness by the sprinkling of an assortment of stigmatic evil person descriptors.

To say that I am disappointed that a majority of the Board would agree to conduct themselves in such an unprofessional and immature manner is an understatement. However this is neither the time nor place for me to attempt to address their attempts to smear, defame and unperson me. After requesting of me directly that I “fall on my sword” and not receiving the answer it wanted, the Board has chosen to make it clear to where it collectively thinks the high moral ground is in this matter. As I have already expressed to them, I believe they have made a poor assessment of the facts, and have not made the correct choice on their moral stance, and have now stood by and encouraged additional smears against me.

I will have more to say on the Board’s role and our relationship to it below, however.

That I am a barrier to inclusivity efforts

This is a refrain that is often repeated on social media about me and it's never been made clear, to me at least, how this is justified.

By most objective measures, the Node.js project has been healthier and more open to outsiders during my 2-year tenure in leadership than at any time in its history. One of the great pleasures I've had during this time has been in showing and celebrating this on the conference circuit. We have record numbers of contributors overall, per month overall and unique per month. Our issue tracker is so busy with activity that very few of us can stay subscribed to the firehose any more. We span the globe such that our core and working group meetings are very difficult to schedule and usually have to end up leaving people out. We regularly have to work to overcome language and cultural barriers as we continue to expand.

When I survey the contributor base, the collaborator list, the CTC membership, I see true diversity across many dimensions. Claims that I am a barrier to inclusivity and the building of a diverse contributor base are at odds with the prominent role I've had in the project during its explosive growth.

My assessment of the claim that I am a hindrance to inclusivity efforts is that it hinges on the singular matter of moderation and control of discourse that occurs amongst the technical team. From the beginning I have strongly maintained that the technical team should retain authority over its own space. That its independence also involves its ability to enforce the rules of social interaction and discussion as it sees fit. This has lead to disagreements with individuals that would rather insert external arbiters into the moderation process; arbiters who have not earned the right to stand in judgement of technical team members, and have not been held to the same standards by which technical team members are judged to earn their place in the project.

On this matter I remain staunchly opposed to the dilution of independence of the technical team and will continue to advocate for its ability to make such critical decisions for itself. This is not only a question of moral (earned) authority, but of the risk of subversion of our organisational structures by individuals who are attracted to the project by the possibility of pursuing a personal agenda, regardless of the impact this has on the project itself. I see current moves in this direction, as in this week’s moderation policy proposal at nodejs/TSC#276, as presenting such a risk. I don't expect everyone to agree with me on this, but I have just as much right as everyone else to make my case and not be vilified in my attempts to convince enough of the TSC to prevent such changes.

Further, regarding other smears against my character that now circulate regularly on social media and GitHub. I would ask that if you are using any of these as the basis of your judgement against me, please ask for supporting evidence of those making or repeating such smears. It's been an educational experience to watch a caricatured narrative about my character grow into the monster that it is today, and it saddens me when people I respect take this narrative at face value without bothering to scratch the surface to see if there is any basis in fact.

The use of language such as “systematic” and “pattern” to avoid having to outline specifics should be seen for what they are: baseless smears. I have a large body of text involving many hundreds of social interactions scattered through the Node.js project and its various repositories on GitHub. If any such “systematic” behavioural problems exist then it should not be difficult to provide clear documentation of them.

Threats to the independence of the technical group

We now face the unprecedented move by the Node.js Foundation Board to inject itself directly in our decision-making process. The message being: the TSC voted the wrong way, they should do it again until you get the “right” outcome.

This echoes the sentiment being expressed in the Community Committee and elsewhere, that since there were accusations, there must be guilt and the fault lies in the inability of the TSC to deal with that guilt. With no credence paid to the possibility that perhaps the TSC evaluated the facts and reached a consensus that no further action was necessary.

I have some sympathy for the position of the Node.js Foundation board. These are tough times in the Silicon Valley environment, particularly with the existing concerns surrounding diversity, inclusivity, and tolerance. I can understand how rumors of similarly unacceptable behavior can pose a threat, even absent any evidence of such behavior. That said, I do not believe that it is in the long-term interests of Node.js or its Foundation to pander to angry mobs, as they represent a small fraction of our stakeholders and their demands are rarely rational. In this case, I believe that a majority of outsiders will be viewing this situation with bemusement at best. It saddens me that there is no recognition of the fact that appeasing angry and unverified demands by activists only leads to greater demands and less logical discussion of these issues. If we accept this precedent then we place the future health of this project in jeopardy, as we will have demonstrated that we allow outsiders to adjust our course to suit personal or private agendas, as long as they can concoct a story to create outrage and dispense mob justice without reproach.

While difficult, I believe that it is important for the technical team to continue to assert its independence, to the board and to other outside influences. We are not children who need adult supervision; treating us as such undermines so much of what we have built over these last few years and erodes the feelings of ownership of the project that we have instilled in our team of collaborators.

The threat to future leadership of the project

Finally, I want to address a critical problem which has been overlooked, but now poses a big problem for our future: how to grow, enable and support leadership in such a difficult environment.

My tenure in leadership easily represents the most difficult years of my life. The challenges I have had to face have forced me to grow in ways I never expected. I'm thankful for the chance to meet these challenges, however, and even though it's taken a toll on my health, I'll be glad to have had the experience when I look back.

One of my tasks as a leader, particularly serving in the role of bridge between the Board and the technical team, has involved maintaining that separation and independence but also shielding the technical team from the intense corporate and personal politics that constantly exists and is being exercised within, and around the Foundation. This role forced me to take strong positions on many issues and to stand up to pressure applied from many different directions. In doing what I felt was best to support my technical team members I’m sure I’ve put people off-side—that's an unfortunate consequence of good intentions, but not an uncommon one. I wouldn't say I've made enemies so much as had to engage in very difficult conversations and involve myself in the surfacing of many disagreements that are difficult and sometimes impossible to resolve.

Having to involve yourself in a wide variety of decision-making processes inevitably requires that you make tough calls or connect yourself in some way to controversial discussions. I'm sure our current leadership can attest to the awkward positions they have found themselves in, and the difficult conversations they have had to navigate, including this one!

I'll never pretend I don't have limitations in the skills, both intellectually and emotionally, required to navigate through these tough waters. But when I consider the sheer number of dramas, controversies, and difficult conversations I've had to be involved in—and when I consider the thousands of pages of text I have left littered across GitHub and the other forums we use to get things done—I come to this conclusion: If the best reason you can find force my resignation is the above list of infractions, given the weight of content you could dredge through, then you're either not trying very hard or I should be pretty proud of myself for keeping a more level head than I had imagined.

That aside, my greatest concern for the role of leadership coming as a consequence of the actions currently being pursued, is that we've painted ourselves into a corner regarding the leaders we're going to have available. The message that the Board has chosen to send today can be rightly interpreted as this: if the mob comes calling, if the narrative of evil is strong enough, regardless of the objective facts, the Foundation does not have your back. As developers and leaders, the Foundation is signalling that they will not stand up for us when things get tough. Combine this with a difficult and thankless job, where the result of exercising your duties could be career-killing, the only path forward for leadership is that we will likely only have:

  • Individuals who are comfortable giving in to the whims of the outside activists, whatever the demands, slowly transforming this project into something entirely different and focused on matters not associated with making Node.js a success
  • Individuals who are capable but shrewd enough to avoid responsibility
  • Individuals who are capable and take on responsibility, exercise backbone when standing against pressure groups and mob tactics but get taken down because the support structures either abandon them or turn against them

This kind of pattern is being evidenced across the professionalised open source sphere, with Node.js about to set a new low bar. Do not be surprised as quality leaders become more difficult to find or become unconvinced that the exercise of leadership duties is at all in their personal interest.

This is a great challenge for modern open source and I'm so sad that I am being forced to be involved in the setting of our current trajectory. I hope we can find space in the future to have the necessary dialog to find a way out of the hole being dug.

In summary

Obviously I hope that you agree that (a) this action against me is unwarranted, is based on flawed and/or irrelevant claims of “misbehaviour” and is based in malicious intent, and that (b) allowing this course of action to be an acceptable part of our governance procedures will have detrimental consequences for the future health of the project.

I ask the CTC to reject this motion, for the TSC to reject the demand by the Board for my suspension, and that we as a technical team send a signal that our independence is critical to the success of the project, despite the accusations of an angry mob.

Thank you if you dignified my words by reading this far!

Why I don't use Node's core 'stream' module

This article was originally offered to nearForm for publishing and appeared for some time on their blog from early 2014 (at this URL: http://www.nearform.com/nodecrunch/dont-use-nodes-core-stream-module). It has since been deleted. I'd rather not speculate about the reasons for the deletion but I believe the article contains a very important core message so I'm now republishing it here.

TL;DR

The "readable-stream" package available in npm is a mirror of the Streams2 and Streams3 implementations in Node-core. You can guarantee a stable streams base, regardless of what version of Node you are using, if you only use "readable-stream".

The good 'ol days

Prior to Node 0.10, implementing a stream meant extending the core Stream object. This object was simply an EventEmitter that added a special pipe() method to do the streaming magic.

Implementing a stream usually started with something like this:

var Stream = require('stream').Stream
var util = require('util')

function MyStream () {
  Stream.call(this)
}

util.inherits(MyStream, Stream)

// stream logic, implemented however you want

If you ever had to write a non-trivial stream implementation for pre-Node 0.10 without using a helper library (such as through), you know what a nightmare the state-management it can be. The actual implementation of a custom stream is a lot more than just the above code.

Welcome to Node 0.10

Thankfully, Streams2 came along with a brand new set of base Stream implementations that do a whole lot more than pipe(). The biggest win for stream implementers comes from the fact that state-management is almost entirely taken care of for you. You simply need to provide concrete implementations of some abstract methods to make a fully functional stream, even for non-trivial workloads.

Implementing a stream now looks something like this:

var Readable = require('stream').Readable
// `Stream` is still provided for backward-compatibility
// Use `Writable`, `Duplex` and `Transform` where required
var util = require('util')

function MyStream () {
  Readable.call(this, { /* options, maybe `objectMode:true` */ })
}

util.inherits(MyStream, Readable)

// stream logic, implemented mainly by providing concrete method implementations:

MyStream.prototype._read = function (size) {
  // ... 
}

State-management is handled by the base-object and you interact with internal methods, such as this.push(chunk) in the case of a Readable stream.

While the internal streams implementations are an order-of-magnitude more complex than the previous core-streams implementation, most of it is there to make life an order-of-magnitude easier for those of us implementing custom streams. Yay!

Backward-compatibility

When every new major stable release of Node occurs, anyone releasing public packages in npm has to make a decision about which versions of Node they support. As a general rule, the authors of the most popular packages in npm will support the current stable version of Node and the previous stable release.

Streams2 was designed with backwards-compatibility in mind. Streams using require('stream').Stream as a base will still mostly work as you'd expect and they will also work when piped to streams that extend the other classes. Streams2 streams won't work like classic EventEmitter objects when you pipe them together, as old-style streams do. But when you pipe a Streams2 stream and an old-style EventEmitter-based stream together, Streams2 will fall-back to "compatibility-mode" and operate in a backward-compatible way.

So Streams2 are great and mostly backward-compatible (aside from some tricky edge cases). But what about when you want to implement Streams2 and run on Node 0.8? And what about open source packages in npm that want to still offer Node 0.8 compatibility while embracing the new Streams2-goodness?

"readable-stream" to the rescue

During the 0.9 development phase, prior to the 0.10 release, Isaac developed the new Streams2 implementation in a package that was released in npm and usable on older versions of Node. The readable-stream package is essentially a mirror of the streams implementation of Node-core but is available in npm. This is a pattern we will hopefully be seeing more of as we march towards Node 1.0. Already there is a core-util-is package that makes available the shiny new is type-checking functions in the 0.11 core 'util' package.

readable-stream gives us the ability to use Streams2 on versions of Node that don't even have Streams2 in core. So a common pattern for supporting older versions of Node while still being able to hop on the Streams2-bandwagon starts off something like this, assuming you have "readable-stream" as a dependency:

var Readable = require('stream').Readable || require('readable-stream').Readable

This works because there is no Readable object on the core 'stream' package in 0.8 and prior, so if you are running on an older version of Node it skips straight to the "readable-stream" package to get the required implementation.

Streams3: a new flavour

The readable-stream package is still being used to track the changes to streams coming in 0.12. The upcoming Streams3 implementation is more of a tweak than a major change. It contains an attempt to make "compatibility mode" more of a first-class citizen of the API and also some improvements to pause/resume behaviour.

Like Streams2, the aim with Streams3 is for backward (and forward) compatibility but there are limits to what can be achieved on this front.

While this new streams implementation will likely be an improvement over the current Streams2 implementation, it is part of the unstable development branch of Node and is so far not without its edge cases which can break code designed against the pure 0.10 versions of Streams2.

What is your base implementation?

Looking back at the code used to fetch the base Streams2 implementation for building custom streams, let's consider what we're actually getting with different versions of Node:

var Readable = require('stream').Readable || require('readable-stream').Readable
  • Node 0.8 and prior: we get whatever is provided by the readable-stream package in our dependencies.
  • Node 0.10: we get the particular version of Streams2 that comes with the version of Node we're using.
  • Node 0.11: we get the particular version of Streams3 that comes with the version of Node we're using.

This may not be interesting if you have full control over all deployments of your custom stream implementations and which version(s) of Node they will be used on. But it can cause some problems in the case of open source libraries distributed via npm with users still stuck on 0.8 (for some, the upgrade path is not an easy one for various reasons), 0.10 and even people trying out some of the new Node and V8 features available in 0.11.

What you end up with is a very unstable base upon which to build your streams implementation. This is particularly acute since the vast bulk of the code used to construct the stream logic is coming from either Node-core or the readable-stream package. Any bugs fixed in later Node 0.10 releases will obviously still be present for people still stuck on earlier 0.10 releases even if the readable-stream dependency has the fixed version.

Then, when your streams code is run on Node 0.11, suddenly it's a Streams3 stream which has slightly different behaviour to what most of your users are experiencing.

One of the ways these subtle differences are exposed is in bug reports. Users may report a bug that only occurs on their particular combination of core-streams and readable-stream and it may not be obvious that the problem is related to base-stream implementation edge-cases they are stumbling upon; wasting time for everyone.

And what about stability? The fragmentation introduced by all of the possible combinations means that your otherwise stable library is having instability foisted upon it from the outside. This is one of the costs of relying on a featureful standard-library (core) within a rapidly developing, pre-v1 platform. But we can do something about it by taking control of the exact version of the base streams objects we want to extend regardless of what is bundled in the version of Node being used. readable-stream to the rescue!

Taking control

To control exactly what code your streams implementation is building on, simply pin the version of readable-stream and use only it, avoiding require('stream') completely. Then you get to make the choice when to upgrade to Streams3, even if that's some time after Node 0.12.

readable-stream comes in two major versions, v1.0.x and v1.1.x. The former tracks the Streams2 implementation in Node 0.10, including bug-fixes and minor improvements as they are added. The latter tracks Streams3 as it develops in Node 0.11; we may see a v1.2.x branch for Node 0.12.

Any library worth using should be following the basics of semver minor and patch versions (the merits and finer points of major versioning are still something worth debating). readable-stream gives you proper patch-level versioning so if you pin to "~1.0.0" you'll get the latest Node 0.10 Streams2 implementation, including any fixes and minor non-breaking improvements. The patch-level version of 1.0.x and 1.1.x should mirror the patch-level versions of Node core releases as we proceed.

When you're ready to start using Streams3 you can pin to "~1.1.0", but you should hold off until much closer to Node 0.12, if not after its formal release.

Small core FTW!

Being able to control precisely the versions of dependencies your code uses reduces the scope for bugs introduced by version incompatibilities or new and unproven implementations.

When we rely on a bulky standard-library to build our libraries and applications, we're relying on a shifting sand that we have little control over. This is particularly a problem for open source libraries whose users have legitimate (and sometimes not-so-legitimate) reasons for using versions that you'd rather not have to support.

Streams2 is a powerful abstraction, but the implementation is far from simple. The Streams2 code is some of the most complex JavaScript you'll find in Node core. Unless you want to have a detailed understanding of how they work and be able to track the changes as they develop, you should pin your Streams2 dependency in the same way as you pin all your other dependencies. Opt for readable-stream over what Node-core offers:

{
  "name": "mystream",
  ...
  "dependencies": {
    "readable-stream": "~1.0.0"
  }
}
var Readable = require('readable-stream').Readable
var util = require('util')

function MyStream () {
  Readable.call(this)
}

util.inherits(MyStream, Readable)

MyStream.prototype._read = function (size) {
  // ... 
}

Addendum: "through2"

If the boilerplate of the Streams2 base objects ("classes") is too much for you or triggers some past-life Java PTSD, you can just opt for the "through2" package in npm to get the job done.

through2 is based on Dominic Tarr's through but is built for Streams2, whereas "through" is a pure Streams1 style. The API isn't quite the same but the flexibility and simplicity is.

through2 gives you a DuplexStream as a base to implement any kind of stream you like, be it as purely readable, purely writable or a fully duplex stream. In fact, you can even use through2 to implement a PassThrough stream by not providing an implementation!

From the examples:

var through2 = require('through2')

fs.createReadStream('ex.txt')
  .pipe(through2(function (chunk, enc, callback) {

    for (var i = 0; i < chunk.length; i++)
      if (chunk[i] == 97)
        chunk[i] = 122 // swap 'a' for 'z'

    this.push(chunk)

    callback()

   }))
  .pipe(fs.createWriteStream('out.txt'))

Or an object stream:

fs.createReadStream('data.csv')
  .pipe(csv2())
  .pipe(through2.obj(function (chunk, enc, callback) {

    var data = {
        name    : chunk[0]
      , address : chunk[3]
      , phone   : chunk[10]
    }

    this.push(data)

    callback()

  }))
  .on('data', function (data) {
    all.push(data)
  })
  .on('end', function () {
    doSomethingSpecial(all)
  })

NodeSchool comes to Australia

NodeSchool has its genesis ultimately at NodeConf 2013 where @substack introduced us to stream-adevnture. I took the concept home to CampJS and wrote learnyounode for the introduction to Node.js workshop I was to run. As part of the process I extracted a package called workshopper to do the work of making a terminal workshop experience. Most of the logic originally came from stream-adventure. A short time after CampJS, I created levelmeup for NodeConf.eu and suddenly we had a selection of what have come to be known as "workshoppers". At NodeConf.eu, @brianloveswords suggested the NodeSchool concept and registered the domain, @substack provided the artwork and the ball was rolling.

Today, workshopper is depended on by 22 packages in npm, most of which are workshoppers that you can install and use to learn aspects of Node.js or JavaScript in general. The curated list of usable workshoppers is maintained at nodeschool.io.

learnyounode itself is now being downloaded at a rate of roughly 200 per day. That's at least 200 more people each day wanting to learn how to do Node.js.

learnyounode downloads

I can't recall exactly how "NodeSchool IRL" events started but it was probably @maxogden who has been responsible for a large number of these events. There have now been over 30 of these events around the world and the momentum is only picking up steam. The beauty of this format is that it's low-cost and low-effort to make it happen. All you need is a venue where nerds can show up with their computers and some basic guidance. There have even been a few events that were without experienced Node.js mentors but that's no great barrier as the lessons are largely self-guided and work particularly well when pairs or groups of people work together on solutions.

NodeSchool comes to Australia

It's surprising that so far, given all of the NodeSchool activity around the world, that we haven't had a single NodeSchool event in Australia. CampJS had learnyounode last year and this year there were 3 brand new workshoppers introduced there, so it's the closest thing we've had.

Next weekend, on the 21st of June, we are attempting a coordinated Australian NodeSchool event. At the moment, that coordination amounts to having events hosted in Sydney, Melbourne and Hobart, unfortunately the timing has been difficult for Brisbane and we haven't managed to bring anyone else out of the woodwork. But, we will be attempting to do this regularly, plus we'd like to encourage meet-up hosts to use the format now and again with their groups.

NodeSchool in Sydney

I'll be at NodeSchool in Sydney next weekend. It will be proudly hosted by NICTA who have a space for up to 60 people. NICTA are currently doing some interesting work with WebRTC, you should catch up with @DamonOehlman if this is something you're interested in. Tabcorp will also be a major sponsor of the event. Tabcorp have been building a new digital team with the back-end almost entirely in Node.js. They are also doing a great job engaging and contributing to existing and new open source projects. They are also hiring so be sure to catch up with @romainprieto if you're doing PHP, Java or some other abomination and want to be doing Node!

Thanks to the sponsorship, we'll be able to provide some catering for the event. Currently we're looking at providing lunch but we may be expanding that to providing some breakfast treats. We'll also be providing refreshments for everyone attending throughout the day.

Start time is 9.30am, end is 4pm. The plan is to spend the first half of the day doing introductory Node.js which will mainly mean working through learnyounode. The second half of the day will be less structured and we'll encourage attendees to work on other workshoppers that they find interesting. Thankfully we have some amazing Node.js programmers in Sydney and they'll be available as mentors.

We are currently selling tickets for $5, the money will contribute towards the event, there is no profit involved here. We don't need to charge for the event but given the generally dismal turnout for tech meet-ups that are free we feel that providing a small commitment barrier will help us maximise the use of the space we have available. If the money is a barrier for you please contact us! We don't want anyone to miss out. Also, we have special "mentor" tickets available for experienced Node.js programmers who are able to assist. If you think you fit into this category please contact us also.

You can sign up for Sydney NodeSchool at https://ti.to/nodeschool/sydney-june-2014/. If you are tempted, don't sit on the fence because spots are limited and as of writing the tickets are almost 1/2 gone.

NodeSchool in Melbourne

NodeSchool in Melbourne is being supported by ThoughtWorks who have been doing Node in Australia for a while now. If you're interested in their services or want to chat about employment opportunities you should catch up with Liauw Fendy.

@sidorares is putting in a large amount of the legwork in to Melbourne's event. He was a major contributor to the original learnyounode and is a huge asset to the Melbourne Node community. Along with Andrey, Melbourne has large number of expert Node.js hackers, many of who will be available as mentors. This will be a treat for Melbournians so this is not something you should miss out on if you are in town! Potential mentors should contact Andrey.

You can sign up for Melbourne NodeSchool at https://ti.to/nodeschool/melbourne-june-2014/.

NodeSchool in Hobart

Hobart is lucky to have @joshgillies, a local tech-community organiser responsible for many Tasmanian web and JavaScript events. The event is being hosted at The Typewriter Factory, a business workspace that Josh helps run. Sponsorship is being provided by ACS who will be helping support the venue and also provide some catering.

You can sign up for Hobart NodeSchool at https://ti.to/nodeschool/hobart-june-2014/.

Testing code against many Node versions with Docker

I haven't found reason to play with Docker until now, but I've finally came up with an excellent use-case.

NAN is a project that helps build native Node.js add-ons while maintaining compatibility with Node and V8 from Node versions 0.8 onwards. V8 is currently undergoing major internal changes which is making add-on development very difficult; NAN's purpose is to abstract that pain. Instead of having to manage the difficulties of keeping your code compatible across Node/V8 versions, NAN does it for you. But this means that we have to be sure to keep NAN tested and compatible with all of the versions it claims to support.

Travis can help a little with this. It's possible to use nvm to test across different versions of Node, we've tried this with NAN (see the .travis.yml). Ideally you'd have better choice of Node versions, but Travis have had some difficulty keeping up. Also, npm bugs make it difficult, with a high failure rate from npm install problems, like this and this, so I don't even publish the badge on the NAN README.

The other problem with Travis is that it's a CI solution, not a proper testing solution. Even if it worked well, it's not really that helpful in the development process, you need rapid feedback that your code is working on your target platforms (this is one reason why I love back-end development more than front-end development!)

Enter Docker and DNT

DNT: Docker Node Tester

Docker is a tool that simplifies the use of Linux containers to create lightweight, isolated compute "instances". Solaris and its variants have had this functionality for years in the form of "zones" but it's a fairly new concept for Linux and Docker makes the whole process a lot more friendly.

DNT contains two tools that work with Docker and Node.js to set-up containers for testing and run your project's tests in those containers.

DNT includes a setup-dnt script that sets up the most basic Docker images required to run Node.js applications, nothing extra. It first creates an image called "dev_base" that uses the default Docker "ubuntu" image and adds the build tools required to compile and install Node.js

Next it creates a "node_dev" image that contains a complete copy of the Node.js source repository. Finally, it creates a series of images that are required; for each Node version, it creates an image with Node installed and ready to use.

Setting up a project is a matter of creating a .dntrc file in the root directory of the project. This configuration file involves setting a NODE_VERSIONS variable with a list of all of the versions of Node you want to test against, and this can include "master" to test the latest code from the Node repository. You also set a TEST_CMD variable with a series of commands required to set up, compile and execute your tests. The setup-dnt command can be run against a .dntrc file to make sure that the appropriate Docker images are ready. The dnt command can then be used to execute the tests against all of the Node versions you specified.

Since Docker containers are completely isolated, DNT can run tests in parallel as long as the machine has the resources. The default is to use the number of cores on the computer as the concurrency level but this can be configured if not appropriate.

Currently DNT is designed to parse TAP test output by reading the final line as either "ok" or "not ok" to report test status back on the command-line. It is configurable but you need to supply a command that will transform test output to either an "ok" or "not ok" (sed to the rescue?).

How I'm using it

My primary use-case is for testing NAN. The test suite needs a lot of work so being able to test against all the different V8 and Node APIs while coding is super helpful; particularly when tests run so quickly! My NAN .dntrc file tests against master, all of the 0.11 releases since 0.11.4 (0.11.0 to 0.11.3 are explicitly not supported by NAN) and the last 5 releases of the 0.10 and 0.8 series. At the moment that's 17 versions of Node in all and on my computer the test suite takes approximately 20 seconds to complete across all of these releases.

The NAN .dntrc

NODE_VERSIONS="\
  master   \
  v0.11.9  \
  v0.11.8  \
  v0.11.7  \
  v0.11.6  \
  v0.11.5  \
  v0.11.4  \
  v0.10.22 \
  v0.10.21 \
  v0.10.20 \
  v0.10.19 \
  v0.10.18 \
  v0.8.26  \
  v0.8.25  \
  v0.8.24  \
  v0.8.23  \
  v0.8.22  \
"
OUTPUT_PREFIX="nan-"
TEST_CMD="\
  cd /dnt/test/ &&                                               \
  npm install &&                                                 \
  node_modules/.bin/node-gyp --nodedir /usr/src/node/ rebuild && \
  node_modules/.bin/tap js/*-test.js;                            \
"

Next I configured LevelDOWN for DNT. The needs are much simpler, the tests need to do a compile and run a lot of node-tap tests.

The LevelDOWN .dntrc

NODE_VERSIONS="\
  master   \
  v0.11.9  \
  v0.11.8  \
  v0.10.22 \
  v0.10.21 \
  v0.8.26  \
"
OUTPUT_PREFIX="leveldown-"
TEST_CMD="\
  cd /dnt/ &&                                                    \
  npm install &&                                                 \
  node_modules/.bin/node-gyp --nodedir /usr/src/node/ rebuild && \
  node_modules/.bin/tap test/*-test.js;                          \
"

Another native Node add-on that I've set up with DNT is my libssh bindings. This one is a little more complicated because you need to have some non-standard libraries installed before compile. My .dntrc adds some extra apt-get sauce to fetch and install those packages. It means the tests take a little longer but it's not prohibitive. An alternative would be to configure the node_dev base-image to have these packages to all of my versioned images have them too.

The node-libssh .dntrc

NODE_VERSIONS="master v0.11.9 v0.10.22"
OUTPUT_PREFIX="libssh-"
TEST_CMD="\
  apt-get install -y libkrb5-dev libssl-dev &&                           \
  cd /dnt/ &&                                                            \
  npm install &&                                                         \
  node_modules/.bin/node-gyp --nodedir /usr/src/node/ rebuild --debug && \
  node_modules/.bin/tap test/*-test.js --stderr;                         \
"

LevelUP isn't a native add-on but it does use LevelDOWN which requires compiling. For the DNT config I'm removing node_modules/leveldown/ prior to npm install so it gets rebuilt each time for each new version of Node.

The LevelUP .dntrc

NODE_VERSIONS="\
  master   \
  v0.11.9  \
  v0.11.8  \
  v0.10.22 \
  v0.10.21 \
  v0.8.26  \
"
OUTPUT_PREFIX="levelup-"
TEST_CMD="\
  cd /dnt/ &&                                                    \
  rm -rf node_modules/leveldown/ &&                              \
  npm install --nodedir=/usr/src/node &&                         \
  node_modules/.bin/tap test/*-test.js --stderr;                 \
#"

What's next?

I have no idea but I'd love to have helpers flesh this out a little more. It's not hard to imagine this forming the basis of a local CI system as well as a general testing tool. The speed even makes it tempting to run the tests on every git commit, or perhaps on every save.

If you'd like to contribute to development then please submit a pull request, I'd be happy to discuss anything you might think would improve this project. I'm keen to share ownership with anyone making significant contributions; as I do with most of my open source projects.

See the DNT GitHub repo for installation and detailed usage instructions.

LevelDOWN v0.10 / managing GC in native V8 programming

LevelDB

Today we released version 0.10 of LevelDOWN. LevelDOWN is the package that directly binds LevelDB into Node-land. It's mainly C++ and is a fairly raw & direct interface to LevelDB. LevelUP is the package that we recommend most people use for LevelDB in Node as it takes LevelDOWN and makes it much more Node-friendly, including the addition of those lovely ReadStreams.

Normally I wouldn't write a post about a minor release like this but this one seems significant because of a number of small changes that culminate in a relatively major release.

In this post:

  • V8 Persistent references
  • Persistent in LevelDOWN; some removed, some added
  • Leaks!
  • Snappy 1.1.1
  • Some embarrassing bugs
  • Domains
  • Summary
  • A final note on Node 0.11.9

V8 Persistent references

The main story of this release are v8::Persistent references. For the uninitiated, V8 internally has two different ways to track "handles", which are references to JavaScript objects and values currently active in a running program. There are Local references and there are Persistent references. Local references are the most common, they are the references you get when you create an object or pass them around within a function and do the normal work that you do with an object. Persistent references are a special case that is all about Garbage Collection. An object that has at least one active Persistent reference to it is not a candidate for garbage collection. Persistent references must be explicitly destroyed before they release the object and make it available to the garbage collector.

Prior to V8 3.2x.xx (I don't know the exact version, does it matter? It roughly corresponds to Node v0.11.3.), these handles were both as easy as each other to create and interchange. You could swap one for the other whenever you needed. My guess is that the V8 team decided that this was a little too easy and that a major cause for memory leaks in C++ V8 code was the ease at which you could swap a Local for a Persistent and then forget to destroy the Persistent. So they tweaked the "ease" equation and it's become quite difficult.

Persistent and Local no longer share the same type hierarchy and the way you instantiate and assign a Persistent has become quite awkward. You now have to go through enough gymnastics to create a Persistent that it makes you ask the question: "Do I really need this to be a Persistent?" Which I guess is a good thing for memory leaks. NAN to the rescue though! We've somewhat papered over those difficulties with the capabilities introduced in NAN, it's still not as easy as it once was but it's not a total headache.

So, you understand v8::Persistent now? Great, so back to LevelDOWN.

Persistent in LevelDOWN; some removed, some added!

Some removed

Recently, Matteo noticed that when you're performing a Batch() operation in LevelDB, there is an explicit copy of the data that you're feeding in to that batch. When you construct a Batch operation in LevelDB you start off with a short string representing the batch and then build on that string as you build your batch with both Put() and Del() operations. You end up with a long string containing all of your write data; keys and values. Then when you call Write() on the Batch, that string gets fed directly into the main LevelDB store as a single write—which is where the atomicity of Batch comes from.

Both the chained-form and array-form batch() operations work this way internally in LevelDOWN.

However, with almost all operations in LevelDOWN, we perform the actual writes and reads against LevelDB in libuv worker threads. So we have to create the "descriptor" for work in the main V8 Node thread and then hand that off to libuv to perform the work in a separate thread. Once the work is completed we get the results back in the main V8 Node thread from where we can trigger a callback. This is where Persistent references come in.

Before we hand off the work to libuv, we need to make Persistent references to any V8 object that we want to survive across the asynchronous operation. Obviously the main candidate for this is callback functions. Consider this code:

db.get('foo', function (err, value) {
  console.log('foo = %s', value)
})

What we've actually done is create an anonymous closure for our callback. It has nothing referencing it, so as far as V8 is concerned it's a candidate for garbage collection once the current thread of execution is completed. In Node however, we're doing asynchronous work with it and need it to survive until we actually call it. This is where Persistent references come in. We receive the callback function as a Local in our C++ but then assign it to a Persistent so GC doesn't touch it. Once we're done our async work we can call the function and destroy the Persistent, effectively turning it back in to a Local and freeing it up for GC.

Without the Persistent then the behaviour is indeterminate. It depends on the version of V8, the GC settings, the workload currently in the program and the amount of time the async work takes to complete. If the GC is aggressive enough and has a chance to run before our async work is complete, the callback will disappear and we'll end up trying to call a function that no longer exists. This can obviously lead to runtime errors and will most likely crash our program.

In LevelDOWN, if you're passing in String objects for keys and values then to pull out the data and turn it in to a form that LevelDB can use we have to do an explicit copy. Once we've copied the data from the String then we don't need to care about the original object and GC can get its hands on it as soon as it wants. So we can leave String objects as Local references while we are building the descriptor for our async work.

Buffer objects are a different matter all together. Because we have access to the raw character array of a Buffer, we can feed that data straight in to LevelDB and this will save us one copy operation (which can be a significant performance boost if the data is significantly large or you're doing lots of operations—so prefer Buffers where convenient if you need higher perf). When building the descriptor for the async work, we are just passing a character array to the LevelDB data structures that we're setting up. Because the data is shared with the original Buffer we have to make sure that GC doesn't clean up that Buffer before we have a chance to use the data. So we make a Persistent reference for it which we clean up after the async work is complete. So you can do this without worrying about GC:

db.put(
    new Buffer('foo')
  , require('crypto').randomBytes(1024)
  , function (err) {
      console.log('foo is now some random data!')
    }
)

This has been the case in LevelDOWN for all operations since pretty much the beginning. But back to Matteo's observation. If LevelDB's data structures perform an explicit copy on the data we feed it then perhaps we don't need to keep the original data safe from GC? For a batch() call it turns out that we don't! When we're constructing the Batch descriptor, as we feed in data to it, both Put() and Del(), it's taking a copy of our data to create its internal representation. So even when we're using Buffer objects on the JavaScript side, we're done with them before the call down in to LevelDOWN is completed so there's no reason to save a Persistent reference! For other operations we're still doing some copying during the asynchronous cycle but the removal of the overhead of creating and deleting Persistent references for batch() calls is fantastic news for those doing bulk data loading (like Max Ogden's dat project which needs to bulk load a lot of data).

Some added

Another gem from Matteo was reports of crashes during certain batch() operations. Difficult to reproduce and only under very particular circumstances, it seems to be mostly reproducible by the kinds of workloads generated by LevelGraph. Thanks to some simple C++ debugging we traced it to a dropped reference, obviously by GC. The code in question boiled down to something like this:

function doStuff () {
  var batch = db.batch()
  batch.put('foo', 'bar')
  batch.write(function (err) {
    console.log('done', err)
  })
}

In this code, the batch object is actually a LevelDOWN Batch object created in C++-land. During the write() operation, which is asynchronous, we end up with no hard references to batch in our code because the JS thread has yieled and moved on and the batch is contained within the scope of the doStuff() function. Because most of the asynchronous operations we perform are relatively quick, this normally doesn't matter. But for writes to LevelDB, if you have enough data in your write and you have enough data already in your data store, you can trigger a compaction upstream which can delay the write which can give V8's GC time to clean up references that might be important and for which you have no Persistent handles.

In this case, we weren't actually creating internal Persistent references for some of our objects. Batch in this case but also Iterator. Normally this isn't a problem because to use these objects you generally keep references to them yourself in your own code.

We managed to debug Matteo's crash by adjusting his test code to look something like this and watching it succeed without a crash:

function doStuff () {
  var batch = db.batch()
  batch.put('foo', 'bar')
  batch.write(function (err) {
    console.log('done', err)
    batch.foo = 'bar'
  })
}

By reusing batch inside our callback function, we're creating some work that V8 can't optimise away and therefore has to assume isn't a noop. Because the batch variable is also now referenced by the callback function and we already have an internal Persistent for it, GC has to pass over batch until the Persistent is destroyed for the callback.

So the solution is simply to create a Persistent for the internal objects that need to survive across asynchronous operations and make no assumptions about how they'll be used in JavaScript-land. In our case we've gone for assigning a Persistent just prior to every asynchronous operation and destroying it after. The alternative would be to have a Persistent assigned upon the creation of objects we care about but sometimes we want GC to do its work:

function dontDoStuff () {
  var batch = db.batch()
  batch.put('foo', 'bar')
  // nothing else, wut?
}

I don't know why you would write that code but perhaps you have a use-case where you want the ability to start constructing a batch but then decide not to follow through with it. GC should be able to take care of your mess like it does with all of the other messes you create in your daily adventures with JavaScript.

So we are only assigning a Persistent when you do a write() with a chained-batch operation in LevelDOWN since it's the only asynchronous operation. So in dontDoStuff() GC will come along and rid us of batch, 'foo' and 'bar' when it has the next opportunity and our C++ code will have the appropriate destructors called that will clean up any other objects we have created along the way, like the internal LevelDB Batch with its copy of our data.

Leaks!

We've been having some trouble with leaks in LevelUP/LevelDOWN lately (LevelDOWN/#171, LevelGraph/#40). And it turns out that these leaks aren't related to Persistent references, which shouldn't be a surprise since it's so easy to leak with non-GC code, particularly if you spend most of your day programming in a language with GC.

With the help of Valgrind we tracked the leak down to the omission of a delete in the destructor of the asynchronous work descriptor for array-batch operations. The internal LevelDB representation of a Batch wasn't being cleaned up unless you were using the chained-form of LevelDOWN's batch(). This one has been dogging us for a few releases now and it's been a headache particularly for people doing bulk-loading of data so I hope we can finally put it behind us!

Snappy 1.1.1

Google released a new version of Snappy, version 1.1.1. I don't really understand how Google uses semver; we get very simple LevelDB releases with the minor version bumped and then we get versions of Snappy released with non-trivial changes with only the patch version bumped. I suspect that Google doesn't know how it uses semver either and there's no internal policy on it.

Anyway, Snappy 1.1.1 has some fixes, some minor speed and compression improvements but most importantly it breaks compilation on Windows. So we had to figure out how to fix that for this release. Ugh. I also took the opportunity to clean up some of the compilation options for Snappy and we may see some improvements in the way it works now... perhaps.

Some embarrassing bugs

Amine Mouafik is new to the LevelDOWN repository but has picked up some rather embarrassing bugs/omissions that are probably my fault. It's great to have more eyes on the C++ code, there's not enough JavaScript programmers with the confidence to dig in to messy C++-land.

Firstly, on our standard LevelDOWN releases, it turns out that we haven't actually been enabling the internal bloom filter. The bloom filter was introduced in LevelDB to speed up read operations to avoid having to scan through whole blocks to find the data a read is looking for. So that's now enabled for 0.10.

Then he discovered that we had been turning off compression by default! I believe this happened with the the switch to NAN. The signature for reading boolean options from V8 objects was changed from internal LD_BOOLEAN_OPTION_VALUE & LD_BOOLEAN_OPTION_VALUE_DEFTRUE macros for defaulting to true and false respectively when the options aren't supplied, to the NAN version which is a unified NanBooleanOptionValue which takes an optional defaultValue argument that can be used to make the default true. This happened at roughly Node version 0.11.4.

Well, this code:

bool compression =
    NanBooleanOptionValue(optionsObj, NanSymbol("compression"));

is now this:

bool compression =
    NanBooleanOptionValue(optionsObj, NanSymbol("compression"), true);

so if you don't supply a "compression" boolean option in your db setup operation then it'll now actually be turned on!

Domains

We've finally caught up with properly supporting Node's domains by switching all C++ callback calls from standard V8 callback->Call(...) to Node's own node::MakeCallback(callback, ...) which does the same thing but also does lots of additional things, including accounting for domains. This change was also included in NAN version 0.5.0.

Summary

Go and upgrade!

leveldown@0.10.0 is packaged with the new levelup@0.18.0 and level@0.18.0 which have their minor versions bumped purely for this LevelDOWN release.

Also released are the packages:

  • leveldown-hyper@0.10.0
  • leveldown-basho@0.10.0
  • rocksdb@0.10.0 (based on the same LevelDOWN code) (Linux only)
  • level-hyper@0.18.0 (levelup on leveldown-hyper)
  • level-basho@0.18.0 (levelup on leveldown-basho)
  • level-rocks@0.18.0 (levelup on rocksdb) (Linux only)

I'll write more about these packages in the future since they've gone largely under the radar for most people. If you're interested in catching up then please join ##leveldb on Freenode where there's a bunch of Node database people and also a few non-Node LevelDB people like Robert Escriva, author of HyperLevelDB and all-round LevelDB expert.

A final note on Node 0.11.9

There will be a LevelDOWN@0.10.1 very soon that will increment the NAN dependency to 0.6.0 when it's released. This new version of NAN will specifically deal with Node 0.11.9 compatibility where there are more breaking V8 changes that will cause compile errors for any addon not taking them in to account. So if you're living on the edge in Node then we should have a release soon enough for you!