r.va.gg

Announcing Bean v1.0.0

In my previous post about Bean I discussed in detail the work that has gone in to a v1 release and how it will differ from the v0.4 branch.

Bean version 1.0.0 has now been released, you can download it from the GitHub repository or you can fetch it from npm for your Ender builds.

Here's a quick summary of the changes, but for a more in-depth look you should refer to my previous post.

on() argument ordering: the new signature is now .on(events[, selector], handlerFn), which will work on both Bean as a standalone library and when bundled in Ender. In Ender, the following aliases also pass through on() so the same arguments work: addListener(), bind(), listen() and one() (which of course will only trigger once). Plus all the specific shortcuts such as click(), keyup() etc. although these methods have the first argument hardwired.

add() is left intact with the same argument ordering for standalone Bean and delegate() has the same signature, the same as jQuery's equivalent.

off() is the new remove(): although remove() is still available in standalone Bean.

Bean attaches a single handler to the DOM for each event type on each element: as outlined above, Bean will iterate over all handlers for each triggered and (mostly) reuse the same Event object for each call.

Event.stopImmediatePropagation(): is available across all supported browsers, it will stop the processing of all handlers for the current event at the current element (i.e. the event will still bubble).

The selector engine argument to add() is now completely removed: you used to have to pass a selector engine in as the last argument for delegated events. Now you must set it once at start-up with setSelectorEngine(). This is automatically taken care of for you in an Ender build.

A duplicate-handler check is no longer performed when you add: performance testing showed that this was a massive slow-down and is simply not something that Bean should be responsible for. If you want to add the same handler twice then that's your business and responsibility.

Namespace matching for event fire()ing now matches namespaces using an and instead of an or: so for example, firing namespaces 'a.b' will fire any event with both 'a' and 'b' rather than either 'a' or 'b'. This is compatible with jQuery and is arguably a much more sensible and helpful way to deal with namespaces. You can find some discussion on this on GitHub.

Lots of internal improvements for speed, code size, etc..

There was one remaining question to be resolved—whether Event.stop() would also trigger Event.stopImmediatePropagation(). I've decided to not include it and leave it to the user to decide whether they want to prevent triggering of other listeners on the same event/element.

And that's it! Please give it a spin and open an issue on GitHub if you have any bugs to report or questions to be answered.

How Ender bundles libraries for the browser

I was asked an interesting Ender question on IRC (#enderjs on Freenode) and as I was answering it, it occurred to me that the subject would be an ideal way to explain how Ender's multi-library bundling works. So here is that explanation!

The original question went something like this:

When a browser first visits my page, they only get served Bonzo (a DOM manipulation library) as a stand-alone library, but on returning visits they are also served Qwery (a selector engine), Bean (an event manager) and a few other modules in an Ender build. Can I integrate Bonzo into the Ender build on the browser for repeat visitors?

Wait, what's Ender?

Let's step back a bit and start with some basics. The way I generally explain Ender to people is that it's two different things:

  1. It's a build tool, for bundling JavaScript libraries together into a single file. The resulting file constitutes a new "framework" based around the jQuery-style DOM element collection pattern: $('selector').method(). The constituent libraries provide the functionality for the methods and may also provide the selector engine functionality.
  2. It's an ecosystem of JavaScript libraries. Ender promotes a small collection of libraries as a base, called The Jeesh, which together provide a large portion of the functionality normally required of a JavaScript framework, but there are many more libraries compatible with Ender that add extra functionality. Many of the libraries available for Ender are also usable outside of Ender as stand-alone libraries.

Continue reading this article on DailyJS.com

Towards Bean v1.0 (or: How event managers do their thing)

Bean is the event manager included in Ender's starter pack, The Jeesh. If you want to do jQuery-style bind(), on() etc. with Ender, then use Bean.

At the time of writing, we're on version 0.4.11. There's also been a 0.5-wip ("work in progress") branch for a while now that's included some improvements I've been holding off for a major release. I also put together a 0.5 milestone on GitHub with some ideas. The major item impacting on the external API is a switch to the on() argument order found in Prototype, jQuery and Zepto. Considering the significance of the changes in the new branch, I think that perhaps a 1.0 release would be warranted.

Delegated on() argument ordering

Until now, Bean's add() has followed the same argument ordering as jQuery's bind() for standard events, and delegate() for delegated events; so the signature looks something like this: .add([selector, ]events, handlerFn) (.on() exists in the Ender bridge and does the same thing). The proposal was to change this to match the other major libraries', arguably more sensible, .on(events[, selector], handlerFn). This is now in the 0.5-wip branch.

Performance

Speed was another issue that I wanted to address for a new major release. Benchmarks have shown that Bean is under-performing in some areas and I believed it could do better. The process of analysing and addressing Bean's performance has been quite instructional and I've narrowed it down to some key trade-offs that authors of event libraries have deal with. One of the reasons I wanted to write this post was to outline some of these and solicit some feedback from the wider Bean-using community.

Performance trade-off #1: record keeping

When you call Element.attachEvent() (IE8 and below) or Element.addEventListener() (new browsers) you pass in a handler function that's called when the event in question is triggered. To stop that function being triggered you have to call Element.detachEvent() or Element.removeEventListener() and pass in that same function so the browser knows which handler you want to remove. Event managers like Bean and jQuery make that easier so you can do things like bean.remove(element, 'click') to remove all handlers; but Bean needs to know which handlers it needs to remove so it must keep records. The biggest change back in v0.4 of Bean was a switch to an internal registry that didn't molest DOM elements, external objects or external functions to attach identifiers so they could be later recalled. Previously, a uid property was set on each DOM element that you set a handler on and your handler function itself had a uid property set on it. jQuery does this too, it has a global jQuery.guid integer that it increments and attaches to pretty much everything. Don't be surprised when you find a guid property on your object/function/element once jQuery has got its fingers on it. This type of record keeping is fast and easy, but molesting other people's objects isn't very cool and there are alternatives.

My first major contribution to Bean was to switch it over to a registry similar to the one Deigo Perini has implemented in NWEvents. Bean now iterates and compares rather than looking up directly. It adds some overhead but I managed to squeeze in enough performance gains in other areas to make v0.4 generally faster than v0.3 even with the registry switch.

Performance trade-off #2: synthesising the Event object

The DOM Level 3 Events specification outlines a base Event object interface, along with specific event types that extend this and add extra attributes and methods. This is the object that you get when your event handler is triggered by the DOM, it's the object that you read keyCode from for keyboard events and the object that you call preventDefault() and stopPropagation() on.

The problem we have is that nobody actually implements the full spec as-is and we also have to deal with older browsers which have all sorts of interesting attributes and methods on their Event objects. The stand-out difference is that in IE8 and below, instead of calling Event.preventDefault() to prevent the default browser behaviour (e.g. following a link click or accepting a keypress), you have to Event.returnValue = false. And, instead of calling Event.stopPropagation() to stop the event from bubbling up the DOM to parent elements, you have Event.cancelBubble = true.

So, the standard practice is for event managers to either create an Event object for you and set up the properties and methods based on the underlying actual Event object (as in Bean, jQuery and most others), or fix the Event object (as in Prototype). The performance trade-off here is that this is not cheap to do, especially for every event you need to react to. But there are ways to speed it up.

In Bean v0.4 we introduced a property "whitelist" which provided significant performance gains. In v0.3 and prior, Bean would try and copy every property and method that it found on the original Event object over to a new object ({}). It turns out that accessing some of these properties on some browsers comes with a significant performance penalty, and often you just don't need them because they are specific quirks of individual browsers. Since v0.4, Bean has been only looking at a list of properties that it expects to find on particular types of event objects and ignoring the rest. In the 0.5-wip branch, I started caching event "fixers" for each event type as they were encountered, so it's a little faster to figure out exactly what needs to be done as events are triggered.

But, it's still costly, so that's where the next performance trade-off comes in.

Performance trade-off #3: hijacking event handler management

Given that synthesising the Event object is so costly and you end up doing it multiple times for a single event if you have more than one handler for that event, event managers have a trick up their sleeve to alleviate the pain. NWMatcher, jQuery and others don't directly attach your event handler to the DOM, instead, they attach a single internal handler that is responsible for triggering any number of handlers you register for a given event on a particular element.

Consider the following code:

for (var i = 0; i < 100; i++) {
  $('#el').bind('click', function () { console.log(i) })
}

This code would work in Bean and jQuery, the difference is that Bean v0.4 and prior adds 100 handlers directly to the DOM element to listen to that event while jQuery adds just one and iterates over the others when the event is triggered. The new version of Bean does the same.

The reason this helps with performance is that we don't have to make a new Event object each time the event is triggered, we can reuse the same one across handlers.

There's another major advantage to this approach, and perhaps a more important reason to implement an event manager this way: you get to hide some odd browser quirks. As Kit Cambridge noted recently, older versions of Internet Explorer generally fire their handlers in LIFO order, yet W3C specs for addEventListener() specifies FIFO order! In fact, it's even worse because the Microsoft documentation says that they may actually be triggered in random order! But, if you only have a single real handler then you get complete control over order.

The benefits go further though, we get to implement some nice features that are completely missing from older browsers and even some current browsers. The most notable is Event.stopImmediatePropagation(). This is a method that was introduced with DOM Level 3, so it's missing from IE8 and below, but surprisingly it's also missing from the current version of Opera! Perhaps the pressure is off because jQuery implements it as part of their relatively complete DOM Level 3 Events implementation using this single-DOM-handler method.

stopImmediatePropagation()

Bean has included a custom Event.stop() method since v0.4, it's modelled off the same method in Prototype. It's also found in MooTools and and some other libraries. This method combines both Event.stopPropagation() and Event.preventDefault() in a short and sweet little utility method. But, "stop" is slightly misleading, because you can stop the default behaviour of the browser and you can stop the event bubbling up the DOM, but you can't stop other event handlers for this event at this element from firing. That's where the new Event.stopImmediatePropagation() comes in: it halts the processing of the event handler list for the current event at the current element (i.e. it can be used at any point in the bubbling process and it'll stop processing just the handlers at the element it was called at).

If an event manager takes the single-DOM-handler approach, it has to care about stopImmediatePropagation() because it no longer has an affect in the browser since the browser only has a single handler to worry about. But, you also get the benefit that it now applies to any browser the event manager supports.

At the time of writing this article I haven't decided whether I think that Event.stop() should also bundle Event.stopImmediatePropagation(). I'm leaning towards including it because "stop" should mean stop and the combination of all three methods would certainly do this.

List of changes for Bean 1.0

on() argument ordering: the new signature is now .on(events[, selector], handlerFn), which will work on both Bean as a standalone library and when bundled in Ender. In Ender, the following aliases also pass through on() so the same arguments work: addListener(), bind(), listen() and one() (which of course will only trigger once). Plus all the specific shortcuts such as click(), keyup() etc. although these methods have the first argument hardwired.

add() is left intact with the same argument ordering for standalone Bean and delegate() has the same signature, the same as jQuery's equivalent.

off() is the new remove(): although remove() is still available in standalone Bean.

Bean attaches a single handler to the DOM for each event type on each element: as outlined above, Bean will iterate over all handlers for each triggered and (mostly) reuse the same Event object for each call.

Event.stopImmediatePropagation(): is available across all supported browsers, it will stop the processing of all handlers for the current event at the current element (i.e. the event will still bubble).

The selector engine argument to add() is now completely removed: you used to have to pass a selector engine in as the last argument for delegated events. Now you must set it once at start-up with setSelectorEngine(). This is automatically taken care of for you in an Ender build.

A duplicate-handler check is no longer performed when you add: performance testing showed that this was a massive slow-down and is simply not something that Bean should be responsible for. If you want to add the same handler twice then that's your business and responsibility.

Namespace matching for event fire()ing now matches namespaces using an and instead of an or: so for example, firing namespaces 'a.b' will fire any event with both 'a' and 'b' rather than either 'a' or 'b'. This is compatible with jQuery and is arguably a much more sensible and helpful way to deal with namespaces. You can find some discussion on this on GitHub.

Lots of internal improvements for speed, code size, etc..

Deconstructing performance (benchmarks)

We've had a benchmark suite since v0.4 to help measure the impact of changes, so I've extended it to help compare some versions of Bean. The benchmarks use benchmark.js.

There are 3 versions of Bean included here:

  • Bean 0.4: The current release of Bean, specifically version 0.4.11-1, source here.
  • Bean 0.5a: An unreleased version of Bean in the 0.5-wip branch. Specifically most of the changes listed above are included here except for the single-DOM-handler change. This is here to assess the impact of this change and deciding whether it's a worthwhile "improvement". Source here.
  • Bean 1.0a: The main difference between this and 0.5a is the single-DOM-handler change. Source here.

I'll have some notes about my own analysis of these numbers below but first I should mention that these benchmarks are not particularly helpful in showing how the libraries perform with real use patterns. I consider them to mainly be proxies for identifying the performance of particular behaviours within the libraries. You'll note that there are a lot of tests for add() / on(), that's simply because that's the easiest to test reliably and also because I haven't been bothered coming up useful with tests for other things. It's very difficult to test the actual event triggering which would be the most interesting bit, although the fire() tests give us a little bit of insight. The tests at the bottom try to capture a full add/fire/remove cycle, but even this isn't even particularly helpful. These benchmarks can be found in the Bean repo so if you want to tinker then feel free, I'd love to have additional input.

So, more so than most benchmarks, take these with a very large grain of salt or two!

(The numbers are ops/sec, so higher is better in all cases)

Chrome

Bean 0.4 Bean 0.5a Bean 1.0a NWEvents jQuery
add(element, event, fn)25,76066,580185,14718,133142,161
add(unique element, event, fn)33,02499,20836,48118,63450,554
add(element, custom, fn)28,72856,607165,18911,248119,593
add(unique element, custom, fn)36,15078,26034,30824,40944,761
add(element, event.namespace, fn)30,08264,435189,468136,486
add(unique element, event.namespace, fn)33,702101,91534,67833,637
add(element, selector, event, fn)25,18042,274119,3392,90976,171
add(unique element, selector, event, fn)27,32891,15630,3081,06935,696
add({})15,59427,31259,434
fire(event)5764926,8609,79721,821
fire(custom)165,222164,418161,243240,96186,291
fire(namespace)29,74228,72127,666
element add / click / remove18,57917,42514,7601,7482,775
element add / fire / remove31,23028,34415,8021,1272,763
object add / fire / remove58,92753,13949,549107,70018,619

Firefox

Bean 0.4 Bean 0.5a Bean 1.0a NWEvents jQuery
add(element, event, fn)20,40445,030100,54613,82663,159
add(unique element, event, fn)16,70867,41719,62516,81029,130
add(element, custom, fn)16,69142,601134,53513,36859,774
add(unique element, custom, fn)24,15955,31221,23513,47527,877
add(element, event.namespace, fn)17,41453,639101,42755,321
add(unique element, event.namespace, fn)23,73559,75122,03427,576
add(element, selector, event, fn)18,76654,57192,6022,31736,753
add(unique element, selector, event, fn)22,09456,02616,70596422,102
add({})9,12617,10432,093
fire(event)2602663,3913,12011,154
fire(custom)61,84559,95061,74293,03345,978
fire(namespace)28,91027,37923,127
element add / click / remove7,6446,2206,0051,2844,845
element add / fire / remove11,28810,9547,4587889,115
object add / fire / remove45,16537,93437,30638,09712,490

IE9

Bean 0.4 Bean 0.5a Bean 1.0a NWEvents jQuery
add(element, event, fn)925944209,7144,321117,343
add(unique element, event, fn)13,559113,94410,5683,01258,929
add(element, custom, fn)9461,004219,6314,329128,570
add(unique element, custom, fn)7,557123,28812,6203,19132,610
add(element, event.namespace, fn)88082687,93253,737
add(unique element, event.namespace, fn)11,823103,97712,00128,053
add(element, selector, event, fn)65580257,61938221,159
add(unique element, selector, event, fn)11,64996,59711,40413924,756
add({})534917,735
fire(event)290,543286,385293,54771,39622,794
fire(custom)229,241223,189216,94378,39523,081
fire(namespace)17,50711,84816,018
element add / click / remove10,2289,6979,2604788,345
element add / fire / remove13,06210,58718,5771556,094
object add / fire / remove30,92429,09628,90439,7617,634

First, let me say that the IE results don't make a whole lot of sense so I'm going to suggest that the Chrome and Firefox benchmarks are the best indicators of general performance characteristics across browsers. The IE results have similar patterns to the others but there's way too much strangeness in there for me to take them seriously! IE8 has difficulty running all the benchmarks without locking up and I don't care enough to persevere there so I'm ignoring that too. Safari crashes and Opera has very similar results to Firefox and Chrome.

(Just to clarify, it's only the benchmarks that have trouble running in older versions of IE, the Bean test suite still runs on IE6 and above and has been beefed up even more in the 0.5-wip branch.)

Some observations

  • The gains for add() from Bean v0.4 to v0.5a are largely from removing the duplicate handler check.
  • The reason for the duplicate tests for "element" vs "unique element" in the add() benchmarks is to demonstrate the costs and benefits involved the single-DOM-handler model. You can see that the numbers switch between the non-unique / unique tests for Bean v0.5a and v1.0a. Also jQuery suffers significantly when you feed it unique elements because it has to add DOM handlers each time.
  • The poor performance for Bean v0.4 and v0.5a in fire() benchmarks is mostly attributed to Event object synthesising, rather than the speed of the browser-native handler list management. This is important because firing native-style events (e.g. fire('click'), which is what we're testing here) is not a common activity but we're having to synthesize the event object each time a handler is triggered. So, this is where Bean finds the most win in switching to a single-DOM-handler model.
  • Bean loses performance between v0.5a and v1.0a in the unique element add() benchmarks, this can mostly be explained by the overhead of managing the root handler that it needs to attach to the DOM. The handler is stored in the internal registry and each time you add() it needs to work out if you already have a root handler attached to the DOM or not for the given event / element. jQuery gets to take some shortcuts by polluting the DOM and handler functions with guid properties. However, the numbers suggest to me that there is some additional performance that could be squeezed out of Bean in this area.
  • Bean is fairly liberal with its whitelist of properties to copy from the original Event object, jQuery is a bit more restrictive with its similar system, this may slow Bean down very slightly.
  • Delegated events are not represented well here, but the results would be very interesting because of the additional work required.

File size

A lot of users of Bean are file-size-sensitive, so it's important to highlight that there are costs to these performance improvements. Minified, gzipped, the sizes for each of these versions of Bean are:

Bean 0.43870 bytes
Bean 0.5a3959 bytes
Bean 1.0a4176 bytes

I've tried really hard to keep the size under 4kb but the additional overhead in managing the single-DOM-handler is too much to achieve that, even though I've managed to shave many precious bytes off in other areas of the code in the process (which unfortunately can't be seen in these numbers!).

We're still well under the minified, gzipped size of the jQuery events module by itself, even though we implement very similar functionality and jQuery gets to leverage lots of internal sugar not contained within the events module.

Request for feedback

After all that, what I really want is feedback! At this point I'm happy to release a proper version 1.0, I think it's major enough to warrant a jump past 0.5. I'd really like to hear feedback from people that have doubts that the changes are worth it, particularly the single-DOM-handler change.

Using the 1.0 pre-release

I've started using it in production and am very happy with the results so far, I'd love to have feedback from anyone else who wants to give it a spin.

The new version of Bean is in npm with the tag dev so you can include it in your Ender builds by referring to bean@dev as the package name.

For stand-alone, you can grab it from the 0.5-wip branch on GitHub.

Thanks for getting this far!

mod_geoip2_xff update

Thanks to a contribution from Kevin Gaudin, I have a new release of my mod_geoip2 fork. (The history starts here.)

You can find the source here: https://github.com/rvagg/mod_geoip2_xff

Kevin's addition provides a fall-back to the standard remote IP address of the client if no public IP address is found in the X-Forwarded-For header. Previously, my implementation just fell back to the default mod_geoip2 behaviour of just taking the first IP address in the X-Forwarded-For header, or the last if you set GeoIPUseLastXForwardedForIP in your config.

I also took the opportunity to clean things up a little and introduce a config option to turn on the special X-Forwarded-For handling. You now have to set GeoIPUseLeftPublicXForwardedForIP to On to activate it.

Thanks to Kevin, and additional contributions are welcome!

Update July 7th 2012: Since I was in C-mode, I went ahead and implemented something I've tried to get working in the past: hostname lookups on the X-Forwarded-For host! I got intimate with APR and worked out how to use Apache to do the resolution so there isn't the lengthy timeout of raw syscalls. If you set GeoIPEnableHostnameLookups to On, you'll get a GEOIP_HOST environment variable to use.

I've also decided to start making tarballs available off GitHub for your convenience: https://github.com/rvagg/mod_geoip2_xff/downloads

Data URI + SVG

Data URIs are great when you want to serve small resources that there's no point serving up in a combined sprite. Consider microjs.com which serves up an HTML file plus a single JavaScript file containing the latest data used to build the site. The build logic is in an embedded script, the CSS is also embedded, so it's pretty lean considering what you see and the amount of data displayed. But, notice the 3 icons for each project, 2 GitHub icons and a Twitter icon. They are PNG images, combined as a sprite but to avoid an additional HTTP request to fetch them they are simply embedded in the CSS which is embedded on the page:

.title .stat span {
  background-image: url("data:image/png;base64,iVBORw0KGgoAAAANSUhE...
}

Easy and quick and fairly well supported across browsers.

But Data URIs can do so much more, including embed SVG!

url("data:image/svg+xml,<svg viewBox='0 0 40 40' height='25' width='25'
xmlns='http://www.w3.org/2000/svg'><path fill='rgb(91, 183, 91)' d='M2.379,
14.729L5.208,11.899L12.958,19.648L25.877,6.733L28.707,9.561L12.958,25.308Z'
/></svg>")

The above will produce a 25px square image but the SVG is drawn in a 40x40 coordinate box, because I'm using a Raphaël Icon paths (you can try it yourself by replacing the d='' content with the path data you get when you click on any of the icons on the Raphaël Icons page.)

SVG of course gives you perfectly scalable graphics, embedding in a Data URI in your CSS lets you use them in the same way that you use other CSS images, minus the need to fetch them via an additional HTTP request.

What's the catch?

It's the web, of course there's a catch, and of course it involves Internet Explorer!

For a start you don't get SVG support in IE8 and below, which is a bit of a problem right now because IE8 is still very much with us due to the fact that IE9 isn't available for Windows XP users. But there's more than that. IE adheres to the spec more strongly than other browsers in that there are 2 types of encoding for Data URIs, base64 and non-base64. If you leave the ;base64 off your string then most browsers let you get away with anything that doesn't conflict with standard CSS, so basically don't use ", or if you do, escape them with simple \". What the Data URI spec says is:

...the data (as a sequence of octets) is represented using ASCII encoding for octets inside the range of safe URL characters and using the standard %xx hex encoding of URLs for octets outside that range.
And IE doesn't let you have it any other way. So you either encode your SVG into Base64 or escape it with %xx's, which kind of loses some of the elegance of SVG in CSS. But at least you'll get IE9+ support.

So here's some examples to fiddle with. Click through to the CSS tab to see the gory details. The first icon is Base64 encoded, the second icon is URL escaped (%xx), the rest are just plain SVG, so you'll get different results viewing in IE9 vs the rest.

SVG in Data URIs is an elegant solution (and a bit of fun) but only really useful at the moment if you don't need to support IE8 and below.

Update 17th Sept 2012

Below in the comments, Ben reports on his (much more rigorous) research into browser support; refer to that if you're serious about using SVG in Data URIs. An interesting result of his work comes from the issue he filed with Chromium (I don't know if this is a generic WebKit thing or not but you could easily test if you're interested). It turns out that Chromium/WebKit requires Base64 Data URIs to be multiples of 4 characters, so you just need to pad with ==.