Waveform GUI: neuroscience in Chrome

During my PhD I used bundles of electrodes to record the voltage fluctuations generated by neurons in the brain.   I’m not going to describe the details of the science here, instead I want to focus on the graphical user interface I ended up building.

As of this afternoon, I now have a video demo to show off the application to researchers in my field; the demo may also be of vague interest to the lay web-developer and/or lay scientist, although I’d be surprised if either understands much of what is being demonstrated.

Motivation and design principles/features

I created this application because I was so intensely irritated by the old piece of software, which I would otherwise have had to use.  Every time I clicked the tiny buttons in that other program and waited for it to respond to my actions I felt like I was in some kind of UX purgatory where the devil was laughing over my shoulder, hoping that I would snap and lash out at something.  (That is roughly the image I had in mind for a large part of my PhD.)

Anyway.  This application is a lot better: rather than having to press 10 buttons (or maybe it was 7, I did count them once) in order to get a single plot, here you can get a whole page of plots using just ctrl+a followed by a single drag and drop.   Compare eating your morning cereal with a toothpick, versus wolfing it down with a soup ladle in each hand.

The initial loading is not the only thing that’s easier: all the various files are automatically organised in a graphical table that lets you jump between any two data sets with a single click.   And although these data sets can each be 10MB+, I worked hard to ensure that they typically load and display in less than about 400ms.

You can also change various parameters with a single click: no need to force a redraw as everything is automatically re-rendered.  This minimal-clicking rule applies to the cluster-assignment editing too: merging is done with one drag-drop; splitting with two clicks; and swapping can be done entirely from the keyboard when the mouse is in position.  The greatest triumph of this anti-hand-eye-coordination crusade was the “invention” of the spacebar+click paradigm:  when you hold down space (which is by far the easiest key to hit) various bits of the UI become clickable, with clicking usually causing the thing to be copied or closed/deleted.   Normally these actions would require clicking through a tiny drop-down menu, toolbar button or, some sort of mouse action followed by multi-key shortcut (not that I have anything against keyboard shortcuts, but this is even easier).

The nice thing about building your own application is that as you discover usage patterns, you can add tools to make your life even easier.  For example, I added a tool for copying plots to the clipboard so that I could make notes with absolutely minimal effort: the plots come with a caption, so there’s very little left to say. Another example: when running experiments I often needed to know the time that the previous experiment had finished: you could work this out manually from the start time and duration info in the header, but by creating a little info widget to display this information automatically, suddenly experiments were that little bit less tedious.  Also, in cases where I had to choose a default from two or more clashing files, I was able to get the program to automatically use the most recently modified file (i.e. rather than one higher up the alphabet, or one matching some preset filename pattern); the user could override the default with a single click, if needed, but in most cases the most recently modified file is the one of interest.

In summary, the main aims were to optimized away redundant clicking, and redundant accuracy in pointing.   Jumping between different datasets had to be easy to request, and fast to execute.  And it had to be possible to put tons of plots on the screen all at once.

How does that help?

Now that data exploration requires a lot less time and effort, the experimenter can spend more resources on other aspects of research. They can also increase the volume of data they collect, and be generally more thorough in their data exploration: the more you shown on screen in one go, the easier it is to spot interesting features in the data, both as a novice and as an experienced user.

A bit about implementation

The application is built in Chrome, using JS and HTML, with most of the file IO and plots being done in WebWorkers off the main thread.   The wave data itself, which can often be over 10MB (as shown in the demo), is sent down to the GPU for rendering with WebGL: I wrote a couple of fairly simple kernels to do this.

The application evolved in fits and starts over the course of a couple of years.  At various stages I spent quite a bit of effort staring at the profiler in Chrome to try and squeeze as much as possible out of the hardware.  Unlike with a commercial webapp, my life was relatively simple in that I was able to focused exclusively on Chrome and assume that users have a reasonably powerful computer.   The profiler really helped when, for example, I needed to prepare the waveform data for the GPU.  The operation I had to perform was (in terms of memory layout) a bit like a matrix transpose: using Chrome’s profiler I discovered that this operation was taking several hundred milliseconds, and that if I re-wrote the loop I could reduce the execution time by a factor of about 4: the trick was to read from memory contiguously even though this meant writing out non-contiguously.  I was also able to play with a loop converting from Int8 to Uint8:

var Int8ToUint8 = function(A){
    // Takes an int8array, A, adds 128 to each element
    // and views it as a uint8 array, in place.
    // We are XORing each byte with 128, which in hex
    // is 0x80...here we do it with 4 bytes at a time.
    A = new Uint32Array(A.buffer); 
    for(var i=0;i<A.length;i++)
        A[i] ^= 0x80808080;
    return new Uint8Array(A.buffer);

This takes about 10ms for the typical 10MB of data that the program deals with.

There were a lot of ways to actually organize the drawing on the GPU. I ended up doing the following:

render waves

In words: we draw all the lines from (t, y(t)) to (t+1, y(t+1)) for all waves at fixed t, before moving on to the next t. This requires us to rearrange and (almost) duplicate the y data, before moving it to the GPU (which is what I mentioned above), but once on the GPU it is incredibly easy to use. The cluster assignment data is also duplicated, but there’s only two copies of it per wave, rather than one for every single point on the wave; this makes it fast to update if we modify clusters.

The first stage of the kernel is to count the number of lines crossing every pixel, we use GL_FUNC_ADD as the blend equation mode:

attribute float voltage; // y-data for line at t and t+1
attribute vec2 wave_xy_offset; // position of group on canvas
uniform mediump float t_x_offset; // x offset for t
attribute float is_t_plus_one; // 0 1 0 1 0 1 0 1 ...
const mediump float delta_t_x_offset = 2./512.;
uniform highp vec4 count_mode_color; // small number
varying lowp vec4 v_col;
void main(void) {
    v_col = count_mode_color; // in palette-mode this is trickier
    gl_Position.x = wave_xy_offset.x + (t_x_offset + 
    gl_Position.y = wave_xy_offset.y + voltage*y_factor;
    gl_Position[3] = 1.;

See, it’s not that complicated, right? I tried doing it a few different ways before this, but the simplicity of this method seems to help with performance.

Reflections on browser-based science

The browser is an amazing tool for building complex user interfaces quickly, especially if you’re already familiar with a good framework (such as polymer, which I started using towards the end of this project).
In terms of parallelism support: you have basic multithreading with WebWorkers (although I recommend using a helper library such as my BridegedWorker); you have basic GPU rendering, and to some extent compute (I managed to write a distance matrix computation in WebGL at one point, but it was unpleasant and I never fully understood all the rules about precision); you have basic compiler optimizations through V8….but all those things are decidedly “basic”, meaning that you have to work hard to get good performance if you need it, and ultimately there are serious limitations in quite how well you can really utilize the hardware. For example, there’s currently no SIMD support or true shared memory/atomics on the web, though that is due to change soon. Perhaps when it does we will get some decent numerical libraries, like those that we take for granted in Matlab/Numpy/C++ etc.

In summary, I’m fairly confident that the browser is going to become a better and better place to be manipulating serious data sets in a user-friendly way. (This is not a novel observation, but I wanted to say it in light of my own experience.)


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s