Back from our first Vispy code camp at ESRF

We had our first official Vispy Code Camp this week. I and the other core developers of Vispy were kindly invited by the European Synchrotron Radiation Facility. We presented our young library to software engineers from the ESRF and other European synchrotron facilities. It was also the occasion for us to make a gentle introduction to modern OpenGL, as many attendees didn’t have experience in real-time GPU rendering. We discovered various scientific use cases in need of high-bandwidth, low-latency real-time visualization of big data.

This was also the very first time we all gathered to work entirely on the project. We made productive use of our time together, discussing code architecture and design during most of the days and evenings. In particular, we had a close look to Luke Campagnola‘s amazing work realized during the weeks before the meeting. Luke managed to digest all our prior discussions about the core layers of Vispy (visuals, transforms, scene graph, shader composition). He designed a very solid and promising system that does not sacrifice speed for flexibility. We also discussed many other aspects of the library and the project. Here is a summary.

Abstraction levels for interactive visualization

The core idea of Vispy is to offer different abstraction layers for high-performance interactive visualization. There is a huge gap between what a scientist wants to display (in terms of data and plots), and the OpenGL API. This is no different to the gap between a high-level language (such as Python or Haskell) and assembly code. Computer science is fundamentally based on this idea. In terms of high-performance interactive visualization, we think that the gap has yet to be filled.

Interactive visualization deals with visualization on the one hand, and interactivity on the other hand. What are convenient abstraction levels for these two ideas? This is probably an open question in general. With Vispy, we’ll be offering one among many possible solutions. Importantly, we will also design modular building blocks for epxerimenting different types of abstractions.

As far as visualization is concerned, we plan to design:

  • vispy.gloo: an object-oriented interface to the core features of modern OpenGL
  • vispy.visuals: an object-oriented reactive interface for various 2D and 3D visuals
  • vispy.shaders: an architecture for modular GLSL shaders
  • vispy.transform: a flexible system for handling various linear and non-linear coordinate systems (Cartesian, polar, log, geographical map projections…), with support for GPU acceleration when needed
  • vispy.scenegraph: a flexible and efficient scene graph that is designed with big data in mind
  • high-level plotting interfaces

There is obviously a tremendous amount of work down the line, but we start to have a good idea about how we’ll organise these modules. Hopefully, we’ll be able to reach a critical mass of code and contributors required for the realization of this project.

Renaud Blanch from UJF suggested that we start thinking about something similar for interactivity. Typically, user interactivity is implemented at the lowest level possible: mouse movements, key strokes, etc. Higher-level abstraction systems may allow end-users to design interactive visualizations in a more intuitive way. There happens to be a whole range of research about this topic (human-machine interfaces).

Object-oriented OpenGL

Vispy.gloo is the main module implemented at this time. It is supporting our whole visualization stack. We discussed some relatively minor changes suggested by Nicolas Rougier in order to make the interface even simpler and cleaner. This object-oriented interface is already extremely convenient for us and other OpenGL developers. In effect, it allows us to focus on the what instead of the how. We define vertex buffers, textures, variables, we write the shaders in GLSL, and we render the OpenGL programs. All that with a Pythonic interface.

Visuals

The visuals, transforms, modular shaders, and scene graph, are very much work in progress right now. We discussed these layers extensively during the code camp.

The visuals layer is one abstraction level above vispy.gloo. A visual is an object appearing on the scene. At this level, we start to get closer to the user’s mind. Vispy will eventually come with a library of common visuals: polylines, geometric 2D shapes, 3D meshes, Bézier curves, surfaces, etc. Those visuals will be extendable. Importantly, users will be able to write their own visuals for complex use cases. They will have to learn the basis of modern OpenGL, and notably GLSL. We plan to provide very solid documentation on this subject. That being said, the core ideas are relatively simple.

Visuals will come with a reactive object-oriented interface: properties of a visual may be updated by changing instance attributes in Python. The according OpenGL commands would be automatically called under the hood.

Using a small subset of the SVG specification for common shapes may also be an interesting idea.

Linear and non-linear transformations

With transformations, we allow visual objects to be organized in different coordinate systems. We tried to base this module on the mathematical notion of bijective function. After all, transformations are merely more than the mathematical composition of direct and inverse functions. Linear transformations can be expressed as matrix multiplications, but we don’t enforce this in order to support non-linear transformations out of the box.

Modular shaders

Offering the possibility to write and organize modular shaders is one of the main challenges of the project. GLSL is a pretty low-level language, describing how vertices and fragments (i.e. pixels) are processed on the massively parallel GPU architecture. Shaders can become quite complex in real-world use cases. Yet, there are many recurring patterns in shaders. By allowing users to design shaders from compact building blocks, we drastically simplify the task of creating complex extendable visuals (DRY principle). We also need these features internally for the transforms.

Luke came up with a pretty amazing modular system that seems to encompass all our use cases. The amount of programming wizardry involved is quite stunning. Nicolas pointed out that we were basically creating a new language on top of GLSL along with a dynamic compiler to GLSL. Although many details remain to be worked out, I think we have here a brilliant system that will prove vastly useful for the whole project.

Scene graph

With the visuals, the transforms, and the modular shader, we have everything we need to build a flexible and efficient GPU-aware scene graph. The idea of the scene graph is very classic: there is a hierarchy of visual objects that are linked by specific transforms. Imagine, for example, a scientific interactive figure with multiple subplots (grid layout). There are multiple coordinate systems involved. For maximum performance, transforms may happen on the CPU or the GPU depending on the use cases. For instance, static transforms may be computed and cached once on the CPU, whereas it may be more efficient to perform dynamic transforms on the GPU.

High-level interfaces

The layers described above constitute the internals of Vispy, and most users won’t be aware of them. Eventually, we’ll need to implement high-level interfaces for scientific plotting. Even if we could implement a brand new interface, it will be safer to implement existing high-level APIs.

We talked a bit about the different possibilities, starting with the MATLAB/matplotlib.pyplot interface. Although this interface is admitedly clunky, many scientists are used to it. We could either reimplement the most important functions, or find a way to leverage the existing implementation in matplotlib. One interesting direction is Jake Vanderplas’ current work on an exporter for matplotlib figures. The idea is to export a plot in a language-independent representation, so that it can be easily displayed with another backend (such as Vispy).

Of course, there are alternative interfaces and plotting libraries that we could take inspiration from: seaborn, vincent/vega, bokeh, plot.ly, etc. Even if we start thinking about these issues now, we’re currently focusing on the core layers, keeping in mind plotting use-cases.

Vispy in the browser

A longing feature is the ability to run Vispy in the browser. The main use case is the IPython notebook. I’ve thought a lot about the different ways to achieve this. We discussed many of these ways during the code camp, and I think we made some progress.

First, it should be relatively easy to implement an online backend. A Python server would stream OpenGL commands straight to the browser through WebSockets. This would enable interactive visualizations embedded in live IPython notebooks.

In parallel, an offline backend would be even more interesting, but highly challenging. The idea is to compile a visualization written in Python to a standalone HTML/Javascript interactive document.

After exploring multiple ideas, I’m now thinking that the cleanest way of bringing Vispy to the browser would be to:

  • Allow serialization of visuals, entities, transforms, scene graphs, interactivity.
  • Implement an interpreter in Javascript for displaying serialized visualizations.

This represents a significant amount of work, notably the first part. But we can do it progressively. The interpreter would be much less complex than Vispy itself, mainly because Python would still be responsible for the most complex part, that is, initialization of the scene with user-provided code.

Desktop and ES OpenGL

An issue we discussed a lot relates to the different flavors of OpenGL. We currently limit the set of features to OpenGL ES 2.0. This light implementation of OpenGL works on desktop and mobile devices, as well as in the browser. Having a single implementation makes it easier to share code between different devices and platforms. However, OpenGL ES 2.0 lacks a few interesting features that do exist on many desktop systems. We have yet to find a convenient system for enabling explicitly non-ES features.

OpenGL wrapper, ANGLE

Almar Klein has been busy in the train implementing his own OpenGL ES wrapper in Python with ctypes, thereby bypassing PyOpenGL. He also succeeded in using ANGLE on Windows with this wrapper, bringing modern OpenGL to most Windows users. ANGLE automatically translates OpenGL API calls to DirectX. This is quite an useful feature for those Windows users who only have the default OpenGL 1.1 implementation in Windows. This will considerably simplify the distribution of OpenGL-based applications to Windows users.

Image registration for continuous integration

Eric Larson has set up a continuous integration system for Vispy with Travis CI. He also did some great work improving our testing suite. However, we have yet to check the bitmap output of rendering tests. A difficulty lies in the fact that different OpenGL implementations do not result in pixel-perfect results. We started some preliminary work to have a look at the discrepancies between images generated by various implementations.

Installation and library dependencies

Even if Vispy is still a pure Python library for now (yet depending on NumPy and PyOpenGL), this might change in the future. In specific instances, we may need to implement complex algorithms in C or Cython. This will complicate the installation, except if we find a way to achieve graceful degradation in the absence of a C compiler or external dependencies. In particular, it seems that SciPy is quite a heavy dependency, and we should avoid relying on it if possible.

OpenCL/OpenGL interoperability

Jérôme Kieffer and Armando Solé from ESRF were interested in combining OpenGL and OpenCL with Vispy. The idea is to allocate a single buffer on the GPU for both visualization and computing. For example, one can create an OpenGL texture, and perform general-purpose computations on this buffer from an OpenCL kernel. This is quite efficient since there is no copy whatsoever between the CPU and GPU.

After fighting against driver and OS-specific bugs of various kinds with OpenCL, we finally managed to enable OpenGL-OpenCL interoperability with Vispy. We have yet to do detailed performance benchmarks with various backends and OpenGL wrappers. We’re also working on encapsulating boilerplate code in a clean Pythonic API.

Out-of-memory visualization with HDF5

I presented a few demos implementing out-of-memory visualization of HDF5 files with Galry. We have yet to port those to Vispy, but there shouldn’t be any particular difficulty in the process. Armando shared with us his long expertise in optimizing HDF5 data access. There happens to be many tricks and techniques to get the most performance out of HDF5 in Python.

A molecular viewer with true impostors

The code camp was also the occasion for some of the participants to implement demos in modern OpenGL, using our object-oriented interface vispy.gloo.

Gaël Goret implemented an interactive 3D viewer of molecules with vispy.gloo. This viewer is extremely efficient: Nicolas suggested to use true impostors. This smart technique consists in using a tiny number of vertices (or even one) per molecule, instead of rendering spheres with complex meshes. Realistic 3D rendering is achieved with a ray-casting algorithm implemented in the fragment shader. Be sure to check out the demo here.

Wrap-up

This was a highly productive meeting, and we’re all quite excited with what’s coming. We’re starting to overcome most conceptual challenges. Code is being written, discussed, tested. More and more people with various areas of expertise are willing to contribute.

We’re also producing more and more documentation materials (when will we see a Vispy book?). This is a fundamental aspect of the project. Indeed, our goal is not only to build a library, but also a knowledge base. Scientists are generally not exposed to modern OpenGL, although this is a decade-old subject (generally the expertise domain of game developers). The high complexity of OpenGL is probably an important reason why OpenGL is still not widespread in scientific visualization. Vispy hides most of this complexity, offering simple and clean interfaces that specifically target scientific visualization. We really want to bring OpenGL to scientists.

So, that’s a wrap. We’re deeply grateful to the ESRF staff for their support, and particularly Jérôme and Armando who decided to invite all of us. This was a fantastic opportunity for the project, and we hope we’ll be able to organize more events like this in the future. In the meantime, the development continues!

PS: the ESRF Data Analysis Unit is recruiting an OpenGL/OpenCL/Python expert for high-performance interactive visualization of big scientific data. Be sure to check out the announcement, and pass the word around if you know potentially interested people!

Hardware-accelerated interactive data visualization in Python

There have been several interesting discussions recently about the future of visualization in Python. Jake Vanderplas wrote a detailled post about the current state-of-the-art of visualization software in Python. Michael Droettboom, one of the Matplotlib’s core developers, consequently wrote about the future challenges Matplotlib will need to tackle. Matplotlib has been designed more than ten years ago, and now needs to embrace modern trends including web frontends, high-level R-style plotting interfaces, hardware-accelerated visualization, etc.

One trend that I find particularly interesting concerns hardware acceleration. Standard graphics cards are now extremely powerful and are routinely used in 2D or 3D video games, web browsers, user interfaces. Mobile devices such as smartphones, tablets, and even the Raspberry Pi include a decent graphical processing unit (GPU). There is no reason why hardware-acceleration should be absent in scientific visualization, where there is a really pressing need to display huge data sets. Graphics cards are now the most powerful computing units in a computer, and they are specifically adapted to fast visualization.

Hardware-accelerated visualization is not always relevant. It is slightly less portable than a pure software implementation. It requires the graphics card drivers to be up-to-date, which is not always the case on non-technical users’ computers. And it is not always necessary, as plots with a reasonable amount of data can still be smoothly displayed without using the graphics card at all. In fact, a clear distinction needs to be made between plotting and interactive visualization. In the former, the emphasis is on the generation of high-quality static plots ready for publication, that represent some short and precise summary of the data. In the latter, the objective is to let the user take a look to the raw data, find out what the interesting statistics can be and if there are unexpected patterns. This step comes before the generation of publication-ready plots, and would much more benefit from hardware acceleration than for common plotting. Existing plotting libraries typically do not make the distinction between these two use cases.

Software

Hardware-accelerated visualization software is still a work in progress. As far as Python is concerned, libraries include: Galry, which I started a few months ago, Glumpy, Visvis, PyQtGraph which has support for hardware-accelerated 3D visualization. All these libraries use OpenGL for hardware acceleration. Let’s also mention Mayavi which is specifically designed for 3D rendering.

I started Galry because I needed to plot very large data sets containing more than ten million points, and no existing library could support such large data. One may wonder why someone would need to plot more than ten million points whereas screens have merely a few million pixels. The answer is simple: to visualize large datasets interactively. For instance, displaying an outline of the whole dataset, before zooming in on the regions of interest. Huge datasets are not rare today: a one-hour long signal sampled at 20 kHz represents already 72 million points (example of an intracellular recording). This gets worse with multi-channel data, like in multi-electrode extracellular recordings. There can be tens and even hundreds of signals in parallel (for example with recordings in the retina). These are typical cases where the graphics card can really be helpful for visualization.

Example: multi-channel recordings viewer

In fact I recently wrote a small prototype with Galry for quick visualization of multi-channel recordings. The data is assumed to be stored in an HDF5 file, which can weigh several gigabytes. Such a large file can merely reside in system memory, even less in graphics memory, so it needs to be dynamically fetched as one navigates within the plot. Reading data from the disk occurs in an external thread so that the interface is not frozen. When zooming out, the signals are automatically and dynamically undersampled. These techniques enable fast and smooth interactive visualization of huge datasets.

Web-based interactive visualization

Another interesting direction concerns web browser frontends. We spend more and more time in the browser in general. The success of the IPython notebook shows that interactive computing can seamlessly happen in the browser. There are excellent web-based visualization libraries out there, like D3 or threeJS. The WebGL standard, although young, brings native, plugin-free hardware-acceleration graphics in the browser via OpenGL ES 2.0. It is supported in most recent browsers. Even Internet Explorer might support it in the future. So the technology exists for bringing high performance interactive data visualization to the browser. There is definitely some work to do here.

In Python, web-based interactive visualization could be done in two ways. In the first, VNC-like approach, user actions are captured by Javascript, transferred to Python which generates a static visualization and returns it back to the browser as a compressed image. In the second approach, the actual visualization happens entirely in the browser. The first approach is probably more generic and simpler to implement, whereas the latter could be useful for static notebooks or webpages. In a data sharing perspective, having a pure HTML/Javascript interactive visualization that does not require a Python backend engine would be quite relevant. Both approaches are complementary.

The future

In summary, the technology for hardware-accelerated interactive visualization in Python and in the browser is there. Different Python libraries have been developed independently, have different code bases, offer different programming interfaces and are not mutually compatible. For these reasons, the developers of these libraries are currently working on a new project that has the goal to allow more sharing of code between these libraries, and to provide interfaces to OpenGL at different levels of abstraction and generality. The long-term goal for this project is to offer a single library for high-performance visualization, be it 2D or 3D, for scientific purposes or not, etc. But for now, we’re focusing on building high-quality modules operating at different levels of abstraction, from low-level OpenGL to high-level plotting and interaction commands. We’ll start with the low-level blocks, so that we can rewrite our different libraries on top of them.

That’s an ambitious and difficult project that will take time, but the core technology and code is already there. The hard part is to create a coherent, flexible and high-quality architecture for supporting different levels of abstraction in a seamless and efficient way. I’m confident that we’ll come up with a solid piece of software at some point. And of course, if you’re interested, you’re welcome to contribute.

Galry’s Story, or the Quest of Multi-Million Plots

About a month ago, I announced here the availability of a new experimental high performance visualization package in Python that I’m developing as part of my current research project. It has significantly evolved since then, but it is still experimental. Moreover, the interface is still not ready for a 0.1 release. I also need to do much more tests on various systems and graphics cards. In this post I’ll talk about how the idea of writing a new visualization package came up in the first place. I’ll also describe the new features that are coming to the library.

After my announcement, I was pleased to see that there were a lot of people interested in this project. There were more than 500 unique visits since then, which is not that impressive but still much more than what I’d have thought! That’s probably because I wasn’t the only one to note that it was simply not possible to plot huge datasets in Python. Matplotlib, probably the most popular plotting library in Python, crashes before displaying a multi-million dataset (at least that’s what I could experience on my machine), or when it works, the navigation is severly limited by an extremely low framerate.

All other plotting libraries I could find had the same issue. The Matlab plotting library appears to be a bit more efficient than matplotlib when it comes to multi-millions datasets, and it may be one of the reasons why many people still prefer to use Matlab rather than Python.

I think many people are doing just fine with matplotlib because they simply don’t work with very large datasets. But that may be going to change, with “big data” becoming a more and more popular buzz word. In bioinformatics, the mass of data becoming available is simply crazy. There’s the whole field of bioimaging of course, but even apparently harmless time-dependent signals can become quite large. Imagine, for example, a neurophysiological recording with an extracellular multi-electrode array with 250 channels, each channel sampling a signal at 16 bits and 20 kHz (this is close to a real example). That’s 10 MB of data per second (5 million points), more than 30 GB per hour (18 billion points) ! A modern hard drive can store that, but processing such a big file is simply not straightforward: it even doesn’t fit in system memory (at least on most today’s computers), and even less in graphics memory. Yet, is it too much to ask to just plot these data?

The typical way of processing this is to take chunks of data, either in space or in time. But when it comes to visualization, it’s hardly possible to plot even a single second across all channels, since that’s already 5 million points!

One could argue that a modern screen does not contain much more than 2 million pixels, and about 2000 only horizontally. But the whole point of interactive navigation (zooming and panning) is to be able to plot the whole signal at first, and zoom-in in real time on regions of interest.

I could not find any Python library that would allow me to do that. Outside Python, I am not aware of such a software either. That’s precisely why I decided to try a new approach, which is to use the graphics card for the whole rendering process in the most efficient possible way. I realized that the only way I could achieve the highest performance possible on a given piece of hardware was to go as low-level as I could with Python. Using a great and light Python wrapper around OpenGL (not unexpectingly called PyOpenGL) seemed like a natural choice. Initial proof-of-concept experiments with PyOpenGL suggested that it appeared to be like a viable method.

That’s how Galry was born earlier this year.

Here come shaders

The library has evolved a lot since then. I had to go through multiple improvements and refactoring sessions as I was using Galry for my research project. In addition, I also had to learn OpenGL in parallel. That was not an excellent idea, since I realized several times that I was doing it wrong. In particular, I was using at first a totally obsolete way of rendering points, which was to use the fixed function pipeline. When I discovered that the modern way of using OpenGL was to use customizable shaders , I had to go through a consequent rewriting of the whole rendering engine. I could have spared me this rewriting if I was aware of that point beforehand.

But it was in the end a very good decision, since programmable shaders are just infinitely more powerful than the fixed function pipeline, and make a whole new bunch of things possible with the package. Not only was I able to considerably improve the rendering part in my research project, but I realized that the same code could be used to do much more than just plotting. Here are a few examples of what I was able to do with the new “shader-aware” interface: GPU-based image filtering, GPU-based particle system, GPU-based fractal viewer, 3D rendering, dynamic graph rendering (CPU-based for now), etc. These are all actual working examples in the Galry repository. I suppose this package could also be used to write small video games!

The following video shows a demo of the graph example. This example incorporates many of the rendering techniques currently implemented in Galry: point sprites (a single texture attached to many points), lines, buffer references (the nodes and edges are rendered using the exact same memory buffer on the GPU, which contains the node positions), indexed rendering (the edges are rendered using indices targetting the corresponding nodes, always stored in the same buffer), real-time buffer updating (the positions are updated on the CPU and transferred on the GPU at every frame). GPU-based rendering may also be possible but it’s not straightforward, since the shaders need to access the other shaders’ information, and also modify dynamically the position. I might investigate this some time. Another solution is to use OpenCL, but it requires to have an OpenCL installation (it can work even if an OpenCL-capable GPU is not available, in which case the OpenCL kernels are executed on the CPU).

[youtube]http://www.youtube.com/watch?v=MLW3i-_yQ-k[/youtube]

About OpenGL

Another thing I discovered a bit late was that OpenGL is a fragmented library: there are multiple versions, a lot of different extensions, some being specific to a hardware vendor, and a lot of deprecated features. There’s also a specific version of OpenGL for mobile platforms (such as the IPhone and the IPad), called OpenGL ES, which is based on OpenGL but which is still different. In particular, a lot of deprecated features in OpenGL are simply unavailable in OpenGL ES. Also, the shading language (GLSL) is not the same between OpenGL and OpenGL ES. There’s a loose correspondence between the two but the version numbers do not match at all. And, by the way, the GLSL language version does not match the corresponding OpenGL version… except for later versions! Really confusing.

The OpenGL ES story is important for Galry, because apparently OpenGL ES is sometimes used in VirtualBox for hardware-acceleration, and it might also be useful in the future for a potential mobile version of Galry. In addition, OpenGL ES also forms the basis of WebGL, enabling access to OpenGL in the browser. I’ll talk about that below, but the point is that in order to have compatibility between multiple versions of OpenGL, I had to redesign again an important part of the rendering process (by using a small template system for dynamic shader code generation depending on the GLSL version).

Also, whereas the shading language is quite nice and easy to use, I find the host OpenGL API unintuitive and sometimes obscure. The Galry programming interface is right there to hide those details to the developer.

In brief, I find certain aspects of OpenGL a bit messy, but the advantages and the power of the library are definitely worth it.

About writing multi-platform software

Python is great for multi-platform software. Choosing Python for a new project means that one has the best chance of having a single code base for all operating systems out there. In theory, that’s the same story for OpenGL, since it’s a widely used open standard. In practice, it’s much more difficult due to the fragmentation of the OpenGL versions and drivers across different systems and graphics card manufacturers. Writing a multi-platform system means that all supported systems need to be tested, and that’s not particularly easy to do in practice: there are a large number of combinations of systems (Windows, different Linux distributions, Mac OSX, either 32 bits and 64 bits), of graphics card drivers, versions of Python/PyOpenGL/QT, etc.

High-level interface

In the current experimental version of Galry, the low-level API is the only interface I’ve been working on, since it’s really what I need for my project. However, I do plan to write a basic matplotlib-like high-level interface in the near future. At some point, I even considered integrating Galry’s code into a matplotlib GL backend, which is apparently something that several people have been trying to do for quite some time. However, as far as I understand, this is very much non-trivial due to the internal architecture of matplotlib. The backend handles the rendering process and is asked to redraw everything at each frame during interactive navigation. However, high performance is achieved in Galry by loading data at initialization time only, and updating the transformation at every frame so that the GPU can apply it on the data. The backend does not appear to have access to that transformation, so I can’t see how an efficient GL renderer could be written in the current architecture. But I’m pretty sure that somebody will manage to make that happen eventually.

In the meantime, I will probably write a high-level interface from scratch, without using matplotlib at all. The goal is to replace import matplotlib.pyplot as plt by something like import galry.plot as plt at the top of a script to use Galry instead of matplotlib. At first, I’ll probably only implement the most common functions such as plot, imshow, etc. That would already be very useful.

Galry in the browser

Fernando Perez, the author of IPython, suggested to integrate Galry in the IPython notebook. The notebook is a relatively new feature that allows to write (I)Python code in cells within an HTML page, and output the result below. That’s quite similar to what Mathematica or Maple offer. The whole interactive session can then be saved as a JSON file. It brings reproducibility and coherent structure in interactive computing. Headers, text, and even static images with matplotlib can be integrated in a notebook. Blog posts, courses, even technical books are being written with this.

I personally heard about the notebook some time ago, but I’d never tried it because I was a bit reluctant to use Python in a browser instead of a console. After Fernando’s suggestion, I tried to use the notebook and I quickly understood why so many people are very enthusiastic about it. It’s because it changes the very way we do exploratory research with numerical experiments. In a classical workflow, one would use a Python script to write some computational process, and use the interactive console to execute it, explore the model in the parameter space, etc. It works, but it can be terrible for reproducibility: there’s no way one can recover the exact set of parameters and code that corresponds to figure test34_old_newnewbis.png. Many people are dealing with this problem, me included. I’m quite ashamed by the file structure of most of my past research projects’ code, and I’ll try to use the notebook in the future to try being more organized than before.

The idea of integrating Galry in the notebook comes from the work that has been done during a Python conference earlier this month, with the integration of a 3D molecular visualization toolkit in the notebook using WebGL. WebGL is a standard specification derived from OpenGL that aims at bringing OpenGL rendering to the browser, through the HTML5 <canvas> element. It is still an ongoing project that may still take months or years to complete. Currently, it is only supported by the latest versions of modern browsers such as Chrome or Firefox (no IE of course). But it’s an exciting technology that has a huge potential.

So I thought it would be quite a good idea and I gave it a try: I managed to implement a proof-of-concept in only one day, by looking at the code that had been done during the conference.

[youtube]http://www.youtube.com/watch?v=taN4TobRS-E[/youtube]

Then, I was able to see what would need to be done in the code’s architecture to make that integration smooth and flexible. The solution I chose was to separate completely the scene definition (what to plot exactly, with all parameters, data, colors, shading code, etc.) with the actual GL rendering code. The scene can be serialized in JSON and transmitted to a Javascript/WebGL renderer. I had to rewrite a portion of the Python renderer in Javascript, which turned out to be less painful than what I thought.

Finally, the WebGL renderer can in fact be used as a completely standalone Javascript library, without any reference to Python. This may allow interesting scenarios, such as the conversion of a plotting script in Python using a matplotlib-like high-level interface, into standalone HTML/Javascript code that enables interactive visualization of that plot in the browser.

About performance

The major objective of Galry is, by far, performance. I found that PyOpenGL can be very fast at the important condition of using it correctly. In particular, data transfer from system memory to graphics memory should be made only when necessary. Also, the number of calls to OpenGL commands should be minimal in the rendering phase.

The first point means that data should be uploaded on the GPU at initialization time, and should stay on the GPU as long as possible. When zooming in, the GPU should handle the whole transformation on the same memory buffer. This ensures that the GPU is used optimally. In Matplotlib, as far as I know, everything is rendered again at each frame, which explains why the performance is not very good. And the CPU does the rendering in this case, not the GPU.

The second point is also crucial. When plotting a large number of individual points, or a single curve, it is possible to call a single OpenGL rendering command, so that the Python overhead is negligible compared to the actual GPU rendering phase. But when it comes to a plot containing a large number of individual curves, using a Python loop is highly inefficient, especially when every call renders a small curve. The best solution I could find is to use glMultiDrawArrays or glMultiDrawElements, which render several primitives with a single memory buffer and a single command. Even if this function is implemented internally as a loop by the driver, it will still be much faster than a Python loop, since there isn’t the cost of interpretation.

With this technique, I am able to plot 100 curves with 1 million points each at ~15 FPS with a recent graphics card. That’s 1.5 billion points per second! Such performance is directly related to the incredible power of modern GPUs, which is literally mind blowing.

Yet, I think there’s still some room for improvement by using dynamic undersampling techniques. But that is for the future…

The Power of Shaders in Real-Time Graphics Programming

I’ve been programming in OpenGL for a few months. Like a lot of programmers, I learnt the language by myself, thanks to various tutorials, books or e-books on the subject. One couldn’t say there’s a lack of resources on this 20-years old language since it’s so widely used throughout the world. Yet, I was surprised to discover a few weeks ago that the vast majority of what I learnt has been obsolete for almost a decade. The reason is that too many textbooks and tutorials on the Internet about OpenGL refer to a deprecated way of programming and which relates to the fixed-function pipeline. The modern way of programming in OpenGL is to use the programmable pipeline through shaders. The free e-book by Jason McKesson is a very good resource for learning modern OpenGL programming using the programmable pipeline.

The graphics pipeline

I’ll try to explain what this is all about in simple terms, assuming no prior knowledge in graphics programming. Since graphics cards exist, the way rendering is done in a computer can be seen as a pipeline. At the top, there’s data that enters in the graphics card memory. At the bottom, there’s the pixels on the screen. The role of the graphics card is, and has always been, to transform the data into pixels. The reason why this is complicated is that most of the time, the graphics card needs to render 3D data on a 2D screen. Indeed, graphics cards have been created primarily to allow real-time rendering in 3D video games. So, first, the 3D data describing the game world and characters enter into the graphics card memory. Then, the graphics card transforms the data into a 2D scene that corresponds to the projection of the world onto a virtual camera. In a first-person game, like an FPS for instance, this camera corresponds to the eye of the main character.

Projecting a 3D world to a 2D camera is not mathematically complicated, and only involves basic linear algebra. However, it can be expensive in terms of computing power, since the number of mathematical operations to perform at every frame increases with the scene complexity. More details in the game implies a higher number of points, therefore a higher number of operations. Those operations constitute a highly parallel problem, since in general, the same mathematical operation is performed on all points (for instance, when the camera moves, the same linear transformation is applied on all points). The power of graphics cards comes precisely from their highly parallel architecture which lets them compute this kind of linear transformation in a very efficient way.

So, coming back to the rendering pipeline, the transformation of the data involves linear transformations to get the 2D positions of the points on the camera, then rasterization to transform primitives (made up by vertices) into colored pixels. In addition, an important aspect of 3D rendering has to do with lighting, which plays an essential role in the realistic aspect of the scene. Real-time realistic lighting is a complicated subject. There’s also the issue of textures, reflections, etc. So over the years, graphics cards programming has become increasingly complicated in order to account for more and more complex and realistic rendering algorithms.

The advent of programmable shaders

Then, an alternative approach has been to bring flexibility in this pipeline process, by giving the programmer the possibility to customize this rendering pipeline. Instead of having fixed, hard-coded rendering algorithms, the programmer had the possibility to write small programs in a low-level language called a shading language. A shader program is executed independently and in parallel over all vertices or pixels, and transforms them in order to implement custom rendering algorithms. This new technique has allowed for real-time special effects that would not have been possible before.

Several types of shaders exist, the most common ones are the vertex shader and the fragment shader. The vertex shader is executed once per vertex and can transform its position. Applications include any special effect that requires independent transformation of vertices that would not be possible globally (e.g. cloth simulation, hair movements, morphing, particle system rendering, special lighting effects, etc.).

The fragment shader is executed once per pixel and can transform its final color. Applications include everything related to textures, lighting, reflections, etc.

Shaders are extremely powerful, for they give the programmer full control of the graphics card to make them do what they do best: real-time rendering. The deep reason why they are so powerful is that shaders execute on the GPU in a fully parallel way, so they exploit the parallel architecture of graphics card in the most efficient possible way. The hundreds or thousands of cores all execute simultaneously the same little programs that transform the megabytes or gigabytes of data stored in GPU memory into millions of pixels on the screen. And the precise algorithms are up to the programmer rather than the graphics card manufacturer.

In the early times of programmable pipeline rendering, shaders were written in an assembly language, making them highly difficult to write, design, read, and debug. Then, more readable languages have been designed, such as GLSL (OpenGL), HLSL (DirectX), or Cg (Nvidia). These languages look very much like C, even if they target very different architectures than those of typical central processing units. The simple and widespread syntax has made shader programming a potential reality for most graphics programmers.

Today, shaders are widely used in virtually all 3D video games. Yet, very few OpenGL resources address them. Instead, tutorials and lessons explain how to use the fixed-function pipeline to perform transform and lighting on the GPU, without precising that this way of doing has been obsolete for nearly 10 years! Now, even old graphics cards fully support programmable shaders, so I can’t see good reasons not to use them. They are just so powerful, easy to program and they allow for a thorough understanding of how modern graphics cards work, and how to use their extreme computational power to their full extent.

Concrete example of shaders in OpenGL

I will now give an example of using shaders in PyOpenGL, by extending my previous tutorial on PyOpenGL. In OpenGL, shaders are written in GLSL. Several versions of GLSL exist, and the version supported by the GPU depends on the version of OpenGL implemented in the graphics card drivers. I will assume that OpenGL 3.30 is supported.

First, when using shaders, it may be a good idea to get rid of all code related to the fixed-function pipeline. It is now completely deprecated, yet almost all tutorials do not mention that. For example, here is a non-exhaustive list of deprecated OpenGL functions:

glColorPointer, glVertexPointer, glEnableClientState, glLoadIdentity,
glLoadMatrix, glMultMatrix, glOrtho*, glPopMatrix, glRotate*, glScale*,
glTranslate*, glMaterial*, glLight*...

The details can be found in the official specifications. It means that all functions related to transformations, lighting, texturing, etc. are deprecated and should rather be implemented in vertex and fragment shaders. Concerning matrix transformations, it means that matrices need to be passed to the vertex shader through uniform variables, and then be explicitely multiplied to the position vector (which are attribute variables). Similarly for lighting and texturing.

Example description

In this simple example, a null sampled function (\(x \in [-1, 1], y = 0\)) is loaded on the GPU as an attribute variable named position. An attribute variable is an array of scalars or vectors (of dimension 2, 3 or 4) that is loaded on the GPU as a vertex buffer object. It has a name, a type and a location. The location is an integer that should be unique within a shader program. Here, position is an attribute of type vec2 and location 0. It contains the coordinates of the vertices. There are \(N\) vertices, so \(N\) vectors with vertex coordinates, and \(N\) executions of the vertex shader. Each thread takes one vec2 position as an input, and returns the final position in the special variable gl_Position. If linear transformations need to be applied, one needs to multiply matrices with position and assign the result to gl_Position.

Vertex shader

Here is the source code of the vertex shader in this example.

# Vertex shader.
VS = """
#version 330
// Attribute variable that contains coordinates of the vertices.
layout(location = 0) in vec2 position;
 
// Main function, which needs to set `gl_Position`.
void main()
{
    // The final position is transformed from a null signal to a sinewave here.
    // We pass the position to gl_Position, by converting it into
    // a 4D vector. The last coordinate should be 0 when rendering 2D figures.
    gl_Position = vec4(position.x, .2 * sin(20 * position.x), 0., 1.);
}
"""

The code should be self-explanatory. The x coordinate of the position is used to calculate the y coordinate (through a sinus function). We don’t use the y coordinate at all, so we could also have used an array of floats for the position. The special variable gl_Position has four components, the third is the third dimension (not used here since we render a 2D scene), the latest is the fourth, homogeneous coordinate, that is not relevant in 2D rendering and should be fixed to 1.

Fragment shader

The fragment shader is executed once per primitive pixel. It takes possible vertex shader outputs as inputs (none here) and returns the pixel color as an output. The output color needs to be explicitely declared. Usage of the special gl_FragColor keyword is now deprecated, such as a lot of other gl_* variables in GLSL.

# Fragment shader
FS = """
#version 330
// Output variable of the fragment shader, which is a 4D vector containing the
// RGBA components of the pixel color.
out vec4 out_color;
 
// Main fragment shader function.
void main()
{
    // We simply set the pixel color to yellow.
    out_color = vec4(1., 1., 0., 1.);
}
"""

Once again, the code should be clear enough. The out_color variable contains the red, green, blue and alpha components of the pixel final color. The components are between 0 and 1. The alpha component is the transparency: 0 for completely transparent, 1 for completely opaque.

Compiling a shader

Once shader codes have been defined, shaders need to be compiled. Here is a small Python function for compiling a vertex shader.

import OpenGL.GL as gl
def compile_vertex_shader(source):
    """Compile a vertex shader from source."""
    vertex_shader = gl.glCreateShader(gl.GL_VERTEX_SHADER)
    gl.glShaderSource(vertex_shader, source)
    gl.glCompileShader(vertex_shader)
    # check compilation error
    result = gl.glGetShaderiv(vertex_shader, gl.GL_COMPILE_STATUS)
    if not(result):
        raise RuntimeError(gl.glGetShaderInfoLog(vertex_shader))
    return vertex_shader

Compilation involves the creation of a shader, the load of the source code, and finally the compilation. Then, we check that no error happened during the compilation, or we print the compilation error. This last step is critical when debugging a PyOpenGL program using shaders, because otherwise there is no way to know why the compilation failed.

The function for compiling a fragment shader is pretty much the same (see the full script at the end for the details).

Attaching shaders to a program

Once the vertex and fragment shaders have been compiled, they need to be attached to a program, the latter being then linked.

def link_shader_program(vertex_shader, fragment_shader):
    """Create a shader program with from compiled shaders."""
    program = gl.glCreateProgram()
    gl.glAttachShader(program, vertex_shader)
    gl.glAttachShader(program, fragment_shader)
    gl.glLinkProgram(program)
    # check linking error
    result = gl.glGetProgramiv(program, gl.GL_LINK_STATUS)
    if not(result):
        raise RuntimeError(gl.glGetProgramInfoLog(program))
    return program

We first create a program, then we attach the compiled shaders, and finally we link the program. We also check that the linking was successful.

Using shaders

Finally, here is how to use shaders during the rendering process.

def paintGL(self):
    # vertices located in the buffer in position 0 contain 2 single
    # precision floating points as coordinates
    gl.glEnableVertexAttribArray(0)
    gl.glVertexAttribPointer(0, 2, gl.GL_FLOAT, gl.GL_FALSE, 0, None)
    # we use the compiled program
    gl.glUseProgram(program)
    # draw "count" points from the VBO
    gl.glDrawArrays(gl.GL_LINE_STRIP, 0, len(self.data))

We activate the buffers that need to pass data to the vertex shader, then we use the program before calling the OpenGL rendering commands.

Full script

Finally, here is the full Python script that displays a sinewave function using shaders. PyQt4 or PySide and PyOpenGL are necessary. If you use PySide, you should simply replace PyQt4 by PySide in the imports. The create_window function has already been explained in a previous post.

# PyQt4 imports
from PyQt4 import QtGui, QtCore, QtOpenGL
from PyQt4.QtOpenGL import QGLWidget
# PyOpenGL imports
import OpenGL.GL as gl
import OpenGL.arrays.vbo as glvbo
 
# Window creation function.
def create_window(window_class):
    """Create a QT window in Python, or interactively in IPython with QT GUI
    event loop integration:
        # in ~/.ipython/ipython_config.py
        c.TerminalIPythonApp.gui = 'qt'
        c.TerminalIPythonApp.pylab = 'qt'
    See also:
        http://ipython.org/ipython-doc/dev/interactive/qtconsole.html#qt-and-the-qtconsole
    """
    app_created = False
    app = QtCore.QCoreApplication.instance()
    if app is None:
        app = QtGui.QApplication(sys.argv)
        app_created = True
    app.references = set()
    window = window_class()
    app.references.add(window)
    window.show()
    if app_created:
        app.exec_()
    return window
 
def compile_vertex_shader(source):
    """Compile a vertex shader from source."""
    vertex_shader = gl.glCreateShader(gl.GL_VERTEX_SHADER)
    gl.glShaderSource(vertex_shader, source)
    gl.glCompileShader(vertex_shader)
    # check compilation error
    result = gl.glGetShaderiv(vertex_shader, gl.GL_COMPILE_STATUS)
    if not(result):
        raise RuntimeError(gl.glGetShaderInfoLog(vertex_shader))
    return vertex_shader
 
def compile_fragment_shader(source):
    """Compile a fragment shader from source."""
    fragment_shader = gl.glCreateShader(gl.GL_FRAGMENT_SHADER)
    gl.glShaderSource(fragment_shader, source)
    gl.glCompileShader(fragment_shader)
    # check compilation error
    result = gl.glGetShaderiv(fragment_shader, gl.GL_COMPILE_STATUS)
    if not(result):
        raise RuntimeError(gl.glGetShaderInfoLog(fragment_shader))
    return fragment_shader
 
def link_shader_program(vertex_shader, fragment_shader):
    """Create a shader program with from compiled shaders."""
    program = gl.glCreateProgram()
    gl.glAttachShader(program, vertex_shader)
    gl.glAttachShader(program, fragment_shader)
    gl.glLinkProgram(program)
    # check linking error
    result = gl.glGetProgramiv(program, gl.GL_LINK_STATUS)
    if not(result):
        raise RuntimeError(gl.glGetProgramInfoLog(program))
    return program
 
# Vertex shader
VS = """
#version 330
// Attribute variable that contains coordinates of the vertices.
layout(location = 0) in vec2 position;
 
// Main function, which needs to set `gl_Position`.
void main()
{
    // The final position is transformed from a null signal to a sinewave here.
    // We pass the position to gl_Position, by converting it into
    // a 4D vector. The last coordinate should be 0 when rendering 2D figures.
    gl_Position = vec4(position.x, .2 * sin(20 * position.x), 0., 1.);
}
"""
 
# Fragment shader
FS = """
#version 330
// Output variable of the fragment shader, which is a 4D vector containing the
// RGBA components of the pixel color.
out vec4 out_color;
 
// Main fragment shader function.
void main()
{
    // We simply set the pixel color to yellow.
    out_color = vec4(1., 1., 0., 1.);
}
"""
 
class GLPlotWidget(QGLWidget):
    # default window size
    width, height = 600, 600
 
    def initializeGL(self):
        """Initialize OpenGL, VBOs, upload data on the GPU, etc."""
        # background color
        gl.glClearColor(0, 0, 0, 0)
        # create a Vertex Buffer Object with the specified data
        self.vbo = glvbo.VBO(self.data)
        # compile the vertex shader
        vs = compile_vertex_shader(VS)
        # compile the fragment shader
        fs = compile_fragment_shader(FS)
        # compile the vertex shader
        self.shaders_program = link_shader_program(vs, fs)
 
    def paintGL(self):
        """Paint the scene."""
        # clear the buffer
        gl.glClear(gl.GL_COLOR_BUFFER_BIT)
        # bind the VBO 
        self.vbo.bind()
        # tell OpenGL that the VBO contains an array of vertices
        # prepare the shader        
        gl.glEnableVertexAttribArray(0)
        # these vertices contain 2 single precision coordinates
        gl.glVertexAttribPointer(0, 2, gl.GL_FLOAT, gl.GL_FALSE, 0, None)
        gl.glUseProgram(self.shaders_program)
        # draw "count" points from the VBO
        gl.glDrawArrays(gl.GL_LINE_STRIP, 0, len(self.data))
 
    def resizeGL(self, width, height):
        """Called upon window resizing: reinitialize the viewport."""
        # update the window size
        self.width, self.height = width, height
        # paint within the whole window
        gl.glViewport(0, 0, width, height)
 
if __name__ == '__main__':
    # import numpy for generating random data points
    import sys
    import numpy as np
 
    # null signal
    data = np.zeros((10000, 2), dtype=np.float32)
    data[:,0] = np.linspace(-1., 1., len(data))
 
    # define a QT window with an OpenGL widget inside it
    class TestWindow(QtGui.QMainWindow):
        def __init__(self):
            super(TestWindow, self).__init__()
            # initialize the GL widget
            self.widget = GLPlotWidget()
            self.widget.data = data
            # put the window at the screen position (100, 100)
            self.setGeometry(100, 100, self.widget.width, self.widget.height)
            self.setCentralWidget(self.widget)
            self.show()
 
    # show the window
    win = create_window(TestWindow)

There is of course much more to say about shaders that what this deceptively simple example has shown: how to implement linear transformations, lighting, textures, etc. This free ebook is an excellent resource for modern OpenGL programming, since it directly addresses the programmable pipeline instead of the deprecated fixed-function pipeline. The latter remains supported only for the sake of backwards compatibility, and should not be used at all.

Introducing Galry, a high-performance interactive 2D visualization Python package

I’m releasing today the code of a first experimental version of Galry, a high-performance interactive 2D visualization Python package that I’m creating as part of my current research project.

The rationale of this package is to provide a highly flexible and optimized way of visualizing large 2D datasets in Python by using the full power of the graphics card. Most visualization packages in Python are either meant to generate high-quality publication-ready figures (like matplotlib), or to offer 3D fast interactive visualization (like mayavi). Existing 2D plotting packages do not generally offer an efficient way to interactively visualize large datasets (1, 10, even 100 million points). That’s what Galry is aiming for, by using directly OpenGL through a thin Python wrapper called PyOpenGL. Galry should work on most platforms (Windows/MacOS/Linux).

To give an idea of the performance of Galry on a recent hardware, on a 2012 desktop computer with an high-end AMD graphics card, I can navigate smoothly into a plot with 50 million points (~35 FPS in the current version), and almost smoothly with 100 million points (~15 FPS).

Galry integrates smoothly with IPython and QT, through either PyQT or PySide.

Demo

[youtube]http://www.youtube.com/watch?v=jYNJJ4O3pXo[/youtube]

Getting started

The code is available on GitHub.

Important note: Galry is still an experimental project with an unstable programming interface that is likely to change at any time. Do not use it in production yet.

Installation of Galry may not be straightforward depending on your specific configuration. In particular, you need a recent enough version of OpenGL (tested version: 3.3). Please don’t hesitate to contact me if you have any trouble installing or using the library.