Cyrille Rossant

Moving away from HDF5

2016-01-06

Update [2016-01-30]: I wrote a follow-up here

In the research lab where I work, we've been developing a data processing pipeline for several years. This includes not only a program but also a new file format based on HDF5 for a specific type of data. While the choice of HDF5 was looking compelling on paper, we found many issues with it. Recently, despite the high costs, we decided to abandon this format in our software.

In this post, I'll describe what is HDF5 and what are the issues that made us move away from it.


New year

2016-01-01

I didn't write a lot lately: only 2 posts in 2015! I'll try to do better this year: more and shorter posts about programming, technology, and science. There's no shortage of topics to discuss.

Happy New Year.


A compiler infrastructure for data visualization

2015-07-24

There are many data visualization tools out there. Yet, I believe we're still lacking a robust, scalable, and cross-platform visualization toolkit that can handle today's massive datasets.

Most existing tools target simple plots with a few hundreds or thousands of points: bar plots, scatter plots, histograms and the like. Typically, these figures represent aggregated statistical quantities. Maps are also particularly popular, and there are now really great open source tools.

Perhaps contrary to a common belief, this is not the end of the story. There are much more complex visualization needs in academia and industry, and I've always been unsatisfied by the tools at our disposal.


NumPy in the browser: proof of concept with Numba, LLVM, and emscripten

2015-02-18

It's been a while since I wanted to try to bring some of NumPy to the browser. I've already discussed the motivations for this in a previous post last year. As far as I'm concerned, the main use case would be to enable interactive data visualization in offline notebooks (including nbviewer), which often require client-based array operations for interactivity. In this post, I'll describe a proof-of-concept of compiling NumPy-aware Python functions to JavaScript using Numba, LLVM, and emscripten.


Big Data visualization with WebGL, part 2: VisPy

2014-12-11

In this post series, I'm describing the big data visualization platform I'm currently developing with WebGL. I'll detail in this second post the VisPy library which is the basis of the project.