Creating interactive visualizations of large datasets using JavaScript

Crossfilter is a JavaScript library initially designed by Square to explore large multivariate datasets in a web browser. It basically allows you to create sorted indexes and feed them to a charting library like D3, and enable the user to filter by clicking and dragging, even when sifting through 200.000 rows in a 5MB file:

Time of Day
Arrival Delay (min.)
Distance (mi.)
Date

(go ahead, you can play with it. Click and drag on any of the charts to perform a live filtering of the dataset)

The process to create these charts is relatively easy:
1. Load the data.
2. Define the dimensions with crossfilter.
3. Create the charts with D3.
4. Write the event code to respond to clicks and filters.

The main challenge of using crossfilter is that you have to spend quite some time fighting engaging D3 to make the charts respond to filter changes and user clicks. These charts have to be wired together, so when you click on one, the others respond to the action. The JavaScript code for the chart above is about 300 lines and was made by clever guys who work at Square.

There’s also the question of reusability. How much code would we have to write –and maintain– if we wanted to adapt any of these charts to another visualization?

A lot!

This is the main problem with using D3.js as a charting library and doing everything by hand. The development and testing of a single non-trivial dashboard could take weeks, if not months.

You know what would be ideal? To have a charting library that allowed us to use large datasets with crossfilter and, at the same time, enabled us to create reusable charts that we could easily plug in wherever we want.

That’s exactly what dc.js does. Here I’m loading the same 5MB, 200.000 records file, and dealing with it in only 121 lines of un-optimized JavaScript.

Time of Day

Arrival Delay (min.)

Distance (mi.)

Date

(click and drag on any of the charts to filter the data)

This is the same set of charts, but done in a much simpler, faster and expressive way.

Take a look at the samples they have in their site, like this interactive Twitter dashboard or this stock picker.

dc.js performs the dirty work of linking the charts together, so when we click on a chart, the other charts in the group respond automatically, without writing any additional code. Magic!






A reliable list of country codes

country-codes

Who knows how much time I’ve wasted over the years trying to find a reliable data source for country names, ISO codes, phone prefixes and currencies.

I have my own personal dysfunctional set of tables built manually from bits and pieces found online. But the other day I ran into this beauty: and updated ISO 3166 and ISO 4217 country and currency codes, with some FIFA and ITU codes thrown into the mix. The list is provided in CSV and JSON format and is now my one-stop solution for country codes.

 

Data + Design

data-design-cover

A team of over 50 people have collaborated in Data+Design, a book about preparing and visualizing information.

It’s a thorough but simple introduction to data collection and visualization, very thoroughly written and chock full of great advice.

bars

axis labels

 

labels in mobile charts

labels in mobile charts

It’s open-source too, published in its present form using O’Reilly’s Atlas e-publishing platform, which produces very clean, readable books.

Go and read it here.

 

Two Dropbox alternatives

Screen-Shot-2013-05-16-at-12.50.39-PM

Like everyone else, I’ve run out of space in my Dropbox and my Google Drive.

Besides, I’m low on space in my hard drive, so I’ve been looking for a solution that enables me to do manual backups, to have some mp3 and video files in an off-site and relatively safe place.

So, I’ve been browsing and testing different alternatives, with four specific conditions:

  1. Has to be free (initially).
  2. Has to offer 5GB or more.
  3. Has to allow automatic sync of a folder and at the same time it has to let you manually copy files.
  4. It mustn’t have file-size limits. Most of the popular cloud storage services, like Box or MEGA, impose a 200MB (or so) limit per file.

Turns out it’s not so easy to find a service with these characteristics. But I managed to find two:

 

bitcasa
Bitcasa offers 5GB to start with and 1GB for each friend that you invite, up to 20GB.

 

copy

But I think Copy is the best one. They offer 20GB to start and 5GB per invite. Yes, 5GB. Besides, that friend who you invite also receives an extra 5GB for the invitation. So everybody wins. Another cool thing about Copy is that if you share a 3GB folder with 3 friends, each one consumes only 1GB of their cuota. Which is genius and completely logical.

Both services work the same way as Dropbox: you install an app which creates a special folder in your hard drive which syncs automatically with the cloud. But also they let you manually upload files to free up some space in your hard drive.

Bitcasa

Copy

 

How to create an embeddable timeline chart


The other day, web developer friend asked me how to create and insert a timeline into WordPress. He actually wanted to put an interactive timeline with links, images and video on the homepage of a news site.

I checked out several options out there and settled on TimelineJS. You can create timeline charts directly on their web site. Each story in the timeline has to be loaded as rows in a google spreadsheet. You can enrich each of the “steps” or events with links, videos and pictures and everything’s stupid-easy to use.

At the end you get an iframe embedding code that you can copy and paste to your site. But they even have a WordPress plugin for that. There’s even other developers who have created alternative tools based on TimelineJS that allow you to play with more settings.

So if you ever need a quick and dirty solution to build a timeline and tell a story, look no further.

 

Now, if you’re a developer and you want a proper, scalable and maintainable solution, that’s another story.

In the case of my friend, depending on an external site for such a crucial part of a website is not a good idea. What if these guys decide to abandon the product (hope not!)? What if their site goes off-line? My friend may lose his content temporarily or permanently. It’s not maintained by him, it’s not hosted by him, it’s not backed up by him. His content it’s not his.

So, the correct solution is to find a way to self-host a similar solution.

Fortunately and amazingly, TimelineJS is an open source tool and it’s on GitHub. If you’re a developer like my friend, you can download the source code, copy it to your server and that’s it. You’ll have a fledged timeline builder in minutes.