Moneyballing criminal justice

One of the problems of justice systems everywhere is that they depend on subjectivity and have near zero data-mining expertise. Because of that, tons of money are wasted in keeping low-risk offenders in jail.

As the attorney general for New Jersey, Anne Milgram changed the panorama of her state’s criminal justice system. By applying statistics to create projections, she devised a dashboard to single out the worst offenders and make sure that they were prosecuted. By applying Moneyball concepts, her methods minimized subjetive decisions, lowered costs and optimized the justice system.

A primer on Visual Encoding


Michael Dubakov has this great introduction to visual encoding, along with samples and easy to remember rules, so you don’t mess up the next time you’re trying to visualize data.

Zoomdata: BI without ETL


This weekend I took a few minutes to test Zoomdata.

And when I say minutes, I literally mean minutes. These guys have done an amazing job to provide a quick mobile-ready demo that you can download as a VM, or as I did, install from an RPM package in a fresh Linux system.

After the install, I added a twitter data source and created the dashboard that you see above, linking it with the built-in demo sales data. With a little more work, you could build a sentiment analysis vs. sales dashboard in under an hour.

I know, I know. I used a pie chart. Don’t use pie charts!

Besides the built-in chart types, Zoomdata includes a “visualization studio“, in which you can import and use another JavaScript library –say, D3js– to create sexier charts. After a few tweaks, I managed to feed my twitter datasource into this bubble chart:


The product uses mongoDB in the back-end and –if I understood correctly– the Zoomdata server periodically stores snapshots of your data sources, painlessly building a sort-of-data-warehouse on the fly. This makes it ideal for building dashboards for trends and historical analyses. All the charts have  sliders at the bottom which allow you to go back in time and in some cases aggregate monthly or yearly data. You can see a sample of this behavior at the demo site.

Zoomdata comes with connectors to Cloudera Impala, JSON, CSV, Google Docs, Twitter and good-ol’ SQL databases. All with the promise of “no ETL”, which I don’t actually believe it’s completely true –I can think about one or two cases at work in which I would really, really, need to merge two datasources with ETL– but given that I was able to create a dashboard from different data sources without any kind of pain, I bet that in many cases you could do the same without a single transformation.

If you work in BI, I encourage you to download one of their pre-packaged VMs and give it a whirl.



How I fell in love with MongoDB


I just finished the developer and DBA tracks of the two online MongoDB on-line trainings.

The courses are excellent, providing all you need to know to start working as a developer or DBA with MongoDB.

And, uh… they’re free!

I have taken several online courses and seen a lot of tutorials. One thing that always bugs me is the sound quality, or the lack of proper guided explanations, or the luminaries that expect you to know it all (but lack basic communication skills). But this course is really well done. Also, I’ve finally realized that taking a course with actual homework and grading is way better than watching tutorials. As the course progressed, I became super-motivated to finish it with a high score.

Of course, it helps that MongoDB is such a sexy product. Just for having schemaless design in your toolkit is worth learning it. Before this course, I disregarded NoSQL as a bunch of key-store nonsense with one cool application (Hadoop). A fad, probably. Then I started the developer’s course, I created my first MongoDB-backed REST service, then my first replicated shard.

I was so wrong.

MongoDB is perfect for data warehousing but also for the early stage of any software development project. It’s the perfect tool for creating a quick prototype backed by a very powerful database engine that can grow with the final product. If you’re a developer, you will save hours by adopting MongoDB now.

Migrating from SQL to MapReduce with MongoDB –Rick Osborne

Migrating from SQL to MapReduce with MongoDB –Rick Osborne

I’m sure that if I sit down with an Oracle DBA, the guy would probably destroy my assessments with things like “Oracle DB has that since the 80s” or “yeah, but the MongoDB way of doing it is insecure”. But there are workarounds for improving the security and, in practical terms, there are few databases that can offer zero-to-replicated-sharding in under two minutes, as MongoDB does.

I encourage you take one of these courses and try MongoDB in your next Software Development or Business Intelligence project. It’s an opportunity to be up to date with the latest (most hyped and coolest) database technology.

You can find the courses here:

A great book on MongoDB: MongoDB: The Definitive Guide


RDBMS vs. NoSQL: How do you pick?

Infographic: Migrating from SQL to MapReduce with MongoDB



175 years of maps

You’ve gotta hand it to the swiss. Follow this link and you will land in an interactive high-resolution historical map of Switzerland, with an impressive detail level that allows you to time-travel 175 years in map-making, witnessing city walls being replaced by roads and towns engulfed by cities. Mind you: this is for all Switzerland, not just a city.

Also: beautiful-yet-functional typography. Look at this g:


Also: 2013 marks the 150th anniversary of Thomas Cook’s first tour of Switzerland. A fascinating story that the author Diccon Bewes retells in his book, Slow Train to Switzerland.

(oh, yeah, and it’s also the 150th anniversary of the Red Cross)

Also: When we were living in Barcelona, I found a similar resource made available by the cccb: historical maps of the city, overlapped.