Evan Savage

Financial Tracking: Mint Bubbles

In this post I present Mint Bubbles, a force-directed bubble chart
visualization of exported Mint data. I explain how to use
force-directed layouts to produce awesome interactive visualizations
with d3, and also provide details on some of the specific tricks used
to create Mint Bubbles.

Getting Your Data #

Exporting your data from Mint is easy. Log into Mint and go to the
Transactions tab:

Scroll to the bottom pagination section. In barely-legible super-tiny
type at bottom right, there's a link to export all your transactions:

Clicking that link will download a file called transactions.csv:

Mint Bubbles #

If you're viewing this on an RSS reader, check out the example
on my blog. You will need a browser that supports the
HTML5 File API.

You can see the code for this demo here.

To see a visualization of your data, drag the transactions.csv file from
Mint onto the drag your data here area below. You can also use
my data from the last three months or so.

drop your data here

Behind The Bubbles #

Inspiration #

This visualization was inspired by the
NYT 2013 Budget Proposal Graphic,
which uses d3.js to bring
Obama's 2013 budget proposal
to life as an interactive bubble chart.

I'd just started using Mint for financial tracking, and this
seemed like an awesome way to visualize my personal spending patterns.
To help figure out the mechanics of the NYT visualization, I consulted
this article
by Jim Vallandingham. He explains in detail how to create similar
visualizations using d3's force-directed layouts, which model your
data as a set of particles moving about in space.

Importing Data #

Unlike my previous visualizations, I wanted this visualization
to allow you to play with your data. Enter the HTML5 File API, which
allows access to files via JavaScript. First, I set up the drag-and-drop
listeners on div#drop_zone:

/*
* Octopress bundles ender.js, which provides $() for DOM access; mootools
* tries to play nice, so it won't install its $() over that. I'm using
* document.id() instead.
*/

var dropZone = document.id('drop_zone');
function trapEvent(evt) {
evt.stopPropagation();
evt.preventDefault();
}
dropZone.addEventListener('dragenter', trapEvent, false);
dropZone.addEventListener('dragexit', trapEvent, false);
dropZone.addEventListener('dragover', function(evt) {
trapEvent(evt);
// This makes a copy icon appear during the drag operation.
evt.dataTransfer.dropEffect = 'copy';
}, false);
dropZone.addEventListener('drop', handleFileSelect, false);

dragenter, dragexit, and dragover are analogous to mouseenter,
mouseexit, and mouseover. For those events, it suffices to call
trapEvent(), which prevents the browser's default action from happening.
For instance, Chrome on Mac OS will just download the transactions.csv file
if you drag it into a browser tab, which is not what I want here.

drop is the interesting event:

function handleFileSelect(evt) {
trapEvent(evt);
var f = evt.dataTransfer.files[0];
// NOTE: you might want to filter out large or invalid files here.
var reader = new FileReader();
reader.onloadstart = function(e) {
if (e.lengthComputable) {
document.id('progress').removeClass('hidden');
document.id('progress_bar').set('value', 0);
document.id('progress_bar').set('max', e.total);
}
};
reader.onprogress = function(e) {
if (e.lengthComputable) {
document.id('progress_bar').value = e.loaded;
}
};
reader.onload = function(e) {
document.id('caption').removeClass('hidden').addClass('chart-active');
document.id('progress').addClass('hidden');
document.id('drop_zone').addClass('hidden');
document.id('chart').addClass('chart-active');
buildChart(d3.csv.parse(e.target.result));
};
reader.readAsText(f);
}

This uses FileReader.readAsText() to read in the transactions.csv file,
with d3.csv.parse() for turning that CSV file into a sequence of JavaScript
objects representing the transactions. This parsing is triggered onload,
which fires once file I/O has completed.

onloadstart and onprogress are used to monitor file I/O progress via the
HTML5 progress element document.id('progress_bar'). Since
transactions.csv files are typically small, and since the "uploading" is
actually a client-local copy into browser memory, you'll probably never see
that progress bar.

Grouping Transactions #

I group the transactions by category:

var cs = {};
data.each(function(tx) {
var c = tx['Category'];
if (!(c in cs)) {
cs[c] = {
amount: 0,
txs: []
};
}
cs[c].amount += +(tx['Amount']);
cs[c].txs.push(tx);
});

amount stores the total amount; note the use of +(tx['Amount']) to convert
CSV string values into numbers. txs is used for the transaction list.

I then convert these into nodes to be used by
d3.layout.force():

var nodes = [];
for (var c in cs) {
nodes.push({
R: Math.max(2, Math.sqrt(cs[c].amount)),
category: c,
amount: cs[c].amount,
txs: cs[c].txs
});
}

Defining The Layout #

Before building the visualization itself, I define a color gradient based on
bubble radius, picking the colors using the excellent
Color Scheme Designer:

var Rs = nodes.map(function(d) { return d.R; });
var minR = d3.min(Rs),
maxR = d3.max(Rs);
var fill = d3.scale.linear()
.domain([minR, maxR])
.range(['#7EFF77', '#067500']);

Now on to the visualization. First, I need to create the SVG element:

var w = 960, h = 480;
var vis = d3.select('#chart').append('svg:svg')
.attr('width', w)
.attr('height', h);

Next, I define the behavior and styling of the bubbles:

var node = vis.selectAll('circle.node')
.data(nodes)
.enter().append('svg:circle')
.attr('class', 'node')
.attr('cx', function(d) { return d.x; })
.attr('cy', function(d) { return d.y; })
.attr('r', function(d) { return d.R; })
.style('fill', function(d) { return fill(d.R); })
.style('stroke', function(d) { return d3.rgb(fill(d.R)).darker(1); })
.style('stroke-width', 1.5);

fill(d.R) uses the color gradient fill to make smaller bubbles lighter and
larger bubbles darker.

As for the force-directed layout, I start with some basic properties:

var force = d3.layout.force()
.nodes(nodes)
.links([]) // no edges between bubbles!
.size([w, h])
.gravity(0.05) // controls speed at which bubbles seek the center
.friction(0.95); // slows down motion

Tick Handler #

`force.tick()`: Runs the force layout simulation one step.

Force-directed layouts model your data as a set of particles in space. Those
particles are subject to various forces:

A layout can describe some or all of these forces. Resolving the forces is a
simple iterative process:

while (true) {
for (P in particles) {
F = [0, 0];
for (f in forcesActingOn(P)) {
F[0] += f[0]; F[1] += f[1];
}
applyForceTo(P, F);
}
}

In addition to the above forces, visualizations using d3.layout.force() can
define their own forces via the ontick handler. I use this to apply two
effects:

var floatPoint = d3.scale.linear()
.domain([minR, maxR])
.range([h * 0.65, h * 0.35]);

force.on('tick', function(e) {
// vertical size sorting
nodes.each(function(d) {
var dy = floatPoint(d.R) - d.y;
d.y += 0.25 * dy * e.alpha;
});

// collision detection
var q = d3.geom.quadtree(nodes);
nodes.each(function(d1) {
q.visit(function(quad, x1, y1, x2, y2) {
var d2 = quad.point;
if (d2 && (d2 !== d1)) {
var x = d1.x - d2.x,
y = d1.y - d2.y,
L = Math.sqrt(x * x + y * y),
R = d1.R + d2.R;
if (L < R) {
L = (L - R) / L * 0.5;
var Lx = L * x,
Ly = L * y;
d1.x -= Lx; d1.y -= Ly;
d2.x += Lx; d2.y += Ly;
}
}
// This short-circuits visit() for quadtree nodes that can't collide with
// d1, resulting in O(n log n) collision detection.
return
x1 > (d1.x + d1.R) ||
x2 < (d1.x - d1.R) ||
y1 > (d1.y + d1.R) ||
y2 < (d1.y - d1.R);
});
});
node
.attr('cx', function(d) { return d.x; })
.attr('cy', function(d) { return d.y; });
});

Alpha and Size Sorting #

What's e.alpha? This is described cryptically in the
d3.js documentation:

Internally, the layout uses a cooling parameter alpha which controls the layout temperature: as the physical simulation converges on a stable layout, the temperature drops, causing nodes to move more slowly.

A look at the code for d3.layout.force()
provides some insight into what's happening here:

force.tick = function() {
// simulated annealing, basically
if ((alpha *= .99) < .005) {
event.end({type: "end", alpha: alpha = 0});
return true;
}
// ...
}

Let's look at the size sorting code again:

nodes.each(function(d) {
var dy = floatPoint(d.R) - d.y;
d.y += 0.25 * dy * e.alpha;
});

floatPoint(d.R) computes a "desired height" for the node d. The d.y
adjustment moves d towards that height, using e.alpha to slow down the
sorting adjustment as the layout "cools" into its final state.

Collision Detection #

The collision detection code is cribbed from
this page,
which is part of a talk
given by Mike Bostock on d3.

Up Next #

I'm currently working on a post for the main Quantified Self blog,
in which I'm planning to feature another cool visualization for personal data.
Aside from that, I'm hoping to use an upcoming post to dissect my Mint data in
more detail. Keep posted!