From the course: DC.js for Data Science Essential Training

Making a bar chart

- [Narrator] We are going to create this bar chart from our dummy data. There are three steps to doing this. First, we create a div place holder within the HTML. Second, we organize the data for the chart, using cross filter. And third, we declare the chart in DC, linking together the HTML div, and the data as well. So, let's go back to our template, and save it as bar chart. And we can update the title, as well, to say bar chart, because that's going to change the tab name in our browser. Now within the body tags, where we have our div, above the div, add a heading, which can be H1, or H2, depending what kind of size and formatting of heading you like. And I'm going to put bar charts, payments, by type, and that's because we're going to show the number of payments split by the payment type, which is tab, Visa or cash. Then we need our div element, and we need to give it a unique ID, and I'm going to call it chart. It's not hard to give it a unique ID, cause we haven't got any other elements yet. But what we do is we leave it empty, because DC is going to spurt out the chart into this div for us. So that's step one done. Now for step two, our data. Here's our data. We've already run the cross filter command. Now, the purpose of the cross filter command is to index the data, and that's always the first command that you run when you're using cross filter or DC that's built on cross filter. The next thing we have to do is declare a dimension on our cross filtered data. A dimension is an aspect of our data. So we could say, we could look at a quantity dimension, or a total dimension, or a type dimension. But you shouldn't think of a dimension as having a one to one relationship with these fields, or properties here, because you can actually create a dimension off of several fields, by combining them together, or you could multiply them, create a percentage, or a fraction, or whatever. So a dimension is just one way of looking at our data. Now, we want to create a bar chart showing the number of payments by type, and the type will go along the bottom. And another useful way of thinking of a dimension is that it tends to specify the X axis for us, and the X axis is where we want to show the different types. So we go facts.dimension, and then inside the dimension brackets we have to tell DC which bit of data to go fetch and what to do with it. And this is a fairly simple one. We're just saying go and fetch and return d.type, and you don't need to do anything with it, other than return it. Now, you could call this variable, by the way, anything you like. So, let's just look through this terminology here. We've got facts, because that's our variable name of our indexed data. And then on facts, we run our dimension, and dimension is a cross filter term, which sets how cross filter, or DC, should go and access our data. What should they get, and what should they do with it? Within dimension, we're given a small function, and this is called an accessive function, because it specifies what to access or how to access it. These functions always follow the same structure, function d, return, something or other, before the semi colon. D means data point, or row, or fact. And because we're running this on our variable called facts, and facts has 12 objects inside it, this accessive function is going to run 12 times, and every time it runs, D will be a different row, a different fact. Because those rows are objects, we access them with d.type, like so, or d.tip or d.quantity. It's always a good idea when you're using functions like this to have a good look at what's inside them. And so, I'm going to say console.log d, and we can comment out those, for a second, and just see what that gives us in our new bar chart page. We can see, we've got our heading there already, and if we take a look inside the console, this is the d for data point that's been printed out 12 times, and each time it's passing in a different row. Now if we go back and change d to d.type, we can see exactly what it is we're planning to pass in to the variable called type dimension, and there we go. It's the strings, or text values, of tab, Visa, and cash. We could write something much more complex in this function if we liked. We could say d.type plus d.date, if we wanted to, and can catenate the strings, making a whole new data point. In general, you should make as few dimensions as possible. The more you create, the slower your code will run, and in any case, there's an absolute limit of 32 total dimensions within any one webpage. So we have our dimension, and that's going to set our X axis, and the next thing we need to do is aggregate our data. It's not much of a bar chart if we have a bar for every single row. We want one bar per type, and the bars themselves should have a height that represents a count, or a sum, of something for us. To aggregate our data, we use group, which is another cross filter command. So we can say var typeGroup is typeDimension.group. This is the most basic kind of group, and it will count rows for us. Now let's look inside our new group using print filter. Remember that print filter has a bit of a quirk, where we pass the variable inside a string. Nothing bad happens really if you don't use a string. The output's just a bit less useful. So where it says typeGroup there, if we hadn't passed in the string, it would say object object. So this cross filtered group has output key value pairs. There's the key, and there's the value. Tab has a value of eight, cash and Visa have two each. And if we look at the raw data, that looks about right. We can see a lot of these rows represent payments that have been settled by tab. There's eight there. So that's steps one and two complete. Now, the final step is to create the chart, and link together the cross filtered data with the HTML location. And we do that with dc.barChart brackets, and inside the brackets, we say, #chart. Now, this is actually CSS selector terminology. This hashtag or pound sign means ID, and if we put a dot there instead that would mean class. Now the significance of ID or class is that we have said ID up here equals chart. So when we say hashtag chart, it means go and select whatever element currently has an ID of chart. That's how we tell DC which bit of HTML to look at. And then we have to specify the data, which we do with .dimension, and .group, and the dimension is typeDimension, and the group is typeGroup, like so. Now, with a bar chart, there's a couple of other things that we have to specify. In fact, normally, there's only one other thing, and that's .x, like so. This X parameter tells DC how to lay out the X axis for us, and the thing we have to pass in there is a scale. For this, we're going to use our first bit of D3. So, DC hasn't defined its own set of scales, but D3 has probably hundreds of scales defined, and one of them is d3.scale.ordinal, brackets, and then DC requires that we specify a domain. And I will explain these terms in a second. The domain will contain cash, tab, and Visa, like so. Now, a scale maps one set of values to another. D3 contains definitions for lots of scales, including the two most common types, which are ordinal and linear. An ordinal scale usually maps numbers to text, where the text might be days of the week, or names of staff, or in our case, payment methods. To declare a scale in D3 version three, we say, d3.scale, then we give the type. Then we help DC make sense of the data with .domain. Domain is a D3 term, and it just means the jurisdiction of the data. With a linear scale, we would enter minimum and maximum values as the domain. But when we're dealing with ordinal scale, we list all the types that are available, and these are the same as the keys in our key value pairs. Now one more thing before we can see our, in fact two more things before we can see our chart. Whenever you're using an ordinal scale, we have to give one more parameter, which is x units, and this is where we tell DC to expect an ordinal scale. So, we say dc.units.ordinal, and we save that. And then the final command, in any DC webpage, is dc.renderAll, which means go ahead and draw the chart. And there we have our first bar chart. You can see, we're showing two for cash, two for Visa, and eight for tab, as we would expect. Notice that we have an X and a Y scale, and the Y chart has labels on it, even though we didn't specify any. And actually we didn't give a scale for Y, either. DC has defaulted to a linear scale, and a tick mark against each integer. We have a default blue color for our bars, and you're going to come across this blue color a great deal, but we can change it. And we have a default two pixel gap between the bars. The height of the chart has defaulted to 200. Now let's just go back and comment out .xUnits and see what happens when we do that. Personally, I feel that DC should be able to work out that this is ordinal without being told, because it's specified in the scale. But as you can see, if we don't specify xUnits here, DC misinterprets our domain. It's put cash at the far left, and tab at the far right. So let's put that back, and then we'll be ready to play around with some of these defaults to make this chart look more like our original chart.

Contents