The Grammar Of Graphics
Hey Katie Hi ben high you doing? What are we talking about today? We're talking about the grammar of graphics. The Grammar of graphics yeah. This is a visual episode in audio form. So let's see how this goes. This can be okay. You're listening to linear digressions. Okay so I know what? The term grammar means as it applies to language It's kind of the the rules about how you would construct sentences and I'm sure that there are many people who find better than me but that's kind of how I think about it. Yeah that when we are using language to communicate. There's an order in which we place subjects in verbs and objects. There's a recurring to language in the sense that you can have phrases. That have substructure. There's also Orders in which things tend to appear like I would say I would always say the big black car I would never say the black big car. Yes grammar is yes this this thing. That's a little bit hard to define but once you start to think of it is pretty common to think of it. In terms of the rules of language I actually was reading. Something really interesting about this It's so I just found it a tweet by Matthew Anderson things native English speakers. No but don't know why we know and the quote is adjectives in English. Absolutely have to be in the following order opinion size age shape color origin material purpose noun. So you can have a lovely little old rectangular green French silver WHITTLING KNIFE. But if you mess with that word order in the slightest. You'll sound like a maniac. It's an odd thing that every English speaker uses that list but almost none of us could write it out yeah. I think I've heard something similar to so I think that was what I would like drawing on a little bit in that Great Green Great Dragons. No Great Green Dragons. Yeah exactly so. We're not talking about language in this talk of graphics. What how what does that mean yes? So that's what we're going to spend the next fifteen minutes talking about a little bit but the rough idea here. Is that so just like? There's an expectation that you have about the word order or the construction of phrases when you're listening to someone speaker when you're reading a sentence. There's a similar idea. Perhaps for visualizing drawing visualizations of data or consuming visualizations of data. Things that you expect to see whether or not you even really think about it. Or when you're composing a visualization things that you're planning for or taking into account that again. Maybe you aren't thinking about but this comes up in a really deep way if you are say. Dealing with data visualization software at a at a pretty fundamental level. So for those of you who are into our universe and particularly The tidy verse Hadley Wickham 's corner of the our universe. You're probably familiar with a package called G. G Plot to which is a visualization library. In our that's can famously makes very beautiful graphics especially with its its defaults make for really nice graphics. the gee-gee NJIT PLOT TO REVERSE TO GRAMMAR OF GRAPHICS and own. And actually. Yeah the most of the research that I did for. This episode was reading a twenty five page paper. That had they wickham wrote about how he thinks about. And how the field a general thinks about the grammar of graphics. Data visualization says where. We're going to talk about very cool. I don't even know where to start in thinking about this. This is this is GonNa be neat. Yeah this this was a pretty challenging Topic for me to try to understand because it gets into theory pretty quickly of like what is a facet and what is the scale and what is A. What's the difference between a mapping to an aesthetic and coordinate system I think There's certainly a lot to unpack if you're just really excited about this idea but rather than getting into some of these kind of esoteric concepts especially concepts that are ESA teric without having examples to look at. I wanted to illustrate the main pieces of the grammar of graphics as highly working for example talks about it using an example of a visualization. That probably a lot of people are really familiar. With and how that illustrates a few of the big important concept that again. We all kind of take for granted probably in our day to day. Visualizations Okay so what's the. What's the example graphic then? All right let's talk about a stacked histogram stacked histogram yet can you? Can you describe it for me? Yes so let me give you an example of stacked histogram ice to make all the time when I was a physicist so when I was a physicist we used to make lots and lots of plots where what you are trying to do was look at distributions of particles that you are getting in your detector and in general there were lots of different kinds of particles that were classified as what we would call background so these were types of particles that were you know interesting but not what we are really searching for and then there were in certain situations. You'll be looking for signal particles as well so this might be like a higgs bows on if you're doing a heck search and so when you were creating visualizations of your data. What you're looking for is okay. Do we have a distribution of data? That's more consistent with there. Only being background present or does it look more consistent with background plus signal for the second cases like Oh maybe we discovered some new physics or something so we would think a lot about how to visualize background and when you're doing that analysis you tend to have different kinds of particles that are coming in from different places in your detector and so if you just look at one of those systems at a time you're going to get an incomplete picture of all of the particles instead what you wanted to layer them all on top of each other so that you have yes so that you have like a picture of the overall distribution of the particles that you see but you also have them stratified by the different types of physics processes that they correspond to and so you're kind of stacking each of those strata on top of each other and you have a visualization that shows you know each of them separately but also all of them adding together. That's roughly what a histogram is God. I think I've seen these before are I'm sure I've seen them in many places but I'm thinking about when you look at when you do a software release and you look at all of the different All of the different computers that are running the software. And what version. They're on and you can see how people have upgraded. Each version of the software will be represented by different color. And over time. You'll see them kind of go and peak and then as new software later is released than the previous version will kind of trail off and The I guess the representation that you're talking about is showing all of that in a single graph with time. Let's say being the x axis and in in my example. It's always at one hundred per cent hike because every user is on some version but you can see the dip the I guess the distribution at any given point of those versions yeah or a few decided to represent it instead of as a percentage of the whole if you had your y. Axis was allowed float and instead it was the total number of users using that system than you could imagine like the overall rate could actually go up and down as users join. Leave your your system or you're right are using your software or whatever so. I haven't I have an image in my head now. Okay great and so hopefully for most of the folks who are listening to this. Hopefully you do too. But if you don't or if you're really struggling to think about what a stacked histogram might look like an might be worth taking like five seconds to Google this on your phone to see like a mental snapshot because it's I don't imagine that the rest of this will make tons of sense if you have no idea. We're talking about so okay So stacked histogram how do we think about this in terms of the grammar of graphics so let me layer in a few of the fundamental ideas of grammar graphic so either taking place in a very explicit order to the first layer the most foundational layer of when you need to make? Data visualization is What is the data? Set that you'RE GONNA BE VISUALIZING. And how does that map from The the variables in the data set to a set of aesthetics. So what's the data set? Let's talk about that first. Let's use my example of. Let's use your example. Actually I think that's probably a little bit more familiar to our listeners than like a particle physics date set but instead we have some notion of a data set that has all of the users of our software through time and the type of what did he say. It was like the version of the software that they're using yet and actually. Can I make this a little bit? Meta and tweak this and we'll say this could be a linear digressions episode downloads. Like we can go. We can go into our hosting provider and we can see how many people download on on a given day and so of course the day after we release an episode we see a lot of downloads and then maybe two months ago by and now that episode is a small sliver.