[First of all, please take a look at the Databases page, which I will be updating as I find more data and tools for data analysis.]

One of the issues that I commonly face while teaching introduction to sociology is that I have to battle the fact that undergraduate students know very little about public policy in general and how powerful it can be to generate powerful outcomes for large numbers in affected populations, sometimes for better, and sometimes for worse.

Yesterday, the Guardian got a good series of maps on this (based on the Global Maps published by Children’s Chances, out of the World Policy Center).

Let’s get started:

Mapping children's chances - Paid Leave

First note the overall North / South divide. But also note that the red countries are not the poorest ones. Some poorer countries offer greater benefits than the US. But that is for mothers, what about fathers?

Mapping children's chances - Paid Leave 2

Now, what could explain the discrepancies? This might be a good opportunity to discuss gender roles and how they are enforced / supported / challenged through public policy. One can see right away that there is much more red on this map, that is, far fewer countries offer paternal leaves. However, there is a bit of overlap: blue areas in the first map match blue areas in the second one.

There is also a good opportunity to discuss how one should read a map. As much as one’s reading is guided by the colors, one should pay attention to the legend as well, otherwise, a careless reading might lead one to think that Australia offers longer paternal leaves than maternal leaves… well, no. The legend is different between the maps: blue in the first one means 26 weeks or more, where it means 14 weeks or more in the second one. So, comparisons should not be based just by a skimming of the colors: only red means the same in both maps.

Moving on,

Mapping children's chances - work

This one is interesting because, good luck finding a pattern. And Maude knows we sociologists love patterns because nothing says social structure like a good pattern of behavior or policies and outcomes. For instance, look at the Americas and you find countries for each type of regulation (or lack thereof). So what do we do in a case like this? After all, the first map was relatively easy to analyze. Well, first of all, it is a good opportunity to recognize that Western countries do not have a monopoly on children- and family-friendly policies and that the US tends to do pretty badly on those compared to countries at the same economic level but it is not the only one (Australia). On the other hand, other less wealthy countries have strict standards (check out the grey countries). But overall, what this tells us is “dig deeper”.

But it is all well and good to have a paid leave, but how paid is it really to be effective and allowing parents to not have to work during the leave time? The Guardian does not have that information but you can find it at the original site.

So, for mothers:

Mapping children's chances - How Paid  Mothers

Note that the values are maximum values, not necessarily what people are receiving but it is still amazing to see most of the world in the dark green category although one wonders how many women in the poorest countries, especially those who live in rural areas, actually receive any benefit at all considering we are talking wage replacement.

What about fathers? We already know they are less likely to get a leave in the first place:

Mapping children's chances - How Paid  Fathers

No surprises here.

What is missing here, though, is the impact these policies might have (or not) on other social indicators. If we assume that these policies should generate positive outcomes for children, we would need to correlate them with other variables, such as infant mortality, educational achievements, life expectancy, etc. One would also need to know the effectiveness in implementing these policies and have benefits distributed across the population especially on countries in the Global South.

So, as much as I like visualizations, these are a bit short on content. They are a starting point and raise a lot of interesting issues and questions but provide few answers.

As much as I can, I only integrate a data analysis component to my introduction to sociology classes. I am not trying to do anything really complicated but I want my students to get a very basic taste of what it means to think with data. For a long time, I had the perfect tool at hand in the form of Microcase Workbooks. There were several of them (a couple for introduction, one for marriages and families, one for social research). MicroCase is a bare bone version of more commons statist8ical software in the social sciences. It is a small program (does not take too much space on your hard drive) that runs on Windows only. However, it uses the GSS, American Community Survey, the World Value Survey and gives students the opportunity to select their own variables, construct their own tables / maps / pie charts / scatterplots / time lines. Students have always found it easy to use and actually fun. Well, that is over as the publisher decided to no longer update the software or the databases. Since then, I have been looking for alternatives. And, of course, publishers’ reps have been more than eager to try to sell me on their latest tools… which are all inadequate for my own purpose. And sending intro students into SPSS is out of the question… heck, I don’t want to go into SPSS.

So, what is a SocProf supposed to do? Well, there are now tons of data and databases that are publicly available. Why not create my own exercises? It will be cheaper to my students and my exercises can be exactly the way I want them. There are also now a lot of visualizing tools, either directly provided by the same organizations that make the data available (like the UN development report or Gapminder). I don’t get dependent upon the good will of a corporate publisher to keep on updating a product that is going to be costly to students. Win-win. On the losing side, it is going to be time-consuming to build up these exercises. I just spent an hour cleaning up data from the CDC on suicide in the US. And it the visualization tools are not available, I can always use Tableau.

So, for instance, indeed, I started simple with some data on suicide in the US. The CDC was the organization with the most data on that. Starting with this:

Suicide Map 1

The first problem with this map is that it is not interactive and the level of detail (by county) makes it a bit busy even if you can clearly regional patterns. These regional patterns actually make for an interesting puzzle for my students to solve. That can be a starting point but it is hard to create rankings, for instance.

A second option is to use the CDC interactive tool through WISQARS. So, basically, it looks like this:

Suicide CDC Interactive 1

As you can see above, you have a series of menus, drop down and radio buttons. You can filter things out. I kept the entire US but I selected “suicide” for intent of injury. And I kept the largest spread (2000 – 2006). I kept all the demographic subset at default. And I  got this as a result:

Suicide CDC Interactive 2

Several problems, with this: (1) on the right hand side, it says “Hover over a state with your mouse to see its name and rate”… that does not work. I tried different browsers including *gasp* Explorer, and no dice. (2) The export data function creates a csv file that takes a lot of cleaning up if you want to do the most simple statistical operations and visualizations. Which is what I ultimately ended up doing in Tableau Public (sorry, the embed still does not work).

The map, though, shows the same pattern as the county one above.

Third option, if you really want an interactive map, and still from the CDC, there is another interactive tool that is a bit trickier to manipulate but does the job: Health Data Interactive:

Suicide CDC Interactive 3

Again, you get to set your options and get an interactive map (with some missing data and only 44 states reporting, which is kinda annoying).

Beyond maps, though, the CDC has some good data visualizations but again, the raw data are harder to track down. For instance, you can get a broad overview over time:

Suicide Overall

Again, you can set up some interesting questions regarding the shifts in age categories with the highest suicide rates, when the shift happened and why. But you can drill down even further and consider race and ethnicity:

Suicide Race Ethnicity

Why whites and American Indian / Alaskan Native / Pacific Islanders (from my little Tableau thing, we already know that Alaska has a high rate)?

Ok, let’s add sex into the mix:

Suicide Race Ethnicity Sex

Across the board, men are way more likely to commit suicide than women. Adding sex does not alter the racial / ethnic patterns. So, should we pity white men after all?’

Finally, let’s add age. Let’s start with the 10-24 age category:

Suicide Age 10-24

One can only ask, what is going on with young American Indian / Alaskan Native / Pacific Islanders? Whites are no longer strikingly higher than other racial and ethnic category, for that age category.

But once you move up the age ladder, into the 25 – 64 category:

Suicide Age 25-64

Then, whites pop up again in the higher rates.

Ok, how about 65 and older:

Suicide Age 65 over

See what happens with American Indian / Alaskan Native / Pacific Islanders? And Whites?

Ok, how about some trends?

Suicide gender trend

Note the uptick with the recession. Otherwise, a familiar gender pattern.

Let’s separate men and women and compare by age categories, first, for men:

Suicide trend males age

The interesting trend here is the progressive joining of the 25-64 (up) and the 65+ (down).

Now, women:

Suicide trend females age

Now, we already know that women are much less likely to commit suicide than men. And this visualization has an extra age category but one can see that the relative increase is greater for women than men. This is especially the case in the 45-54 category.

And now, for the fun of a different visualization, let’s add yet another variable: the means of suicide:

Suicide mechanisms

I am normally not a big fan of stacked bars, but in this case, I think it works. You can clearly see that men are more likely to use a firearm in all age categories whereas suffocation and poisoning are more used by women. One could explore access and cultural factors in the decision to use one mechanisms or another to kill oneself.

This gender aspect is more visible if one filters out other variables:

Suicide mechanisms gender

So, as you can see, there is a lot to explore and a lot of sociological puzzles to be solved, just by using some very basic data, with limited variables, and just by using publicly available data visualizations.

I’ll continue to share these things as I build them.

For those of us interested in sociology, globalization, global stratification, and data analysis, the annual Human Development Report is a must-read and a highly expected source of data. This year’s edition is no exception. You can check out the highlights in the short video below:

There are some extra goodies, though, for the data analysts of all tripes. The report’s website has a great amount of visualizations and tools for people to explore the data on their own, based on their own interest. There is something for everyone and you can drill down to your heart’s content, using a variety of data visualizations or tables. That is what I did and the results are below.

Human Development Index 2013 from SocProf on Vimeo.

This is where the real good stuff is:

HDR visualizations

Click on the image to be taken to the actual page and you can start from there. It is a great exploration / teaching / learning tool.