Mapping data from Indonesia’s disaster information portal

Maps are great for decision-making (ex. where’s the nearest restaurant, how to get from point A to B)… they’re even better when you know how use them to help analyze data and information (thank you geography degree). A lot of data visualization automation software exists now that can produce charts, graphs and even maps to help see trends and patterns. But when it comes to really understanding and analyzing information, there’s still a lot to be said about including a human touch/perspective to data and information visualization.

One of the projects I’ve been working on is to capture and analyze disaster-induced displacement information for the Internal Displacement Monitoring Centre (IDMC) and it’s Global Report on Internal Displacement and Global Internal Displacement Database. One of things IDMC wants to know when a disaster strikes, like a flood, hurricane or earthquake, is how many people are displaced? It’s a simple research question that usually doesn’t lead to a straight-forward answer. Challenges can include lack of government monitoring for this kind of information, data collection and standardization issues, accessibility of said data, or even the political nature of publishing and sharing this information.

GRID2016_1000px

Fortunately some governments actually do a great job in collecting, processing, and publishing this kind of information. Indonesia is one of them. The government provides a disaster data portal which it maintains on a regular basis that tracks where a disaster takes place, when it happens, what kind of hazard triggered the disaster event, and the people killed, missing, injured and displaced/evacuated. For one of the most disaster-prone countries in the world, having this kind of information online, updated and easily accessible is an asset for research organizations like IDMC to be able to develop policies and recommendations that can have an impact on saving lives.

While the website has automatic visualization features, it requires a lot of assumptions and understanding by the user to know what to search for. At the same time, it is a bit challenging to use since it’s an online portal that has limited visualization and analysis capabilities. As part of my research, I decided to put my geography background to work to make sense of this data.

The mapping feature from BNPB disaster data portal - http://dibi.bnpb.go.id/data-bencana
The mapping feature from BNPB disaster data portal – http://dibi.bnpb.go.id/data-bencana

I downloaded the raw data in Excel format and in most situations a quick manipulation of Excel can reveal some trends. However the Excel included too many data points with differing variables like event date, hazard type, and location. I wanted to find a better way to make sense of its data so I decided to plot the data using QGIS, a free open-source Geographic Information System (GIS).

Here’s a quick summary of what I did:

  1. The Excel included raw district-level disaster information that goes back as far as 1815. I only need 2016 data so I filtered the data set and extracted all 2016 data that included “Mengungsi” or evacuation values greater than zero.
  2. In order to plot the data on a map, I needed to add spatial information to the data set. As the Indonesian data was broken down by districts, a quick search led me to district boundary level data published by the World Food Programme – unfortunately I couldn’t find district-level spatial data on the government website.
  3. Once I joined the Excel sheet with the district boundaries, I still needed to clean and verify that all districts in the government disaster data set matched the WFP district boundary data set. This is key otherwise the data can’t be mapped by QGIS.
  4. Since no GPS locations were included to pinpoint exactly where each disaster occurred, I defined a centroid (i.e. a point at the centre of each district boundary). This allowed me to plot each event as a specific point on the map to help in analyzing and aggregating information since multiple events can take place in one district.

It may not have been pretty, but it did make it easier to interpret the data based on hazard type, event date, and geographic location. And it made it more effective to work with when I wanted to conduct further analysis, run queries to address different research questions, and produce maps like the ones below.

Evacuations Events-by-Date Events-by-Hazard-Type Total-Events-by-District

Data visualization automation software and websites can be useful, but it’s also great to have a skill like old-school mapping and cartography to turn to when I need it… times and projects like these make me realize how useful a geography degree can be.

Disaster graphics get bronze prize for international information design award

One of the buzz words these days is “infographics”. While these can range from just simple pie graphs to complex flowcharts, the best aren’t necessary the most “designed”. The most effective information graphics are ones that can communicate an idea or story and that can help the audience turn information into knowledge. This also means going through a design thinking process and understanding the subject matter so the “design” matches the objective of what the graphic is trying to communicate.

This was the philosophy I took when coming up with infographics for the United Nations Office for Disaster Risk Reduction (UNISDR). On a whim and to share my love for communicating information and data, I decided to enter a couple of infographics for the 2014 IIIDAward.

fung

While I had an honorable mention in the 2011 IIIDAward, whaddya know, the graphics I entered this time around won third prize for two categories:

Didactics: “A timeline revisited”
This category was for projects that focus on educational or instructional information design – Download my entry here.

didatics

Editorial: “Making sense of disaster data”
This category was for projects related to media, journalism and writing – Download my entry here

editorial

The selection criteria was based on:

  1. Quality of the employed problem solving procedure:
    – identifying the information needs of users
    – making needed information available, accessible, understandable/usable
    – assessing the effectiveness of the provided information, if at all possible
  2. Attractiveness and elegance of the designed information

The IIIDAward is part of the International Institute for Information Design (IIID), a global network of individuals and organizations who are interested in optimizing information and information systems for knowledge transfer in everyday life, business, education and science. Its aims are to stimulate internationally the development, recognition and good practice of information design in its broadest sense.

All winners will have their work exhibited on a global tour. The first stop for the exhibition is the IIID Vision Plus 2015 conference in Birmingham: http://www.visionplus2015.info/
All winners will have their work exhibited on a global tour. The first stop for the exhibition is the IIID Vision Plus 2015 conference in Birmingham – http://www.visionplus2015.info/

Charts don’t explain themselves

It’s a scary fact of life these days that people take figures and numbers at face value, and consider them authoritative when it comes from the “source” or by the media, yet a lot of times they are taken out of context to serve a purpose like selling an idea or a service. Maybe it’s the way we’ve learned things in school, but when numbers and figures are put into charts and graphs it becomes more “scientific” or trustworthy. They are also easy ways to visualize information – this is misleading. I’m a big fan of data visualization and information design, and have made a living out of it. At the same time, I also understand their limitations and biases because of how numbers can be manipulated or emphasized to serve a purpose. Communication isn’t just about exposing numbers – it’s about being able to explain them in a meaningful way.

climatechange

This below video from Veritasium which demystifies 13 misconceptions about global warming (i.e. climate change) is a great example of why charts and graphs don’t explain themselves. The presenter flashes lots of scientific graphs and charts that have also been used by climate-deniers to support their arguments. The key thing is that he supports the numbers with clear explanations backed by research rather than just focusing on the numbers or trends shown by the charts.

Data is good. Data is bad. Where’s the middle ground?

It’s amazing to see the use of technology to track, monitor, and collect information and data from things that we do, like sports, to help improve and enhance what we do. I think it’s called life-hacking and if you’re thinking of resolutions for the New Year, there’s a whole website about doing just that. Yet as we get more and more comfortable tracking and hacking ourselves (FitBit anyone?), I can’t help but think – are we getting too reliant on the data?

I’ve written a couple posts about how data has been used to better predict and inform baseball and basketball. Yet a prime example of the over-reliance on data, and not balancing it with common sense, comes from the sports world. In this article, it shows that despite all the hype around big data for the 2014 World Cup, Nate Silver, a popular data geek who made predictions during the 2012 US presidential election, still got it all wrong. Some said that the competition was no place for big data, which can’t understand the intrinsic issues and subtleties that real soccer fans see. Others claimed that Silver ignored some basic data issues.

IMG_0333

Data and information that we collect whether through technology or by ourselves inherently has biases, like how someone setup these exit “sortie” signs for a reason. We build the technology and develop algorithms that are supposedly objective yet in developing them we make inherent compromises and assumptions. The same goes with collecting and compiling data ourselves – from monitoring our diet, building a contact/email list, or just keeping track of our to-do lists and calendar – we are biased to certain things (ex. what we think is more important, what we can remember, etc.) when collecting this information (i.e. Excel sheets anyone?). Also are we managing our information consistently enough so that it can reveal some truth that can help our decision-making?

P1150774

I work for an organization that prides itself on its “information management” and spend a lot of time with internal and external clients to not only improve this management (ex. simplify, organize and clean the data), but also to understand how it can be used strategically to communicate their work and key messages (ex. like making a good infographic). Within the international development community, OCHA is light-years ahead of the game when it comes to this. They’ve also evolved and branched out to apply information in a useful way for the humanitarian community like the recently launched INFORM initiative to improve risk analysis and the Humanitarian ID project to make contact management simpler and better for an emergency or crisis. Perhaps following OCHA’s lead, there are plenty of UN organizations starting to visualize this information and realize that data is more than just 1’s and 0’s or that it’s only for “geeks”, but can be used in different ways to communicate and provide “evidence” to improve programming and decision-making. The success of innovative ideas like these will depend on how accessible both the data and tools are to the people who will use them. It can also be summed up by these two quotes from the Nate Silver article:

Predictions are no better than the quality of data and model that you employ.

Big data and predictive techniques are supposed to inform smart decision making, not automate it.

On the opposite end of the spectrum of the Silver article are these little visual vignettes by the New York Times of what went right for the dutch, and so wrong for Brazil during the world cup. They are both data-driven and informative.

Data simplicity might be the best way to help us understand and improve the way we do things.

Technology isn’t going to solve all our problems, just ask Johnny Depp

A couple of weeks ago I watched Johnny Depp’s new movie Transcendence – spoiler alert: it’s a love story (kind of) – and found the issues the movie touched on quite intriguing. A lot of it focused on people’s reliance on technology and how it was suppose to give us all the answers to life. The premise in the movie was that there were two opposing perspectives on technology: one was to give ourselves completely to technology and artificial intelligence (AI), while the other was on balancing human vs. machine decision-making where we/humans are in the ones in control. Well, all hell breaks loose when Depp’s consciousness gets uploaded into an AI… just watch thetrailer below to see what happens.

The topic of humans vs. machines has always been around (remember 1984’s Terminator?). It’s just that now, it’s actually happening with more and more technology infiltrating our lives. I don’t have to go very far to think about how often I jumped to attention when getting a Facebook, Gmail, or Twitter notification on my phone – isn’t the beep from our phones kind of like a master calling his dog?

Nice to see that life can still be entertaining (and distracts us from our phones) like this drummer asking people to play with him.
Nice to see that life can still be entertaining (and distracts us from our phones) like this drummer asking people to play with him.

While these scenarios are scary (otherwise how would Hollywood make more movies??), more insightful thinkers like Clive Thompson makes the argument that it’s still possible, even necessary, to blend humans and machines together to actually evolve and help society. This doesn’t mean turning everyone into cyborgs, but really looking at how we can use the best of both worlds. In the early chapters of Thompson’s book ‘Smarter Than You Think’, he provides examples of how chess grand masters worked with computers to not only improve the speed, but also creativity of play. I’ve just started the book, so definitely more insights to come.

My favorite quote so far:

“At their best, today’s digital tools help us see more, retain more, communicate more. At their worst, they leave us prey to the manipulation of the toolmakers. But on balance, I’d argue, what is happening is deeply positive. This book is about the transformation.”

Update: Thx Mr. Thompson!

Data isn’t everything – let’s balance it with common sense

I love visualizations, data, and information and finding creative ways to turn it into something interesting and useful. It’s a great way to take advantage of the analytical and creative sides of the brain. At the same time, I’m quite aware that even if the world is becoming more visual and addicted to stats and numbers, we have to be even more wary of how that information is being used and interpreted. It’s shouldn’t be about seeing the superficial side of a statistic and using it in the hopes of sensationalizing a topic (i.e. it’s tempting for journalists and others to do this), but being true to what the statistics represent, building a story around it, and respecting how this may influence the audience.

That’s why it’s refreshing to see that in WIRED, a magazine focused on technology and all the numbers coming from it, they published Felix Salmon’s article “Numbed by Numbers: Why Quants don’t know everything“, which helps to put a bit of perspective on the numbers game.

Let's not get bent out of shape over numbers.
Let’s not get bent out of shape over numbers.

According to the Merriam-Webster dictionary, a quant is an expert at analyzing and managing quantitative data and its first known use was in 1979. In Salmon’s article, he uses the example of the movie Moneyball which documented how statistics were used in baseball to help the underfunded Oakland A’s to a division-winning 2002 season. He writes that quants are almost always right since they use algorithms and setup systems that track every aspect of society with 1’s and 0’s. Yet, the more that a field is run by a system, the more the system creates incentives for everyone to change their behavior – and in the end people start to cheat the system – and that the statistics/numbers generated by the system may not actually hold value or be telling the “truth”.

It’s increasingly clear that for smart organizations, living by numbers alone simply won’t work…

There needs to be a bit more of a balance to the numbers that can help make our lives better and the use of good ol’ human insight, decision-making and common sense. Believing in statistics as they stand is one thing, but we also have to use our judgement and experience to bolster our understanding so that this information can improve the society we live in. For example, the National Weather Service employs meteorologists who, understanding the dynamics of weather systems, can improve forecasts by as much as 25% compared with computers alone.

Let’s celebrate the value of disruption by data – but let’s not forget that data isn’t everything.

Read “Numbed by Numbers: Why Quants don’t know everything

Infographics: the gourmet or fast food version?

_MG_0525

Great cuisine is an art form with a purpose – to make us feel good and full. I had this amazing dessert in a Les Oliviers in the South of France over the Christmas holidays. It’s made of pineapple, passion fruit, cream, sugar and probably plenty of other ingredients – each of them are simple enough to find in a grocery story yet putting it all together into a tasty post-dinner experience takes skill and a bit of creativity. Its the same way we can think of infographics and the ability to take content, data, and other information and turn it into some sort of shape and visual experience that is easy and enjoyable to consume.

The Courrier international, a French newspaper that compiles stories from around the world, featured a special edition on infographics. While I found most of the infographics in the newspaper, which highlighted issues like employment, health, politics, and technology, a little too complicated to understand at first glance, the articles on the how, what, and why of infographics were very insightful.

infographic2-2 infographic2-3 infographic2

The collection of articles (all translated in to French) is a great primer to data visualization and what it means in an increasing visual and data-driven world. The topics included analyzing the link between seeing and thinking, visualizations as tools for the mind, the use of Tweets to forecast getting the flu, and the “children of Big Data“.

Some big names in the field of data visualization and infographic design mentioned in the articles include:

  • Paolo Ciuccarelli – Associate Professor at Politecnico di Milano, he teaches at the Faculty of Design in the Communication Design master degree and part of DensityDesign, a Research Lab in the Design Department of the Politecnico di Milano.
  • Alberto Cairo – teaches Information Graphics and Visualization at the School of Communication at the University of Miami and is the author of The Functional Art: An introduction to information graphics and visualization
  • David McCandless – a London-based author, data-journalist and information designer, working across print, advertising, TV and web, and author of “Information is Beautiful“.

Like food, there’s always going to be both the fast and gourmet versions. It’s a toss up as to which would be better. Gourmet food is always nice to the palate but takes a lot of time and energy to prepare. Fast food might not be the healthiest option but it’s easily accessible. Infographics are the same – do we want information that just scratches the surface or more in-depth analysis – it all really depends on what they are trying to communicate and to whom. Graphics and design are nice but if they’re not achieving its aim (ex. education, awareness, behavior change, etc…) maybe there are other better mediums?

infographic1a

 

Getting down with tech to solve crime and reduce vomit

Social media isn’t just for “fun”. All the sharing that people do via Facebook, Instagram, Twitter, etc. can actually help fight crime, improve hygiene, and provide insight into how to improve services (like having less traffic jams) and keep us safe. While the humanitarian community has been looking into how tech can be used to save lives in disaster or conflict-prone countries, you don’t have to go too far to find that the integration of tech and basic public services like policing and public health can make our lives easier, safer, and maybe just maybe filled with less vomit.

In a study earlier this year, researchers at the University of Rochester used a Twitter search tool called nEmesis to identity cases of food poisoning with tweets that had GPS coordinates. In just four months, the system collected 3.8 million tweets from more than 94,000 unique users in New York City, traced 23,000 restaurant visitors, and found 480 reports of likely food poisoning. The public health sector has typically been ahead of the curve when it comes to prevention and early warning to reduce the risk of, say, an epidemic of a virus.

Another example of how tech and social data can actually help is in crime-fighting. It might not be far off from the tech that Batman uses like the Batcomputer. Just ask IBM : they’ve been working with the police to setup systems that will use pattern recognition and anomaly detection technology on existing records like 911 calls, crime records, and building permit activity. Patterns revealed can help decision makers anticipate rather than just react to problems.

“We’re entering a new era of police work,” says the Fort Lauderdale Police Chief.

One of the funniest comedy shows this year has been Brooklyn Nine-Nine and in one of the episodes this season they even touch on the fact that “real crime-fighting” these days is about about using data and technology to solve crimes! In the episode “Old School“, despite coming to work with a huge hangover from a drinking binge the night before, Jake, the main character, pulls together and figures out how to find the IP address of the guy who’s been stealing credit card numbers.

The Open Data Revolution

3054501076_87f2ae6f7a_o

There’s a bit of a revolution going under most people’s noses and it is probably something most people won’t think about even if we need this revolution whether we realize it or not. And at the same time people contribute to this revolution even if they don’t know about it. It’s a good thing – it makes us understand our world better, helps to build better tools and products that hopefully make our lives easier, and eventually gives us the power to make better decisions and helps governments to serve us better.

Coined in the 90s, the “information superhighway” is all about the flow of information and communication through digital channels. On this highway there’s plenty of lanes to drive on, from normal, fast, to car-pool lanes. Open data or knowledge is the car-pool kind. And this growing interest and need for openly accessible information and data is the revolution.

IMG_0023

The Open Knowledge Foundation has a standard definition on what is Open Data:

Open data is data that can be freely used, reused and redistributed by anyone subject only, at most, to the requirement to attribute and sharealike.

With more and more information, this asset becomes something that we can trade, market, and sell – some like to call this the “information economy“. Just think about how your profile on Google or Facebook can actually help advertisers and companies to better understand your habits. In “Who Wins in the Battle for Power on the Internet?“, the author takes a dark look at the power of cyberspace and the internet and makes a convincing argument as to being a situation of “you’re against us or you’re with us”.

Were at the beginning of some critical debates about the future of the Internet: the proper role of law enforcement, the character of ubiquitous surveillance, the collection and retention of our entire lifes history, how automatic algorithms should judge us, government control over the Internet, cyberwar rules of engagement, national sovereignty on the Internet, limitations on the power of corporations over our data, the ramifications of information consumerism, and so on.

While most people won’t like this kind of intrusion, the fact that our digital footprint can be measured, monitored and tracked isn’t such a bad thing. Don’t forget, it’s not just about people, but this information also relates to how companies, governments, and organizations do business which can help you or me understand better our choices and decisions that actually have an impact on them. I’d like to think that there’s some positive aspects of embracing the challenges of data and information (overload?) and how we can make use of this to make the world better. Ultimately, it’s people who will push the boundaries of how data can be used for good or evil… and the Open Knowledge Foundation is a community that’s been pushing for this “good” revolution.

opendata

It’s going to be an interesting future as more and more data and information can and will help us make better decisions. In September I attended my first Open Knowledge Conference and found the discussions fascinating, the challenges complex, and the potential inspiring. It was also a great way to connect, meet people with similar interest, and just get excited by all the ideas and creativity.

My focus for the meeting was seeing where the discussion led in terms of understanding our risk to disasters and wrote an article for work about it called: “Open data makes disaster risks visible“. While risk information is definitely growing, there’s plenty of other places that have already matured in how data can be used to understand patterns, inform the way we work and do business, and hopefully provides us with insight into how we can evolve, plan, and make societies and systems better. For example, there was an exhibition by Schema Designs that showcased traffic patterns over the course of a day in Geneva, Zurich and San Francisco and highlighting the frequency and intensity of public transit use throughout the day…

Behind the scenes on the 2013 IDDR infographics

Tomorrow is October 13th. To most people, it’s just another day, but to those who believe in and work to reduce disaster risks, the 13th of October is the day to celebrate the International Day for Disaster Reduction (IDDR). This year the focus of IDDR is on some one billion people around the world who live with some form of disability. Representing one-fifth of the world’s population, persons living with disabilities (PWDs) have unique contributions, often overlooked, to help reduce the risk of disasters and build resilient societies and communities. Here’s a look at what went into the infographics to “step up” the issue.

thunderclap_side

One of the major outcomes of this year’s IDDR is a survey that was conducted to gain insight into the views of people living with disabilities and how they cope with disaster situations. The online global survey collected responses in English, French, Chinese, Russian, Spanish, Arabic, Japanese, Italian, Bahasa Indonesia, and at the same time some disability organizations ran the survey offline in Bangladesh, Thailand, and VietNam. The 22-question survey included a lot of stats and figures that ranged from understanding respondents’ disabilities to if they have a plan for disaster situations.

Over 5000 people from 126 countries made the effort, sometimes with the help of a relative or friend, to fill in the survey. With the amount of data crunching that went into the survey, it was a shame just to write about the findings. To give it more visibility, I created 3 infographics highlighting some of the major findings. The key to the visualizations was to both highlight the challenges PWDs have in dealing with disasters, but also to bring their voices to the forefront in terms of what they want for the future. With all the stats and the comments submitted, I decided to create different variations – I hate when too much information and stats are packed into graphics or text because it just defeats the purpose of communicating what’s most important.

There are some interesting aspects of the overall summary that presented themselves once I finished putting the visual touches to the data and info. The first is the regional spread, which was not covered in the initial announcement of the preliminary findings, and how it shows the distribution of survey responses. Most responses came from Asia, particularly from the offline survey responses collected from Bangladesh which numbered over 1500. I also went through the comments and found ones that came from the region to give the data a little more context. The second interesting note for this graphic is the responses to the “Are you aware of a disaster risk reduction plan?” question. We didn’t see the trend in the raw data, but once I graphed it you can see that most respondents indicated that they are NOT aware of either a national or local level disaster risk reduction plan. I can understand if people didn’t know of a national plan, but not event a local one? This is a bit of a scary thought, especially for both national and local governments… I wonder what the trend would look like if the question was asked to “able-bodied” people?

Word clouds are great – they’re both visualizing engaging and actually make you think about the issues/context. For the survey, some of the most interesting responses weren’t just the quantitative stuff, but also all the qualitative ones where people actually took the time to write down their concerns and ideas. We focused on looking at two questions in the survey. The first was Question 8 and having a personal preparedness plan. Surprisingly, 71% of respondents said they don’t. Of the 29% who said they did, the most common words that showed up in their responses were “water, plan, food, family, and supplies”, which gives a bit of sense as to what’s most important to people living with disabilities… And also what disaster responders should plan and prepare for should a disaster happen – would “able-bodied” people care about the same things?

Finally, the Word cloud for Question 19 on priorities to include in a new disaster risk reduction framework holds some interesting insight. In 2015, the world will come together to adopt a new framework to succeed the Hyogo Framework for Action (2005-2015). For people living with disabilities, “information and communication” are the most common priorities they want, as well as ones that address their “needs” and making things “accessible”. In an age where information and communication is literally at our fingertips, it’s a revealing sign that people living with disabilities would like these to be a priority – does that mean information and communication technologies (ICTs) aren’t currently meeting their needs? Would “able-bodied” people be asking for the same priorities?

If you like the infographics, you can download them here.