by Robert Kirkpatrick
It's one of the most frequent questions
we get here at Global Pulse: "where do you stand with this whole 'crowdsourcing'
thing?"
We have always believed that such information
could be usefully combined with statistical data from various UN crisis
early warning systems and the new kinds of "information exhaust" being
generated passively by people as they go about their daily lives. But the
devil's in the details.
My CrisisMappers colleagues have argued
from the beginning that real-time reporting on the impacts of crises by
the affected population themselves must be a foundational methodology for
how Global Pulse functions at the country level. If you want to learn to
listen to the voices of the vulnerable, they said, you must give them a
voice. It's hard to argue with that logic.
At the same time, our colleagues in
the official statistics community rightly point out that governments cannot
make policy decisions based on information from untrusted sources. While
acknowledging the validation delays inherent in statistics and the need
for real-time data, they point out that crowdsourcing from the masses tends
to yield messy, unverified reports that may amount to little more than
hearsay - or worse. The general consensus is that it would be irresponsible
to use this kind of information as a substitute for high-quality, validated
statistical data -especially when it comes to deciding where to adjust
social safety nets during crises. It's also hard to argue with that logic.
Information crowdsourced from the masses ought not be relied as the sole
source upon which to base policy responses to the kinds of slow-onset development
crisis that Global Pulse is looking at.
So we had a bit of a challenge. Our
starting point was the 39+ existing UN sector-specific early warning systems
and a growing number of mobile phone-based data collection efforts by UN
agencies. All of these were potentially useful information sources, but
they weren't enough to get us there. Our mandate was to provide decision-makers
with cross-sectoral information on the plight of the vulnerable that was
actionable, accurate and real-time. Some of the other potential kinds of
data on the list were:
1. Real-time information
exhaust of unknown utility,
2. Real-time crowdsourced
information of unknown accuracy, and
3. High-quality survey
data that is years out of date.
Yikes. It's a bit like that old business
adage, "Fast, cheap and good: you can have any two." Where to go from
here?
The breakthrough came when we realized
that there was already a highly successful model to draw on in the field
of public heath: outbreak investigation. Public health institutions everywhere
monitor real-time data for anomalies that might represent disease outbreaks.
Outbreak investigation teams from Ministries of Health, for example, collect
and analyze reports on sightings of dead birds, sales of certain medications,
and case reports on hospital patients presenting with certain symptoms
to detect outbreaks of potentially dangerous strains of avian influenza.
When a pattern begins to emerge in weekly reports coming from a remote
village, for example, a team is sent in to investigate. Their first step
is to confirm that an outbreak of some kind is indeed underway. Once they
have established this fact, they move to the verification step, sending
samples to a diagnostic laboratory for analysis. When the lab results come
in and government experts know what they are dealing with, they initiate
an appropriate response to contain the outbreak and treat those affected.
So outbreak investigation follows a process of detection, investigation,
verification and response in which real-time "circumstantial" evidence
is monitored for anomalies that might trigger subsequent investigation
and targeted collection of scientific evidence.
Why couldn't Global Pulse adopt a similar
model? When families affected by external shocks begin coping by cutting
back on meals, pulling their kids out of school to work in the market,
or foregoing medical care, could we detect enough of a signal early on
to realize that something bad could be happening, investigate through crowdsourcing
of citizen reports, and then verify the nature and severity of the impacts
through rapid impact assessments? If a similar process could work for Global
Pulse, it could help leaders gather the hard evidence they need to intervene
with agile, targeted policy responses to protect these families from downstream
impacts such as malnutrition, lack of education, or health problems.
The more we thought about it, the more
sense it made: in lieu of an unsustainable and impractical approach to
real-time monitoring based purely on household surveys, we could adopt
an agile, reactive, targeted, and phased approach that could still allow
government ministries to monitor a large fraction of the population. Figuring
out exactly how to do this will be the work of our Pulse Lab teams, beginning
in Uganda in the next few months, but we know that our monitoring approach
will incorporate three phases:
1. DETECT: government monitors real-time
information exhaust for anomalies.
All over the developing world, people
are increasingly using mobile phones to access services for banking, sending
money to family, cashing in food vouchers, sharing prices of agricultural
products, buying and selling goods, furthering their education, and seeking
information on every topic imaginable. We believe governments can analyze
information exhaust generated as a by-product of the use of these services
to detect the early impacts of crises on vulnerable populations. Google's
"Flu Trends" works this way, by looking for signature pattern in the
information exhaust created by online searches, such as increases in searches
for terms like "fever."
In order to understand what even constitutes
an anomaly, we'd need to establish a baseline and perhaps incorporate
contextual data (e.g. social practices, market data, remote sensing, etc.)
from a variety of sources. We could then conduct a retrospective analysis
of historical data sets to learn to characterize the "signatures" of
known previous impacts on different population groups. When a particular
impact began to be felt at the household level back in 2009, for example,
what changes in collective behavior could be observed in patterns of buying,
selling, transferring funds, or seeking information? Once we know how these
early impacts manifest themselves in usage patterns of programs and services,
we will be able to train software to detect this signature the next time
it shows up and alert government officials of potential problems.
While it's certainly true that many
of the most vulnerable populations have access to cell phones, there are
many UN programs that specifically target these populations, and analyzing
information exhaust based on how these programs are used will, we believe,
also yield useful results. The World Food Programme, for example, runs
school feeding programs that collect school attendance metrics. Yet this
information is only used to monitor program performance and is not published.
We believe such program-generated information exhaust could be quite useful
for vulnerability analysis. In other words, there are ways to reach populations
with no mobile phone access. Finally, it's worth noting that it will take
our country-level Pulse Labs a few years to develop and refine the analytical
methodologies required here before the model is ready for broader adoption,
and a few years from now, many who do not have mobile phones today will
have them. Mobile penetration is accelerating in developing countries.
There is plenty for us to learn in the mean time.
2. INVESTIGATE: government elicits initial
reports directly from vulnerable communities.
Once a concerning pattern has been detected
in the information exhaust being generated by a particular community, government
officials could blast out text messages to a network of citizen reports
to initiate an investigation. The idea here is to confirm either that the
community in question is, in fact, being impacted, or that there are widespread
public perceptions that this is the case. This might be initiated via a
mass broadcast to a randomly selected population sample, but it would likely
be more useful if it were targeted at trusted network of preselected citizen
reporters. Imagine if government ministries had a directory with the mobile
phone numbers of all of the community heath workers, teachers, radio station
hosts, and youth volunteers and could ask, "are there serious food shortages
in your community, or has food become unaffordable? How are you coping?
What do you believe is causing these problems?" They could even request
that members of this network actively elicit more information from their
patients, students, audience and parents.
Note that this scenario doesn't really
qualify as crowdsourcing according to some definitions , because it involves
reaching out to a trusted network rather than the masses. Ushahidi's Patrick
Meier has referred to this technique as "bounded crowdsourcing." We believe
a bounded approach could have significant value as an intermediary confirmation
step to help justify deployment of an onsite rapid assessment team to gather
solid statistical evidence. There may also be cases where unbounded crowdsourcing
(i.e. from the masses) would also be useful, though the Pulse Labs will
need to experiment with these techniques in different contexts to determine
whether they are appropriate. And it's not hard to imagine that over time
certain citizens contacted via unbounded crowdsourcing would prove their
credibility in the eyes of government though consistently accurate reporting,
to the point that their first-hand reports might come to carry more weight
than those from unknown sources.
3. VERIFY: government sends in a rapid
impact assessment team
Should citizen reports confirm that
the population under investigation does at least perceive that they are
being significantly impacted, the next step would be to send in a team
to gather the hard statistical evidence on the nature and severity of the
impacts and establish their underlying causes. Armed with hard evidence
that they would otherwise never have known they needed to collect, government
would now be in a position to implement an appropriate policy response
- hopefully within a matter of weeks after the initial signal was detected.
To sum up, we believe that we work with governments to develop an approach where they are able to fuse real-time information exhaust and statistical contextual data to detect anomalies, use selective elicitation of citizen reports from trusted sources to confirm that an onsite assessment is required, and then send in a team to get the hard evidence. So the answer is a provisional "yes" -- we do believe there is a potentially useful augmentative role for crowdsourcing (in one form or another) in Global Pulse, provided that our Member State partners are game to explore the utility of this technique through the Pulse Labs, which will be trying out a variety of emerging technologies and experimenting with many different approaches to real-time monitoring. Platforms like Swift River and Riff, which were initially developed for sudden-onset emergencies, may also prove useful for the development crises Global Pulse is focused on. Now it's time to find out what actually works!
We believe the phased investigative approach I have described here has the potential to help accelerate our transition to world in which development decisions, like those in the private sector, are based on real-time evidence. Yet there is much to learn, and there will be many, many iterations on the ground as we go from concept to implementation. Along the way, the technical, methodological and political challenges will be significant, not to mention the clear need to address concerns around individual privacy, data security, data sovereignty, and intellectual property fully and transparently. We'd love your feedback and your ideas on approach as we move forward.