New World, New Data
Apr 10, 2020
10 minutes read
Photo by

Photo by New York National Guard

Picture of military members adding lamps to the Patient Care Units for Phase II of the COVID-19 Response at the Jacob K. Javits Convention Center in New York City, April 2, 2020.

I just completed the fourth week of lockdown here in Switzerland and the COVID-19 pandemic is already defining itself as a life altering event. It is still unclear though how the economy will resume and how it will differ from our former way of living. The social distancing achieved under containment was the initial step and will likely be followed by testing and tracing. The analysis of an epidemic such as COVID-19 can be done using compartmental models, in particular a SEIR model which defines four compartments: Susceptible, Exposed, Infectious and Recovered. Tom Fiddaman is CTO with Ventana Systems, maker of a popular systems dynamics modeling tool. In the video below, he explores the SEIR epidemic model for his hometown in Montana confronting coronavirus.

Immediate Data Needs

Lockdown is only one part of the isolate-test-trace effort required to get us out of this awful situation. Data-driven organizations as well as civic tech groups understand the associated data needs and are bringing their capabilities to bear.

Alphabet’s Verily’s Baseline COVID-19 Pilot Program is an example of how Google is leveraging its partnership with Ascension. The joint pilot program consists of drive-thru testing sites, an online screening test and measures to protect the data harvested.

Another example is Antoine Flahault‘s, a system to monitor the activity of influenza-like illnesses with the aid of volunteers via the internet.

Even simpler, is Sean Bonner’s covid19map which lets anyone contribute data through a hosted Ushahidi instance. Sean is co-founder of SafeCast, a citizen sensing organization that played a key role during the Fukushima Daiichi nuclear disaster in Japan back in 2011. SafeCast and Ushahidi know a thing or two about crowdsourcing information, in particular data that can be critical for relief efforts. Unsurpringly users can flag “refused testing” or “testing unavailable” incidents which also occur during dodgy elections.

Machine learning experts who know how to leverage proven audio processing techniques are also able to contribute with audio classifiers. Think of the classifier used by Google for its Home Assistant to recognize the trigger sequence “Hey Google” to invoke its services. Coughvid collects coughing sounds of infected volunteers precisely to train a classifier. Without speculating on the effectiveness of such a classifier, I imagine telephone services and home assistants open perhaps additional possibilities for testing at scale…

Singapore had learned from SARS back in 2003 and was prompt to deploy a structured response beyond isolate. A group of contact tracers was activated by the Communicable Diseases Division of the Ministry of Health. Their procedure includes patient interviews to map whereabouts, reaching out by phone to all people potentially involved in interaction, identification of risk and notification based on the outcome. This time, the government also released the TraceTogether App which people can install to trace contacts voluntarily. The home page has a nice explanatory video. Designing apps for the entire population brings up some nice UX challenges. A pet peeve of mine is the lack of consideration for elederly which seems pervasive in e-banking… This one seems to be no exception as this feedback shows.

Good app, but the texts need to auto shrink on some phones. I tried to install on my parents phone (Samsung s10), they had extra large fonts on their phone. The part where we need to key in the phone number cannot be seen, I can’t scroll down not shrink it. I have to to go the phone settings to reduce the size. If people didn’t know the work around, they would assume the app can’t be used.

As COVID-19 expanded across the globe to hit most nations, wider interest in tracing real-life contacts through Bluetooth quickly emerged. These include MIT’s SafePaths, Enigma’s SafeTrace or Covid Watch to name a few.

What digital breadcrumb will apps use to trace proximity contacts? Designs rely on a low power wireless technology known as Bluetooth Low Energy which was introduced with Bluetooth 4.0. You may know this tech from Apple iBeacons which were intended e.g. for marketing purposes. A store could place iBeacons on the shelves and a Smartphone would read the data they advertise, typically some URL to a special offer. What’s key is that the Smartphone can also advertise and not just be an observer.

For tracing apps to work, the Smartphones must continuously broadcast BLE signals and interoperate between phones regardless of their make. Most Smartphones are either on Apple iOS or Google Android. During normal times Androids and iPhones, being conduits to content, don’t play well together. Read Fred Volgestein’s excellent Dogfight for more on this. For instance, the way each mobile OS interprets the use of BLE services differs. But these are not normal times. Apple and Google announced a joint effort in contact tracing and published a preliminary specification. Unsurprinsingly, it includes a section on privacy.

Since 2019 and The Great Hack, Cambridge Analytica has become a household name in the way social networks can be abused to impact the fabric of our societies. With tracing, we are making a step towards proximity networks and data from our immediate friends and neighbors. So privacy is kind of a big deal.

Forewarned is forearmed!

With New Data Comes New Responsibility

Marcel Salathe who heads the Salathe Lab of Digital Epidemiology here in Switzerland and Carmela Troncoso, head of the SPRING Lab focused on Security and Privacy Engineering at EPFL, were quick to recognize the importance of privacy in relation to a robust approach to tracing. Given the security and data requirement challenges shared by all impacted nations, a Pan-European initiative called PEPP-PT was set up to provide guidance in line with GDPR. PEPP-PT is a non-profit based in Switzerland with more than 130 members across eight European countries, including scientists, technologists, and experts from well-known international research institutions and companies. It has spawned initiatives such as Decentralized Privacy-Preserving Proximity Tracing or DP-3T which is one possible protocol for supporting decentralized proximity tracing. In the US, PACT is a comparable protocol which will be available through the SafePaths app and seems to have benefitted from earlier work on Private Kit.

DP-3T aims at minimizing data collection and differs in that sense from the data grab models behind initiatives such as Verily or mentioned above. Efforts to anonymize data are notoriously difficult and known to be vulnerable to e.g. re-identification attacks. Instead, the protocol proposes to minimize the amount of data needed and comes in two flavors - trusted and trustless - when it comes to the shared backend infrastructure. Its four phases are shown below.


DP-3T CC BY 4.0

  • The installation includes set up of a digital breadcrumb (EphIDs) dispenser.
  • In step two, normal operations, the app does two things: it broadcasts EphIDs to other phones running the app and checks for news with the backend server (see last step).
  • The third step addresses the handling of new known patients. They receive a token from a health authority and voluntarily use it to push their EphIDs to the backend server. The data is retained for a limited period of time.
  • The fourth step describes the actual tracing. Apps retrieve published infection data and check encounters against locally stored graphs.

As our phones collect and share breadcrumb data from our proximity networks, privacy concerns arise. Privacy from snoopers, privacy from contacts and privacy from backend operators. There are several ways to define privacy itself. One example is through differential privacy. Differential privacy is an information-theoretical criterion in the context of statistical and machine learning analysis (i.e. where loads of data is captured.) It enables the collection, analysis, and sharing of statistical estimates such as averages or synthetic data, while preventing individual re-identification or record linkage.

These protocols are still work-in-progress and Apps assumed to be based on voluntary use. Only the future will tell if people adopt them and what unforeseen side-effects, positive or negative, they will have. They represent a new frontier in social network data though and great vigilence is required.

There is the utterly fascinating possibility that liberal democracies where people have high trust in their governments will see a strong uptake of such an app, and may end up managing the crisis well. Entirely speculative at this point, but fascinating. 7/12 — Marcel Salathé (deleted tweet) April 4, 2020

What’s great with DP-3T is that it is open and generates interesting questions from the community such as this cryptography paper by Serge Vaudenay or this GitHub entry entitled “The long trail of contact tracing.” One of the authors of the entry is Helen Pritschard, a veteran in citizen sensing and data-driven activism. As with any initiative involving citizens, the civic tech community can contribute valuable learnings and complement efforts by research and academia. The key is to remain vigilant with private organizations such as Apple, Google or Palantir who are eager to operate data collection and/or backend on behalf of a government without much scrutiny.

Forewarned is forearmed!

Data for a More Sustainable World

The effects of isolation are dramatic on our mobility. Air traffic has significantly shrunk according to these flightradar24 reports and cities typically plagued by saturated commuter traffic are seeing significant drops in air pollutants as shown by this Aclima post for the San Francisco Bay Area in California. This ENDS Report is another example on on how the lockdown is impacting nitrogen dioxide levels at London’s most polluted sites. Remote sensing data from satellites is also being used to signal air quality improvements, as this widget shows for Madrid.

The new data reminds me of an earlier post about Greg Niemeyer’s Pufftron. Greg’s students were using one of its sensors to plot CO2 emissions. One of the devices located near a bridge recorded a significant drop due to an unforeseen closure of the bridge on that weekend many years before the senseBox and other such devices.

While COVID-19 is likely to put us on a new trajectory, problems that existed before will still have to be addressed. Global warming is certainly one of them. As we grapple with tracing new kinds of social networks and track Pufftron-like improvements in air quality (and perhaps also biodiversity) in our cities and communities, we may well end up re-thinking some of the ways before the pandemic. I am reminded of Bruno Latour and his book Down to Earth in which he describes a shift away from a “Out-of-this-World” regime towards a “Terrestrial” regime in which humans accept consequences of their activities on the environment.

One example of how we may apprehend such consequences came with flatten the curve. If you watched Tom Fiddaman’s video above, you will appreciate the ability for a systems thinker to relate the high rate of infection with the number of ICU beds of one’s country or community healthcare system. This post explicitely connects flatten the curve with carrying capacity and the various forms of capital at stake. As the author suggests, we are testing our ability to limit ourselves and depart from our otherwise “Out-of-this-World” way of living.

That way of living was true in a pre-COVID-19 world. But with a new world, comes new data and perhaps, just perhaps, we use it to course-correct towards a better place.

Stay safe at home!

EDIT April 19, 2020 since the post was published, several organizations have distanced themselves from PEPP-PT due to lack of transparency, including DP-3T which has since then published updates to its protocol as well as working prototypes for iOS and Android.

Creative Commons License This material is licensed under CC BY 3.0

Back to posts