COVID19 - What data scientists should and shouldn’t do right now
We think data scientists should be very intentional with what they do and don’t do right now. Here’s why.
2020-03-26 | Frie und Johannes
With the COVID crisis changing our lives dramatically, we see an overwhelming wave of civil society involvement and commitment – a really amazing thing! Some of the people who are not working in system-relevant jobs have a bit more time at hand now and think about how they can contribute to society in these challenging times. Among them: We as data scientists.
What we shouldn’t do
The COVID-19 crisis seems to be a numbers game: Between numbers and networks of infected people, infection curves, and many metrics, it seems almost natural to do something with them: Making sense of them, building predictive models, making ever new visualizations. But we should all ask ourselves: Why are we doing it? Are we making a genuine, useful contribution or are we just creating more noise?
Per the most common definition, a data scientist is defined by a skillset incorporating
- statistics and applied math,
- domain knowledge.
Most of us are pretty good at 1) and 2) – but most of us lack domain knowledge. Usually, we can either acquire it ourselves or we have experts to collaborate with (colleagues, clients, …). But unless you personally are an epidemiologist or you know someone who can contribute this crucial element to your data science project, all your modelling efforts might do more harm than good. Right now is not the time to play around, to build some “AI” or a Shiny App just because.
We get it: Most of us feel helpless and we want to do something, just anything to help. And if the only tool you have is a hammer, everything looks like a nail. The thing is: The methods and tools of data science are powerful and most likely will play an important role in overcoming this crisis. For example, there is great value in data journalism to explain abstract concepts to the public like this simulation from the Washington Post. However, only if they are used in context and with the expertise to back them up. Because especially in times like this, it is essential that all data analyses are strongly based upon solid domain expertise.
What we should do
This doesn’t mean that we can’t do anything right now. Here are a few suggestions where we can pour all your energy into:
- We can offer our expertise to organizations and people who are critical to overcoming this crisis. Ask your local authorities and experts if they need help in communicating critical information using data visualization. Ask whether they need help with building up or changing their data infrastructure. Maybe they even are in need of an (exploratory) analysis. But don’t be disappointed if they’re already super busy with their core responsibilities and can’t think about data right now.
- Contribute to data projects initiated by experts: there are quite a few visualization and modelling projects initiated by people who do have the domain expertise or who have a network of domain experts. Check out the following pages to see whether they need your help:
- GitHub - Neherlab dashboard: repository for https://neherlab.org/covid19/. Developed by a research group focusing on “evolution, ecology, and population genetics with a focus on rapidly evolving pathogens such as HIV, influenza virus, or pathogenic bacteria” (Website of the Lab). You can contribute data for your country/region or maybe help out with simple bugs in data processing.
- Our French friends from jogl.io (Just One Giant Lab) have started the OpenCovid19 initiative. Perfect if you have skills in bioinformatics, chemistry, or medicine.
- Covid19 Cognitive City: the Bill & Melinda Gates Foundation has created a data-centric social network with the goal of stopping the spread of COVID-19 and accelerate development of a vaccine.
- Github - quarantine-hero : Repository of Quarantänehelden that connects volunteers with people who need help with grocery shopping etc.
- Contribute to the Corona Virus Tech Handbook which “provides a library for technologists, civic organisations, public and private institutions, researchers, educators and specialists of all kinds to collaborate on an agile and sophisticated response to the coronavirus outbreak and sequential impacts”.
- We can help non-profit organizations which might need help – not only when it comes to data science, but also project management, remote work, remote collaboration. Most of us have experience working in online contexts: we know the tools (Slack, Zoom, Google Docs, …) that help with remote work by heart. But for people who usually work offline, this is all news. To tackle this problem, CorrelAid, D3 - so geht digital, OpenTransfer and GoVolunteer are partnering up to connect IT folks with non-profits that need help in getting set up for remote work. Starting this Friday, we will offer the “Plötzlich digital: Die Sprechstunde” (German for roughly “suddenly digital: the open consultation hour”) where digital experts - really anyone with experience with remote work - can share their expertise in remote work technologies with non-profits. Please sign up here if you can participate in a call as an expert for a tool. Depending on the needs of the non-profits, we might extend this to a kind of “mentoring” model later on.
- We can get involved in areas outside of our special data science expertise. Help in your house, local community and city. If you are in good health and not part of the at-risk group, go grocery shopping for your elderly neighbour or your immunocompromised friend. Donate blood. Call your grandparents or friends who live alone.
And finally, but most importantly: wash your hands, stay at home and practice social distancing (or rather: physical distancing).
Stay well everyone! ❤️