Scientists love their data. But tracking and sorting the ever-increasing amounts of data can be a time consuming effort. Enter the data wrangler.
Data wrangling is a thing. Like cattle wrangling, it involves gathering, sorting and settling the data where it can be retrieved for future use. Spurs not needed.
Katya Kovalenko is an aquatic ecologist and data scientist who came to NRRI in 2013 with a doctorate degree from Mississippi State University, picking up data wrangling skills while working on a wide variety of projects at NRRI.
It’s a critical role because for scientists, the more data, the better. But managing that data is time-consuming. In her work as a data scientist, Kovalenko’s skill accelerates the data-intensive research at NRRI, making it more accessible.
“I know it sounds dry, but I really enjoy this work,” said Kovalenko. “My goal is to increase the utility of existing high-value datasets by applying new analytical approaches.”
Kovalenko provides input for experimental design and statistical analyses, optimizes workflows and makes on-screen graphics work.
One interesting project has her analyzing 16 years of data from Google searches to understand changes in public interest of invasive species and harmful algal blooms. She’s also part of a global effort – funded by the U.S. Geological Survey – to analyze the impact of sea-level rise on coastal marshes.
“The data for that project is being collected all over the world and it’s so important because millions of people live near coasts,” Kovalenko explained. “Marshes are important protectors of the land.”
Away from the computer, Kovalenko dons waders for research in understanding invasive species, aquatic food webs and ecosystem ecology. She studies many species in the Great Lakes and Minnesota’s inland lakes to understand impacts of invasive species and other human-caused stressors on macroinvertebrates, fish and other aquatic communities.
Within NRRI, Kovalenko’s skills – both as a data scientist and an ecologist – are well used by researchers, especially the water, forest and bird scientists. She is also part of the Informatics Institute and Minnesota Supercomputing Institute on the U’s Twin Cities campus, sharing her specific expertise, as needed.
“We are likely to rely more and more on these institutes as the amount of data is increasing exponentially in nearly all fields represented at NRRI,” she said.
And because of her coastal marshes work and work on scientific editorial boards, Kovalenko collaborates with colleagues around the globe, on every continent.
In 2019, she identified a need for all of NRRI’s data users to get on the same page, so she pulled together a small internal team to document data management best practices – backing up data, how to analyze and structure data, documentation, etc. And then NRRI Quality Manager, Lisa Estepp, organized and designed the information in an easy-to-follow guide.
Passing Pandemic Time
As an avid hiker and mountain climber (she has 30 14,000-foot ascents under her belt), Kovalenko hoped the pandemic lock-down would free up more time to hike locally. It hasn’t. She also misses the extensive travelling she’s done in the past.
“On the other hand, online meetings save a lot of time, increase efficiency and allow interactions which would otherwise be logistically impossible,” she added.