Skip to main content

Data Engineering vs. Data Science

· 3 min read
Parham Parvizi

Data workers are living like rock stars. I don’t mean trashing hotel rooms and jumping off second-story homes into pools after dropping LSD. I mean, they’re living jet-set lifestyles—and the future looks even brighter.

In 2017, The Economist boldly stated that oil is no longer the world’s greatest asset—data is. It’s no surprise. The world’s biggest companies all work in data and make a lot of money off data. That’s why the two most common data-handling jobs are in high demand.

The 2020 Dice Tech Jobs Report found that data engineers and data scientists were two of the three fastest growing tech occupations. Data scientists and data engineers collaborate on projects but they’re very different roles. The infographic below ((CONFIRM THE IMAGE IS PLACED JUST BELOW THIS PARAGRAPH)), created by data science advisor Monica Rogati, shows how their roles make up a data science hierarchy of needs.

DATA ENGINEERS

Data engineers operate in the lower portion of the pyramid. They create the structure that makes data reliable and manageable. That involves collecting data, storing and moving it, cleaning and preparing it, then labeling the data to make it easy for data scientists to understand what they’re looking at.

Collect: Data engineers set up instrumentation, sensors, mechanisms, and user interfaces to access raw data from apps, websites, or other sources.

Move/Store: Data engineers build reliable pipelines and other infrastructure to move data to and from the cloud or warehouse databases.

Explore/Transform: This is a critical portion of the job as it ensures the collected data is reliable, consistent, clean, and unified so it can be efficiently analyzed.

Aggregate/Label: This is where the raw data starts to come together in coherent forms. The data goes through analytics to establish metrics—after which, the data can then be labeled and prepared to go through the data scientists’ prediction algorithms.

DATA SCIENTISTS

Data scientists operate in the upper half of the pyramid. They make the data relevant for key decision-makers within a company or organization.

Learn/Optimize: Data scientists take data engineers’ reliable and coherent data and try to learn what they can from it to optimize their systems. It’s done by running the data through algorithms. Then artificial intelligence systems can tell us what that data means.

Data engineers and data scientists may not be rock stars—but they can work in harmony to create data predictions that’ll have executives humming along to the bottom line.