AI and the Future of Data Science

There’s an old joke in rock climbing:

“Sport climbing is neither.”

Old-school climbers argue it’s not a sport because there’s no clock, no scoreboard, and no clear winner. They also say it’s not “real” climbing since you clip bolts the whole way and lower off anchors instead of reaching the top of something wild and remote.

We usually say this half-seriously and half to poke fun at ourselves.

Recently, I’ve been thinking that data science is also neither.

It’s not quite statistics. It’s not quite engineering. It’s not quite science. It’s an in-between field focused mostly on building models that predict things well enough to be useful.

Now, AI is making this identity crisis hard to overlook.

Science Is About Understanding

When I first entered this field, I was drawn to the “science” aspect. My background is in economics, where I trained in building regression models and understanding causal relationships. It was about grasping phenomena.

Science, in a slightly romantic sense, is about getting to the heart of things. It’s not just about arriving at the right answer, but also about understanding why it’s right. Throughout the history of science, this quest for understanding has been appealing. It’s about making sense of the world around us.

When Isaac Newton explained gravity, he wasn’t just fitting curves to planetary motion. When Albert Einstein redefined space and time, he wasn’t focused on metrics.

They aimed to explain how the world works!

That has always felt like the guiding principle of serious intellectual effort. It’s about condensing reality into something we can discuss and revising our views when faced with evidence that challenges a theory.

AI Doesn’t Care About Why

The way we create AI systems doesn’t seem to prioritize understanding.

They excel at mapping inputs to outputs. Provide enough data, enough parameters, and enough computing power, and they will closely approximate the function. If you want to learn more about this, read Stochastic Parrots.

Perhaps the underlying model is grasping the problem it aims to solve. But a lot of the deeper, nuanced work gets overlooked. AI models gain understanding while we, as humans, seem to lag behind.

Where That Leaves Data Science

For a time, data science felt like a tightrope walk.

We built predictive models. But we also:

Analyzed coefficients
Designed features
Ran A/B tests
Debated assumptions
Explained results to people unfamiliar with models

There was still a scientific drive behind it. We wanted to comprehend which variables mattered and why.

Now, more often, the best-performing model is a large, opaque system that you deploy without fine-tuning it extensively. It often outperforms your carefully built logistic regression. It allows for quicker iterations and is generally more robust.

If the metrics improve, everyone celebrates.

So, what is the role of a data scientist now?

Are we here to understand the system?
Or just to improve the numbers?

Prediction Is Often Enough

In most businesses, prediction is what matters.

If churn falls, fraud detection improves, or revenue rises, the way we got there can feel unimportant. Understanding is nice, but performance is essential.

This shift is subtle yet significant.

We move from asking:

What is happening?

to asking:

Does it work?

That’s not necessarily a bad change. It’s simply different from what many of us envisioned when we heard the word “science.”

But Understanding Still Shows Up

Here’s the point I keep returning to. Systems based purely on prediction can become fragile.

When the data shifts.
When users change their behavior.
When incentives alter.
When the model starts affecting the behavior it forecasts.

That’s when you start to care about the underlying mechanics.

You need someone who can say, “This looks like a shortcut the model found,” or “We might be overfitting to a misleading signal,” or “This might break if X changes.”

Understanding might not be necessary for short-term wins. But it becomes vital for long-term stability!

Maybe Data Science Is Still “Neither”

Data science has always been a hybrid field. The term has always irritated me. It seems to dismiss “non-data scientists” as if that’s not a category.

It’s neither pure science nor pure engineering.

It’s more like applied epistemology with deadlines.

AI doesn’t eliminate the need for data scientists, but it does raise an important question.

If machines can spot patterns better than we can, where do we contribute value?

I suspect it’s not in out-predicting the model, but in framing the problem, deciding what to measure, designing experiments, understanding failure points, and translating between complex systems and human decisions.

AI may not care about why.

But we should still prioritize understanding.

TL;DR Causal science is still valuable. Data science still has a poor name. We should continue striving for a better understanding of our world!

Science Is About Understanding#

AI Doesn’t Care About Why#

Where That Leaves Data Science#

Prediction Is Often Enough#

But Understanding Still Shows Up#

Maybe Data Science Is Still “Neither”#