Why is causation so difficult to understand?

Sergio Focardi
Feb 17
6 min read

Causation is difficult to understand because it means too many things. We find causal statements in many different contexts and with many different meanings. This is misleading and makes it difficult to really understand causation. Let’s begin with purely conceptual issues.

In the western philosophical tradition that started in classical Greece, causation has at least two different meanings.

The first is the proto-scientific notion that things do not happen by chance, but that any change has a cause. For example, an arrow does not spontaneously fly against a target but because of the action of an archer.

However, there is a different notion of causation as existential explanation. Why things exist? Why things are what they are? Why we humans suffer or experience pleasure? These are existential questions that philosophers have tried to answer.

Let’s discuss the above notions. The idea that any change is provoked by a cause corresponds to the intuition of daily life. In daily life, most of our actions are intended to provoke a result. If we generalize from daily life and we consider causation a kind of fundamental principle we face a few issues.

First, the ontological question: what is the truth maker of causal statement? Before the advent of modern science, causation was a basic principle. Causation was considered an ontological reality. Second, causation is a sequence of discrete events or discrete changes. If any change is caused by another change, we see that causation implies an infinite sequence of causal events. Either we assume that causal chains extend in the infinite past or we have to admit that there are some events which cause but are not caused. The idea of a first cause not caused by anything else is due to Aristotle.

The second meaning of causation is not relative to change or evolution but to the quest for existential explanations. Why do things exist? Why do we exist? Why things are what they are and not something different? It is dubious if these questions are meaningful. What could we expect as an answer? Philosophers have tried to answer existential questions assuming some self-explaining principle. Many philosophers identify this principle with God.

Science has taken a totally different approach. Starting with Newton’s Principia physics adopted a mathematical approach based on differential equations. The entire body of classical physics and the Theory of Relativity are based on differential equations. Quantum mechanics uses the mathematics of operators acting on Hilbert spaces, but the dynamics of quantum systems is described by Schrödinger’s equation or, in more general settings, by Feynman’s path integrals.

Physics is not causal, but it is descriptive. Initial and boundary conditions are not causes but descriptions of physical systems. The notion of a sequence of causal events does not belong to modern physics.

The concept of scientific explanations is not causal but axiomatic. Following Carl Hempel and Robert Oppenheim, scientific explanation consists in logically inferring the behavior of any physical system from basic principles plus initial and boundary descriptive conditions. Hempel coined the term Deductive-Nomological -DN - to characterize scientific explanation.

Pierre Duhem and, in more general terms, Willard van Orman Quine introduced what is called the Duhem-Quine Thesis that claims that physics can be tested only globally because individual statements can always be made true. Thomas Kuhn analysed scientific revolutions that imply a change of paradigm so that new and old theories might be incommensurable. Philip Anderson introduced the idea that physics is hierarchical: complex systems might exhibit laws that cannot be formulated in terms of more fundamental laws.

In modern physics basic laws are empirical hypotheses. The question why reality behaves according to the basic laws does not belong to scientific explanation. We can logically infer the behavior of a system from basic laws that are accepted as hypotheses.

How does causation fit into this conceptual scheme? In modern terms, causation applies to systems. There are many different modern notions of causation.

Following Hans Reichenbach, causation has been defined in terms of probability. Given two events A and B, A causes B if the conditional probability of B given A is higher than the probability of B. This notion of causation is symmetrical: if A causes B then B causes A. Reichenbach’s Common Cause Principle – RCCP - makes any correlation the product of causal effects. RCCP suggests that any functional relationship is causal.

Following the philosopher James Woodward, causation is defined as manipulability. A variable X has a causal effect on a variable Y if a change of X is always followed by a change of Y This definition assumes that there is a causal mechanism that links the two variables. The causal relationship between two variables is a functional relationship so that an arbitrary intervention of one variable has a well-defined effect of the other variables.

These different notions of causation have different properties which makes it difficult to really grasp causation. Causation is not a law of nature, but it is a property of systems, called causal systems, where a set of input variables control output variables.

The mechanism of causation depends on the structure of the system and can be inferred from basic laws and the description of the system. If we consider physical systems that obey physical laws, causation is not a law of nature, but it is a property of systems that can be described in terms of fundamental physical laws and the description of systems. Note that causation is a property of some variables that coexist with other variables that are not causal.

However, the current interest in causation is not due to understanding how the causal behavior of physical systems that can be inferred by basic laws. Current causal models, in particular Structural Causal Models – SCMs – are used to describe the causal behavior of complex systems. In many cases, the behavior of complex systems cannot be inferred by basic laws, but it is learned from data. Examples include the study of social systems, economics, firms, and the study of complex biological systems.

In these cases, causal laws have to be inferred from observational data. This is a difficult task that leads very easily to misunderstanding. In the last three decades much effort has been devoted to creating algorithms that discover causal relationships from observational data.

Suppose we have selected a number of variables of a system and suppose we have collected an i.i.d. sample of observations of these variables. Consider the correlation matrix of these variables. The mantra of causal models is Correlation is not causation.

Correlations computed from observational time series might be spurious. This is because correlations can be due to exogenous factors that affects all variables. Given that in general we cannot perform experiments, there is no way to exclude the possibility of exogenous factors affecting all variables.

Causation assumes that a causal relationship between two variables is due to a causal mechanism that involves only the two variables and that does not depend on any other variable. In a physical system subject to physical laws this fact could be ascertained by looking at the engineering of the system. However, when we apply models such as SCMs formed by a set of structural equations, we have no additional knowledge of the system. Causal laws become first principles.

In order to discover causal relationships, we have to make assumptions about our data. The assumption behind SCMs is that relationships between variables are indeed causal relationships. The definition of SCMs assumes that any variable is a function of its parents plus noise, where the parents of a variable Xi are all variables that have a direct causal relationship with Xi. Then we make additional assumptions relative to the probability distribution of the variables.

All these assumptions are very cogent and might not be verified. Correlation might not be spurious. It might be that variables have non-causal functional relationships. In physical system this is a common fact. We might have variables linked by functional relationships on which we cannot intervene and therefore are not causal, given the definition of causation as manipulability. Actually, variables might be subject to more complex relationships including feedback loops.

If we want to properly understand the causal features of a system, we have to define carefully the concept of causality we adopt and then evaluate if the assumption of causality is tenable. This is not an easy task because concepts might not be clearly defined. For example, if we define causation as manipulability, we need to evaluate if variables can effectively be manipulated without the possibility of making experiments.

Why is causation so difficult to understand?

Recent Posts

Comments