diff --git a/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/index.html b/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/index.html
index 58aee567f..be8c6052c 100644
--- a/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/index.html
+++ b/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/index.html
@@ -1,11 +1,11 @@
 <!doctype html><html lang=en dir=auto><head><meta charset=utf-8><meta http-equiv=X-UA-Compatible content="IE=edge"><meta name=viewport content="width=device-width,initial-scale=1,shrink-to-fit=no"><meta name=robots content="index, follow"><title>Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions | Yanir Seroussi | Data & AI for Startup Impact</title>
-<meta name=keywords content="causal inference,data science,insights,predictive modelling"><meta name=description content="Discussing the need for untested assumptions and temporality in causal inference. Mostly based on Samantha Kleinberg&rsquo;s Causality, Probability, and Time."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.6f5c97224af1f1714566202529b7d458386b85c4df858c71df30dd5c1c769363.css integrity="sha256-b1yXIkrx8XFFZiAlKbfUWDhrhcTfhYxx3zDdXBx2k2M=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions"><meta property="og:description" content="Discussing the need for untested assumptions and temporality in causal inference. Mostly based on Samantha Kleinberg&rsquo;s Causality, Probability, and Time."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/"><meta property="og:image" content="https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/freediving.jpg"><meta property="article:section" content="posts"><meta property="article:published_time" content="2016-05-14T19:57:03+00:00"><meta property="article:modified_time" content="2024-01-16T09:56:03+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/freediving.jpg"><meta name=twitter:title content="Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions"><meta name=twitter:description content="Discussing the need for untested assumptions and temporality in causal inference. Mostly based on Samantha Kleinberg&rsquo;s Causality, Probability, and Time."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions","item":"https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions","name":"Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions","description":"Discussing the need for untested assumptions and temporality in causal inference. Mostly based on Samantha Kleinberg\u0026rsquo;s Causality, Probability, and Time.","keywords":["causal inference","data science","insights","predictive modelling"],"articleBody":"Background: I have previously written about the need for real insights that address the why behind events, not only the what and how. This was followed by a fairly popular post on causality, which was heavily influenced by Samantha Kleinberg's book Why: A Guide to Finding and Using Causes. This post continues my exploration of the field, and is primarily based on Kleinberg's previous book: Causality, Probability, and Time.\nThe study of causality and causal inference is central to science in general and data science in particular. Being able to distinguish between correlation and causation is key to designing effective interventions in business, public policy, medicine, and many other fields. There are quite a few approaches to inferring causal relationships from data. In this post, I discuss some aspects of Judea Pearl’s graphical modelling approach, and how its limitations are addressed in recent work by Samantha Kleinberg. I then finish with a brief survey of the Bradford Hill criteria and their applicability to a key limitation of all causal inference methods: The need for untested assumptions.\nJudea Pearl Overcoming my Pearl bias First, I must disclose that I have a personal bias in favour of Pearl’s work. While I’ve never met him, Pearl is my academic grandfather – he was the PhD advisor of my main PhD supervisor (Ingrid Zukerman). My first serious exposure to his work was through a Sydney reading group, where we discussed parts of Pearl’s approach to causal inference. Recently, I refreshed my knowledge of Pearl causality by reading Causal inference in statistics: An overview. I am by no means an expert in Pearl’s huge body of work, but I think I understand enough of it to write something of use.\nPearl’s theory of causality employs Bayesian networks to represent causal structures. These are directed acyclic graphs, where each vertex represents a variable, and an edge from X to Y implies that X causes Y. Pearl also introduces the do(X) operator, which simulates interventions by removing all the causes of X, setting it to a constant. There is much more to this theory, but two of its main contributions are the formalisation of causal concepts that are often given only a verbal treatment, and the explicit encoding of causal assumptions. These assumptions must be made by the modeller based on background knowledge, and are encoded in the graph’s structure – a missing edge between two vertices indicates that there is no direct causal relationship between the two variables.\nMy main issue with Pearl’s treatment of causality is that he doesn’t explicitly handle time. While time can be encoded into Pearl’s models (e.g., via dynamic Bayesian networks), there is nothing that prevents creation of models where the future causes changes in the past. A closely-related issue is that Pearl’s causal models must be directed acyclic graphs, making it hard to model feedback loops. For example, Pearl says that “mud does not cause rain”, but this isn’t true – water from mud evaporates, causing rain (which causes mud). What’s true is that “mud now doesn’t cause rain now” or something along these lines, which is something that must be accounted for by adding temporal information to the models.\nNonetheless, Pearl’s theory is an important step forward in the study of causality. In his words, “in the bulk of the statistical literature before 2000, causal claims rarely appear in the mathematics. They surface only in the verbal interpretation that investigators occasionally attach to certain associations, and in the verbal description with which investigators justify assumptions.” The importance of formal causal analysis cannot be overstated, as it underlies many decisions that affect our lives. However, it seems to me like there’s still plenty of work to be done before causal analysis becomes as established as other statistical tools.\nSamantha Kleinberg Kleinberg: Addressing gaps in Pearl’s work I recently finished reading Samantha Kleinberg’s Causality, Probability, and Time. Kleinberg dedicates a good portion of the book to presenting the history of causality and discussing its many definitions. As hinted by the book’s title, Kleinberg believes that one cannot discuss causality without considering time. In her words: “One of the most critical pieces of information about causality, though – the time it takes for the cause to produce its effect – has been largely ignored by both philosophical theories and computational methods. If we do not know when the effect will occur, we have little hope of being able to act successfully using the causal relationship.” Following this assertion, Kleinberg presents a new approach to causal inference that is based on probabilistic computation tree logic (PCTL). With PCTL, one can concisely express probabilistic temporal statements. For example, if we observe a potential cause c occurring at time t, and a possible effect e occurring at time t’, we can use PCTL to state the hypothesis that in general, after c becomes true, it takes between one and |t’ – t| time units for e to become true with probability at least p, i.e., c leads to e:\nIt is obvious why PCTL may be a better fit than Bayesian networks for expressing causal statements. For example, with a Bayesian network, we can easily express the statement that smoking causes lung cancer with probability 0.3, but this isn’t that useful, as it doesn’t tell us how long it’ll take for cancer to develop. With PCTL, we can state that smoking causes lung cancer in 5-30 years with probability at least 0.3. This matches our knowledge that cancer doesn’t develop immediately – one cigarette won’t kill you.\nOne of the key concepts introduced by Kleinberg is that of causal significance. Calculating the causal significance of a cause c to an effect e relies on first identifying the set X of potential (or prima facie) causes of e. The set X contains all discrete variables x such that E[e|x]≠E[e] and x occurs earlier than e. Given the set X, the causal significance of c to e is the mean of E[e|c∧x] – E[e|¬c∧x] for all x≠c. The intuition is that if a cause c is significant, its causal significance value will be high when other potential causes are held fixed. For example, if c is heavy smoking and e is severity of lung cancer (with e=0 meaning no cancer), the expected value of e given c is likely to be higher than the expected value of e given ¬c, when conditioned on any other potential cause. Once causal significance has been measured, we can separate significant causes from insignificant causes by setting a threshold on causal significance values (this threshold can be inferred from the data). Significant causes are considered to be genuine if the data is stationary and the common causes of all pairs of variables have been included, which is a very strong condition that may be hard to fulfil in realistic scenarios. However, causal significance is an evolving concept – last year, Huang and Kleinberg introduced a new definition of causal significance that can be inferred faster and yield more accurate results. My general feeling is that this line of research will continue to yield many interesting and useful results in coming years.\nKleinberg’s work is not without its limitations. In addition to the assumptions that causal relationships are stationary and the requirement to identify all potential causes, the recently-introduced definition of causal significance also requires the relationships to be linear and additive (though this limitation may be relaxed in future work). Another issue is that most of the evaluation in the studies I’ve read was done on synthetic datasets. While there are some results on real-life health and finance data, I find it hard to judge the practicality of utilising Kleinberg’s methods without applying them to problems that I’m more familiar with. Finally, as with other work in the field of causal inference, we need to have some degree of belief in untested assumptions to reach useful conclusions. In Kleinberg’s words:\nThus, a just so cause is genuine in the case where all of the outlined assumptions hold (namely that all common causes are included, the structure is representative of the system and, when data is used, a formula satisfied by the data will be satisfied by the structure). Our belief in whether a cause is genuine, in the case where it is not certain that the assumptions hold, should be proportional to how much we believe that the assumptions are true.\nAustin Bradford Hill Hill: Testing untested assumptions To the best of my knowledge, all causal inference methods rely on untested assumptions. Specifically, we can never include all the variables in the universe in our models. Therefore, any conclusions drawn are reliant on deciding what, when, and how to measure potential causes and effects. Another issue is that no matter how good and believable our modelling is, we cannot use causal inference to convince unreasonable people. For example, some people may cite divine intervention as an unmeasurable cause of anything and everything. In addition, people with certain commercial interests often try to raise doubt about well-established causal mechanisms by making unreasonable claims for evidence of various hidden factors. For example, tobacco companies used to claim that both smoking and lung cancer were caused by a common hidden factor, making the link between smoking and lung cancer a mere association.\nAssuming that we are dealing with reasonable people, there’s still the question of where we should get our untested assumptions from. This question is fairly old, and has been partly answered in 1965 by Austin Bradford Hill, with nine criteria that he recommended should be considered before calling an association causal:\nStrength: How strong is the association? For example, lung cancer deaths of heavy smokers are 20-30 times greater than those of non-smokers. Consistency: Has the association been repeatedly observed in various circumstances? For example, many different populations have exhibited an association between smoking rates and cancer. Specificity: Can we pin down specific instances of the effect to specific instances of the cause? Hill sees this as a nice-to-have condition rather than a must-have – cases with multiple possible causes may not fulfil the specificity requirement. Temporality: Do we know that c leads to e or are we observing them together? This is a condition that isn’t always easy to fulfil, especially when dealing with feedback loops and slow processes. Biological gradient: Hill’s focus was on medicine, and this condition refers to the association exhibiting some dose-response curve. This can be generalised to other fields, as we can expect some regularity in the effect if it is a function of the cause (though it doesn’t have to be a linear function). Plausibility: Do we know of a mechanism that can explain how the cause brings about the effect? Coherence: Does the association conflict with our current knowledge? Even if it does, it isn’t enough to rule out causality, as our current knowledge may be incomplete or wrong. Experiment: If possible, running controlled experiments may yield very powerful evidence in favour of causation. Analogy: Do we know of any similar cause-and-effect relationships? Hill summarises the list of criteria (or viewpoints) with the following statements.\nHere then are nine different viewpoints from all of which we should study association before we cry causation. What I do not believe – and this has been suggested – is that we can usefully lay down some hard-and-fast rules of evidence that must be obeyed before we accept cause and effect. None of my nine viewpoints can bring indisputable evidence for or against the cause-and-effect hypothesis and none can be required as a sine qua non. What they can do, with greater or less strength, is to help us to make up our minds on the fundamental question – is there any other way of explaining the set of facts before us, is there any other answer equally, or more, likely than cause and effect?\nNo formal tests of significance can answer those questions. Such tests can, and should, remind us of the effects that the play of chance can create, and they will instruct us in the likely magnitude of those effects. Beyond that they contribute nothing to the ‘proof’ of our hypothesis.\nHill then goes on to criticise the increased focus on statistical significance as a condition for accepting scientific papers for publication. Remembering that this was over 50 years ago, it is a bit worrying that it has taken so long for the statistical community to formally acknowledge the fact that statistical significance does not imply scientific importance, or constitutes enough evidence to support a causal hypothesis.\nClosing thoughts This post has only scratched the surface of the vast field of study of causality. At this point, I feel like I’ve read quite a bit, and it is time to apply what I learned to real problems. I encounter questions of causality in my everyday work, but haven’t fully applied formal causal inference to any problem yet. My view is that everyone needs to at least be aware of the need to consider causality, and of what it’d take to truly prove causal impact. A large proportion of what many people need in practice may be addressed by Hill’s criteria, rather than by formal methods for causal analysis. Nonetheless, I will report back when I get a chance to apply formal causal inference to real datasets. Stay tuned!\n","wordCount":"2223","inLanguage":"en","image":"https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/freediving.jpg","datePublished":"2016-05-14T19:57:03Z","dateModified":"2024-01-16T09:56:03+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Writing><span>Writing</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Speaking><span>Speaking</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions</h1><div class=post-meta><span title='2016-05-14 19:57:03 +0000 UTC'>May 14, 2016</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/freediving_hu8406825224592319487.jpg 360w ,https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/freediving_hu14840648707129179482.jpg 480w ,https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/freediving_hu8563758355214543345.jpg 720w ,https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/freediving_hu15603219818946967169.jpg 1080w ,https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/freediving_hu4491227251759857421.jpg 1500w ,https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/freediving.jpg 1920w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/freediving.jpg alt width=1920 height=672></figure><div class=post-content><p class=intro-note>Background: I have previously written about <a href=https://yanirseroussi.com/2015/12/08/this-holiday-season-give-me-real-insights/>the need for real insights that address the why behind events, not only the what and how</a>. This was followed by a <a href=https://yanirseroussi.com/2016/02/14/why-you-should-stop-worrying-about-deep-learning-and-deepen-your-understanding-of-causality-instead/>fairly popular post on causality</a>, which was heavily influenced by Samantha Kleinberg's book <a href=http://www.skleinberg.org/why/ target=_blank rel=noopener>Why: A Guide to Finding and Using Causes</a>. This post continues my exploration of the field, and is primarily based on Kleinberg's previous book: <a href=http://www.skleinberg.org/causality_book/index.html target=_blank rel=noopener>Causality, Probability, and Time</a>.</p><p>The study of causality and causal inference is central to science in general and data science in particular. Being able to distinguish between correlation and causation is key to designing effective interventions in business, public policy, medicine, and many other fields. There are quite a few approaches to inferring causal relationships from data. In this post, I discuss some aspects of <a href=https://en.wikipedia.org/wiki/Judea_Pearl target=_blank rel=noopener>Judea Pearl&rsquo;s</a> graphical modelling approach, and how its limitations are addressed in recent work by <a href=http://www.skleinberg.org/ target=_blank rel=noopener>Samantha Kleinberg</a>. I then finish with a brief survey of the <a href=https://en.wikipedia.org/wiki/Bradford_Hill_criteria target=_blank rel=noopener>Bradford Hill criteria</a> and their applicability to a key limitation of all causal inference methods: The need for untested assumptions.</p><h2 id=hahahugoshortcode41s0hbhb-overcoming-my-pearl-bias><figure class=float-right><a href=judea-pearl.jpg target=_blank rel=noopener><img sizes="(min-width: 768px) 435px,
+<meta name=keywords content="causal inference,data science,insights,predictive modelling"><meta name=description content="Discussing the need for untested assumptions and temporality in causal inference. Mostly based on Samantha Kleinberg&rsquo;s Causality, Probability, and Time."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.6f5c97224af1f1714566202529b7d458386b85c4df858c71df30dd5c1c769363.css integrity="sha256-b1yXIkrx8XFFZiAlKbfUWDhrhcTfhYxx3zDdXBx2k2M=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions"><meta property="og:description" content="Discussing the need for untested assumptions and temporality in causal inference. Mostly based on Samantha Kleinberg&rsquo;s Causality, Probability, and Time."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/"><meta property="og:image" content="https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/freediving.jpg"><meta property="article:section" content="posts"><meta property="article:published_time" content="2016-05-14T19:57:03+00:00"><meta property="article:modified_time" content="2024-01-16T09:56:03+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/freediving.jpg"><meta name=twitter:title content="Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions"><meta name=twitter:description content="Discussing the need for untested assumptions and temporality in causal inference. Mostly based on Samantha Kleinberg&rsquo;s Causality, Probability, and Time."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions","item":"https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions","name":"Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions","description":"Discussing the need for untested assumptions and temporality in causal inference. Mostly based on Samantha Kleinberg\u0026rsquo;s Causality, Probability, and Time.","keywords":["causal inference","data science","insights","predictive modelling"],"articleBody":"Background: I have previously written about the need for real insights that address the why behind events, not only the what and how. This was followed by a fairly popular post on causality, which was heavily influenced by Samantha Kleinberg's book Why: A Guide to Finding and Using Causes. This post continues my exploration of the field, and is primarily based on Kleinberg's previous book: Causality, Probability, and Time.\nThe study of causality and causal inference is central to science in general and data science in particular. Being able to distinguish between correlation and causation is key to designing effective interventions in business, public policy, medicine, and many other fields. There are quite a few approaches to inferring causal relationships from data. In this post, I discuss some aspects of Judea Pearl’s graphical modelling approach, and how its limitations are addressed in recent work by Samantha Kleinberg. I then finish with a brief survey of the Bradford Hill criteria and their applicability to a key limitation of all causal inference methods: The need for untested assumptions.\nJudea Pearl Overcoming my Pearl bias First, I must disclose that I have a personal bias in favour of Pearl’s work. While I’ve never met him, Pearl is my academic grandfather – he was the PhD advisor of my main PhD supervisor (Ingrid Zukerman). My first serious exposure to his work was through a Sydney reading group, where we discussed parts of Pearl’s approach to causal inference. Recently, I refreshed my knowledge of Pearl causality by reading Causal inference in statistics: An overview. I am by no means an expert in Pearl’s huge body of work, but I think I understand enough of it to write something of use.\nPearl’s theory of causality employs Bayesian networks to represent causal structures. These are directed acyclic graphs, where each vertex represents a variable, and an edge from X to Y implies that X causes Y. Pearl also introduces the do(X) operator, which simulates interventions by removing all the causes of X, setting it to a constant. There is much more to this theory, but two of its main contributions are the formalisation of causal concepts that are often given only a verbal treatment, and the explicit encoding of causal assumptions. These assumptions must be made by the modeller based on background knowledge, and are encoded in the graph’s structure – a missing edge between two vertices indicates that there is no direct causal relationship between the two variables.\nMy main issue with Pearl’s treatment of causality is that he doesn’t explicitly handle time. While time can be encoded into Pearl’s models (e.g., via dynamic Bayesian networks), there is nothing that prevents creation of models where the future causes changes in the past. A closely-related issue is that Pearl’s causal models must be directed acyclic graphs, making it hard to model feedback loops. For example, Pearl says that “mud does not cause rain”, but this isn’t true – water from mud evaporates, causing rain (which causes mud). What’s true is that “mud now doesn’t cause rain now” or something along these lines, which is something that must be accounted for by adding temporal information to the models.\nNonetheless, Pearl’s theory is an important step forward in the study of causality. In his words, “in the bulk of the statistical literature before 2000, causal claims rarely appear in the mathematics. They surface only in the verbal interpretation that investigators occasionally attach to certain associations, and in the verbal description with which investigators justify assumptions.” The importance of formal causal analysis cannot be overstated, as it underlies many decisions that affect our lives. However, it seems to me like there’s still plenty of work to be done before causal analysis becomes as established as other statistical tools.\nSamantha Kleinberg Kleinberg: Addressing gaps in Pearl’s work I recently finished reading Samantha Kleinberg’s Causality, Probability, and Time. Kleinberg dedicates a good portion of the book to presenting the history of causality and discussing its many definitions. As hinted by the book’s title, Kleinberg believes that one cannot discuss causality without considering time. In her words: “One of the most critical pieces of information about causality, though – the time it takes for the cause to produce its effect – has been largely ignored by both philosophical theories and computational methods. If we do not know when the effect will occur, we have little hope of being able to act successfully using the causal relationship.” Following this assertion, Kleinberg presents a new approach to causal inference that is based on probabilistic computation tree logic (PCTL). With PCTL, one can concisely express probabilistic temporal statements. For example, if we observe a potential cause c occurring at time t, and a possible effect e occurring at time t’, we can use PCTL to state the hypothesis that in general, after c becomes true, it takes between one and |t’ – t| time units for e to become true with probability at least p, i.e., c leads to e:\nIt is obvious why PCTL may be a better fit than Bayesian networks for expressing causal statements. For example, with a Bayesian network, we can easily express the statement that smoking causes lung cancer with probability 0.3, but this isn’t that useful, as it doesn’t tell us how long it’ll take for cancer to develop. With PCTL, we can state that smoking causes lung cancer in 5-30 years with probability at least 0.3. This matches our knowledge that cancer doesn’t develop immediately – one cigarette won’t kill you.\nOne of the key concepts introduced by Kleinberg is that of causal significance. Calculating the causal significance of a cause c to an effect e relies on first identifying the set X of potential (or prima facie) causes of e. The set X contains all discrete variables x such that E[e|x]≠E[e] and x occurs earlier than e. Given the set X, the causal significance of c to e is the mean of E[e|c∧x] – E[e|¬c∧x] for all x≠c. The intuition is that if a cause c is significant, its causal significance value will be high when other potential causes are held fixed. For example, if c is heavy smoking and e is severity of lung cancer (with e=0 meaning no cancer), the expected value of e given c is likely to be higher than the expected value of e given ¬c, when conditioned on any other potential cause. Once causal significance has been measured, we can separate significant causes from insignificant causes by setting a threshold on causal significance values (this threshold can be inferred from the data). Significant causes are considered to be genuine if the data is stationary and the common causes of all pairs of variables have been included, which is a very strong condition that may be hard to fulfil in realistic scenarios. However, causal significance is an evolving concept – last year, Huang and Kleinberg introduced a new definition of causal significance that can be inferred faster and yield more accurate results. My general feeling is that this line of research will continue to yield many interesting and useful results in coming years.\nKleinberg’s work is not without its limitations. In addition to the assumptions that causal relationships are stationary and the requirement to identify all potential causes, the recently-introduced definition of causal significance also requires the relationships to be linear and additive (though this limitation may be relaxed in future work). Another issue is that most of the evaluation in the studies I’ve read was done on synthetic datasets. While there are some results on real-life health and finance data, I find it hard to judge the practicality of utilising Kleinberg’s methods without applying them to problems that I’m more familiar with. Finally, as with other work in the field of causal inference, we need to have some degree of belief in untested assumptions to reach useful conclusions. In Kleinberg’s words:\nThus, a just so cause is genuine in the case where all of the outlined assumptions hold (namely that all common causes are included, the structure is representative of the system and, when data is used, a formula satisfied by the data will be satisfied by the structure). Our belief in whether a cause is genuine, in the case where it is not certain that the assumptions hold, should be proportional to how much we believe that the assumptions are true.\nAustin Bradford Hill Hill: Testing untested assumptions To the best of my knowledge, all causal inference methods rely on untested assumptions. Specifically, we can never include all the variables in the universe in our models. Therefore, any conclusions drawn are reliant on deciding what, when, and how to measure potential causes and effects. Another issue is that no matter how good and believable our modelling is, we cannot use causal inference to convince unreasonable people. For example, some people may cite divine intervention as an unmeasurable cause of anything and everything. In addition, people with certain commercial interests often try to raise doubt about well-established causal mechanisms by making unreasonable claims for evidence of various hidden factors. For example, tobacco companies used to claim that both smoking and lung cancer were caused by a common hidden factor, making the link between smoking and lung cancer a mere association.\nAssuming that we are dealing with reasonable people, there’s still the question of where we should get our untested assumptions from. This question is fairly old, and has been partly answered in 1965 by Austin Bradford Hill, with nine criteria that he recommended should be considered before calling an association causal:\nStrength: How strong is the association? For example, lung cancer deaths of heavy smokers are 20-30 times greater than those of non-smokers. Consistency: Has the association been repeatedly observed in various circumstances? For example, many different populations have exhibited an association between smoking rates and cancer. Specificity: Can we pin down specific instances of the effect to specific instances of the cause? Hill sees this as a nice-to-have condition rather than a must-have – cases with multiple possible causes may not fulfil the specificity requirement. Temporality: Do we know that c leads to e or are we observing them together? This is a condition that isn’t always easy to fulfil, especially when dealing with feedback loops and slow processes. Biological gradient: Hill’s focus was on medicine, and this condition refers to the association exhibiting some dose-response curve. This can be generalised to other fields, as we can expect some regularity in the effect if it is a function of the cause (though it doesn’t have to be a linear function). Plausibility: Do we know of a mechanism that can explain how the cause brings about the effect? Coherence: Does the association conflict with our current knowledge? Even if it does, it isn’t enough to rule out causality, as our current knowledge may be incomplete or wrong. Experiment: If possible, running controlled experiments may yield very powerful evidence in favour of causation. Analogy: Do we know of any similar cause-and-effect relationships? Hill summarises the list of criteria (or viewpoints) with the following statements.\nHere then are nine different viewpoints from all of which we should study association before we cry causation. What I do not believe – and this has been suggested – is that we can usefully lay down some hard-and-fast rules of evidence that must be obeyed before we accept cause and effect. None of my nine viewpoints can bring indisputable evidence for or against the cause-and-effect hypothesis and none can be required as a sine qua non. What they can do, with greater or less strength, is to help us to make up our minds on the fundamental question – is there any other way of explaining the set of facts before us, is there any other answer equally, or more, likely than cause and effect?\nNo formal tests of significance can answer those questions. Such tests can, and should, remind us of the effects that the play of chance can create, and they will instruct us in the likely magnitude of those effects. Beyond that they contribute nothing to the ‘proof’ of our hypothesis.\nHill then goes on to criticise the increased focus on statistical significance as a condition for accepting scientific papers for publication. Remembering that this was over 50 years ago, it is a bit worrying that it has taken so long for the statistical community to formally acknowledge the fact that statistical significance does not imply scientific importance, or constitutes enough evidence to support a causal hypothesis.\nClosing thoughts This post has only scratched the surface of the vast field of study of causality. At this point, I feel like I’ve read quite a bit, and it is time to apply what I learned to real problems. I encounter questions of causality in my everyday work, but haven’t fully applied formal causal inference to any problem yet. My view is that everyone needs to at least be aware of the need to consider causality, and of what it’d take to truly prove causal impact. A large proportion of what many people need in practice may be addressed by Hill’s criteria, rather than by formal methods for causal analysis. Nonetheless, I will report back when I get a chance to apply formal causal inference to real datasets. Stay tuned!\n","wordCount":"2223","inLanguage":"en","image":"https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/freediving.jpg","datePublished":"2016-05-14T19:57:03Z","dateModified":"2024-01-16T09:56:03+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Writing><span>Writing</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Speaking><span>Speaking</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions</h1><div class=post-meta><span title='2016-05-14 19:57:03 +0000 UTC'>May 14, 2016</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/freediving_hu8406825224592319487.jpg 360w ,https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/freediving_hu14840648707129179482.jpg 480w ,https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/freediving_hu8563758355214543345.jpg 720w ,https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/freediving_hu15603219818946967169.jpg 1080w ,https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/freediving_hu4491227251759857421.jpg 1500w ,https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/freediving.jpg 1920w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/freediving.jpg alt width=1920 height=672></figure><div class=post-content><p class=intro-note>Background: I have previously written about <a href=https://yanirseroussi.com/2015/12/08/this-holiday-season-give-me-real-insights/>the need for real insights that address the why behind events, not only the what and how</a>. This was followed by a <a href=https://yanirseroussi.com/2016/02/14/why-you-should-stop-worrying-about-deep-learning-and-deepen-your-understanding-of-causality-instead/>fairly popular post on causality</a>, which was heavily influenced by Samantha Kleinberg's book <a href=http://www.skleinberg.org/why/ target=_blank rel=noopener>Why: A Guide to Finding and Using Causes</a>. This post continues my exploration of the field, and is primarily based on Kleinberg's previous book: <a href=http://www.skleinberg.org/causality_book/index.html target=_blank rel=noopener>Causality, Probability, and Time</a>.</p><p>The study of causality and causal inference is central to science in general and data science in particular. Being able to distinguish between correlation and causation is key to designing effective interventions in business, public policy, medicine, and many other fields. There are quite a few approaches to inferring causal relationships from data. In this post, I discuss some aspects of <a href=https://en.wikipedia.org/wiki/Judea_Pearl target=_blank rel=noopener>Judea Pearl&rsquo;s</a> graphical modelling approach, and how its limitations are addressed in recent work by <a href=http://www.skleinberg.org/ target=_blank rel=noopener>Samantha Kleinberg</a>. I then finish with a brief survey of the <a href=https://en.wikipedia.org/wiki/Bradford_Hill_criteria target=_blank rel=noopener>Bradford Hill criteria</a> and their applicability to a key limitation of all causal inference methods: The need for untested assumptions.</p><h2 id=hahahugoshortcode42s0hbhb-overcoming-my-pearl-bias><figure class=float-right><a href=judea-pearl.jpg target=_blank rel=noopener><img sizes="(min-width: 768px) 435px,
 100vw" srcset="https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/judea-pearl_hu1791182969769687454.jpg 360w,
-https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/judea-pearl.jpg 435w," src=https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/judea-pearl.jpg alt="Judea Pearl" width=150 loading=lazy></a><figcaption><p>Judea Pearl</p></figcaption></figure>Overcoming my Pearl bias</h2><p>First, I must disclose that I have a personal bias in favour of Pearl&rsquo;s work. While I&rsquo;ve never met him, Pearl is my academic grandfather – he was the PhD advisor of my main PhD supervisor (Ingrid Zukerman). My first serious exposure to his work was through a Sydney reading group, where we discussed parts of Pearl&rsquo;s approach to causal inference. Recently, I refreshed my knowledge of Pearl causality by reading <a href=http://ftp.cs.ucla.edu/pub/stat_ser/r350.pdf target=_blank rel=noopener>Causal inference in statistics: An overview</a>. I am by no means an expert in Pearl&rsquo;s huge body of work, but I think I understand enough of it to write something of use.</p><p>Pearl&rsquo;s theory of causality employs Bayesian networks to represent causal structures. These are directed acyclic graphs, where each vertex represents a variable, and an edge from X to Y implies that X causes Y. Pearl also introduces the <code>do(X)</code> operator, which simulates interventions by removing all the causes of X, setting it to a constant. There is much more to this theory, but two of its main contributions are the formalisation of causal concepts that are often given only a verbal treatment, and the explicit encoding of causal assumptions. These assumptions must be made by the modeller based on background knowledge, and are encoded in the graph&rsquo;s structure – a missing edge between two vertices indicates that there is no direct causal relationship between the two variables.</p><p>My main issue with Pearl&rsquo;s treatment of causality is that he doesn&rsquo;t explicitly handle time. While time can be encoded into Pearl&rsquo;s models (e.g., via dynamic Bayesian networks), there is nothing that prevents creation of models where the future causes changes in the past. A closely-related issue is that Pearl&rsquo;s causal models must be directed <em>acyclic</em> graphs, making it hard to model feedback loops. For example, Pearl says that &ldquo;mud does not cause rain&rdquo;, but this isn&rsquo;t true – water from mud evaporates, causing rain (which causes mud). What&rsquo;s true is that &ldquo;mud now doesn&rsquo;t cause rain now&rdquo; or something along these lines, which is something that must be accounted for by adding temporal information to the models.</p><p>Nonetheless, Pearl&rsquo;s theory is an important step forward in the study of causality. In his words, &ldquo;<em>in the bulk of the statistical literature before 2000, causal claims rarely appear in the mathematics. They surface only in the verbal interpretation that investigators occasionally attach to certain associations, and in the verbal description with which investigators justify assumptions.</em>&rdquo; The importance of formal causal analysis cannot be overstated, as it underlies many decisions that affect our lives. However, it seems to me like there&rsquo;s still plenty of work to be done before causal analysis becomes as established as other statistical tools.</p><h2 id=hahahugoshortcode41s1hbhb-kleinberg-addressing-gaps-in-pearls-work><figure class=float-right><a href=samantha-kleinberg.jpg target=_blank rel=noopener><img sizes="(min-width: 768px) 586px,
+https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/judea-pearl.jpg 435w," src=https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/judea-pearl.jpg alt="Judea Pearl" width=150 loading=lazy></a><figcaption><p>Judea Pearl</p></figcaption></figure>Overcoming my Pearl bias</h2><p>First, I must disclose that I have a personal bias in favour of Pearl&rsquo;s work. While I&rsquo;ve never met him, Pearl is my academic grandfather – he was the PhD advisor of my main PhD supervisor (Ingrid Zukerman). My first serious exposure to his work was through a Sydney reading group, where we discussed parts of Pearl&rsquo;s approach to causal inference. Recently, I refreshed my knowledge of Pearl causality by reading <a href=http://ftp.cs.ucla.edu/pub/stat_ser/r350.pdf target=_blank rel=noopener>Causal inference in statistics: An overview</a>. I am by no means an expert in Pearl&rsquo;s huge body of work, but I think I understand enough of it to write something of use.</p><p>Pearl&rsquo;s theory of causality employs Bayesian networks to represent causal structures. These are directed acyclic graphs, where each vertex represents a variable, and an edge from X to Y implies that X causes Y. Pearl also introduces the <code>do(X)</code> operator, which simulates interventions by removing all the causes of X, setting it to a constant. There is much more to this theory, but two of its main contributions are the formalisation of causal concepts that are often given only a verbal treatment, and the explicit encoding of causal assumptions. These assumptions must be made by the modeller based on background knowledge, and are encoded in the graph&rsquo;s structure – a missing edge between two vertices indicates that there is no direct causal relationship between the two variables.</p><p>My main issue with Pearl&rsquo;s treatment of causality is that he doesn&rsquo;t explicitly handle time. While time can be encoded into Pearl&rsquo;s models (e.g., via dynamic Bayesian networks), there is nothing that prevents creation of models where the future causes changes in the past. A closely-related issue is that Pearl&rsquo;s causal models must be directed <em>acyclic</em> graphs, making it hard to model feedback loops. For example, Pearl says that &ldquo;mud does not cause rain&rdquo;, but this isn&rsquo;t true – water from mud evaporates, causing rain (which causes mud). What&rsquo;s true is that &ldquo;mud now doesn&rsquo;t cause rain now&rdquo; or something along these lines, which is something that must be accounted for by adding temporal information to the models.</p><p>Nonetheless, Pearl&rsquo;s theory is an important step forward in the study of causality. In his words, &ldquo;<em>in the bulk of the statistical literature before 2000, causal claims rarely appear in the mathematics. They surface only in the verbal interpretation that investigators occasionally attach to certain associations, and in the verbal description with which investigators justify assumptions.</em>&rdquo; The importance of formal causal analysis cannot be overstated, as it underlies many decisions that affect our lives. However, it seems to me like there&rsquo;s still plenty of work to be done before causal analysis becomes as established as other statistical tools.</p><h2 id=hahahugoshortcode42s1hbhb-kleinberg-addressing-gaps-in-pearls-work><figure class=float-right><a href=samantha-kleinberg.jpg target=_blank rel=noopener><img sizes="(min-width: 768px) 586px,
 100vw" srcset="https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/samantha-kleinberg_hu12536952542846574072.jpg 360w,
 https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/samantha-kleinberg_hu16107200547359659034.jpg 480w,
 https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/samantha-kleinberg.jpg 586w," src=https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/samantha-kleinberg.jpg alt="Samantha Kleinberg" width=150 loading=lazy></a><figcaption><p>Samantha Kleinberg</p></figcaption></figure>Kleinberg: Addressing gaps in Pearl&rsquo;s work</h2><p>I recently finished reading Samantha Kleinberg&rsquo;s <a href=http://www.skleinberg.org/causality_book/index.html target=_blank rel=noopener>Causality, Probability, and Time</a>. Kleinberg dedicates a good portion of the book to presenting the history of causality and discussing its many definitions. As hinted by the book&rsquo;s title, Kleinberg believes that one cannot discuss causality without considering time. In her words: &ldquo;<em>One of the most critical pieces of information about causality, though – the time it takes for the cause to produce its effect – has been largely ignored by both philosophical theories and computational methods. If we do not know when the effect will occur, we have little hope of being able to act successfully using the causal relationship.</em>&rdquo; Following this assertion, Kleinberg presents a new approach to causal inference that is based on <a href=https://en.wikipedia.org/wiki/Probabilistic_CTL target=_blank rel=noopener>probabilistic computation tree logic (PCTL)</a>. With PCTL, one can concisely express probabilistic temporal statements. For example, if we observe a potential cause <em>c</em> occurring at time <em>t</em>, and a possible effect <em>e</em> occurring at time <em>t&rsquo;</em>, we can use PCTL to state the hypothesis that in general, after <em>c</em> becomes true, it takes between one and <em>|t&rsquo; – t|</em> time units for <em>e</em> to become true with probability at least <em>p</em>, i.e., <em>c</em> leads to <em>e</em>:</p><figure><a href=pctl-cause-leads-to-effect.png target=_blank rel=noopener><img sizes="(min-width: 768px) 175px,
-100vw" srcset="https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/pctl-cause-leads-to-effect.png 175w," src=https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/pctl-cause-leads-to-effect.png alt="PCTL cause leads to effect" loading=lazy></a></figure><p>It is obvious why PCTL may be a better fit than Bayesian networks for expressing causal statements. For example, with a Bayesian network, we can easily express the statement that smoking causes lung cancer with probability 0.3, but this isn&rsquo;t that useful, as it doesn&rsquo;t tell us how long it&rsquo;ll take for cancer to develop. With PCTL, we can state that smoking causes lung cancer in 5-30 years with probability at least 0.3. This matches our knowledge that cancer doesn&rsquo;t develop immediately – one cigarette won&rsquo;t kill you.</p><p>One of the key concepts introduced by Kleinberg is that of <strong>causal significance</strong>. Calculating the causal significance of a cause <em>c</em> to an effect <em>e</em> relies on first identifying the set <em>X</em> of <em>potential</em> (or <em>prima facie</em>) causes of <em>e</em>. The set <em>X</em> contains all discrete variables <em>x</em> such that <em>E[e|x]≠E[e]</em> and <em>x</em> occurs earlier than <em>e</em>. Given the set <em>X</em>, the causal significance of <em>c</em> to <em>e</em> is the mean of <em>E[e|c∧x] – E[e|¬c∧x]</em> for all <em>x≠c</em>. The intuition is that if a cause <em>c</em> is significant, its causal significance value will be high when other potential causes are held fixed. For example, if <em>c</em> is heavy smoking and <em>e</em> is severity of lung cancer (with <em>e=0</em> meaning no cancer), the expected value of <em>e</em> given <em>c</em> is likely to be higher than the expected value of <em>e</em> given <em>¬c</em>, when conditioned on any other potential cause. Once causal significance has been measured, we can separate significant causes from insignificant causes by setting a threshold on causal significance values (this threshold can be inferred from the data). Significant causes are considered to be genuine if the data is stationary and the common causes of all pairs of variables have been included, which is a very strong condition that may be hard to fulfil in realistic scenarios. However, causal significance is an evolving concept – last year, Huang and Kleinberg <a href=http://www.skleinberg.org/papers/huang_flairs15.pdf target=_blank rel=noopener>introduced a new definition of causal significance that can be inferred faster and yield more accurate results</a>. My general feeling is that this line of research will continue to yield many interesting and useful results in coming years.</p><p>Kleinberg&rsquo;s work is not without its limitations. In addition to the assumptions that causal relationships are stationary and the requirement to identify all potential causes, the recently-introduced definition of causal significance also requires the relationships to be linear and additive (though this limitation may be relaxed in future work). Another issue is that most of the evaluation in the studies I&rsquo;ve read was done on synthetic datasets. While there are some results on real-life health and finance data, I find it hard to judge the practicality of utilising Kleinberg&rsquo;s methods without applying them to problems that I&rsquo;m more familiar with. Finally, as with other work in the field of causal inference, we need to have some degree of belief in untested assumptions to reach useful conclusions. In Kleinberg&rsquo;s words:</p><blockquote><p>Thus, a just so cause <em>is</em> genuine in the case where all of the outlined assumptions hold (namely that all common causes are included, the structure is representative of the system and, when data is used, a formula satisfied by the data will be satisfied by the structure). Our <em>belief</em> in whether a cause is genuine, in the case where it is not certain that the assumptions hold, should be proportional to how much we believe that the assumptions are true.</p></blockquote><h2 id=hahahugoshortcode41s3hbhb-hill-testing-untested-assumptions><figure class=float-right><a href=austin-bradford-hill.jpg target=_blank rel=noopener><img sizes="(min-width: 768px) 720px,
+100vw" srcset="https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/pctl-cause-leads-to-effect.png 175w," src=https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/pctl-cause-leads-to-effect.png alt="PCTL cause leads to effect" loading=lazy></a></figure><p>It is obvious why PCTL may be a better fit than Bayesian networks for expressing causal statements. For example, with a Bayesian network, we can easily express the statement that smoking causes lung cancer with probability 0.3, but this isn&rsquo;t that useful, as it doesn&rsquo;t tell us how long it&rsquo;ll take for cancer to develop. With PCTL, we can state that smoking causes lung cancer in 5-30 years with probability at least 0.3. This matches our knowledge that cancer doesn&rsquo;t develop immediately – one cigarette won&rsquo;t kill you.</p><p>One of the key concepts introduced by Kleinberg is that of <strong>causal significance</strong>. Calculating the causal significance of a cause <em>c</em> to an effect <em>e</em> relies on first identifying the set <em>X</em> of <em>potential</em> (or <em>prima facie</em>) causes of <em>e</em>. The set <em>X</em> contains all discrete variables <em>x</em> such that <em>E[e|x]≠E[e]</em> and <em>x</em> occurs earlier than <em>e</em>. Given the set <em>X</em>, the causal significance of <em>c</em> to <em>e</em> is the mean of <em>E[e|c∧x] – E[e|¬c∧x]</em> for all <em>x≠c</em>. The intuition is that if a cause <em>c</em> is significant, its causal significance value will be high when other potential causes are held fixed. For example, if <em>c</em> is heavy smoking and <em>e</em> is severity of lung cancer (with <em>e=0</em> meaning no cancer), the expected value of <em>e</em> given <em>c</em> is likely to be higher than the expected value of <em>e</em> given <em>¬c</em>, when conditioned on any other potential cause. Once causal significance has been measured, we can separate significant causes from insignificant causes by setting a threshold on causal significance values (this threshold can be inferred from the data). Significant causes are considered to be genuine if the data is stationary and the common causes of all pairs of variables have been included, which is a very strong condition that may be hard to fulfil in realistic scenarios. However, causal significance is an evolving concept – last year, Huang and Kleinberg <a href=http://www.skleinberg.org/papers/huang_flairs15.pdf target=_blank rel=noopener>introduced a new definition of causal significance that can be inferred faster and yield more accurate results</a>. My general feeling is that this line of research will continue to yield many interesting and useful results in coming years.</p><p>Kleinberg&rsquo;s work is not without its limitations. In addition to the assumptions that causal relationships are stationary and the requirement to identify all potential causes, the recently-introduced definition of causal significance also requires the relationships to be linear and additive (though this limitation may be relaxed in future work). Another issue is that most of the evaluation in the studies I&rsquo;ve read was done on synthetic datasets. While there are some results on real-life health and finance data, I find it hard to judge the practicality of utilising Kleinberg&rsquo;s methods without applying them to problems that I&rsquo;m more familiar with. Finally, as with other work in the field of causal inference, we need to have some degree of belief in untested assumptions to reach useful conclusions. In Kleinberg&rsquo;s words:</p><blockquote><p>Thus, a just so cause <em>is</em> genuine in the case where all of the outlined assumptions hold (namely that all common causes are included, the structure is representative of the system and, when data is used, a formula satisfied by the data will be satisfied by the structure). Our <em>belief</em> in whether a cause is genuine, in the case where it is not certain that the assumptions hold, should be proportional to how much we believe that the assumptions are true.</p></blockquote><h2 id=hahahugoshortcode42s3hbhb-hill-testing-untested-assumptions><figure class=float-right><a href=austin-bradford-hill.jpg target=_blank rel=noopener><img sizes="(min-width: 768px) 720px,
 100vw" srcset="https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/austin-bradford-hill_hu1650655389902571844.jpg 360w,
 https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/austin-bradford-hill_hu15238840695674314937.jpg 480w,
 https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/austin-bradford-hill_hu12469998022001698919.jpg 720w,
diff --git a/talks/first-data-hire/index.html b/talks/first-data-hire/index.html
index 98b65630f..8a7f614c8 100644
--- a/talks/first-data-hire/index.html
+++ b/talks/first-data-hire/index.html
@@ -355,7 +355,7 @@ <h2>Relevant, informative questions</h2>
             <li>Past work: Deep on what they've done and why; current motivations especially relevant around why they want the role</li>
             <li>Clarity of non-tech communication is especially important for first data hire (interfaces with rest of company)</li>
             <li>Simple questions can be tested by non-techies (Google recruiter screen as example)</li>
-            <li>Take-home task should be tightly time-boxed to reduce chance of cheating (CBA example)</li>
+            <li>Take-home task: tightly time-boxed to reduce chance of cheating (CBA example) OR completely live (let candidate choose)</li>
             <li>Digging live into take-home work ensures they understand what they did (especially important with Chatty)</li>
             <li>Paid trials & contract-to-hire: Depends on candidates; possible with a large qualified pool</li>
           </ul>
@@ -407,7 +407,7 @@ <h2>Recap: Key takeaways</h2>
       <h2>Story time?</h2>
       <ul>
         <li class="fragment fade-in-then-semi-out">
-          Car Next Door (now Uber Carshare):
+          Car Next Door (Uber Carshare):
           <ul>
             <li>"Head of Data Science"</li>
             <li>Plenty of engineering and analytics</li>