DoWhy 因果分析入门教程

简介

DoWhy 是 Microsoft 开源的因果分析 Python 库，用于估计不同干预的因果效应。

快速开始

安装

pip install dowhy

基础用法

import dowhy
from dowhy import CausalModel
import pandas as pd
import numpy as np

# 生成示例数据
np.random.seed(42)
n = 1000

x = np.random.normal(0, 1, n)
treatment = (x > 0).astype(int)
y = 2 * x + 0.5 * treatment + np.random.normal(0, 0.1, n)

data = pd.DataFrame({
    'x': x,
    'treatment': treatment,
    'y': y,
})

# 创建因果模型
model = CausalModel(
    data=data,
    treatment='treatment',
    outcome='y',
    graph='treatment -> y; x -> treatment; x -> y'
)

# 识别因果效应
identified_estimand = model.identify_effect()

# 估计因果效应
estimate = model.estimate_effect(
    identified_estimand,
    method_name='backdoor.linear_regression'
)

print(f"因果效应值: {estimate.value}")

四步法

DoWhy 采用标准化的四步因果分析流程：

1. 建立模型

定义因果图和假设

2. 识别效应

识别要估计的因果效应

3. 估计效应

计算因果效应的数值

4. 反驳验证

检验结果的稳健性

应用场景

观察性研究中的因果推断
A/B 测试效果评估
政策影响分析
营销活动效果衡量

下一步

探索更多因果分析方法和高级功能！