causalml package¶

Submodules¶

causalml.inference.tree module¶

class causalml.inference.tree.CausalMSE¶

Bases: sklearn.tree._criterion.RegressionCriterion

Causal Tree mean squared error impurity criterion.

CausalTreeMSE = right_effect + left_effect

where,

effect = alpha * tau^2 - (1 - alpha) * (1 + train_to_est_ratio) * (VAR_tr / p + VAR_cont / (1 - p))

class causalml.inference.tree.CausalTreeRegressor(ate_alpha=0.05, control_name=0, max_depth=None, min_samples_leaf=100, random_state=None)¶

Bases: object

A Causal Tree regressor class.

The Causal Tree is a decision tree regressor with a split criteria for treatment effects instead of outputs.

Details are available at Athey and Imbens (2015) (https://arxiv.org/abs/1504.01132)

bootstrap(X, treatment, y, size=10000)¶

Runs a single bootstrap.

Fits on bootstrapped sample, then predicts on whole population.

Parameters

X (np.matrix) – a feature matrix
treatment (np.array) – a treatment vector
y (np.array) – an outcome vector
size (int, optional) – bootstrap sample size

Returns

bootstrap predictions

Return type

(np.array)

estimate_ate(X, treatment, y)¶

Estimate the Average Treatment Effect (ATE).

Parameters

X (np.matrix) – a feature matrix
treatment (np.array) – a treatment vector
y (np.array) – an outcome vector

Returns

The mean and confidence interval (LB, UB) of the ATE estimate.

fit(X, treatment, y)¶

Fit the Causal Tree model

Parameters

X (np.matrix) – a feature matrix
treatment (np.array) – a treatment vector
y (np.array) – an outcome vector

Returns

self (CausalTree object)

fit_predict(X, treatment, y, return_ci=False, n_bootstraps=1000, bootstrap_size=10000, verbose=False)¶

Fit the Causal Tree model and predict treatment effects.

Parameters

X (np.matrix) – a feature matrix
treatment (np.array) – a treatment vector
y (np.array) – an outcome vector
return_ci (bool) – whether to return confidence intervals
n_bootstraps (int) – number of bootstrap iterations
bootstrap_size (int) – number of samples per bootstrap
verbose (str) – whether to output progress logs

Returns

te (numpy.ndarray): Predictions of treatment effects.
te_lower (numpy.ndarray, optional): lower bounds of treatment effects
te_upper (numpy.ndarray, optional): upper bounds of treatment effects

Return type

(tuple)

predict(X)¶

Predict treatment effects.

Parameters: X (np.matrix) – a feature matrix
Returns: Predictions of treatment effects.
Return type: (numpy.ndarray)

class causalml.inference.tree.DecisionTree(classes_, col=- 1, value=None, trueBranch=None, falseBranch=None, results=None, summary=None, maxDiffTreatment=None, maxDiffSign=1.0, nodeSummary=None, backupResults=None, bestTreatment=None, upliftScore=None, matchScore=None)¶

Bases: object

Tree Node Class

Tree node class to contain all the statistics of the tree node.

Parameters

classes (list of str) – A list of the control and treatment group names.
col (int, optional (default = -1)) – The column index for splitting the tree node to children nodes.
value (float, optional (default = None)) – The value of the feature column to split the tree node to children nodes.
trueBranch (object of DecisionTree) – The true branch tree node (feature > value).
falseBranch (object of DecisionTree) – The false branch tree node (feature > value).
results (list of float) – The classification probability P(Y=1|T) for each of the control and treatment groups in the tree node.
summary (list of list) – Summary statistics of the tree nodes, including impurity, sample size, uplift score, etc.
maxDiffTreatment (int) – The treatment index generating the maximum difference between the treatment and control groups.
maxDiffSign (float) – The sign of the maximum difference (1. or -1.).
nodeSummary (list of list) – Summary statistics of the tree nodes [P(Y=1|T), N(T)], where y_mean stands for the target metric mean and n is the sample size.
backupResults (list of float) – The positive probabilities in each of the control and treatment groups in the parent node. The parent node information is served as a backup for the children node, in case no valid statistics can be calculated from the children node, the parent node information will be used in certain cases.
bestTreatment (int) – The treatment index providing the best uplift (treatment effect).
upliftScore (list) – The uplift score of this node: [max_Diff, p_value], where max_Diff stands for the maximum treatment effect, and p_value stands for the p_value of the treatment effect.
matchScore (float) – The uplift score by filling a trained tree with validation dataset or testing dataset.

class causalml.inference.tree.UpliftRandomForestClassifier(control_name, n_estimators=10, max_features=10, random_state=None, max_depth=5, min_samples_leaf=100, min_samples_treatment=10, n_reg=10, evaluationFunction='KL', normalization=True, n_jobs=- 1)¶

Bases: object

Uplift Random Forest for Classification Task.

Parameters

n_estimators (integer, optional (default=10)) – The number of trees in the uplift random forest.
evaluationFunction (string) – Choose from one of the models: ‘KL’, ‘ED’, ‘Chi’, ‘CTS’, ‘DDP’.
max_features (int, optional (default=10)) – The number of features to consider when looking for the best split.
random_state (int, RandomState instance or None (default=None)) – A random seed or np.random.RandomState to control randomness in building the trees and forest.
max_depth (int, optional (default=5)) – The maximum depth of the tree.
min_samples_leaf (int, optional (default=100)) – The minimum number of samples required to be split at a leaf node.
min_samples_treatment (int, optional (default=10)) – The minimum number of samples required of the experiment group to be split at a leaf node.
n_reg (int, optional (default=10)) – The regularization parameter defined in Rzepakowski et al. 2012, the weight (in terms of sample size) of the parent node influence on the child node, only effective for ‘KL’, ‘ED’, ‘Chi’, ‘CTS’ methods.
control_name (string) – The name of the control group (other experiment groups will be regarded as treatment groups)
normalization (boolean, optional (default=True)) – The normalization factor defined in Rzepakowski et al. 2012, correcting for tests with large number of splits and imbalanced treatment and control splits
n_jobs (int, optional (default=-1)) – The parallelization parameter to define how many parallel jobs need to be created. This is passed on to joblib library for parallelizing uplift-tree creation.
Outputs –
---------- –
df_res (pandas dataframe) – A user-level results dataframe containing the estimated individual treatment effect.

static bootstrap(X, treatment, y, tree)¶

fit(X, treatment, y)¶

Fit the UpliftRandomForestClassifier.

Parameters

X (ndarray, shape = [num_samples, num_features]) – An ndarray of the covariates used to train the uplift model.
treatment (array-like, shape = [num_samples]) – An array containing the treatment group for each unit.
y (array-like, shape = [num_samples]) – An array containing the outcome of interest for each unit.

predict(X, full_output=False)¶

Returns the recommended treatment group and predicted optimal probability conditional on using the recommended treatment group.

Parameters

X (ndarray, shape = [num_samples, num_features]) – An ndarray of the covariates used to train the uplift model.
full_output (bool, optional (default=False)) – Whether the UpliftTree algorithm returns upliftScores, pred_nodes alongside the recommended treatment group and p_hat in the treatment group.

Returns

y_pred_list (ndarray, shape = (num_samples, num_treatments])) – An ndarray containing the predicted treatment effect of each treatment group for each sample
df_res (DataFrame, shape = [num_samples, (num_treatments * 2 + 3)]) – If full_output is True, a DataFrame containing the predicted outcome of each treatment and control group, the treatment effect of each treatment group, the treatment group with the highest treatment effect, and the maximum treatment effect for each sample.

class causalml.inference.tree.UpliftTreeClassifier(control_name, max_features=None, max_depth=3, min_samples_leaf=100, min_samples_treatment=10, n_reg=100, evaluationFunction='KL', normalization=True, random_state=None)¶

Bases: object

Uplift Tree Classifier for Classification Task.

A uplift tree classifier estimates the individual treatment effect by modifying the loss function in the classification trees.

The uplift tree classifier is used in uplift random forest to construct the trees in the forest.

Parameters

evaluationFunction (string) – Choose from one of the models: ‘KL’, ‘ED’, ‘Chi’, ‘CTS’, ‘DDP’.
max_features (int, optional (default=None)) – The number of features to consider when looking for the best split.
max_depth (int, optional (default=3)) – The maximum depth of the tree.
min_samples_leaf (int, optional (default=100)) – The minimum number of samples required to be split at a leaf node.
min_samples_treatment (int, optional (default=10)) – The minimum number of samples required of the experiment group to be split at a leaf node.
n_reg (int, optional (default=100)) – The regularization parameter defined in Rzepakowski et al. 2012, the weight (in terms of sample size) of the parent node influence on the child node, only effective for ‘KL’, ‘ED’, ‘Chi’, ‘CTS’ methods.
control_name (string) – The name of the control group (other experiment groups will be regarded as treatment groups).
normalization (boolean, optional (default=True)) – The normalization factor defined in Rzepakowski et al. 2012, correcting for tests with large number of splits and imbalanced treatment and control splits.
random_state (int, RandomState instance or None (default=None)) – A random seed or np.random.RandomState to control randomness in building a tree.

static classify(observations, tree, dataMissing=False)¶

Classifies (prediction) the observations according to the tree.

Parameters

observations (list of list) – The internal data format for the training data (combining X, Y, treatment).
dataMissing (boolean, optional (default = False)) – An indicator for if data are missing or not.

Returns

The results in the leaf node.

Return type

tree.results, tree.upliftScore

static divideSet(X, treatment_idx, y, column, value)¶

Tree node split.

Parameters

X (ndarray, shape = [num_samples, num_features]) – An ndarray of the covariates used to train the uplift model.
treatment_idx (array-like, shape = [num_samples]) – An array containing the treatment group index for each unit.
y (array-like, shape = [num_samples]) – An array containing the outcome of interest for each unit.
column (int) – The column used to split the data.
value (float or int) – The value in the column for splitting the data.

Returns

(X_l, X_r, treatment_l, treatment_r, y_l, y_r) – The covariates, treatments and outcomes of left node and the right node.

Return type

list of ndarray

static evaluate_CTS(nodeSummary)¶

Calculate CTS (conditional treatment selection) as split evaluation criterion for a given node.

Parameters: nodeSummary (list of list) – The tree node summary statistics, [P(Y=1|T), N(T)], produced by tree_node_summary() method.
Returns: d_res
Return type: Chi-Square

static evaluate_Chi(nodeSummary)¶

Calculate Chi-Square statistic as split evaluation criterion for a given node.

Parameters: nodeSummary (dictionary) – The tree node summary statistics, produced by tree_node_summary() method.
Returns: d_res
Return type: Chi-Square

static evaluate_DDP(nodeSummary)¶

Calculate Delta P as split evaluation criterion for a given node.

Parameters: nodeSummary (list of list) – The tree node summary statistics, [P(Y=1|T), N(T)], produced by tree_node_summary() method.
Returns: d_res
Return type: Delta P

static evaluate_ED(nodeSummary)¶

Calculate Euclidean Distance as split evaluation criterion for a given node.

Parameters: nodeSummary (dictionary) – The tree node summary statistics, produced by tree_node_summary() method.
Returns: d_res
Return type: Euclidean Distance

static evaluate_KL(nodeSummary)¶

Calculate KL Divergence as split evaluation criterion for a given node.

Parameters: nodeSummary (list of list) – The tree node summary statistics, [P(Y=1|T), N(T)], produced by tree_node_summary() method.
Returns: d_res
Return type: KL Divergence

fill(X, treatment, y)¶

Fill the data into an existing tree. This is a higher-level function to transform the original data inputs into lower level data inputs (list of list and tree).

Parameters

X (ndarray, shape = [num_samples, num_features]) – An ndarray of the covariates used to train the uplift model.
treatment (array-like, shape = [num_samples]) – An array containing the treatment group for each unit.
y (array-like, shape = [num_samples]) – An array containing the outcome of interest for each unit.

Returns

self

Return type

object

fillTree(X, treatment_idx, y, tree)¶

Fill the data into an existing tree. This is a lower-level function to execute on the tree filling task.

Parameters

X (ndarray, shape = [num_samples, num_features]) – An ndarray of the covariates used to train the uplift model.
treatment_idx (array-like, shape = [num_samples]) – An array containing the treatment group index for each unit.
y (array-like, shape = [num_samples]) – An array containing the outcome of interest for each unit.
tree (object) – object of DecisionTree class

Returns

self

Return type

object

fit(X, treatment, y)¶

Fit the uplift model.

Parameters

X (ndarray, shape = [num_samples, num_features]) – An ndarray of the covariates used to train the uplift model.
treatment (array-like, shape = [num_samples]) – An array containing the treatment group for each unit.
y (array-like, shape = [num_samples]) – An array containing the outcome of interest for each unit.

Returns

self

Return type

object

group_uniqueCounts(treatment_idx, y)¶

Count sample size by experiment group.

Parameters

treatment_idx (array-like, shape = [num_samples]) – An array containing the treatment group index for each unit.
y (array-like, shape = [num_samples]) – An array containing the outcome of interest for each unit.

Returns

results – The negative and positive outcome sample sizes for each of the control and treatment groups.

Return type

list of list

growDecisionTreeFrom(X, treatment_idx, y, max_depth=10, min_samples_leaf=100, depth=1, min_samples_treatment=10, n_reg=100, parentNodeSummary=None)¶

Train the uplift decision tree.

Parameters

X (ndarray, shape = [num_samples, num_features]) – An ndarray of the covariates used to train the uplift model.
treatment_idx (array-like, shape = [num_samples]) – An array containing the treatment group idx for each unit.
y (array-like, shape = [num_samples]) – An array containing the outcome of interest for each unit.
max_depth (int, optional (default=10)) – The maximum depth of the tree.
min_samples_leaf (int, optional (default=100)) – The minimum number of samples required to be split at a leaf node.
depth (int, optional (default = 1)) – The current depth.
min_samples_treatment (int, optional (default=10)) – The minimum number of samples required of the experiment group to be split at a leaf node.
n_reg (int, optional (default=10)) – The regularization parameter defined in Rzepakowski et al. 2012, the weight (in terms of sample size) of the parent node influence on the child node, only effective for ‘KL’, ‘ED’, ‘Chi’, ‘CTS’ methods.
parentNodeSummary (dictionary, optional (default = None)) – Node summary statistics of the parent tree node.

Returns

Return type

object of DecisionTree class

normI(n_c: int, n_c_left: int, n_t: list, n_t_left: list, alpha: float = 0.9) → float¶

Normalization factor.

Parameters

currentNodeSummary (list of list) – The summary statistics of the current tree node, [P(Y=1|T), N(T)].
leftNodeSummary (list of list) – The summary statistics of the left tree node, [P(Y=1|T), N(T)].
alpha (float) – The weight used to balance different normalization parts.

Returns

norm_res – Normalization factor.

Return type

float

predict(X)¶

Returns the recommended treatment group and predicted optimal probability conditional on using the recommended treatment group.

Parameters: X (ndarray, shape = [num_samples, num_features]) – An ndarray of the covariates used to train the uplift model.
Returns: pred – An ndarray of predicted treatment effects across treatments.
Return type: ndarray, shape = [num_samples, num_treatments]

prune(X, treatment, y, minGain=0.0001, rule='maxAbsDiff')¶

Prune the uplift model. :param X: An ndarray of the covariates used to train the uplift model. :type X: ndarray, shape = [num_samples, num_features] :param treatment: An array containing the treatment group for each unit. :type treatment: array-like, shape = [num_samples] :param y: An array containing the outcome of interest for each unit. :type y: array-like, shape = [num_samples] :param minGain: The minimum gain required to make a tree node split. The children

Parameters: rule (string, optional (default = 'maxAbsDiff')) – The prune rules. Supported values are ‘maxAbsDiff’ for optimizing the maximum absolute difference, and ‘bestUplift’ for optimizing the node-size weighted treatment effect.
Returns: self
Return type: object

pruneTree(X, treatment_idx, y, tree, rule='maxAbsDiff', minGain=0.0, n_reg=0, parentNodeSummary=None)¶

Prune one single tree node in the uplift model. :param X: An ndarray of the covariates used to train the uplift model. :type X: ndarray, shape = [num_samples, num_features] :param treatment_idx: An array containing the treatment group index for each unit. :type treatment_idx: array-like, shape = [num_samples] :param y: An array containing the outcome of interest for each unit. :type y: array-like, shape = [num_samples] :param rule: The prune rules. Supported values are ‘maxAbsDiff’ for optimizing the maximum absolute difference, and

Parameters

minGain (float, optional (default = 0.)) – The minimum gain required to make a tree node split. The children tree branches are trimmed if the actual split gain is less than the minimum gain.
n_reg (int, optional (default=0)) – The regularization parameter defined in Rzepakowski et al. 2012, the weight (in terms of sample size) of the parent node influence on the child node, only effective for ‘KL’, ‘ED’, ‘Chi’, ‘CTS’ methods.
parentNodeSummary (list of list, optional (default = None)) – Node summary statistics, [P(Y=1|T), N(T)] of the parent tree node.

Returns

self

Return type

object

tree_node_summary(treatment_idx, y, min_samples_treatment=10, n_reg=100, parentNodeSummary=None)¶

Tree node summary statistics.

Parameters

treatment_idx (array-like, shape = [num_samples]) – An array containing the treatment group index for each unit.
y (array-like, shape = [num_samples]) – An array containing the outcome of interest for each unit.
min_samples_treatment (int, optional (default=10)) – The minimum number of samples required of the experiment group t be split at a leaf node.
n_reg (int, optional (default=10)) – The regularization parameter defined in Rzepakowski et al. 2012, the weight (in terms of sample size) of the parent node influence on the child node, only effective for ‘KL’, ‘ED’, ‘Chi’, ‘CTS’ methods.
parentNodeSummary (list of list) – The positive probabilities and sample sizes of each of the control and treatment groups in the parent node.

Returns

nodeSummary – The positive probabilities and sample sizes of each of the control and treatment groups in the current node.

Return type

list of list

uplift_classification_results(treatment_idx, y)¶

Classification probability for each treatment in the tree node.

Parameters

treatment_idx (array-like, shape = [num_samples]) – An array containing the treatment group index for each unit.
y (array-like, shape = [num_samples]) – An array containing the outcome of interest for each unit.

Returns

res – The positive probabilities P(Y = 1) of each of the control and treatment groups

Return type

list of list

causalml.inference.tree.cat_continuous(x, granularity='Medium')[source]¶

Categorize (bin) continuous variable based on percentile.

Parameters

x (list) – Feature values.
granularity (string, optional, (default = 'Medium')) – Control the granularity of the bins, optional values are: ‘High’, ‘Medium’, ‘Low’.

Returns

res – List of percentile bins for the feature value.

Return type

list

causalml.inference.tree.cat_group(dfx, kpix, n_group=10)[source]¶

Category Reduction for Categorical Variables

Parameters

dfx (dataframe) – The inputs data dataframe.
kpix (string) – The column of the feature.
n_group (int, optional (default = 10)) – The number of top category values to be remained, other category values will be put into “Other”.

Returns

Return type

The transformed categorical feature value list.

causalml.inference.tree.cat_transform(dfx, kpix, kpi1)[source]¶

Encoding string features.

Parameters

dfx (dataframe) – The inputs data dataframe.
kpix (string) – The column of the feature.
kpi1 (list) – The list of feature names.

Returns

dfx (DataFrame) – The updated dataframe containing the encoded data.
kpi1 (list) – The updated feature names containing the new dummy feature names.

causalml.inference.tree.cv_fold_index(n, i, k, random_seed=2018)[source]¶

Encoding string features.

Parameters

dfx (dataframe) – The inputs data dataframe.
kpix (string) – The column of the feature.
kpi1 (list) – The list of feature names.

Returns

dfx (DataFrame) – The updated dataframe containing the encoded data.
kpi1 (list) – The updated feature names containing the new dummy feature names.

causalml.inference.tree.kpi_transform(dfx, kpi_combo, kpi_combo_new)[source]¶

Feature transformation from continuous feature to binned features for a list of features

Parameters

dfx (DataFrame) – DataFrame containing the features.
kpi_combo (list of string) – List of feature names to be transformed
kpi_combo_new (list of string) – List of new feature names to be assigned to the transformed features.

Returns

dfx – Updated DataFrame containing the new features.

Return type

DataFrame

causalml.inference.tree.uplift_tree_plot(decisionTree, x_names)[source]¶

Convert the tree to dot graph for plots.

Parameters

decisionTree (object) – object of DecisionTree class
x_names (list) – List of feature names

Returns

Return type

Dot class representing the tree graph.

causalml.inference.tree.uplift_tree_string(decisionTree, x_names)[source]¶

Convert the tree to string for print.

Parameters

decisionTree (object) – object of DecisionTree class
x_names (list) – List of feature names

Returns

Return type

A string representation of the tree.

causalml.inference.meta module¶

class causalml.inference.meta.BaseDRLearner(learner=None, control_outcome_learner=None, treatment_outcome_learner=None, treatment_effect_learner=None, ate_alpha=0.05, control_name=0)[source]¶

Bases: causalml.inference.meta.base.BaseLearner

A parent class for DR-learner regressor classes.

A DR-learner estimates treatment effects with machine learning models.

Details of DR-learner are available at Kennedy (2020) (https://arxiv.org/abs/2004.14497).

estimate_ate(X, treatment, y, p=None, bootstrap_ci=False, n_bootstraps=1000, bootstrap_size=10000, seed=None)[source]¶

Estimate the Average Treatment Effect (ATE).

Parameters

X (np.matrix or np.array or pd.Dataframe) – a feature matrix
treatment (np.array or pd.Series) – a treatment vector
y (np.array or pd.Series) – an outcome vector
p (np.ndarray or pd.Series or dict, optional) – an array of propensity scores of float (0,1) in the single-treatment case; or, a dictionary of treatment groups that map to propensity vectors of float (0,1); if None will run ElasticNetPropensityModel() to generate the propensity scores.
bootstrap_ci (bool) – whether run bootstrap for confidence intervals
n_bootstraps (int) – number of bootstrap iterations
bootstrap_size (int) – number of samples per bootstrap
seed (int) – random seed for cross-fitting

Returns

The mean and confidence interval (LB, UB) of the ATE estimate.

fit(X, treatment, y, p=None, seed=None)[source]¶

Fit the inference model.

Parameters

X (np.matrix or np.array or pd.Dataframe) – a feature matrix
treatment (np.array or pd.Series) – a treatment vector
y (np.array or pd.Series) – an outcome vector
p (np.ndarray or pd.Series or dict, optional) – an array of propensity scores of float (0,1) in the single-treatment case; or, a dictionary of treatment groups that map to propensity vectors of float (0,1); if None will run ElasticNetPropensityModel() to generate the propensity scores.
seed (int) – random seed for cross-fitting

fit_predict(X, treatment, y, p=None, return_ci=False, n_bootstraps=1000, bootstrap_size=10000, return_components=False, verbose=True, seed=None)[source]¶

Fit the treatment effect and outcome models of the R learner and predict treatment effects.

Parameters

X (np.matrix or np.array or pd.Dataframe) – a feature matrix
treatment (np.array or pd.Series) – a treatment vector
y (np.array or pd.Series) – an outcome vector
p (np.ndarray or pd.Series or dict, optional) – an array of propensity scores of float (0,1) in the single-treatment case; or, a dictionary of treatment groups that map to propensity vectors of float (0,1); if None will run ElasticNetPropensityModel() to generate the propensity scores.
return_ci (bool) – whether to return confidence intervals
n_bootstraps (int) – number of bootstrap iterations
bootstrap_size (int) – number of samples per bootstrap
return_components (bool, optional) – whether to return outcome for treatment and control seperately
verbose (str) – whether to output progress logs
seed (int) – random seed for cross-fitting

Returns

Predictions of treatment effects. Output dim: [n_samples, n_treatment]: If return_ci, returns CATE [n_samples, n_treatment], LB [n_samples, n_treatment], UB [n_samples, n_treatment]

Return type

(numpy.ndarray)

predict(X, treatment=None, y=None, p=None, return_components=False, verbose=True)[source]¶

Predict treatment effects.

Parameters

X (np.matrix or np.array or pd.Dataframe) – a feature matrix
treatment (np.array or pd.Series, optional) – a treatment vector
y (np.array or pd.Series, optional) – an outcome vector
verbose (bool, optional) – whether to output progress logs

Returns

Predictions of treatment effects.

Return type

(numpy.ndarray)

class causalml.inference.meta.BaseDRRegressor(learner=None, control_outcome_learner=None, treatment_outcome_learner=None, treatment_effect_learner=None, ate_alpha=0.05, control_name=0)[source]¶

Bases: causalml.inference.meta.drlearner.BaseDRLearner

A parent class for DR-learner regressor classes.

class causalml.inference.meta.BaseRClassifier(outcome_learner=None, effect_learner=None, propensity_learner=LogisticRegressionCV(Cs=array([1.00230524, 2.15608891, 4.63802765, 9.97700064]), cv=StratifiedKFold(n_splits=4, random_state=42, shuffle=True), l1_ratios=array([0.001, 0.33366667, 0.66633333, 0.999]), penalty='elasticnet', random_state=42, solver='saga'), ate_alpha=0.05, control_name=0, n_fold=5, random_state=None)[source]¶

Bases: causalml.inference.meta.rlearner.BaseRLearner

A parent class for R-learner classifier classes.

fit(X, treatment, y, p=None, sample_weight=None, verbose=True)[source]¶

Fit the treatment effect and outcome models of the R learner.

Parameters

X (np.matrix or np.array or pd.Dataframe) – a feature matrix
treatment (np.array or pd.Series) – a treatment vector
y (np.array or pd.Series) – an outcome vector
p (np.ndarray or pd.Series or dict, optional) – an array of propensity scores of float (0,1) in the single-treatment case; or, a dictionary of treatment groups that map to propensity vectors of float (0,1); if None will run ElasticNetPropensityModel() to generate the propensity scores.
sample_weight (np.array or pd.Series, optional) – an array of sample weights indicating the weight of each observation for effect_learner. If None, it assumes equal weight.
verbose (bool, optional) – whether to output progress logs

predict(X, p=None)[source]¶

Predict treatment effects.

Parameters: X (np.matrix or np.array or pd.Dataframe) – a feature matrix
Returns: Predictions of treatment effects.
Return type: (numpy.ndarray)

class causalml.inference.meta.BaseRLearner(learner=None, outcome_learner=None, effect_learner=None, propensity_learner=LogisticRegressionCV(Cs=array([1.00230524, 2.15608891, 4.63802765, 9.97700064]), cv=StratifiedKFold(n_splits=4, random_state=42, shuffle=True), l1_ratios=array([0.001, 0.33366667, 0.66633333, 0.999]), penalty='elasticnet', random_state=42, solver='saga'), ate_alpha=0.05, control_name=0, n_fold=5, random_state=None)[source]¶

Bases: causalml.inference.meta.base.BaseLearner

A parent class for R-learner classes.

An R-learner estimates treatment effects with two machine learning models and the propensity score.

Details of R-learner are available at Nie and Wager (2019) (https://arxiv.org/abs/1712.04912).

estimate_ate(X, treatment, y, p=None, sample_weight=None, bootstrap_ci=False, n_bootstraps=1000, bootstrap_size=10000)[source]¶

Estimate the Average Treatment Effect (ATE).

Parameters

X (np.matrix or np.array or pd.Dataframe) – a feature matrix
treatment (np.array or pd.Series) – a treatment vector
y (np.array or pd.Series) – an outcome vector
p (np.ndarray or pd.Series or dict, optional) – an array of propensity scores of float (0,1) in the single-treatment case; or, a dictionary of treatment groups that map to propensity vectors of float (0,1); if None will run ElasticNetPropensityModel() to generate the propensity scores.
sample_weight (np.array or pd.Series, optional) – an array of sample weights indicating the weight of each observation for effect_learner. If None, it assumes equal weight.
bootstrap_ci (bool) – whether run bootstrap for confidence intervals
n_bootstraps (int) – number of bootstrap iterations
bootstrap_size (int) – number of samples per bootstrap

Returns

The mean and confidence interval (LB, UB) of the ATE estimate.

fit(X, treatment, y, p=None, sample_weight=None, verbose=True)[source]¶

Fit the treatment effect and outcome models of the R learner.

Parameters

X (np.matrix or np.array or pd.Dataframe) – a feature matrix
treatment (np.array or pd.Series) – a treatment vector
y (np.array or pd.Series) – an outcome vector
p (np.ndarray or pd.Series or dict, optional) – an array of propensity scores of float (0,1) in the single-treatment case; or, a dictionary of treatment groups that map to propensity vectors of float (0,1); if None will run ElasticNetPropensityModel() to generate the propensity scores.
sample_weight (np.array or pd.Series, optional) – an array of sample weights indicating the weight of each observation for effect_learner. If None, it assumes equal weight.
verbose (bool, optional) – whether to output progress logs

fit_predict(X, treatment, y, p=None, sample_weight=None, return_ci=False, n_bootstraps=1000, bootstrap_size=10000, verbose=True)[source]¶

Fit the treatment effect and outcome models of the R learner and predict treatment effects.

Parameters

X (np.matrix or np.array or pd.Dataframe) – a feature matrix
treatment (np.array or pd.Series) – a treatment vector
y (np.array or pd.Series) – an outcome vector
p (np.ndarray or pd.Series or dict, optional) – an array of propensity scores of float (0,1) in the single-treatment case; or, a dictionary of treatment groups that map to propensity vectors of float (0,1); if None will run ElasticNetPropensityModel() to generate the propensity scores.
sample_weight (np.array or pd.Series, optional) – an array of sample weights indicating the weight of each observation for effect_learner. If None, it assumes equal weight.
return_ci (bool) – whether to return confidence intervals
n_bootstraps (int) – number of bootstrap iterations
bootstrap_size (int) – number of samples per bootstrap
verbose (bool) – whether to output progress logs

Returns

Predictions of treatment effects. Output dim: [n_samples, n_treatment].: If return_ci, returns CATE [n_samples, n_treatment], LB [n_samples, n_treatment], UB [n_samples, n_treatment]

Return type

(numpy.ndarray)

predict(X, p=None)[source]¶

Predict treatment effects.

Parameters: X (np.matrix or np.array or pd.Dataframe) – a feature matrix
Returns: Predictions of treatment effects.
Return type: (numpy.ndarray)

class causalml.inference.meta.BaseRRegressor(learner=None, outcome_learner=None, effect_learner=None, propensity_learner=LogisticRegressionCV(Cs=array([1.00230524, 2.15608891, 4.63802765, 9.97700064]), cv=StratifiedKFold(n_splits=4, random_state=42, shuffle=True), l1_ratios=array([0.001, 0.33366667, 0.66633333, 0.999]), penalty='elasticnet', random_state=42, solver='saga'), ate_alpha=0.05, control_name=0, n_fold=5, random_state=None)[source]¶

Bases: causalml.inference.meta.rlearner.BaseRLearner

A parent class for R-learner regressor classes.

class causalml.inference.meta.BaseSClassifier(learner=None, ate_alpha=0.05, control_name=0)[source]¶

Bases: causalml.inference.meta.slearner.BaseSLearner

A parent class for S-learner classifier classes.

predict(X, treatment=None, y=None, p=None, return_components=False, verbose=True)[source]¶

Predict treatment effects. :param X: a feature matrix :type X: np.matrix or np.array or pd.Dataframe :param treatment: a treatment vector :type treatment: np.array or pd.Series, optional :param y: an outcome vector :type y: np.array or pd.Series, optional :param return_components: whether to return outcome for treatment and control seperately :type return_components: bool, optional :param verbose: whether to output progress logs :type verbose: bool, optional

Returns: Predictions of treatment effects.
Return type: (numpy.ndarray)

class causalml.inference.meta.BaseSLearner(learner=None, ate_alpha=0.05, control_name=0)[source]¶

Bases: causalml.inference.meta.base.BaseLearner

A parent class for S-learner classes. An S-learner estimates treatment effects with one machine learning model. Details of S-learner are available at Kunzel et al. (2018) (https://arxiv.org/abs/1706.03461).

estimate_ate(X, treatment, y, p=None, return_ci=False, bootstrap_ci=False, n_bootstraps=1000, bootstrap_size=10000)[source]¶

Estimate the Average Treatment Effect (ATE).

Parameters

X (np.matrix, np.array, or pd.Dataframe) – a feature matrix
treatment (np.array or pd.Series) – a treatment vector
y (np.array or pd.Series) – an outcome vector
return_ci (bool, optional) – whether to return confidence intervals
bootstrap_ci (bool) – whether to return confidence intervals
n_bootstraps (int) – number of bootstrap iterations
bootstrap_size (int) – number of samples per bootstrap

Returns

The mean and confidence interval (LB, UB) of the ATE estimate.

fit(X, treatment, y, p=None)[source]¶: Fit the inference model :param X: a feature matrix :type X: np.matrix, np.array, or pd.Dataframe :param treatment: a treatment vector :type treatment: np.array or pd.Series :param y: an outcome vector :type y: np.array or pd.Series

fit_predict(X, treatment, y, p=None, return_ci=False, n_bootstraps=1000, bootstrap_size=10000, return_components=False, verbose=True)[source]¶

Fit the inference model of the S learner and predict treatment effects. :param X: a feature matrix :type X: np.matrix, np.array, or pd.Dataframe :param treatment: a treatment vector :type treatment: np.array or pd.Series :param y: an outcome vector :type y: np.array or pd.Series :param return_ci: whether to return confidence intervals :type return_ci: bool, optional :param n_bootstraps: number of bootstrap iterations :type n_bootstraps: int, optional :param bootstrap_size: number of samples per bootstrap :type bootstrap_size: int, optional :param return_components: whether to return outcome for treatment and control seperately :type return_components: bool, optional :param verbose: whether to output progress logs :type verbose: bool, optional

Returns

Predictions of treatment effects. Output dim: [n_samples, n_treatment].: If return_ci, returns CATE [n_samples, n_treatment], LB [n_samples, n_treatment], UB [n_samples, n_treatment]

Return type

(numpy.ndarray)

predict(X, treatment=None, y=None, p=None, return_components=False, verbose=True)[source]¶

Predict treatment effects. :param X: a feature matrix :type X: np.matrix or np.array or pd.Dataframe :param treatment: a treatment vector :type treatment: np.array or pd.Series, optional :param y: an outcome vector :type y: np.array or pd.Series, optional :param return_components: whether to return outcome for treatment and control seperately :type return_components: bool, optional :param verbose: whether to output progress logs :type verbose: bool, optional

Returns: Predictions of treatment effects.
Return type: (numpy.ndarray)

class causalml.inference.meta.BaseSRegressor(learner=None, ate_alpha=0.05, control_name=0)[source]¶

Bases: causalml.inference.meta.slearner.BaseSLearner

A parent class for S-learner regressor classes.

class causalml.inference.meta.BaseTClassifier(learner=None, control_learner=None, treatment_learner=None, ate_alpha=0.05, control_name=0)[source]¶

Bases: causalml.inference.meta.tlearner.BaseTLearner

A parent class for T-learner classifier classes.

predict(X, treatment=None, y=None, p=None, return_components=False, verbose=True)[source]¶

Predict treatment effects.

Parameters

X (np.matrix or np.array or pd.Dataframe) – a feature matrix
treatment (np.array or pd.Series, optional) – a treatment vector
y (np.array or pd.Series, optional) – an outcome vector
verbose (bool, optional) – whether to output progress logs

Returns

Predictions of treatment effects.

Return type

(numpy.ndarray)

class causalml.inference.meta.BaseTLearner(learner=None, control_learner=None, treatment_learner=None, ate_alpha=0.05, control_name=0)[source]¶

Bases: causalml.inference.meta.base.BaseLearner

A parent class for T-learner regressor classes.

A T-learner estimates treatment effects with two machine learning models.

Details of T-learner are available at Kunzel et al. (2018) (https://arxiv.org/abs/1706.03461).

estimate_ate(X, treatment, y, p=None, bootstrap_ci=False, n_bootstraps=1000, bootstrap_size=10000)[source]¶

Estimate the Average Treatment Effect (ATE).

Parameters

X (np.matrix or np.array or pd.Dataframe) – a feature matrix
treatment (np.array or pd.Series) – a treatment vector
y (np.array or pd.Series) – an outcome vector
bootstrap_ci (bool) – whether to return confidence intervals
n_bootstraps (int) – number of bootstrap iterations
bootstrap_size (int) – number of samples per bootstrap

Returns

The mean and confidence interval (LB, UB) of the ATE estimate.

fit(X, treatment, y, p=None)[source]¶

Fit the inference model

Parameters

X (np.matrix or np.array or pd.Dataframe) – a feature matrix
treatment (np.array or pd.Series) – a treatment vector
y (np.array or pd.Series) – an outcome vector

fit_predict(X, treatment, y, p=None, return_ci=False, n_bootstraps=1000, bootstrap_size=10000, return_components=False, verbose=True)[source]¶

Fit the inference model of the T learner and predict treatment effects.

Parameters

X (np.matrix or np.array or pd.Dataframe) – a feature matrix
treatment (np.array or pd.Series) – a treatment vector
y (np.array or pd.Series) – an outcome vector
return_ci (bool) – whether to return confidence intervals
n_bootstraps (int) – number of bootstrap iterations
bootstrap_size (int) – number of samples per bootstrap
return_components (bool, optional) – whether to return outcome for treatment and control seperately
verbose (str) – whether to output progress logs

Returns

Predictions of treatment effects. Output dim: [n_samples, n_treatment].: If return_ci, returns CATE [n_samples, n_treatment], LB [n_samples, n_treatment], UB [n_samples, n_treatment]

Return type

(numpy.ndarray)

predict(X, treatment=None, y=None, p=None, return_components=False, verbose=True)[source]¶

Predict treatment effects.

Parameters

X (np.matrix or np.array or pd.Dataframe) – a feature matrix
treatment (np.array or pd.Series, optional) – a treatment vector
y (np.array or pd.Series, optional) – an outcome vector
return_components (bool, optional) – whether to return outcome for treatment and control seperately
verbose (bool, optional) – whether to output progress logs

Returns

Predictions of treatment effects.

Return type

(numpy.ndarray)

class causalml.inference.meta.BaseTRegressor(learner=None, control_learner=None, treatment_learner=None, ate_alpha=0.05, control_name=0)[source]¶

Bases: causalml.inference.meta.tlearner.BaseTLearner

A parent class for T-learner regressor classes.

class causalml.inference.meta.BaseXClassifier(outcome_learner=None, effect_learner=None, control_outcome_learner=None, treatment_outcome_learner=None, control_effect_learner=None, treatment_effect_learner=None, ate_alpha=0.05, control_name=0)[source]¶

Bases: causalml.inference.meta.xlearner.BaseXLearner

A parent class for X-learner classifier classes.

fit(X, treatment, y, p=None)[source]¶

Fit the inference model.

Parameters

X (np.matrix or np.array or pd.Dataframe) – a feature matrix
treatment (np.array or pd.Series) – a treatment vector
y (np.array or pd.Series) – an outcome vector
p (np.ndarray or pd.Series or dict, optional) – an array of propensity scores of float (0,1) in the single-treatment case; or, a dictionary of treatment groups that map to propensity vectors of float (0,1); if None will run ElasticNetPropensityModel() to generate the propensity scores.

predict(X, treatment=None, y=None, p=None, return_components=False, verbose=True)[source]¶

Predict treatment effects.

Parameters

X (np.matrix or np.array or pd.Dataframe) – a feature matrix
treatment (np.array or pd.Series, optional) – a treatment vector
y (np.array or pd.Series, optional) – an outcome vector
p (np.ndarray or pd.Series or dict, optional) – an array of propensity scores of float (0,1) in the single-treatment case; or, a dictionary of treatment groups that map to propensity vectors of float (0,1); if None will run ElasticNetPropensityModel() to generate the propensity scores.
return_components (bool, optional) – whether to return outcome for treatment and control seperately
return_p_score (bool, optional) – whether to return propensity score
verbose (bool, optional) – whether to output progress logs

Returns

Predictions of treatment effects.

Return type

(numpy.ndarray)

class causalml.inference.meta.BaseXLearner(learner=None, control_outcome_learner=None, treatment_outcome_learner=None, control_effect_learner=None, treatment_effect_learner=None, ate_alpha=0.05, control_name=0)[source]¶

Bases: causalml.inference.meta.base.BaseLearner

A parent class for X-learner regressor classes.

An X-learner estimates treatment effects with four machine learning models.

Details of X-learner are available at Kunzel et al. (2018) (https://arxiv.org/abs/1706.03461).

estimate_ate(X, treatment, y, p=None, bootstrap_ci=False, n_bootstraps=1000, bootstrap_size=10000)[source]¶

Estimate the Average Treatment Effect (ATE).

Parameters

X (np.matrix or np.array or pd.Dataframe) – a feature matrix
treatment (np.array or pd.Series) – a treatment vector
y (np.array or pd.Series) – an outcome vector
p (np.ndarray or pd.Series or dict, optional) – an array of propensity scores of float (0,1) in the single-treatment case; or, a dictionary of treatment groups that map to propensity vectors of float (0,1); if None will run ElasticNetPropensityModel() to generate the propensity scores.
bootstrap_ci (bool) – whether run bootstrap for confidence intervals
n_bootstraps (int) – number of bootstrap iterations
bootstrap_size (int) – number of samples per bootstrap

Returns

The mean and confidence interval (LB, UB) of the ATE estimate.

fit(X, treatment, y, p=None)[source]¶

Fit the inference model.

Parameters

X (np.matrix or np.array or pd.Dataframe) – a feature matrix
treatment (np.array or pd.Series) – a treatment vector
y (np.array or pd.Series) – an outcome vector
p (np.ndarray or pd.Series or dict, optional) – an array of propensity scores of float (0,1) in the single-treatment case; or, a dictionary of treatment groups that map to propensity vectors of float (0,1); if None will run ElasticNetPropensityModel() to generate the propensity scores.

fit_predict(X, treatment, y, p=None, return_ci=False, n_bootstraps=1000, bootstrap_size=10000, return_components=False, verbose=True)[source]¶

Fit the treatment effect and outcome models of the R learner and predict treatment effects.

Parameters

X (np.matrix or np.array or pd.Dataframe) – a feature matrix
treatment (np.array or pd.Series) – a treatment vector
y (np.array or pd.Series) – an outcome vector
p (np.ndarray or pd.Series or dict, optional) – an array of propensity scores of float (0,1) in the single-treatment case; or, a dictionary of treatment groups that map to propensity vectors of float (0,1); if None will run ElasticNetPropensityModel() to generate the propensity scores.
return_ci (bool) – whether to return confidence intervals
n_bootstraps (int) – number of bootstrap iterations
bootstrap_size (int) – number of samples per bootstrap
return_components (bool, optional) – whether to return outcome for treatment and control seperately
verbose (str) – whether to output progress logs

Returns

Predictions of treatment effects. Output dim: [n_samples, n_treatment]: If return_ci, returns CATE [n_samples, n_treatment], LB [n_samples, n_treatment], UB [n_samples, n_treatment]

Return type

(numpy.ndarray)

predict(X, treatment=None, y=None, p=None, return_components=False, verbose=True)[source]¶

Predict treatment effects.

Parameters

X (np.matrix or np.array or pd.Dataframe) – a feature matrix
treatment (np.array or pd.Series, optional) – a treatment vector
y (np.array or pd.Series, optional) – an outcome vector
p (np.ndarray or pd.Series or dict, optional) – an array of propensity scores of float (0,1) in the single-treatment case; or, a dictionary of treatment groups that map to propensity vectors of float (0,1); if None will run ElasticNetPropensityModel() to generate the propensity scores.
return_components (bool, optional) – whether to return outcome for treatment and control seperately
verbose (bool, optional) – whether to output progress logs

Returns

Predictions of treatment effects.

Return type

(numpy.ndarray)

class causalml.inference.meta.BaseXRegressor(learner=None, control_outcome_learner=None, treatment_outcome_learner=None, control_effect_learner=None, treatment_effect_learner=None, ate_alpha=0.05, control_name=0)[source]¶

Bases: causalml.inference.meta.xlearner.BaseXLearner

A parent class for X-learner regressor classes.

class causalml.inference.meta.LRSRegressor(ate_alpha=0.05, control_name=0)[source]¶

Bases: causalml.inference.meta.slearner.BaseSRegressor

estimate_ate(X, treatment, y, p=None)[source]¶

Estimate the Average Treatment Effect (ATE). :param X: a feature matrix :type X: np.matrix, np.array, or pd.Dataframe :param treatment: a treatment vector :type treatment: np.array or pd.Series :param y: an outcome vector :type y: np.array or pd.Series

Returns: The mean and confidence interval (LB, UB) of the ATE estimate.

class causalml.inference.meta.MLPTRegressor(ate_alpha=0.05, control_name=0, *args, **kwargs)[source]¶: Bases: causalml.inference.meta.tlearner.BaseTRegressor

class causalml.inference.meta.TMLELearner(learner, ate_alpha=0.05, control_name=0, cv=None, calibrate_propensity=True)[source]¶

Bases: object

Targeted maximum likelihood estimation.

Ref: Gruber, S., & Van Der Laan, M. J. (2009). Targeted maximum likelihood estimation: A gentle introduction.

estimate_ate(X, treatment, y, p, segment=None, return_ci=False)[source]¶

Estimate the Average Treatment Effect (ATE).

Parameters

X (np.matrix or np.array or pd.Dataframe) – a feature matrix
treatment (np.array or pd.Series) – a treatment vector
y (np.array or pd.Series) – an outcome vector
p (np.ndarray or pd.Series or dict) – an array of propensity scores of float (0,1) in the single-treatment case; or, a dictionary of treatment groups that map to propensity vectors of float (0,1)
segment (np.array, optional) – An optional segment vector of int. If given, the ATE and its CI will be estimated for each segment.
return_ci (bool, optional) – Whether to return confidence intervals

Returns

The ATE and its confidence interval (LB, UB) for each treatment, t and segment, s

Return type

(tuple)

class causalml.inference.meta.XGBDRRegressor(ate_alpha=0.05, control_name=0, *args, **kwargs)[source]¶: Bases: causalml.inference.meta.drlearner.BaseDRRegressor

class causalml.inference.meta.XGBRRegressor(early_stopping=True, test_size=0.3, early_stopping_rounds=30, effect_learner_objective='rank:pairwise', effect_learner_n_estimators=500, random_state=42, *args, **kwargs)[source]¶

Bases: causalml.inference.meta.rlearner.BaseRRegressor

fit(X, treatment, y, p=None, sample_weight=None, verbose=True)[source]¶

Fit the treatment effect and outcome models of the R learner.

Parameters

X (np.matrix or np.array or pd.Dataframe) – a feature matrix
y (np.array or pd.Series) – an outcome vector
p (np.ndarray or pd.Series or dict, optional) – an array of propensity scores of float (0,1) in the single-treatment case; or, a dictionary of treatment groups that map to propensity vectors of float (0,1); if None will run ElasticNetPropensityModel() to generate the propensity scores.
sample_weight (np.array or pd.Series, optional) – an array of sample weights indicating the weight of each observation for effect_learner. If None, it assumes equal weight.
verbose (bool, optional) – whether to output progress logs

class causalml.inference.meta.XGBTRegressor(ate_alpha=0.05, control_name=0, *args, **kwargs)[source]¶: Bases: causalml.inference.meta.tlearner.BaseTRegressor

causalml.optimize module¶

class causalml.optimize.CounterfactualUnitSelector(learner, nevertaker_payoff, alwaystaker_payoff, complier_payoff, defier_payoff, organic_conversion=None)[source]¶

Bases: object

A highly experimental implementation of the counterfactual unit selection model proposed by Li and Pearl (2019).

Parameters

learner (object) – The base learner used to estimate the segment probabilities.
nevertaker_payoff (float) – The payoff from targeting a never-taker
alwaystaker_payoff (float) – The payoff from targeting an always-taker
complier_payoff (float) – The payoff from targeting a complier
defier_payoff (float) – The payoff from targeting a defier
organic_conversion (float, optional (default=None)) –
The organic conversion rate in the population without an intervention. If None, the organic conversion rate is obtained from tne control group.

NB: The organic conversion in the control group is not always the same as the organic conversion rate without treatment.
data (DataFrame) – A pandas DataFrame containing the features, treatment assignment indicator and the outcome of interest.
treatment (string) – A string corresponding to the name of the treatment column. The assumed coding in the column is 1 for treatment and 0 for control.
outcome (string) – A string corresponding to the name of the outcome column. The assumed coding in the column is 1 for conversion and 0 for no conversion.

References

Li, Ang, and Judea Pearl. 2019. “Unit Selection Based on Counterfactual Logic.” https://ftp.cs.ucla.edu/pub/stat_ser/r488.pdf.

fit(data, treatment, outcome)[source]¶: Fits the class.

predict(data, treatment, outcome)[source]¶: Predicts an individual-level payoff. If gain equality is satisfied, uses the exact function; if not, uses the midpoint between bounds.

class causalml.optimize.CounterfactualValueEstimator(treatment, control_name, treatment_names, y_proba, cate, value, conversion_cost, impression_cost, *args, **kwargs)[source]¶

Bases: object

Parameters

treatment (array, shape = (num_samples, )) – An array of treatment group indicator values.
control_name (string) – The name of the control condition as a string. Must be contained in the treatment array.
treatment_names (list, length = cate.shape[1]) – A list of treatment group names. NB: The order of the items in the list must correspond to the order in which the conditional average treatment effect estimates are in cate_array.
y_proba (array, shape = (num_samples, )) – The predicted probability of conversion using the Y ~ X model across the total sample.
cate (array, shape = (num_samples, len(set(treatment)))) – Conditional average treatment effect estimations from any model.
value (array, shape = (num_samples, )) – Value of converting each unit.
conversion_cost (shape = (num_samples, len(set(treatment)))) – The cost of a treatment that is triggered if a unit converts after having been in the treatment, such as a promotion code.
impression_cost (shape = (num_samples, len(set(treatment)))) – The cost of a treatment that is the same for each unit whether or not they convert, such as a cost associated with a promotion channel.

Notes

Because we get the conditional average treatment effects from cate-learners relative to the control condition, we subtract the cate for the unit in their actual treatment group from y_proba for that unit, in order to recover the control outcome. We then add the cates to the control outcome to obtain y_proba under each condition. These outcomes are counterfactual because just one of them is actually observed.

predict_best()[source]¶: Predict the best treatment group based on the highest counterfactual value for a treatment.

predict_counterfactuals()[source]¶: Predict the counterfactual values for each treatment group.

class causalml.optimize.PolicyLearner(outcome_learner=GradientBoostingRegressor(), treatment_learner=GradientBoostingClassifier(), policy_learner=DecisionTreeClassifier(), clip_bounds=(0.001, 0.999), n_fold=5, random_state=None, calibration=False)[source]¶

Bases: object

A Learner that learns a treatment assignment policy with observational data using doubly robust estimator of causal effect for binary treatment.

Details of the policy learner are available at Athey and Wager (2018) (https://arxiv.org/abs/1702.02896).

fit(X, treatment, y, p=None, dhat=None)[source]¶

Fit the treatment assignment policy learner.

Parameters

X (np.matrix) – a feature matrix
treatment (np.array) – a treatment vector (1 if treated, otherwise 0)
y (np.array) – an outcome vector
p (optional, np.array) – user provided propensity score vector between 0 and 1
dhat (optinal, np.array) – user provided predicted treatment effect vector

Returns

returns an instance of self.

Return type

self

predict(X)[source]¶

Predict treatment assignment that optimizes the outcome.

Parameters: X (np.matrix) – a feature matrix
Returns: predictions of treatment assignment.
Return type: (numpy.ndarray)

predict_proba(X)[source]¶

Predict treatment assignment score that optimizes the outcome.

Parameters: X (np.matrix) – a feature matrix
Returns: predictions of treatment assignment score.
Return type: (numpy.ndarray)

causalml.optimize.get_actual_value(treatment, observed_outcome, conversion_value, conditions, conversion_cost, impression_cost)[source]¶

Set the conversion and impression costs based on a dict of parameters.

Calculate the actual value of targeting a user with the actual treatment group using the above parameters.

treatmentarray, shape = (num_samples, ): Treatment array.
observed_outcomearray, shape = (num_samples, ): Observed outcome array, aka y.
conversion_valuearray, shape = (num_samples, ): The value of converting a given user.
conditionslist, len = len(set(treatment)): List of treatment conditions.
conversion_costarray, shape = (num_samples, num_treatment): Array of conversion costs for each unit in each treatment.
impression_costarray, shape = (num_samples, num_treatment): Array of impression costs for each unit in each treatment.

Returns

actual_value (array, shape = (num_samples, )) – Array of actual values of havng a user in their actual treatment group.
conversion_value (array, shape = (num_samples, )) – Array of payoffs from converting a user.

causalml.optimize.get_treatment_costs(treatment, control_name, cc_dict, ic_dict)[source]¶

Set the conversion and impression costs based on a dict of parameters.

Calculate the actual cost of targeting a user with the actual treatment group using the above parameters.

treatmentarray, shape = (num_samples, ): Treatment array.
control_name, str: Control group name as string.
cc_dictdict: Dict containing the conversion cost for each treatment.
ic_dict: Dict containing the impression cost for each treatment.

Returns

conversion_cost (ndarray, shape = (num_samples, num_treatments)) – An array of conversion costs for each treatment.
impression_cost (ndarray, shape = (num_samples, num_treatments)) – An array of impression costs for each treatment.
conditions (list, len = len(set(treatment))) – A list of experimental conditions.

causalml.optimize.get_uplift_best(cate, conditions)[source]¶

Takes the CATE prediction from a learner, adds the control outcome array and finds the name of the argmax conditon.

catearray, shape = (num_samples, ): The conditional average treatment effect prediction.

conditions : list, len = len(set(treatment))

Returns: uplift_recomm_name – The experimental group recommended by the learner.
Return type: array, shape = (num_samples, )

causalml.dataset module¶

causalml.dataset.bar_plot_summary(synthetic_summary, k, drop_learners=[], drop_cols=[], sort_cols=['MSE', 'Abs % Error of ATE'])[source]¶

Generates a bar plot comparing learner performance.

Parameters

synthetic_summary (pd.DataFrame) – summary generated by get_synthetic_summary()
k (int) – number of simulations (used only for plot title text)
drop_learners (list, optional) – list of learners (str) to omit when plotting
drop_cols (list, optional) – list of metrics (str) to omit when plotting
sort_cols (list, optional) – list of metrics (str) to sort on when plotting

causalml.dataset.bar_plot_summary_holdout(train_summary, validation_summary, k, drop_learners=[], drop_cols=[])[source]¶

Generates a bar plot comparing learner performance by training and validation

Parameters

train_summary (pd.DataFrame) – summary for training synthetic data generated by get_synthetic_summary_holdout()
validation_summary (pd.DataFrame) – summary for validation synthetic data generated by get_synthetic_summary_holdout()
k (int) – number of simulations (used only for plot title text)
drop_learners (list, optional) – list of learners (str) to omit when plotting
drop_cols (list, optional) – list of metrics (str) to omit when plotting

causalml.dataset.distr_plot_single_sim(synthetic_preds, kind='kde', drop_learners=[], bins=50, histtype='step', alpha=1, linewidth=1, bw_method=1)[source]¶

Plots the distribution of each learner’s predictions (for a single simulation). Kernel Density Estimation (kde) and actual histogram plots supported.

Parameters

synthetic_preds (dict) – dictionary of predictions generated by get_synthetic_preds()
kind (str, optional) – ‘kde’ or ‘hist’
drop_learners (list, optional) – list of learners (str) to omit when plotting
bins (int, optional) – number of bins to plot if kind set to ‘hist’
histtype (str, optional) – histogram type if kind set to ‘hist’
alpha (float, optional) – alpha (transparency) for plotting
linewidth (int, optional) – line width for plotting
bw_method (float, optional) – parameter for kde

causalml.dataset.get_synthetic_auuc(synthetic_preds, drop_learners=[], outcome_col='y', treatment_col='w', treatment_effect_col='tau', plot=True)[source]¶

Get auuc values for cumulative gains of model estimates in quantiles.

For details, reference get_cumgain() and plot_gain() :param synthetic_preds: dictionary of predictions generated by get_synthetic_preds() :type synthetic_preds: dict :param or get_synthetic_preds_holdout(): :param outcome_col: the column name for the actual outcome :type outcome_col: str, optional :param treatment_col: the column name for the treatment indicator (0 or 1) :type treatment_col: str, optional :param treatment_effect_col: the column name for the true treatment effect :type treatment_effect_col: str, optional :param plot: plot the cumulative gain chart or not :type plot: boolean,optional

Returns: auuc values by learner for cumulative gains of model estimates
Return type: (pandas.DataFrame)

causalml.dataset.get_synthetic_preds(synthetic_data_func, n=1000, estimators={})[source]¶

Generate predictions for synthetic data using specified function (single simulation)

Parameters

synthetic_data_func (function) – synthetic data generation function
n (int, optional) – number of samples
estimators (dict of object) – dict of names and objects of treatment effect estimators

Returns

dict of the actual and estimates of treatment effects

Return type

(dict)

causalml.dataset.get_synthetic_preds_holdout(synthetic_data_func, n=1000, valid_size=0.2, estimators={})[source]¶

Generate predictions for synthetic data using specified function (single simulation) for train and holdout

Parameters

synthetic_data_func (function) – synthetic data generation function
n (int, optional) – number of samples
valid_size (float,optional) – validaiton/hold out data size
estimators (dict of object) – dict of names and objects of treatment effect estimators

Returns

synthetic training and validation data dictionaries:

Return type

(tuple)

causalml.dataset.get_synthetic_summary(synthetic_data_func, n=1000, k=1, estimators={})[source]¶

Generate a summary for predictions on synthetic data using specified function

Parameters

synthetic_data_func (function) – synthetic data generation function
n (int, optional) – number of samples per simulation
k (int, optional) – number of simulations

causalml.dataset.get_synthetic_summary_holdout(synthetic_data_func, n=1000, valid_size=0.2, k=1)[source]¶

Generate a summary for predictions on synthetic data for train and holdout using specified function

Parameters

synthetic_data_func (function) – synthetic data generation function
n (int, optional) – number of samples per simulation
valid_size (float,optional) – validation/hold out data size
k (int, optional) – number of simulations

Returns

summary evaluation metrics of predictions for train and validation:

Return type

(tuple)

causalml.dataset.make_uplift_classification(n_samples=1000, treatment_name=['control', 'treatment1', 'treatment2', 'treatment3'], y_name='conversion', n_classification_features=10, n_classification_informative=5, n_classification_redundant=0, n_classification_repeated=0, n_uplift_increase_dict={'treatment1': 2, 'treatment2': 2, 'treatment3': 2}, n_uplift_decrease_dict={'treatment1': 0, 'treatment2': 0, 'treatment3': 0}, delta_uplift_increase_dict={'treatment1': 0.02, 'treatment2': 0.05, 'treatment3': 0.1}, delta_uplift_decrease_dict={'treatment1': 0.0, 'treatment2': 0.0, 'treatment3': 0.0}, n_uplift_increase_mix_informative_dict={'treatment1': 1, 'treatment2': 1, 'treatment3': 1}, n_uplift_decrease_mix_informative_dict={'treatment1': 0, 'treatment2': 0, 'treatment3': 0}, positive_class_proportion=0.5, random_seed=20190101)[source]¶

Generate a synthetic dataset for classification uplift modeling problem.

Parameters

n_samples (int, optional (default=1000)) – The number of samples to be generated for each treatment group.
treatment_name (list, optional (default = ['control','treatment1','treatment2','treatment3'])) – The list of treatment names.
y_name (string, optional (default = 'conversion')) – The name of the outcome variable to be used as a column in the output dataframe.
n_classification_features (int, optional (default = 10)) – Total number of features for base classification
n_classification_informative (int, optional (default = 5)) – Total number of informative features for base classification
n_classification_redundant (int, optional (default = 0)) – Total number of redundant features for base classification
n_classification_repeated (int, optional (default = 0)) – Total number of repeated features for base classification
n_uplift_increase_dict (dictionary, optional (default: {'treatment1': 2, 'treatment2': 2, 'treatment3': 2})) – Number of features for generating positive treatment effects for corresponding treatment group. Dictionary of {treatment_key: number_of_features_for_increase_uplift}.
n_uplift_decrease_dict (dictionary, optional (default: {'treatment1': 0, 'treatment2': 0, 'treatment3': 0})) – Number of features for generating negative treatment effects for corresponding treatment group. Dictionary of {treatment_key: number_of_features_for_increase_uplift}.
delta_uplift_increase_dict (dictionary, optional (default: {'treatment1': .02, 'treatment2': .05, 'treatment3': .1})) – Positive treatment effect created by the positive uplift features on the base classification label. Dictionary of {treatment_key: increase_delta}.
delta_uplift_decrease_dict (dictionary, optional (default: {'treatment1': 0., 'treatment2': 0., 'treatment3': 0.})) – Negative treatment effect created by the negative uplift features on the base classification label. Dictionary of {treatment_key: increase_delta}.
n_uplift_increase_mix_informative_dict (dictionary, optional (default: {'treatment1': 1, 'treatment2': 1, 'treatment3': 1})) – Number of positive mix features for each treatment. The positive mix feature is defined as a linear combination of a randomly selected informative classification feature and a randomly selected positive uplift feature. The linear combination is made by two coefficients sampled from a uniform distribution between -1 and 1.
n_uplift_decrease_mix_informative_dict (dictionary, optional (default: {'treatment1': 0, 'treatment2': 0, 'treatment3': 0})) – Number of negative mix features for each treatment. The negative mix feature is defined as a linear combination of a randomly selected informative classification feature and a randomly selected negative uplift feature. The linear combination is made by two coefficients sampled from a uniform distribution between -1 and 1.
positive_class_proportion (float, optional (default = 0.5)) – The proportion of positive label (1) in the control group.
random_seed (int, optional (default = 20190101)) – The random seed to be used in the data generation process.

Returns

df_res (DataFrame) – A data frame containing the treatment label, features, and outcome variable.
x_name (list) – The list of feature names generated.

Notes

The algorithm for generating the base classification dataset is adapted from the make_classification method in the sklearn package, that uses the algorithm in Guyon [1] designed to generate the “Madelon” dataset.

References

1: I. Guyon, “Design of experiments for the NIPS 2003 variable selection benchmark”, 2003.

causalml.dataset.scatter_plot_single_sim(synthetic_preds)[source]¶

Creates a grid of scatter plots comparing each learner’s predictions with the truth (for a single simulation).

Parameters: synthetic_preds (dict) – dictionary of predictions generated by get_synthetic_preds() or get_synthetic_preds_holdout()

causalml.dataset.scatter_plot_summary(synthetic_summary, k, drop_learners=[], drop_cols=[])[source]¶

Generates a scatter plot comparing learner performance. Each learner’s performance is plotted as a point in the (Abs % Error of ATE, MSE) space.

Parameters

synthetic_summary (pd.DataFrame) – summary generated by get_synthetic_summary()
k (int) – number of simulations (used only for plot title text)
drop_learners (list, optional) – list of learners (str) to omit when plotting
drop_cols (list, optional) – list of metrics (str) to omit when plotting

causalml.dataset.scatter_plot_summary_holdout(train_summary, validation_summary, k, label=['Train', 'Validation'], drop_learners=[], drop_cols=[])[source]¶

Generates a scatter plot comparing learner performance by training and validation.

Parameters

train_summary (pd.DataFrame) – summary for training synthetic data generated by get_synthetic_summary_holdout()
validation_summary (pd.DataFrame) – summary for validation synthetic data generated by get_synthetic_summary_holdout()
label (string, optional) – legend label for plot
k (int) – number of simulations (used only for plot title text)
drop_learners (list, optional) – list of learners (str) to omit when plotting
drop_cols (list, optional) – list of metrics (str) to omit when plotting

causalml.dataset.simulate_easy_propensity_difficult_baseline(n=1000, p=5, sigma=1.0, adj=0.0)[source]¶

Synthetic data with easy propensity and a difficult baseline: From Setup C in Nie X. and Wager S. (2018) ‘Quasi-Oracle Estimation of Heterogeneous Treatment Effects’

Parameters

n (int, optional) – number of observations
p (int optional) – number of covariates (>=3)
sigma (float) – standard deviation of the error term
adj (float) – no effect. added for consistency

Returns

Synthetically generated samples with the following outputs:

Return type

(tuple)

causalml.dataset.simulate_hidden_confounder(n=10000, p=5, sigma=1.0, adj=0.0)[source]¶

Synthetic dataset with a hidden confounder biasing treatment.: From Louizos et al. (2018) “Causal Effect Inference with Deep Latent-Variable Models”

Parameters

n (int, optional) – number of observations
p (int optional) – number of covariates (>=3)
sigma (float) – standard deviation of the error term
adj (float) – no effect. added for consistency

Returns

Synthetically generated samples with the following outputs:

Return type

(tuple)

causalml.dataset.simulate_nuisance_and_easy_treatment(n=1000, p=5, sigma=1.0, adj=0.0)[source]¶

Synthetic data with a difficult nuisance components and an easy treatment effect: From Setup A in Nie X. and Wager S. (2018) ‘Quasi-Oracle Estimation of Heterogeneous Treatment Effects’

Parameters

n (int, optional) – number of observations
p (int optional) – number of covariates (>=5)
sigma (float) – standard deviation of the error term
adj (float) – adjustment term for the distribution of propensity, e. Higher values shift the distribution to 0.

Returns

Synthetically generated samples with the following outputs:

Return type

(tuple)

causalml.dataset.simulate_randomized_trial(n=1000, p=5, sigma=1.0, adj=0.0)[source]¶

Synthetic data of a randomized trial: From Setup B in Nie X. and Wager S. (2018) ‘Quasi-Oracle Estimation of Heterogeneous Treatment Effects’

Parameters

n (int, optional) – number of observations
p (int optional) – number of covariates (>=5)
sigma (float) – standard deviation of the error term
adj (float) – no effect. added for consistency

Returns

Synthetically generated samples with the following outputs:

Return type

(tuple)

causalml.dataset.simulate_unrelated_treatment_control(n=1000, p=5, sigma=1.0, adj=0.0)[source]¶

Synthetic data with unrelated treatment and control groups.: From Setup D in Nie X. and Wager S. (2018) ‘Quasi-Oracle Estimation of Heterogeneous Treatment Effects’

Parameters

n (int, optional) – number of observations
p (int optional) – number of covariates (>=3)
sigma (float) – standard deviation of the error term
adj (float) – adjustment term for the distribution of propensity, e. Higher values shift the distribution to 0.

Returns

Synthetically generated samples with the following outputs:

Return type

(tuple)

causalml.dataset.synthetic_data(mode=1, n=1000, p=5, sigma=1.0, adj=0.0)[source]¶

Synthetic data in Nie X. and Wager S. (2018) ‘Quasi-Oracle Estimation of Heterogeneous Treatment Effects’

Parameters

mode (int, optional) – mode of the simulation: 1 for difficult nuisance components and an easy treatment effect. 2 for a randomized trial. 3 for an easy propensity and a difficult baseline. 4 for unrelated treatment and control groups. 5 for a hidden confounder biasing treatment.
n (int, optional) – number of observations
p (int optional) – number of covariates (>=5)
sigma (float) – standard deviation of the error term
adj (float) – adjustment term for the distribution of propensity, e. Higher values shift the distribution to 0. It does not apply to mode == 2 or 3.

Returns

Synthetically generated samples with the following outputs:

Return type

(tuple)

causalml.match module¶

class causalml.match.MatchOptimizer(treatment_col='is_treatment', ps_col='pihat', user_col=None, matching_covariates=['pihat'], max_smd=0.1, max_deviation=0.1, caliper_range=(0.01, 0.5), max_pihat_range=(0.95, 0.999), max_iter_per_param=5, min_users_per_group=1000, smd_cols=['pihat'], dev_cols_transformations={'pihat': <function mean>}, dev_factor=1.0, verbose=True)[source]¶

Bases: object

check_table_one(tableone, matched, score_cols, pihat_threshold, caliper)[source]¶

match_and_check(score_cols, pihat_threshold, caliper)[source]¶

search_best_match(df)[source]¶

single_match(score_cols, pihat_threshold, caliper)[source]¶

class causalml.match.NearestNeighborMatch(caliper=0.2, replace=False, ratio=1, shuffle=True, random_state=None, n_jobs=- 1)[source]¶

Bases: object

Propensity score matching based on the nearest neighbor algorithm.

caliper¶

threshold to be considered as a match.

Type: float

replace¶

whether to match with replacement or not

Type: bool

ratio¶

ratio of control / treatment to be matched. used only if replace=True.

Type: int

shuffle¶

whether to shuffle the treatment group data before matching

Type: bool

random_state¶

RandomState or an int seed

Type: numpy.random.RandomState or int

n_jobs¶

The number of parallel jobs to run for neighbors search. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors

Type: int

match(data, treatment_col, score_cols)[source]¶

Find matches from the control group by matching on specified columns (propensity preferred).

Parameters

data (pandas.DataFrame) – total input data
treatment_col (str) – the column name for the treatment
score_cols (list) – list of column names for matching (propensity column should be included)

Returns

The subset of data consisting of matched: treatment and control group data.

Return type

(pandas.DataFrame)

match_by_group(data, treatment_col, score_cols, groupby_col)[source]¶

Find matches from the control group stratified by groupby_col, by matching on specified columns (propensity preferred).

Parameters

data (pandas.DataFrame) – total sample data
treatment_col (str) – the column name for the treatment
score_cols (list) – list of column names for matching (propensity column should be included)
groupby_col (str) – the column name to be used for stratification

Returns

The subset of data consisting of matched: treatment and control group data.

Return type

(pandas.DataFrame)

causalml.match.create_table_one(data, treatment_col, features)[source]¶

Report balance in input features between the treatment and control groups.

References

R’s tableone at CRAN: https://github.com/kaz-yos/tableone Python’s tableone at PyPi: https://github.com/tompollard/tableone

Parameters

data (pandas.DataFrame) – total or matched sample data
treatment_col (str) – the column name for the treatment
features (list of str) – the column names of features

Returns

A table with the means and standard deviations in: the treatment and control groups, and the SMD between two groups for the features.

Return type

(pandas.DataFrame)

causalml.match.smd(feature, treatment)[source]¶

Calculate the standard mean difference (SMD) of a feature between the treatment and control groups.

The definition is available at https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3144483/#s11title

Parameters

feature (pandas.Series) – a column of a feature to calculate SMD for
treatment (pandas.Series) – a column that indicate whether a row is in the treatment group or not

Returns

The SMD of the feature

Return type

(float)

causalml.propensity module¶

class causalml.propensity.ElasticNetPropensityModel(clip_bounds=(0.001, 0.999), **model_kwargs)[source]¶: Bases: causalml.propensity.LogisticRegressionPropensityModel

class causalml.propensity.GradientBoostedPropensityModel(early_stop=False, clip_bounds=(0.001, 0.999), **model_kwargs)[source]¶

Bases: causalml.propensity.PropensityModel

Gradient boosted propensity score model with optional early stopping.

Notes

Please see the xgboost documentation for more information on gradient boosting tuning parameters: https://xgboost.readthedocs.io/en/latest/python/python_api.html

fit(X, y, early_stopping_rounds=10, stop_val_size=0.2)[source]¶

Fit a propensity model.

Parameters

X (numpy.ndarray) – a feature matrix
y (numpy.ndarray) – a binary target vector

predict(X)[source]¶

Predict propensity scores.

Parameters: X (numpy.ndarray) – a feature matrix
Returns: Propensity scores between 0 and 1.
Return type: (numpy.ndarray)

class causalml.propensity.LogisticRegressionPropensityModel(clip_bounds=(0.001, 0.999), **model_kwargs)[source]¶

Bases: causalml.propensity.PropensityModel

Propensity regression model based on the LogisticRegression algorithm.

class causalml.propensity.PropensityModel(clip_bounds=(0.001, 0.999), **model_kwargs)[source]¶

Bases: object

fit(X, y)[source]¶

Fit a propensity model.

Parameters

X (numpy.ndarray) – a feature matrix
y (numpy.ndarray) – a binary target vector

fit_predict(X, y)[source]¶

Fit a propensity model and predict propensity scores.

Parameters

X (numpy.ndarray) – a feature matrix
y (numpy.ndarray) – a binary target vector

Returns

Propensity scores between 0 and 1.

Return type

(numpy.ndarray)

predict(X)[source]¶

Predict propensity scores.

Parameters: X (numpy.ndarray) – a feature matrix
Returns: Propensity scores between 0 and 1.
Return type: (numpy.ndarray)

causalml.propensity.calibrate(ps, treatment)[source]¶

Calibrate propensity scores with logistic GAM.

Ref: https://pygam.readthedocs.io/en/latest/api/logisticgam.html

Parameters

ps (numpy.array) – a propensity score vector
treatment (numpy.array) – a binary treatment vector (0: control, 1: treated)

Returns

a calibrated propensity score vector

Return type

(numpy.array)

causalml.propensity.compute_propensity_score(X, treatment, p_model=None, X_pred=None, treatment_pred=None, calibrate_p=True)[source]¶

Generate propensity score if user didn’t provide

Parameters

X (np.matrix) – features for training
treatment (np.array or pd.Series) – a treatment vector for training
p_model (propensity model object, optional) – ElasticNetPropensityModel (default) / GradientBoostedPropensityModel
X_pred (np.matrix, optional) – features for prediction
treatment_pred (np.array or pd.Series, optional) – a treatment vector for prediciton
calibrate_p (bool, optional) – whether calibrate the propensity score

Returns

(tuple)

p (numpy.ndarray): propensity score
p_model (PropensityModel): a trained PropensityModel object

causalml.metrics module¶

class causalml.metrics.Sensitivity(df, inference_features, p_col, treatment_col, outcome_col, learner, *args, **kwargs)[source]¶

Bases: object

A Sensitivity Check class to support Placebo Treatment, Irrelevant Additional Confounder and Subset validation refutation methods to verify causal inference.

Reference: https://github.com/microsoft/dowhy/blob/master/dowhy/causal_refuters/

get_ate_ci(X, p, treatment, y)[source]¶

Return the confidence intervals for treatment effects prediction.

Parameters

X (np.matrix) – a feature matrix
p (np.array) – a propensity score vector between 0 and 1
treatment (np.array) – a treatment vector (1 if treated, otherwise 0)
y (np.array) – an outcome vector

Returns

Mean and confidence interval (LB, UB) of the ATE estimate.

Return type

(numpy.ndarray)

static get_class_object(method_name, *args, **kwargs)[source]¶

Return class object based on input method :param method_name: a list of sensitivity analysis method :type method_name: list of str

Returns: Sensitivy Class
Return type: (class)

get_prediction(X, p, treatment, y)[source]¶

Return the treatment effects prediction.

Parameters

X (np.matrix) – a feature matrix
p (np.array) – a propensity score vector between 0 and 1
treatment (np.array) – a treatment vector (1 if treated, otherwise 0)
y (np.array) – an outcome vector

Returns

Predictions of treatment effects

Return type

(numpy.ndarray)

sensitivity_analysis(methods, sample_size=None, confound='one_sided', alpha_range=None)[source]¶

Return the sensitivity data by different method

Parameters

method (list of str) – a list of sensitivity analysis method
sample_size (float, optional) – ratio for subset the original data
confound (string, optional) – the name of confouding function
alpha_range (np.array, optional) – a parameter to pass the confounding function

Returns

a feature matrix p (np.array): a propensity score vector between 0 and 1 treatment (np.array): a treatment vector (1 if treated, otherwise 0) y (np.array): an outcome vector

Return type

X (np.matrix)

sensitivity_estimate()[source]¶

summary(method)[source]¶

Summary report :param method_name: sensitivity analysis method :type method_name: str

Returns: a summary dataframe
Return type: (pd.DataFrame)

class causalml.metrics.SensitivityPlaceboTreatment(*args, **kwargs)[source]¶

Bases: causalml.metrics.sensitivity.Sensitivity

Replaces the treatment variable with a new variable randomly generated.

sensitivity_estimate()[source]¶

Summary report :param return_ci: sensitivity analysis method :type return_ci: str

Returns: a summary dataframe
Return type: (pd.DataFrame)

class causalml.metrics.SensitivityRandomCause(*args, **kwargs)[source]¶

Bases: causalml.metrics.sensitivity.Sensitivity

Adds an irrelevant random covariate to the dataframe.

sensitivity_estimate()[source]¶

class causalml.metrics.SensitivityRandomReplace(*args, **kwargs)[source]¶

Bases: causalml.metrics.sensitivity.Sensitivity

Replaces a random covariate with an irrelevant variable.

sensitivity_estimate()[source]¶: Replaces a random covariate with an irrelevant variable.

class causalml.metrics.SensitivitySelectionBias(*args, confound='one_sided', alpha_range=None, sensitivity_features=None, **kwargs)[source]¶

Bases: causalml.metrics.sensitivity.Sensitivity

Reference:

[1] Blackwell, Matthew. “A selection bias approach to sensitivity analysis for causal effects.” Political Analysis 22.2 (2014): 169-182. https://www.mattblackwell.org/files/papers/causalsens.pdf

[2] Confouding parameter alpha_range using the same range as in: https://github.com/mattblackwell/causalsens/blob/master/R/causalsens.R

causalsens()[source]¶

static partial_rsqs_confounding(sens_df, feature_name, partial_rsqs_value, range=0.01)[source]¶

Check partial rsqs values of feature corresponding confounding amonunt of ATE :param sens_df: a data frame output from causalsens :type sens_df: pandas.DataFrame :param feature_name: feature name to check :type feature_name: str :param partial_rsqs_value: partial rsquare value of feature :type partial_rsqs_value: float :param range: range to search from sens_df :type range: float

Return: min and max value of confounding amount

static plot(sens_df, partial_rsqs_df=None, type='raw', ci=False, partial_rsqs=False)[source]¶: Plot the results of a sensitivity analysis against unmeasured :param sens_df: a data frame output from causalsens :type sens_df: pandas.DataFrame :param partial_rsqs_d: a data frame output from causalsens including partial rsqure :type partial_rsqs_d: pandas.DataFrame :param type: the type of plot to draw, ‘raw’ or ‘r.squared’ are supported :type type: str, optional :param ci: whether plot confidence intervals :type ci: bool, optional :param partial_rsqs: whether plot partial rsquare results :type partial_rsqs: bool, optional

summary(method='Selection Bias')[source]¶

Summary report for Selection Bias Method :param method_name: sensitivity analysis method :type method_name: str

Returns: a summary dataframe
Return type: (pd.DataFrame)

class causalml.metrics.SensitivitySubsetData(*args, **kwargs)[source]¶

Bases: causalml.metrics.sensitivity.Sensitivity

Takes a random subset of size sample_size of the data.

sensitivity_estimate()[source]¶

causalml.metrics.ape(y, p)[source]¶

Absolute Percentage Error (APE). :param y: target :type y: float :param p: prediction :type p: float

Returns: APE
Return type: e (float)

causalml.metrics.auuc_score(df, outcome_col='y', treatment_col='w', treatment_effect_col='tau', normalize=True, tmle=False, *args, **kwarg)[source]¶

Calculate the AUUC (Area Under the Uplift Curve) score.

Returns: the AUUC score
Return type: (float)

causalml.metrics.classification_metrics(y, p, w=None, metrics={'AUC': <function roc_auc_score>, 'Log Loss': <function logloss>})[source]¶

Log metrics for classifiers.

Parameters

y (numpy.array) – target
p (numpy.array) – prediction
w (numpy.array, optional) – a treatment vector (1 or True: treatment, 0 or False: control). If given, log metrics for the treatment and control group separately
metrics (dict, optional) – a dictionary of the metric names and functions

causalml.metrics.get_cumgain(df, outcome_col='y', treatment_col='w', treatment_effect_col='tau', normalize=False, random_seed=42)[source]¶

Get cumulative gains of model estimates in population.

If the true treatment effect is provided (e.g. in synthetic data), it’s calculated as the cumulative gain of the true treatment effect in each population. Otherwise, it’s calculated as the cumulative difference between the mean outcomes of the treatment and control groups in each population.

For details, see Section 4.1 of Gutierrez and G{‘e}rardy (2016), Causal Inference and Uplift Modeling: A review of the literature.

For the former, treatment_effect_col should be provided. For the latter, both outcome_col and treatment_col should be provided.

Parameters

df (pandas.DataFrame) – a data frame with model estimates and actual data as columns
outcome_col (str, optional) – the column name for the actual outcome
treatment_col (str, optional) – the column name for the treatment indicator (0 or 1)
treatment_effect_col (str, optional) – the column name for the true treatment effect
normalize (bool, optional) – whether to normalize the y-axis to 1 or not
random_seed (int, optional) – random seed for numpy.random.rand()

Returns

cumulative gains of model estimates in population

Return type

(pandas.DataFrame)

causalml.metrics.get_cumlift(df, outcome_col='y', treatment_col='w', treatment_effect_col='tau', random_seed=42)[source]¶

Get average uplifts of model estimates in cumulative population.

If the true treatment effect is provided (e.g. in synthetic data), it’s calculated as the mean of the true treatment effect in each of cumulative population. Otherwise, it’s calculated as the difference between the mean outcomes of the treatment and control groups in each of cumulative population.

For details, see Section 4.1 of Gutierrez and G{‘e}rardy (2016), Causal Inference and Uplift Modeling: A review of the literature.

For the former, treatment_effect_col should be provided. For the latter, both outcome_col and treatment_col should be provided.

Parameters

df (pandas.DataFrame) – a data frame with model estimates and actual data as columns
outcome_col (str, optional) – the column name for the actual outcome
treatment_col (str, optional) – the column name for the treatment indicator (0 or 1)
treatment_effect_col (str, optional) – the column name for the true treatment effect
random_seed (int, optional) – random seed for numpy.random.rand()

Returns

average uplifts of model estimates in cumulative population

Return type

(pandas.DataFrame)

causalml.metrics.get_qini(df, outcome_col='y', treatment_col='w', treatment_effect_col='tau', normalize=False, random_seed=42)[source]¶

Get Qini of model estimates in population.

If the true treatment effect is provided (e.g. in synthetic data), it’s calculated as the cumulative gain of the true treatment effect in each population. Otherwise, it’s calculated as the cumulative difference between the mean outcomes of the treatment and control groups in each population.

For details, see Radcliffe (2007), Using Control Group to Target on Predicted Lift: Building and Assessing Uplift Models

For the former, treatment_effect_col should be provided. For the latter, both outcome_col and treatment_col should be provided.

Parameters

df (pandas.DataFrame) – a data frame with model estimates and actual data as columns
outcome_col (str, optional) – the column name for the actual outcome
treatment_col (str, optional) – the column name for the treatment indicator (0 or 1)
treatment_effect_col (str, optional) – the column name for the true treatment effect
normalize (bool, optional) – whether to normalize the y-axis to 1 or not
random_seed (int, optional) – random seed for numpy.random.rand()

Returns

cumulative gains of model estimates in population

Return type

(pandas.DataFrame)

causalml.metrics.get_tmlegain(df, inference_col, learner=LGBMRegressor(learning_rate=0.05, n_estimators=300, num_leaves=64), outcome_col='y', treatment_col='w', p_col='p', n_segment=5, cv=None, calibrate_propensity=True, ci=False)[source]¶

Get TMLE based average uplifts of model estimates of segments.

Parameters

df (pandas.DataFrame) – a data frame with model estimates and actual data as columns
inferenece_col (list of str) – a list of columns that used in learner for inference
learner (optional) – a model used by TMLE to estimate the outcome
outcome_col (str, optional) – the column name for the actual outcome
treatment_col (str, optional) – the column name for the treatment indicator (0 or 1)
p_col (str, optional) – the column name for propensity score
n_segment (int, optional) – number of segment that TMLE will estimated for each
cv (sklearn.model_selection._BaseKFold, optional) – sklearn CV object
calibrate_propensity (bool, optional) – whether calibrate propensity score or not
ci (bool, optional) – whether return confidence intervals for ATE or not

Returns

cumulative gains of model estimates based of TMLE

Return type

(pandas.DataFrame)

causalml.metrics.get_tmleqini(df, inference_col, learner=LGBMRegressor(learning_rate=0.05, n_estimators=300, num_leaves=64), outcome_col='y', treatment_col='w', p_col='p', n_segment=5, cv=None, calibrate_propensity=True, ci=False, normalize=False)[source]¶

Get TMLE based Qini of model estimates by segments.

Parameters

df (pandas.DataFrame) – a data frame with model estimates and actual data as columns
inferenece_col (list of str) – a list of columns that used in learner for inference
learner (optional) – a model used by TMLE to estimate the outcome
outcome_col (str, optional) – the column name for the actual outcome
treatment_col (str, optional) – the column name for the treatment indicator (0 or 1)
p_col (str, optional) – the column name for propensity score
n_segment (int, optional) – number of segment that TMLE will estimated for each
cv (sklearn.model_selection._BaseKFold, optional) – sklearn CV object
calibrate_propensity (bool, optional) – whether calibrate propensity score or not
ci (bool, optional) – whether return confidence intervals for ATE or not

Returns

cumulative gains of model estimates based of TMLE

Return type

(pandas.DataFrame)

causalml.metrics.gini(y, p)[source]¶

Normalized Gini Coefficient.

Parameters

y (numpy.array) – target
p (numpy.array) – prediction

Returns

normalized Gini coefficient

Return type

e (numpy.float64)

causalml.metrics.logloss(y, p)[source]¶

Bounded log loss error. :param y: target :type y: numpy.array :param p: prediction :type p: numpy.array

Returns: bounded log loss error

causalml.metrics.mae(y_true, y_pred, *, sample_weight=None, multioutput='uniform_average')¶

Mean absolute error regression loss.

causalml package¶

Submodules¶

causalml.inference.tree module¶

causalml.inference.meta module¶

causalml.optimize module¶

causalml.dataset module¶

causalml.match module¶

causalml.propensity module¶

causalml.metrics module¶

Module contents¶