When analyzing product usage data, one of the most common questions is deceptively simple: what actually increases the odds that a user will purchase? Individual features rarely tell the full story. Instead, conversion often emerges from specific combinations of actions, workflows, or feature usage patterns.
Market Basket Analysis (MBA) provides a natural framework for uncovering these combinations. By comparing feature sets observed among purchasers versus non-purchasers, MBA helps identify which groups of behaviors are associated with meaningful increases in purchase odds, without requiring a perfectly predictive model.
Market Basket Analysis for Product Usage
In applying Market Basket Analysis to Product Usage, each user (or session, or account) is treated as a “basket” containing the set of features they used, during a trial period for example. Features might include actions taken, workflows completed, integrations enabled, or thresholds reached prior to a purchase decision.
The key twist is segmentation: baskets are divided into purchasers and non-purchasers. Rather than simply asking which features co-occur, MBA is used to discover which feature sets appear disproportionately often among purchasers compared to non-purchasers, signaling a potential increase in the odds of conversion.
Why FP-Growth Is Better Than Apriori
Apriori is one of the classic methods used in Market Basket Analysis, which looks at what items tend to be bought or used together. Its key idea is simple: if a group of items shows up often, then every smaller combination within that group must also show up often. This insight made it much easier to find patterns in large sets of purchase or usage data. For many years, Apriori was the go-to method for spotting products or features that commonly appear together.
However, purchase analysis often involves high-dimensional data: dozens or hundreds of possible features, many of which appear infrequently. In these settings, Apriori struggles because it repeatedly scans the dataset and generates large numbers of candidate feature combinations, leading to slow performance and poor scalability.
FP-Growth avoids this issue by compressing observed feature usage into an FP-tree and extracting frequent feature sets efficiently. This makes it possible to explore multi-feature combinations that would be impractical to evaluate using Apriori, exactly the combinations most likely to reveal meaningful purchase, driving behavior.
Why This Beats Using Random Forests Alone
Models like Random Forests are optimized to separate purchasers from non-purchasers as accurately as possible. However, real-world purchase behavior is noisy, influenced by external factors such as pricing, timing, sales touchpoints, or organizational constraints that no feature set can fully capture.
Market Basket Analysis takes a more pragmatic approach. Instead of chasing perfect classification, it asks: which combinations of features tend to increase the odds of purchase when they appear? This makes MBA particularly valuable when outcomes are inherently probabilistic and no feature set is sufficient on its own.
Evaluating Rules Using Odds and Significance
Once candidate feature sets are identified, their value lies in how much they change purchase likelihood. Odds ratios provide a clear, interpretable measure: how much more likely is purchase when a given set of features is present versus absent?
Formally, the odds ratio for a given feature set A with respect to purchase is defined as:
Odds Ratio(A) =
[ P(purchase | A) / (1 − P(purchase | A)) ] ÷
[ P(purchase | not A) / (1 − P(purchase | not A)) ]
Using a 2×2 contingency table, the odds ratio can be computed as:
Odds Ratio(A) = (a × d) / (b × c)
- a: purchasers with feature set A
- b: non-purchasers with feature set A
- c: purchasers without feature set A
- d: non-purchasers without feature set A
An odds ratio greater than 1 indicates that the feature set increases the odds of purchase, while a value below 1 indicates a decrease.
Statistical tests and p-values help ensure that observed odds increases are not driven by random fluctuation or small sample effects. A standard approach is to build a 2×2 contingency table for each rule (feature set A present vs absent, purchase vs no purchase) and run Fisher’s exact test (recommended when counts are small) or a chi-squared test (fine when counts are large). Strong rules are those that combine meaningful odds lift with sufficient support and statistical confidence.
# Compute Odds Ratio + p-value for a Market Basket rule using a 2x2 contingency table
# -------------------------------
import numpy as np
import pandas as pd
from scipy.stats import fisher_exact, chi2_contingency
def rule_stats(X: pd.DataFrame, y_purchase: pd.Series, antecedent_cols, test="fisher"):
"""
X: binary feature matrix (rows = users/sessions, cols = features)
y_purchase: 1 if purchased else 0
antecedent_cols: list of feature names that define the rule antecedent A
test: "fisher" (default) or "chi2"
"""
has_A = X[antecedent_cols].all(axis=1)
# 2x2 counts
a = int(((y_purchase == 1) & has_A).sum()) # purchasers with A
b = int(((y_purchase == 0) & has_A).sum()) # non-purchasers with A
c = int(((y_purchase == 1) & (~has_A)).sum()) # purchasers without A
d = int(((y_purchase == 0) & (~has_A)).sum()) # non-purchasers without A
table = np.array([[a, b],
[c, d]], dtype=float)
# Odds ratio (with Haldane–Anscombe correction to avoid zeros)
table_or = table + 0.5
odds_ratio = (table_or[0,0] * table_or[1,1]) / (table_or[0,1] * table_or[1,0])
# p-value
if test == "fisher":
# Fisher's exact test returns an odds ratio too, but we keep our corrected OR above
_, p_value = fisher_exact([[a, b], [c, d]], alternative="two-sided")
elif test == "chi2":
chi2, p_value, _, _ = chi2_contingency([[a, b], [c, d]], correction=True)
else:
raise ValueError("test must be 'fisher' or 'chi2'")
return {
"a_purch_with_A": a,
"b_nonpurch_with_A": b,
"c_purch_without_A": c,
"d_nonpurch_without_A": d,
"odds_ratio": odds_ratio,
"p_value": p_value,
"purchase_rate_given_A": float(y_purchase[has_A].mean()) if has_A.any() else np.nan,
"purchase_rate_overall": float(y_purchase.mean()),
"support_A": float(has_A.mean()),
}
# Example usage:
# stats = rule_stats(X, purchased, ["view_pricing", "trial_start"], test="fisher")
# print(stats)
Validating Feature Sets and Measuring Impact
Discovery is only the first step. Promising feature sets should be validated against more recent data or held-out cohorts to confirm that their association with purchase remains stable over time.
To quantify impact more formally, the identified feature sets can be encoded as a variable in linear or logistic regression models. Model diagnostics such as R-squared help assess explanatory power by indicating how much of the variability in purchase outcomes is associated with differences in the proportion of accounts exhibiting the feature set.
Conclusion
Market Basket Analysis (MBA) provides a powerful framework for understanding purchase behavior when conversions are driven by combinations of feature usage or user actions. By comparing the feature sets of purchasers and non-purchasers, MBA surfaces behavior patterns that meaningfully increase the odds of conversion, revealing actionable signals that other machine learning approaches may not see>.
When paired with FP-Growth, odds-based evaluation, and rigorous validation, MBA complements traditional predictive models by offering a more interpretable view of how real users progress toward purchase through product usage.
Code Sample
# FP-Growth + Odds Ratio for "purchase" uplift rules
import pandas as pd
import numpy as np
from mlxtend.preprocessing import TransactionEncoder
from mlxtend.frequent_patterns import fpgrowth, association_rules
# -----------------------------
# Example input (replace with yours)
# -----------------------------
# Each row is a user/session "basket" of used features BEFORE outcome
transactions = [
["search", "view_pricing", "trial_start"],
["search", "invite_team"],
["view_pricing", "trial_start", "integrate_slack"],
["search", "view_pricing"],
["invite_team", "integrate_slack", "trial_start"],
]
# Outcome label aligned with transactions: 1 if purchased, 0 if not
purchased = pd.Series([1, 0, 1, 0, 1], name="purchased")
# -----------------------------
# 1) One-hot encode baskets
# -----------------------------
te = TransactionEncoder()
X = te.fit(transactions).transform(transactions)
X = pd.DataFrame(X, columns=te.columns)
# Add outcome label
df = X.copy()
df["purchased"] = purchased.values
# -----------------------------
# 2) FP-Growth: frequent itemsets among ALL users
# (you can also run separately for purchasers)
# -----------------------------
freq_itemsets = fpgrowth(X, min_support=0.2, use_colnames=True)
freq_itemsets = freq_itemsets.sort_values("support", ascending=False)
# -----------------------------
# 3) Add "purchase" as a consequent using association rules
# We'll create a dataset that treats 'purchased' as an item too.
# -----------------------------
df_items = df.copy()
# Convert purchased label into an "item" column for rule mining:
df_items["OUTCOME_PURCHASED"] = df_items["purchased"].astype(bool)
df_items = df_items.drop(columns=["purchased"])
freq_itemsets_with_outcome = fpgrowth(df_items, min_support=0.2, use_colnames=True)
rules = association_rules(freq_itemsets_with_outcome, metric="confidence", min_threshold=0.1)
# Keep only rules where consequent is {OUTCOME_PURCHASED}
rules = rules[rules["consequents"].apply(lambda s: s == frozenset(["OUTCOME_PURCHASED"]))].copy()
# -----------------------------
# 4) Compute Odds Ratio (OR) for each rule antecedent vs purchase
# OR = (a/b) / (c/d) with:
# a = purchasers with antecedent present
# b = non-purchasers with antecedent present
# c = purchasers without antecedent
# d = non-purchasers without antecedent
# -----------------------------
def odds_ratio_for_antecedent(df_binary_features: pd.DataFrame, purchased: pd.Series, antecedent: frozenset):
antecedent = list(antecedent)
has_A = df_binary_features[antecedent].all(axis=1)
a = int(((purchased == 1) & has_A).sum())
b = int(((purchased == 0) & has_A).sum())
c = int(((purchased == 1) & (~has_A)).sum())
d = int(((purchased == 0) & (~has_A)).sum())
# Haldane–Anscombe correction to avoid divide-by-zero
a2, b2, c2, d2 = a + 0.5, b + 0.5, c + 0.5, d + 0.5
or_val = (a2 * d2) / (b2 * c2)
# Also return counts for transparency
return or_val, (a, b, c, d)
# Use the original X (feature-only) and purchased label
ors = []
counts = []
for ant in rules["antecedents"]:
or_val, (a, b, c, d) = odds_ratio_for_antecedent(X, purchased, ant)
ors.append(or_val)
counts.append((a, b, c, d))
rules["odds_ratio_purchase"] = ors
rules[["a_purch_with_A", "b_nonpurch_with_A", "c_purch_without_A", "d_nonpurch_without_A"]] = pd.DataFrame(
counts, index=rules.index
)
# Useful derived metric: baseline purchase rate vs purchase rate when antecedent present
hasA = []
p_purchase_given_A = []
for ant in rules["antecedents"]:
ant_cols = list(ant)
mask = X[ant_cols].all(axis=1)
hasA.append(int(mask.sum()))
p_purchase_given_A.append(float(purchased[mask].mean()) if mask.any() else np.nan)
rules["users_with_antecedent"] = hasA
rules["p(purchase|antecedent)"] = p_purchase_given_A
rules["p(purchase_overall)"] = float(purchased.mean())
# -----------------------------
# 5) View top rules by odds ratio (and filter for minimum support)
# -----------------------------
# Note: support/confidence are computed in the "items+outcome" space.
# Add your own thresholds as needed.
top = (
rules.sort_values(["odds_ratio_purchase", "support"], ascending=[False, False])
.loc[:, ["antecedents", "consequents", "support", "confidence", "lift",
"odds_ratio_purchase", "users_with_antecedent",
"p(purchase|antecedent)", "p(purchase_overall)",
"a_purch_with_A", "b_nonpurch_with_A", "c_purch_without_A", "d_nonpurch_without_A"]]
)
print(top.head(15).to_string(index=False))