Low-dimensional Contextualized Bayesian Networks#

For more details, please see the NOTMAD preprint.

Factor Graphs#

To improve scalability, we can include factor graphs (low-dimensional axes of network variation). This is controlled by the num_factors parameter. The default value of 0 turns off factor graphs and computes the network in full dimensionality.

import numpy as np
from contextualized.dags.graph_utils import simulate_linear_sem
n = 1000
C = np.linspace(1, 2, n).reshape((n, 1))
W = np.zeros((4, 4, n, 1))
W[0, 1] = C - 2
W[2, 1] = C**2
W[3, 1] = C**3
W[3, 2] = C
W = np.squeeze(W)
W = np.transpose(W, (2, 0, 1))
X = np.zeros((n, 4))
for i, w in enumerate(W):
    x = simulate_linear_sem(w, 1, "uniform", noise_scale=0.1)[0]
    X[i] = x
%%capture
from contextualized.easy import ContextualizedBayesianNetworks

cbn = ContextualizedBayesianNetworks(
    encoder_type='mlp', num_archetypes=2, num_factors=2,
    n_bootstraps=1, archetype_dag_loss_type="DAGMA", archetype_alpha=0.,
    sample_specific_dag_loss_type="DAGMA", sample_specific_alpha=1e-1,
    learning_rate=1e-3)
cbn.fit(C, X, max_epochs=10)
GPU available: True (mps), used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name           | Type      | Params
---------------------------------------------
0 | encoder        | MLP       | 1.4 K 
1 | explainer      | Explainer | 8     
2 | factor_softmax | Softmax   | 0     
---------------------------------------------
1.4 K     Trainable params
0         Non-trainable params
1.4 K     Total params
0.006     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=10` reached.
cbn.models[-1].latent_dim
2

We can predict full-dimensional graphs or factor graphs based on the keyword argument factors:

predicted_networks = cbn.predict_networks(C)
print(predicted_networks.shape)

predicted_factor_networks = cbn.predict_networks(C, factors=True)
predicted_factor_networks.shape
/opt/homebrew/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:430: PossibleUserWarning: The dataloader, predict_dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 10 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
  rank_zero_warn(
(1000, 4, 4)
(1000, 2, 2)
%%capture
from contextualized.easy import ContextualizedBayesianNetworks

mses = []
for n_factors in range(1, 5):
    cbn = ContextualizedBayesianNetworks(
        encoder_type='mlp', num_archetypes=2, num_factors=n_factors,
        n_bootstraps=1, archetype_dag_loss_type="DAGMA", archetype_alpha=0.,
        sample_specific_dag_loss_type="DAGMA", sample_specific_alpha=1e-1,
        learning_rate=1e-3, foobar=None)
    cbn.fit(C, X, max_epochs=10)
    mses.append(np.mean(cbn.measure_mses(C, X)))
GPU available: True (mps), used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name           | Type      | Params
---------------------------------------------
0 | encoder        | MLP       | 1.4 K 
1 | explainer      | Explainer | 2     
2 | factor_softmax | Softmax   | 0     
---------------------------------------------
1.4 K     Trainable params
0         Non-trainable params
1.4 K     Total params
0.006     Total estimated model params size (MB)
GPU available: True (mps), used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name           | Type      | Params
---------------------------------------------
0 | encoder        | MLP       | 1.4 K 
1 | explainer      | Explainer | 8     
2 | factor_softmax | Softmax   | 0     
---------------------------------------------
1.4 K     Trainable params
0         Non-trainable params
1.4 K     Total params
0.006     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=10` reached.
GPU available: True (mps), used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name           | Type      | Params
---------------------------------------------
0 | encoder        | MLP       | 1.4 K 
1 | explainer      | Explainer | 18    
2 | factor_softmax | Softmax   | 0     
---------------------------------------------
1.4 K     Trainable params
0         Non-trainable params
1.4 K     Total params
0.006     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=10` reached.
GPU available: True (mps), used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name      | Type      | Params
----------------------------------------
0 | encoder   | MLP       | 1.4 K 
1 | explainer | Explainer | 32    
----------------------------------------
1.4 K     Trainable params
0         Non-trainable params
1.4 K     Total params
0.006     Total estimated model params size (MB)
import matplotlib.pyplot as plt
%matplotlib inline

plt.plot(range(1, 5), mses)
plt.ylabel("Error")
plt.xlabel("Number of Factors")
plt.show()
../_images/98128c137f7180eb58bebe12a1104d0758ac93a4ce27ecdbbd724d37153281c4.png