PyCTBN.PyCTBN.estimators package

Submodules

PyCTBN.PyCTBN.estimators.fam_score_calculator module

class PyCTBN.PyCTBN.estimators.fam_score_calculator.FamScoreCalculator

Bases: object

Has the task of calculating the FamScore of a node by using a Bayesian score function

get_fam_score(cims: numpy.array, tau_xu: float = 0.1, alpha_xu: float = 1)

Calculate the FamScore value of the node

Parameters
  • cims (np.array) – np.array with all the node’s cims

  • tau_xu (float, optional) – hyperparameter over the CTBN’s q parameters, default to 0.1

  • alpha_xu (float, optional) – hyperparameter over the CTBN’s q parameters, default to 1

Returns

the FamScore value of the node

Return type

float

marginal_likelihood_q(cims: numpy.array, tau_xu: float = 0.1, alpha_xu: float = 1)

Calculate the value of the marginal likelihood over q of the node identified by the label node_id

Parameters
  • cims (np.array) – np.array with all the node’s cims

  • tau_xu (float) – hyperparameter over the CTBN’s q parameters

  • alpha_xu (float) – hyperparameter over the CTBN’s q parameters

Returns

the value of the marginal likelihood over q

Return type

float

marginal_likelihood_theta(cims: PyCTBN.PyCTBN.structure_graph.conditional_intensity_matrix.ConditionalIntensityMatrix, alpha_xu: float, alpha_xxu: float)

Calculate the FamScore value of the node identified by the label node_id

Parameters
  • cims (np.array) – np.array with all the node’s cims

  • alpha_xu (float) – hyperparameter over the CTBN’s q parameters, default to 0.1

  • alpha_xxu (float) – distribuited hyperparameter over the CTBN’s theta parameters

Returns

the value of the marginal likelihood over theta

Return type

float

single_cim_xu_marginal_likelihood_q(M_xu_suff_stats: float, T_xu_suff_stats: float, tau_xu: float = 0.1, alpha_xu: float = 1)

Calculate the marginal likelihood on q of the node when assumes a specif value and a specif parents’s assignment

Parameters
  • M_xu_suff_stats – value of the suffucient statistic M[x|u]

  • T_xu_suff_stats (float) – value of the suffucient statistic T[x|u]

  • cim (class:'ConditionalIntensityMatrix') – A conditional_intensity_matrix object with the sufficient statistics

  • tau_xu (float) – hyperparameter over the CTBN’s q parameters

  • alpha_xu (float) – hyperparameter over the CTBN’s q parameters

Returns

the value of the marginal likelihood of the node when assumes a specif value

Return type

float

single_cim_xu_marginal_likelihood_theta(index: int, cim: PyCTBN.PyCTBN.structure_graph.conditional_intensity_matrix.ConditionalIntensityMatrix, alpha_xu: float, alpha_xxu: float)

Calculate the marginal likelihood on q of the node when assumes a specif value and a specif parents’s assignment

Parameters
  • cim (class:'ConditionalIntensityMatrix') – A conditional_intensity_matrix object with the sufficient statistics

  • alpha_xu (float) – hyperparameter over the CTBN’s q parameters

  • alpha_xxu (float) – distribuited hyperparameter over the CTBN’s theta parameters

Returns

the value of the marginal likelihood over theta when the node assumes a specif value

Return type

float

single_internal_cim_xxu_marginal_likelihood_theta(M_xxu_suff_stats: float, alpha_xxu: float = 1)

Calculate the second part of the marginal likelihood over theta formula

Parameters
  • M_xxu_suff_stats (float) – value of the suffucient statistic M[xx’|u]

  • alpha_xxu (float) – distribuited hyperparameter over the CTBN’s theta parameters

Returns

the value of the marginal likelihood over theta when the node assumes a specif value

Return type

float

variable_cim_xu_marginal_likelihood_q(cim: PyCTBN.PyCTBN.structure_graph.conditional_intensity_matrix.ConditionalIntensityMatrix, tau_xu: float = 0.1, alpha_xu: float = 1)

Calculate the value of the marginal likelihood over q given a cim

Parameters
  • cim (class:'ConditionalIntensityMatrix') – A conditional_intensity_matrix object with the sufficient statistics

  • tau_xu (float) – hyperparameter over the CTBN’s q parameters

  • alpha_xu (float) – hyperparameter over the CTBN’s q parameters

Returns

the value of the marginal likelihood over q

Return type

float

variable_cim_xu_marginal_likelihood_theta(cim: PyCTBN.PyCTBN.structure_graph.conditional_intensity_matrix.ConditionalIntensityMatrix, alpha_xu: float, alpha_xxu: float)

Calculate the value of the marginal likelihood over theta given a cim

Parameters
  • cim (class:'ConditionalIntensityMatrix') – A conditional_intensity_matrix object with the sufficient statistics

  • alpha_xu (float) – hyperparameter over the CTBN’s q parameters, default to 0.1

  • alpha_xxu (float) – distribuited hyperparameter over the CTBN’s theta parameters

Returns

the value of the marginal likelihood over theta

Return type

float

PyCTBN.PyCTBN.estimators.parameters_estimator module

class PyCTBN.PyCTBN.estimators.parameters_estimator.ParametersEstimator(trajectories: PyCTBN.PyCTBN.structure_graph.trajectory.Trajectory, net_graph: PyCTBN.PyCTBN.structure_graph.network_graph.NetworkGraph)

Bases: object

Has the task of computing the cims of particular node given the trajectories and the net structure in the graph _net_graph.

Parameters
_single_set_of_cims

the set of cims object that will hold the cims of the node

compute_parameters_for_node(node_id: str)PyCTBN.PyCTBN.structure_graph.set_of_cims.SetOfCims

Compute the CIMS of the node identified by the label node_id.

Parameters

node_id (string) – the node label

Returns

A SetOfCims object filled with the computed CIMS

Return type

SetOfCims

static compute_state_res_time_for_node(times: numpy.ndarray, trajectory: numpy.ndarray, cols_filter: numpy.ndarray, scalar_indexes_struct: numpy.ndarray, T: numpy.ndarray) → None

Compute the state residence times for a node and fill the matrix T with the results

Parameters
  • node_indx (int) – the index of the node

  • times (numpy.array) – the times deltas vector

  • trajectory (numpy.ndArray) – the trajectory

  • cols_filter (numpy.array) – the columns filtering structure

  • scalar_indexes_struct (numpy.array) – the indexing structure

  • T (numpy.ndArray) – the state residence times vectors

static compute_state_transitions_for_a_node(node_indx: int, trajectory: numpy.ndarray, cols_filter: numpy.ndarray, scalar_indexing: numpy.ndarray, M: numpy.ndarray) → None

Compute the state residence times for a node and fill the matrices M with the results.

Parameters
  • node_indx (int) – the index of the node

  • trajectory (numpy.ndArray) – the trajectory

  • cols_filter (numpy.array) – the columns filtering structure

  • scalar_indexing (numpy.array) – the indexing structure

  • M (numpy.ndArray) – the state transitions matrices

fast_init(node_id: str) → None

Initializes all the necessary structures for the parameters estimation for the node node_id.

Parameters

node_id (string) – the node label

PyCTBN.PyCTBN.estimators.structure_constraint_based_estimator module

class PyCTBN.PyCTBN.estimators.structure_constraint_based_estimator.StructureConstraintBasedEstimator(sample_path: PyCTBN.PyCTBN.structure_graph.sample_path.SamplePath, exp_test_alfa: float, chi_test_alfa: float, known_edges: List = [], thumb_threshold: int = 25)

Bases: PyCTBN.PyCTBN.estimators.structure_estimator.StructureEstimator

Has the task of estimating the network structure given the trajectories in samplepath by using a constraint-based approach.

Parameters
  • sample_path (SamplePath) – the _sample_path object containing the trajectories and the real structure

  • exp_test_alfa (float) – the significance level for the exponential Hp test

  • chi_test_alfa (float) – the significance level for the chi Hp test

  • known_edges (List) – the prior known edges in the net structure if present

  • thumb_threshold (int) – the threshold value to consider a valid independence test

_nodes

the nodes labels

_nodes_vals

the nodes cardinalities

_nodes_indxs

the nodes indexes

_complete_graph

the complete directed graph built using the nodes labels in _nodes

_cache

the Cache object

complete_test(test_parent: str, test_child: str, parent_set: List, child_states_numb: int, tot_vars_count: int, parent_indx, child_indx) → bool

Performs a complete independence test on the directed graphs G1 = {test_child U parent_set} G2 = {G1 U test_parent} (added as an additional parent of the test_child). Generates all the necessary structures and datas to perform the tests.

Parameters
  • test_parent (string) – the node label of the test parent

  • test_child (string) – the node label of the child

  • parent_set (List) – the common parent set

  • child_states_numb (int) – the cardinality of the test_child

  • tot_vars_count (int) – the total number of variables in the net

Returns

True iff test_child and test_parent are independent given the sep_set parent_set. False otherwise

Return type

bool

compute_thumb_value(parent_val, child_val, parent_set_vals)

Compute the value to test against the thumb_threshold.

Parameters
  • parent_val (int) – test parent’s variable cardinality

  • child_val (int) – test child’s variable cardinality

  • parent_set_vals (List) – the cardinalities of the nodes in the current sep-set

Returns

the thumb value for the current independence test

Return type

int

ctpc_algorithm(disable_multiprocessing: bool = False)

Compute the CTPC algorithm over the entire net.

estimate_structure(disable_multiprocessing: bool = False)

Abstract method to estimate the structure

Returns

List of estimated edges

Return type

Typing.List

independence_test(child_states_numb: int, cim1: PyCTBN.PyCTBN.structure_graph.conditional_intensity_matrix.ConditionalIntensityMatrix, cim2: PyCTBN.PyCTBN.structure_graph.conditional_intensity_matrix.ConditionalIntensityMatrix, thumb_value: float, parent_indx, child_indx) → bool

Compute the actual independence test using two cims. It is performed first the exponential test and if the null hypothesis is not rejected, it is performed also the chi_test.

Parameters
Returns

True iff both tests do NOT reject the null hypothesis of independence. False otherwise.

Return type

bool

one_iteration_of_CTPC_algorithm(var_id: str, tot_vars_count: int) → List

Performs an iteration of the CTPC algorithm using the node var_id as test_child.

Parameters

var_id (string) – the node label of the test child

PyCTBN.PyCTBN.estimators.structure_estimator module

class PyCTBN.PyCTBN.estimators.structure_estimator.StructureEstimator(sample_path: PyCTBN.PyCTBN.structure_graph.sample_path.SamplePath, known_edges: List = None)

Bases: object

Has the task of estimating the network structure given the trajectories in samplepath.

Parameters
  • sample_path (SamplePath) – the _sample_path object containing the trajectories and the real structure

  • known_edges (List) – the prior known edges in the net structure if present

_nodes

the nodes labels

_nodes_vals

the nodes cardinalities

_nodes_indxs

the nodes indexes

_complete_graph

the complete directed graph built using the nodes labels in _nodes

adjacency_matrix() → numpy.ndarray

Converts the estimated structure _complete_graph to a boolean adjacency matrix representation.

Returns

The adjacency matrix of the graph _complete_graph

Return type

numpy.ndArray

static build_complete_graph(node_ids: List) → networkx.classes.digraph.DiGraph

Builds a complete directed graph (no self loops) given the nodes labels in the list node_ids:

Parameters

node_ids (List) – the list of nodes labels

Returns

a complete Digraph Object

Return type

networkx.DiGraph

build_removable_edges_matrix(known_edges: List)

Builds a boolean matrix who shows if a edge could be removed or not, based on prior knowledge given:

Parameters

known_edges (List) – the list of nodes labels

Returns

a boolean matrix

Return type

np.ndarray

abstract estimate_structure() → List

Abstract method to estimate the structure

Returns

List of estimated edges

Return type

Typing.List

static generate_possible_sub_sets_of_size(u: List, size: int, parent_label: str)

Creates a list containing all possible subsets of the list u of size size, that do not contains a the node identified by parent_label.

Parameters
  • u (List) – the list of nodes

  • size (int) – the size of the subsets

  • parent_label (string) – the node to exclude in the subsets generation

Returns

an Iterator Object containing a list of lists

Return type

Iterator

save_plot_estimated_structure_graph(file_path: str) → None

Plot the estimated structure in a graphical model style, use .png extension.

Parameters

file_path – path to save the file to

Type

string

save_results(file_path: str) → None

Save the estimated Structure to a .json file in file_path.

Parameters

file_path (string) – the path including the file name with .json extension

spurious_edges() → List
Return the spurious edges present in the estimated structure, if a prior net structure is present in

_sample_path.structure.

Returns

A list containing the spurious edges

Return type

List

PyCTBN.PyCTBN.estimators.structure_score_based_estimator module

class PyCTBN.PyCTBN.estimators.structure_score_based_estimator.StructureScoreBasedEstimator(sample_path: PyCTBN.PyCTBN.structure_graph.sample_path.SamplePath, tau_xu: int = 0.1, alpha_xu: int = 1, known_edges: List = [])

Bases: PyCTBN.PyCTBN.estimators.structure_estimator.StructureEstimator

Has the task of estimating the network structure given the trajectories in samplepath by using a score based approach and differt kinds of optimization algorithms.

Parameters
  • sample_path (SamplePath) – the _sample_path object containing the trajectories and the real structure

  • tau_xu (float, optional) – hyperparameter over the CTBN’s q parameters, default to 0.1

  • alpha_xu (float, optional) – hyperparameter over the CTBN’s q parameters, default to 1

  • known_edges (List, optional) – List of known edges, default to []

estimate_parents(node_id: str, max_parents: int = None, iterations_number: int = 40, patience: int = 10, tabu_length: int = None, tabu_rules_duration: int = 5, optimizer: str = 'hill')

Use the FamScore of a node in order to find the best parent nodes

Parameters
  • node_id (string) – current node’s id

  • max_parents (int, optional) – maximum number of parents for each variable. If None, disabled, default to None

  • iterations_number (int, optional) – maximum number of optimization algorithm’s iteration, default to 40

  • patience (int, optional) – number of iteration without any improvement before to stop the search.If None, disabled, default to None

  • tabu_length (int, optional) – maximum lenght of the data structures used in the optimization process, default to None

  • tabu_rules_duration (int, optional) – number of iterations in which each rule keeps its value, default to None

  • optimizer (string, optional) – name of the optimizer algorithm. Possible values: ‘hill’ (Hill climbing),’tabu’ (tabu search), defualt to ‘tabu’

Returns

A list of the best edges for the currente node

Return type

List

estimate_structure(max_parents: int = None, iterations_number: int = 40, patience: int = None, tabu_length: int = None, tabu_rules_duration: int = None, optimizer: str = 'tabu', disable_multiprocessing: bool = False)

Compute the score-based algorithm to find the optimal structure

Parameters
  • max_parents (int, optional) – maximum number of parents for each variable. If None, disabled, default to None

  • iterations_number (int, optional) – maximum number of optimization algorithm’s iteration, default to 40

  • patience (int, optional) – number of iteration without any improvement before to stop the search.If None, disabled, default to None

  • tabu_length (int, optional) – maximum lenght of the data structures used in the optimization process, default to None

  • tabu_rules_duration (int, optional) – number of iterations in which each rule keeps its value, default to None

  • optimizer (string, optional) – name of the optimizer algorithm. Possible values: ‘hill’ (Hill climbing),’tabu’ (tabu search), defualt to ‘tabu’

  • disable_multiprocessing (Boolean, optional) – true if you desire to disable the multiprocessing operations, default to False

get_score_from_graph(graph: PyCTBN.PyCTBN.structure_graph.network_graph.NetworkGraph, node_id: str)

Get the FamScore of a node

Parameters
  • node_id (string) – current node’s id

  • graph (class:'NetworkGraph') – current graph to be computed

Returns

The FamSCore for this graph structure

Return type

float

Module contents