PyCTBN.PyCTBN.estimators package¶
Submodules¶
PyCTBN.PyCTBN.estimators.fam_score_calculator module¶
- class PyCTBN.PyCTBN.estimators.fam_score_calculator.FamScoreCalculator¶
Bases:
object
Has the task of calculating the FamScore of a node by using a Bayesian score function
- get_fam_score(cims: numpy.array, tau_xu: float = 0.1, alpha_xu: float = 1)¶
Calculate the FamScore value of the node
- Parameters
cims (np.array) – np.array with all the node’s cims
tau_xu (float, optional) – hyperparameter over the CTBN’s q parameters, default to 0.1
alpha_xu (float, optional) – hyperparameter over the CTBN’s q parameters, default to 1
- Returns
the FamScore value of the node
- Return type
float
- marginal_likelihood_q(cims: numpy.array, tau_xu: float = 0.1, alpha_xu: float = 1)¶
Calculate the value of the marginal likelihood over q of the node identified by the label node_id
- Parameters
cims (np.array) – np.array with all the node’s cims
tau_xu (float) – hyperparameter over the CTBN’s q parameters
alpha_xu (float) – hyperparameter over the CTBN’s q parameters
- Returns
the value of the marginal likelihood over q
- Return type
float
- marginal_likelihood_theta(cims: PyCTBN.PyCTBN.structure_graph.conditional_intensity_matrix.ConditionalIntensityMatrix, alpha_xu: float, alpha_xxu: float)¶
Calculate the FamScore value of the node identified by the label node_id
- Parameters
cims (np.array) – np.array with all the node’s cims
alpha_xu (float) – hyperparameter over the CTBN’s q parameters, default to 0.1
alpha_xxu (float) – distribuited hyperparameter over the CTBN’s theta parameters
- Returns
the value of the marginal likelihood over theta
- Return type
float
- single_cim_xu_marginal_likelihood_q(M_xu_suff_stats: float, T_xu_suff_stats: float, tau_xu: float = 0.1, alpha_xu: float = 1)¶
Calculate the marginal likelihood on q of the node when assumes a specif value and a specif parents’s assignment
- Parameters
M_xu_suff_stats – value of the suffucient statistic M[x|u]
T_xu_suff_stats (float) – value of the suffucient statistic T[x|u]
cim (class:'ConditionalIntensityMatrix') – A conditional_intensity_matrix object with the sufficient statistics
tau_xu (float) – hyperparameter over the CTBN’s q parameters
alpha_xu (float) – hyperparameter over the CTBN’s q parameters
- Returns
the value of the marginal likelihood of the node when assumes a specif value
- Return type
float
- single_cim_xu_marginal_likelihood_theta(index: int, cim: PyCTBN.PyCTBN.structure_graph.conditional_intensity_matrix.ConditionalIntensityMatrix, alpha_xu: float, alpha_xxu: float)¶
Calculate the marginal likelihood on q of the node when assumes a specif value and a specif parents’s assignment
- Parameters
cim (class:'ConditionalIntensityMatrix') – A conditional_intensity_matrix object with the sufficient statistics
alpha_xu (float) – hyperparameter over the CTBN’s q parameters
alpha_xxu (float) – distribuited hyperparameter over the CTBN’s theta parameters
- Returns
the value of the marginal likelihood over theta when the node assumes a specif value
- Return type
float
- single_internal_cim_xxu_marginal_likelihood_theta(M_xxu_suff_stats: float, alpha_xxu: float = 1)¶
Calculate the second part of the marginal likelihood over theta formula
- Parameters
M_xxu_suff_stats (float) – value of the suffucient statistic M[xx’|u]
alpha_xxu (float) – distribuited hyperparameter over the CTBN’s theta parameters
- Returns
the value of the marginal likelihood over theta when the node assumes a specif value
- Return type
float
- variable_cim_xu_marginal_likelihood_q(cim: PyCTBN.PyCTBN.structure_graph.conditional_intensity_matrix.ConditionalIntensityMatrix, tau_xu: float = 0.1, alpha_xu: float = 1)¶
Calculate the value of the marginal likelihood over q given a cim
- Parameters
cim (class:'ConditionalIntensityMatrix') – A conditional_intensity_matrix object with the sufficient statistics
tau_xu (float) – hyperparameter over the CTBN’s q parameters
alpha_xu (float) – hyperparameter over the CTBN’s q parameters
- Returns
the value of the marginal likelihood over q
- Return type
float
- variable_cim_xu_marginal_likelihood_theta(cim: PyCTBN.PyCTBN.structure_graph.conditional_intensity_matrix.ConditionalIntensityMatrix, alpha_xu: float, alpha_xxu: float)¶
Calculate the value of the marginal likelihood over theta given a cim
- Parameters
cim (class:'ConditionalIntensityMatrix') – A conditional_intensity_matrix object with the sufficient statistics
alpha_xu (float) – hyperparameter over the CTBN’s q parameters, default to 0.1
alpha_xxu (float) – distribuited hyperparameter over the CTBN’s theta parameters
- Returns
the value of the marginal likelihood over theta
- Return type
float
PyCTBN.PyCTBN.estimators.parameters_estimator module¶
- class PyCTBN.PyCTBN.estimators.parameters_estimator.ParametersEstimator(trajectories: PyCTBN.PyCTBN.structure_graph.trajectory.Trajectory, net_graph: PyCTBN.PyCTBN.structure_graph.network_graph.NetworkGraph)¶
Bases:
object
Has the task of computing the cims of particular node given the trajectories and the net structure in the graph
_net_graph
.- Parameters
trajectories (Trajectory) – the trajectories
net_graph (NetworkGraph) – the net structure
- _single_set_of_cims
the set of cims object that will hold the cims of the node
- compute_parameters_for_node(node_id: str) → PyCTBN.PyCTBN.structure_graph.set_of_cims.SetOfCims¶
Compute the CIMS of the node identified by the label
node_id
.- Parameters
node_id (string) – the node label
- Returns
A SetOfCims object filled with the computed CIMS
- Return type
- static compute_state_res_time_for_node(times: numpy.ndarray, trajectory: numpy.ndarray, cols_filter: numpy.ndarray, scalar_indexes_struct: numpy.ndarray, T: numpy.ndarray) → None¶
Compute the state residence times for a node and fill the matrix
T
with the results- Parameters
node_indx (int) – the index of the node
times (numpy.array) – the times deltas vector
trajectory (numpy.ndArray) – the trajectory
cols_filter (numpy.array) – the columns filtering structure
scalar_indexes_struct (numpy.array) – the indexing structure
T (numpy.ndArray) – the state residence times vectors
- static compute_state_transitions_for_a_node(node_indx: int, trajectory: numpy.ndarray, cols_filter: numpy.ndarray, scalar_indexing: numpy.ndarray, M: numpy.ndarray) → None¶
Compute the state residence times for a node and fill the matrices
M
with the results.- Parameters
node_indx (int) – the index of the node
trajectory (numpy.ndArray) – the trajectory
cols_filter (numpy.array) – the columns filtering structure
scalar_indexing (numpy.array) – the indexing structure
M (numpy.ndArray) – the state transitions matrices
- fast_init(node_id: str) → None¶
Initializes all the necessary structures for the parameters estimation for the node
node_id
.- Parameters
node_id (string) – the node label
PyCTBN.PyCTBN.estimators.structure_constraint_based_estimator module¶
- class PyCTBN.PyCTBN.estimators.structure_constraint_based_estimator.StructureConstraintBasedEstimator(sample_path: PyCTBN.PyCTBN.structure_graph.sample_path.SamplePath, exp_test_alfa: float, chi_test_alfa: float, known_edges: List = [], thumb_threshold: int = 25)¶
Bases:
PyCTBN.PyCTBN.estimators.structure_estimator.StructureEstimator
Has the task of estimating the network structure given the trajectories in samplepath by using a constraint-based approach.
- Parameters
sample_path (SamplePath) – the _sample_path object containing the trajectories and the real structure
exp_test_alfa (float) – the significance level for the exponential Hp test
chi_test_alfa (float) – the significance level for the chi Hp test
known_edges (List) – the prior known edges in the net structure if present
thumb_threshold (int) – the threshold value to consider a valid independence test
- _nodes
the nodes labels
- _nodes_vals
the nodes cardinalities
- _nodes_indxs
the nodes indexes
- _complete_graph
the complete directed graph built using the nodes labels in
_nodes
- _cache
the Cache object
- complete_test(test_parent: str, test_child: str, parent_set: List, child_states_numb: int, tot_vars_count: int, parent_indx, child_indx) → bool¶
Performs a complete independence test on the directed graphs G1 = {test_child U parent_set} G2 = {G1 U test_parent} (added as an additional parent of the test_child). Generates all the necessary structures and datas to perform the tests.
- Parameters
test_parent (string) – the node label of the test parent
test_child (string) – the node label of the child
parent_set (List) – the common parent set
child_states_numb (int) – the cardinality of the
test_child
tot_vars_count (int) – the total number of variables in the net
- Returns
True iff test_child and test_parent are independent given the sep_set parent_set. False otherwise
- Return type
bool
- compute_thumb_value(parent_val, child_val, parent_set_vals)¶
Compute the value to test against the thumb_threshold.
- Parameters
parent_val (int) – test parent’s variable cardinality
child_val (int) – test child’s variable cardinality
parent_set_vals (List) – the cardinalities of the nodes in the current sep-set
- Returns
the thumb value for the current independence test
- Return type
int
- ctpc_algorithm(disable_multiprocessing: bool = False)¶
Compute the CTPC algorithm over the entire net.
- estimate_structure(disable_multiprocessing: bool = False)¶
Abstract method to estimate the structure
- Returns
List of estimated edges
- Return type
Typing.List
- independence_test(child_states_numb: int, cim1: PyCTBN.PyCTBN.structure_graph.conditional_intensity_matrix.ConditionalIntensityMatrix, cim2: PyCTBN.PyCTBN.structure_graph.conditional_intensity_matrix.ConditionalIntensityMatrix, thumb_value: float, parent_indx, child_indx) → bool¶
Compute the actual independence test using two cims. It is performed first the exponential test and if the null hypothesis is not rejected, it is performed also the chi_test.
- Parameters
child_states_numb (int) – the cardinality of the test child
cim1 (ConditionalIntensityMatrix) – a cim belonging to the graph without test parent
cim2 (ConditionalIntensityMatrix) – a cim belonging to the graph with test parent
- Returns
True iff both tests do NOT reject the null hypothesis of independence. False otherwise.
- Return type
bool
- one_iteration_of_CTPC_algorithm(var_id: str, tot_vars_count: int) → List¶
Performs an iteration of the CTPC algorithm using the node
var_id
astest_child
.- Parameters
var_id (string) – the node label of the test child
PyCTBN.PyCTBN.estimators.structure_estimator module¶
- class PyCTBN.PyCTBN.estimators.structure_estimator.StructureEstimator(sample_path: PyCTBN.PyCTBN.structure_graph.sample_path.SamplePath, known_edges: Optional[List] = None)¶
Bases:
object
Has the task of estimating the network structure given the trajectories in
samplepath
.- Parameters
sample_path (SamplePath) – the _sample_path object containing the trajectories and the real structure
known_edges (List) – the prior known edges in the net structure if present
- _nodes
the nodes labels
- _nodes_vals
the nodes cardinalities
- _nodes_indxs
the nodes indexes
- _complete_graph
the complete directed graph built using the nodes labels in
_nodes
- adjacency_matrix() → numpy.ndarray¶
Converts the estimated structure
_complete_graph
to a boolean adjacency matrix representation.- Returns
The adjacency matrix of the graph
_complete_graph
- Return type
numpy.ndArray
- static build_complete_graph(node_ids: List) → networkx.classes.digraph.DiGraph¶
Builds a complete directed graph (no self loops) given the nodes labels in the list
node_ids
:- Parameters
node_ids (List) – the list of nodes labels
- Returns
a complete Digraph Object
- Return type
networkx.DiGraph
- build_removable_edges_matrix(known_edges: List)¶
Builds a boolean matrix who shows if a edge could be removed or not, based on prior knowledge given:
- Parameters
known_edges (List) – the list of nodes labels
- Returns
a boolean matrix
- Return type
np.ndarray
- abstract estimate_structure() → List¶
Abstract method to estimate the structure
- Returns
List of estimated edges
- Return type
Typing.List
- static generate_possible_sub_sets_of_size(u: List, size: int, parent_label: str)¶
Creates a list containing all possible subsets of the list
u
of sizesize
, that do not contains a the node identified byparent_label
.- Parameters
u (List) – the list of nodes
size (int) – the size of the subsets
parent_label (string) – the node to exclude in the subsets generation
- Returns
an Iterator Object containing a list of lists
- Return type
Iterator
- save_plot_estimated_structure_graph(file_path: str) → None¶
Plot the estimated structure in a graphical model style, use .png extension. Spurious edges are colored in red if a prior structure is present.
- Parameters
file_path – path to save the file to
- Type
string
- save_results(file_path: str) → None¶
Save the estimated Structure to a .json file in file_path.
- Parameters
file_path (string) – the path including the file name with .json extension
- spurious_edges() → List¶
- Return the spurious edges present in the estimated structure, if a prior net structure is present in
_sample_path.structure
.
- Returns
A list containing the spurious edges
- Return type
List
PyCTBN.PyCTBN.estimators.structure_score_based_estimator module¶
- class PyCTBN.PyCTBN.estimators.structure_score_based_estimator.StructureScoreBasedEstimator(sample_path: PyCTBN.PyCTBN.structure_graph.sample_path.SamplePath, tau_xu: int = 0.1, alpha_xu: int = 1, known_edges: List = [])¶
Bases:
PyCTBN.PyCTBN.estimators.structure_estimator.StructureEstimator
Has the task of estimating the network structure given the trajectories in samplepath by using a score based approach and differt kinds of optimization algorithms.
- Parameters
sample_path (SamplePath) – the _sample_path object containing the trajectories and the real structure
tau_xu (float, optional) – hyperparameter over the CTBN’s q parameters, default to 0.1
alpha_xu (float, optional) – hyperparameter over the CTBN’s q parameters, default to 1
known_edges (List, optional) – List of known edges, default to []
- estimate_parents(node_id: str, max_parents: Optional[int] = None, iterations_number: int = 40, patience: int = 10, tabu_length: Optional[int] = None, tabu_rules_duration: int = 5, optimizer: str = 'hill')¶
Use the FamScore of a node in order to find the best parent nodes
- Parameters
node_id (string) – current node’s id
max_parents (int, optional) – maximum number of parents for each variable. If None, disabled, default to None
iterations_number (int, optional) – maximum number of optimization algorithm’s iteration, default to 40
patience (int, optional) – number of iteration without any improvement before to stop the search.If None, disabled, default to None
tabu_length (int, optional) – maximum lenght of the data structures used in the optimization process, default to None
tabu_rules_duration (int, optional) – number of iterations in which each rule keeps its value, default to None
optimizer (string, optional) – name of the optimizer algorithm. Possible values: ‘hill’ (Hill climbing),’tabu’ (tabu search), defualt to ‘tabu’
- Returns
A list of the best edges for the currente node
- Return type
List
- estimate_structure(max_parents: Optional[int] = None, iterations_number: int = 40, patience: Optional[int] = None, tabu_length: Optional[int] = None, tabu_rules_duration: Optional[int] = None, optimizer: str = 'tabu', disable_multiprocessing: bool = False)¶
Compute the score-based algorithm to find the optimal structure
- Parameters
max_parents (int, optional) – maximum number of parents for each variable. If None, disabled, default to None
iterations_number (int, optional) – maximum number of optimization algorithm’s iteration, default to 40
patience (int, optional) – number of iteration without any improvement before to stop the search.If None, disabled, default to None
tabu_length (int, optional) – maximum lenght of the data structures used in the optimization process, default to None
tabu_rules_duration (int, optional) – number of iterations in which each rule keeps its value, default to None
optimizer (string, optional) – name of the optimizer algorithm. Possible values: ‘hill’ (Hill climbing),’tabu’ (tabu search), defualt to ‘tabu’
disable_multiprocessing (Boolean, optional) – true if you desire to disable the multiprocessing operations, default to False
- get_score_from_graph(graph: PyCTBN.PyCTBN.structure_graph.network_graph.NetworkGraph, node_id: str)¶
Get the FamScore of a node
- Parameters
node_id (string) – current node’s id
graph (class:'NetworkGraph') – current graph to be computed
- Returns
The FamSCore for this graph structure
- Return type
float