PyCTBN.PyCTBN.estimators package¶
Submodules¶
PyCTBN.PyCTBN.estimators.fam_score_calculator module¶
-
class
PyCTBN.PyCTBN.estimators.fam_score_calculator.
FamScoreCalculator
¶ Bases:
object
Has the task of calculating the FamScore of a node by using a Bayesian score function
-
get_fam_score
(cims: numpy.array, tau_xu: float = 0.1, alpha_xu: float = 1)¶ Calculate the FamScore value of the node
- Parameters
cims (np.array) – np.array with all the node’s cims
tau_xu (float, optional) – hyperparameter over the CTBN’s q parameters, default to 0.1
alpha_xu (float, optional) – hyperparameter over the CTBN’s q parameters, default to 1
- Returns
the FamScore value of the node
- Return type
float
-
marginal_likelihood_q
(cims: numpy.array, tau_xu: float = 0.1, alpha_xu: float = 1)¶ Calculate the value of the marginal likelihood over q of the node identified by the label node_id
- Parameters
cims (np.array) – np.array with all the node’s cims
tau_xu (float) – hyperparameter over the CTBN’s q parameters
alpha_xu (float) – hyperparameter over the CTBN’s q parameters
- Returns
the value of the marginal likelihood over q
- Return type
float
-
marginal_likelihood_theta
(cims: PyCTBN.PyCTBN.structure_graph.conditional_intensity_matrix.ConditionalIntensityMatrix, alpha_xu: float, alpha_xxu: float)¶ Calculate the FamScore value of the node identified by the label node_id
- Parameters
cims (np.array) – np.array with all the node’s cims
alpha_xu (float) – hyperparameter over the CTBN’s q parameters, default to 0.1
alpha_xxu (float) – distribuited hyperparameter over the CTBN’s theta parameters
- Returns
the value of the marginal likelihood over theta
- Return type
float
-
single_cim_xu_marginal_likelihood_q
(M_xu_suff_stats: float, T_xu_suff_stats: float, tau_xu: float = 0.1, alpha_xu: float = 1)¶ Calculate the marginal likelihood on q of the node when assumes a specif value and a specif parents’s assignment
- Parameters
M_xu_suff_stats – value of the suffucient statistic M[x|u]
T_xu_suff_stats (float) – value of the suffucient statistic T[x|u]
cim (class:'ConditionalIntensityMatrix') – A conditional_intensity_matrix object with the sufficient statistics
tau_xu (float) – hyperparameter over the CTBN’s q parameters
alpha_xu (float) – hyperparameter over the CTBN’s q parameters
- Returns
the value of the marginal likelihood of the node when assumes a specif value
- Return type
float
-
single_cim_xu_marginal_likelihood_theta
(index: int, cim: PyCTBN.PyCTBN.structure_graph.conditional_intensity_matrix.ConditionalIntensityMatrix, alpha_xu: float, alpha_xxu: float)¶ Calculate the marginal likelihood on q of the node when assumes a specif value and a specif parents’s assignment
- Parameters
cim (class:'ConditionalIntensityMatrix') – A conditional_intensity_matrix object with the sufficient statistics
alpha_xu (float) – hyperparameter over the CTBN’s q parameters
alpha_xxu (float) – distribuited hyperparameter over the CTBN’s theta parameters
- Returns
the value of the marginal likelihood over theta when the node assumes a specif value
- Return type
float
-
single_internal_cim_xxu_marginal_likelihood_theta
(M_xxu_suff_stats: float, alpha_xxu: float = 1)¶ Calculate the second part of the marginal likelihood over theta formula
- Parameters
M_xxu_suff_stats (float) – value of the suffucient statistic M[xx’|u]
alpha_xxu (float) – distribuited hyperparameter over the CTBN’s theta parameters
- Returns
the value of the marginal likelihood over theta when the node assumes a specif value
- Return type
float
-
variable_cim_xu_marginal_likelihood_q
(cim: PyCTBN.PyCTBN.structure_graph.conditional_intensity_matrix.ConditionalIntensityMatrix, tau_xu: float = 0.1, alpha_xu: float = 1)¶ Calculate the value of the marginal likelihood over q given a cim
- Parameters
cim (class:'ConditionalIntensityMatrix') – A conditional_intensity_matrix object with the sufficient statistics
tau_xu (float) – hyperparameter over the CTBN’s q parameters
alpha_xu (float) – hyperparameter over the CTBN’s q parameters
- Returns
the value of the marginal likelihood over q
- Return type
float
-
variable_cim_xu_marginal_likelihood_theta
(cim: PyCTBN.PyCTBN.structure_graph.conditional_intensity_matrix.ConditionalIntensityMatrix, alpha_xu: float, alpha_xxu: float)¶ Calculate the value of the marginal likelihood over theta given a cim
- Parameters
cim (class:'ConditionalIntensityMatrix') – A conditional_intensity_matrix object with the sufficient statistics
alpha_xu (float) – hyperparameter over the CTBN’s q parameters, default to 0.1
alpha_xxu (float) – distribuited hyperparameter over the CTBN’s theta parameters
- Returns
the value of the marginal likelihood over theta
- Return type
float
-
PyCTBN.PyCTBN.estimators.parameters_estimator module¶
-
class
PyCTBN.PyCTBN.estimators.parameters_estimator.
ParametersEstimator
(trajectories: PyCTBN.PyCTBN.structure_graph.trajectory.Trajectory, net_graph: PyCTBN.PyCTBN.structure_graph.network_graph.NetworkGraph)¶ Bases:
object
Has the task of computing the cims of particular node given the trajectories and the net structure in the graph
_net_graph
.- Parameters
trajectories (Trajectory) – the trajectories
net_graph (NetworkGraph) – the net structure
- _single_set_of_cims
the set of cims object that will hold the cims of the node
-
compute_parameters_for_node
(node_id: str) → PyCTBN.PyCTBN.structure_graph.set_of_cims.SetOfCims¶ Compute the CIMS of the node identified by the label
node_id
.- Parameters
node_id (string) – the node label
- Returns
A SetOfCims object filled with the computed CIMS
- Return type
-
static
compute_state_res_time_for_node
(times: numpy.ndarray, trajectory: numpy.ndarray, cols_filter: numpy.ndarray, scalar_indexes_struct: numpy.ndarray, T: numpy.ndarray) → None¶ Compute the state residence times for a node and fill the matrix
T
with the results- Parameters
node_indx (int) – the index of the node
times (numpy.array) – the times deltas vector
trajectory (numpy.ndArray) – the trajectory
cols_filter (numpy.array) – the columns filtering structure
scalar_indexes_struct (numpy.array) – the indexing structure
T (numpy.ndArray) – the state residence times vectors
-
static
compute_state_transitions_for_a_node
(node_indx: int, trajectory: numpy.ndarray, cols_filter: numpy.ndarray, scalar_indexing: numpy.ndarray, M: numpy.ndarray) → None¶ Compute the state residence times for a node and fill the matrices
M
with the results.- Parameters
node_indx (int) – the index of the node
trajectory (numpy.ndArray) – the trajectory
cols_filter (numpy.array) – the columns filtering structure
scalar_indexing (numpy.array) – the indexing structure
M (numpy.ndArray) – the state transitions matrices
-
fast_init
(node_id: str) → None¶ Initializes all the necessary structures for the parameters estimation for the node
node_id
.- Parameters
node_id (string) – the node label
PyCTBN.PyCTBN.estimators.structure_constraint_based_estimator module¶
-
class
PyCTBN.PyCTBN.estimators.structure_constraint_based_estimator.
StructureConstraintBasedEstimator
(sample_path: PyCTBN.PyCTBN.structure_graph.sample_path.SamplePath, exp_test_alfa: float, chi_test_alfa: float, known_edges: List = [], thumb_threshold: int = 25)¶ Bases:
PyCTBN.PyCTBN.estimators.structure_estimator.StructureEstimator
Has the task of estimating the network structure given the trajectories in samplepath by using a constraint-based approach.
- Parameters
sample_path (SamplePath) – the _sample_path object containing the trajectories and the real structure
exp_test_alfa (float) – the significance level for the exponential Hp test
chi_test_alfa (float) – the significance level for the chi Hp test
- _nodes
the nodes labels
- _nodes_vals
the nodes cardinalities
- _nodes_indxs
the nodes indexes
- _complete_graph
the complete directed graph built using the nodes labels in
_nodes
- _cache
the Cache object
-
complete_test
(test_parent: str, test_child: str, parent_set: List, child_states_numb: int, tot_vars_count: int, parent_indx, child_indx) → bool¶ Performs a complete independence test on the directed graphs G1 = {test_child U parent_set} G2 = {G1 U test_parent} (added as an additional parent of the test_child). Generates all the necessary structures and datas to perform the tests.
- Parameters
test_parent (string) – the node label of the test parent
test_child (string) – the node label of the child
parent_set (List) – the common parent set
child_states_numb (int) – the cardinality of the
test_child
tot_vars_count (int) – the total number of variables in the net
- Returns
True iff test_child and test_parent are independent given the sep_set parent_set. False otherwise
- Return type
bool
-
compute_thumb_value
(parent_val, child_val, parent_set_vals)¶ Compute the value to test against the thumb_threshold.
- Parameters
parent_val (int) – test parent’s variable cardinality
child_val (int) – test child’s variable cardinality
parent_set_vals (List) – the cardinalities of the nodes in the current sep-set
- Returns
the thumb value for the current independence test
- Return type
int
-
ctpc_algorithm
(disable_multiprocessing: bool = False)¶ Compute the CTPC algorithm over the entire net.
-
estimate_structure
(disable_multiprocessing: bool = False)¶ Abstract method to estimate the structure
- Returns
List of estimated edges
- Return type
Typing.List
-
independence_test
(child_states_numb: int, cim1: PyCTBN.PyCTBN.structure_graph.conditional_intensity_matrix.ConditionalIntensityMatrix, cim2: PyCTBN.PyCTBN.structure_graph.conditional_intensity_matrix.ConditionalIntensityMatrix, thumb_value: float, parent_indx, child_indx) → bool¶ Compute the actual independence test using two cims. It is performed first the exponential test and if the null hypothesis is not rejected, it is performed also the chi_test.
- Parameters
child_states_numb (int) – the cardinality of the test child
cim1 (ConditionalIntensityMatrix) – a cim belonging to the graph without test parent
cim2 (ConditionalIntensityMatrix) – a cim belonging to the graph with test parent
- Returns
True iff both tests do NOT reject the null hypothesis of independence. False otherwise.
- Return type
bool
-
one_iteration_of_CTPC_algorithm
(var_id: str, tot_vars_count: int) → List¶ Performs an iteration of the CTPC algorithm using the node
var_id
astest_child
.- Parameters
var_id (string) – the node label of the test child
PyCTBN.PyCTBN.estimators.structure_estimator module¶
-
class
PyCTBN.PyCTBN.estimators.structure_estimator.
StructureEstimator
(sample_path: PyCTBN.PyCTBN.structure_graph.sample_path.SamplePath, known_edges: List = None)¶ Bases:
object
Has the task of estimating the network structure given the trajectories in
samplepath
.- Parameters
sample_path (SamplePath) – the _sample_path object containing the trajectories and the real structure
- _nodes
the nodes labels
- _nodes_vals
the nodes cardinalities
- _nodes_indxs
the nodes indexes
- _complete_graph
the complete directed graph built using the nodes labels in
_nodes
-
adjacency_matrix
() → numpy.ndarray¶ Converts the estimated structure
_complete_graph
to a boolean adjacency matrix representation.- Returns
The adjacency matrix of the graph
_complete_graph
- Return type
numpy.ndArray
-
static
build_complete_graph
(node_ids: List) → networkx.classes.digraph.DiGraph¶ Builds a complete directed graph (no self loops) given the nodes labels in the list
node_ids
:- Parameters
node_ids (List) – the list of nodes labels
- Returns
a complete Digraph Object
- Return type
networkx.DiGraph
-
build_removable_edges_matrix
(known_edges: List)¶ Builds a boolean matrix who shows if a edge could be removed or not, based on prior knowledge given:
- Parameters
known_edges (List) – the list of nodes labels
- Returns
a boolean matrix
- Return type
np.ndarray
-
abstract
estimate_structure
() → List¶ Abstract method to estimate the structure
- Returns
List of estimated edges
- Return type
Typing.List
-
static
generate_possible_sub_sets_of_size
(u: List, size: int, parent_label: str)¶ Creates a list containing all possible subsets of the list
u
of sizesize
, that do not contains a the node identified byparent_label
.- Parameters
u (List) – the list of nodes
size (int) – the size of the subsets
parent_label (string) – the node to exclude in the subsets generation
- Returns
an Iterator Object containing a list of lists
- Return type
Iterator
-
save_plot_estimated_structure_graph
(file_path: str) → None¶ Plot the estimated structure in a graphical model style, use .png extension. Spurious edges are colored in red if a prior structure is present.
- Parameters
file_path – path to save the file to
- Type
string
-
save_results
(file_path: str) → None¶ Save the estimated Structure to a .json file in file_path.
- Parameters
file_path (string) – the path including the file name with .json extension
-
spurious_edges
() → List¶ - Return the spurious edges present in the estimated structure, if a prior net structure is present in
_sample_path.structure
.
- Returns
A list containing the spurious edges
- Return type
List
PyCTBN.PyCTBN.estimators.structure_score_based_estimator module¶
-
class
PyCTBN.PyCTBN.estimators.structure_score_based_estimator.
StructureScoreBasedEstimator
(sample_path: PyCTBN.PyCTBN.structure_graph.sample_path.SamplePath, tau_xu: int = 0.1, alpha_xu: int = 1, known_edges: List = [])¶ Bases:
PyCTBN.PyCTBN.estimators.structure_estimator.StructureEstimator
Has the task of estimating the network structure given the trajectories in samplepath by using a score based approach and differt kinds of optimization algorithms.
- Parameters
sample_path (SamplePath) – the _sample_path object containing the trajectories and the real structure
tau_xu (float, optional) – hyperparameter over the CTBN’s q parameters, default to 0.1
alpha_xu (float, optional) – hyperparameter over the CTBN’s q parameters, default to 1
known_edges (List, optional) – List of known edges, default to []
-
estimate_parents
(node_id: str, max_parents: int = None, iterations_number: int = 40, patience: int = 10, tabu_length: int = None, tabu_rules_duration: int = 5, optimizer: str = 'hill')¶ Use the FamScore of a node in order to find the best parent nodes
- Parameters
node_id (string) – current node’s id
max_parents (int, optional) – maximum number of parents for each variable. If None, disabled, default to None
iterations_number (int, optional) – maximum number of optimization algorithm’s iteration, default to 40
patience (int, optional) – number of iteration without any improvement before to stop the search.If None, disabled, default to None
tabu_length (int, optional) – maximum lenght of the data structures used in the optimization process, default to None
tabu_rules_duration (int, optional) – number of iterations in which each rule keeps its value, default to None
optimizer (string, optional) – name of the optimizer algorithm. Possible values: ‘hill’ (Hill climbing),’tabu’ (tabu search), defualt to ‘tabu’
- Returns
A list of the best edges for the currente node
- Return type
List
-
estimate_structure
(max_parents: int = None, iterations_number: int = 40, patience: int = None, tabu_length: int = None, tabu_rules_duration: int = None, optimizer: str = 'tabu', disable_multiprocessing: bool = False)¶ Compute the score-based algorithm to find the optimal structure
- Parameters
max_parents (int, optional) – maximum number of parents for each variable. If None, disabled, default to None
iterations_number (int, optional) – maximum number of optimization algorithm’s iteration, default to 40
patience (int, optional) – number of iteration without any improvement before to stop the search.If None, disabled, default to None
tabu_length (int, optional) – maximum lenght of the data structures used in the optimization process, default to None
tabu_rules_duration (int, optional) – number of iterations in which each rule keeps its value, default to None
optimizer (string, optional) – name of the optimizer algorithm. Possible values: ‘hill’ (Hill climbing),’tabu’ (tabu search), defualt to ‘tabu’
disable_multiprocessing (Boolean, optional) – true if you desire to disable the multiprocessing operations, default to False
-
get_score_from_graph
(graph: PyCTBN.PyCTBN.structure_graph.network_graph.NetworkGraph, node_id: str)¶ Get the FamScore of a node
- Parameters
node_id (string) – current node’s id
graph (class:'NetworkGraph') – current graph to be computed
- Returns
The FamSCore for this graph structure
- Return type
float