PyCTBN.PyCTBN.utility package¶

Submodules¶

PyCTBN.PyCTBN.utility.abstract_importer module¶

class PyCTBN.PyCTBN.utility.abstract_importer.AbstractImporter(file_path: str = None, trajectory_list: Union[pandas.core.frame.DataFrame, numpy.ndarray] = None, variables: pandas.core.frame.DataFrame = None, prior_net_structure: pandas.core.frame.DataFrame = None)¶

Bases: abc.ABC

Abstract class that exposes all the necessary methods to process the trajectories and the net structure.

Parameters

file_path (str) – the file path, or dataset name if you import already processed data
trajectory_list (typing.Union[pandas.DataFrame, numpy.ndarray]) – Dataframe or numpy array containing the concatenation of all the processed trajectories
variables (pandas.DataFrame) – Dataframe containing the nodes labels and cardinalities

Prior_net_structure

Dataframe containing the structure of the network (edges)

_sorter

A list containing the variables labels in the SAME order as the columns in concatenated_samples

Warning

The parameters variables and prior_net_structure HAVE to be properly constructed as Pandas Dataframes with the following structure: Header of _df_structure = [From_Node | To_Node] Header of _df_variables = [Variable_Label | Variable_Cardinality] See the tutorial on how to construct a correct concatenated_samples Dataframe/ndarray.

Note

See :class:JsonImporter for an example implementation

build_list_of_samples_array(concatenated_sample: pandas.core.frame.DataFrame) → List¶

Builds a List containing the the delta times numpy array, and the complete transitions matrix

Parameters: concatenated_sample (pandas.Dataframe) – the dataframe/array from which the time, and transitions matrix have to be extracted and converted
Returns: the resulting list of numpy arrays
Return type: List

abstract build_sorter(trajecory_header: object) → List¶

Initializes the _sorter class member from a trajectory dataframe, exctracting the header of the frame and keeping ONLY the variables symbolic labels, cutting out the time label in the header.

Parameters: trajecory_header (object) – an object that will be used to define the header
Returns: A list containing the processed header.
Return type: List

clear_concatenated_frame() → None¶: Removes all values in the dataframe concatenated_samples.

compute_row_delta_in_all_samples_frames(df_samples_list: List) → None¶

Calls the method compute_row_delta_sigle_samples_frame on every dataframe present in the list df_samples_list. Concatenates the result in the dataframe concatanated_samples

Parameters: df_samples_list (List) – the datframe’s list to be processed and concatenated

Warning

The Dataframe sample_frame has to follow the column structure of this header: Header of sample_frame = [Time | Variable values] The class member self._sorter HAS to be properly INITIALIZED (See class members definition doc)

Note

After the call of this method the class member concatanated_samples will contain all processed and merged trajectories

compute_row_delta_sigle_samples_frame(sample_frame: pandas.core.frame.DataFrame, columns_header: List, shifted_cols_header: List) → pandas.core.frame.DataFrame¶

Computes the difference between each value present in th time column. Copies and shift by one position up all the values present in the remaining columns.

Parameters

sample_frame (pandas.Dataframe) – the traj to be processed
columns_header (List) – the original header of sample_frame
shifted_cols_header (List) – a copy of columns_header with changed names of the contents

Returns

The processed dataframe

Return type

pandas.Dataframe

Warning

the Dataframe sample_frame has to follow the column structure of this header: Header of sample_frame = [Time | Variable values]

property concatenated_samples¶

abstract dataset_id() → object¶: If the original dataset contains multiple dataset, this method returns a unique id to identify the current dataset

property file_path¶

property sorter¶

property structure¶

property variables¶

PyCTBN.PyCTBN.utility.cache module¶

class PyCTBN.PyCTBN.utility.cache.Cache¶

Bases: object

This class acts as a cache of SetOfCims objects for a node.

__list_of_sets_of_parents: a list of Sets objects of the parents to which the cim in cache at SAME index is related
__actual_cache: a list of setOfCims objects

clear()¶: Clear the contents both of __actual_cache and __list_of_sets_of_parents.

find(parents_comb: Set)¶

Tries to find in cache given the symbolic parents combination parents_comb the SetOfCims related to that parents_comb.

Parameters: parents_comb (Set) – the parents related to that SetOfCims
Returns: A SetOfCims object if the parents_comb index is found in __list_of_sets_of_parents. None otherwise.
Return type: SetOfCims

put(parents_comb: Set, socim: PyCTBN.PyCTBN.structure_graph.set_of_cims.SetOfCims)¶

Place in cache the SetOfCims object, and the related symbolic index parents_comb in __list_of_sets_of_parents.

Parameters

parents_comb (Set) – the symbolic set index
socim (SetOfCims) – the related SetOfCims object

PyCTBN.PyCTBN.utility.json_importer module¶

class PyCTBN.PyCTBN.utility.json_importer.JsonImporter(file_path: str, samples_label: str, structure_label: str, variables_label: str, time_key: str, variables_key: str)¶

Bases: PyCTBN.PyCTBN.utility.abstract_importer.AbstractImporter

Implements the abstracts methods of AbstractImporter and adds all the necessary methods to process and prepare the data in json extension.

Parameters

file_path (string) – the path of the file that contains tha data to be imported
samples_label (string) – the reference key for the samples in the trajectories
structure_label (string) – the reference key for the structure of the network data
variables_label (string) – the reference key for the cardinalites of the nodes data
time_key (string) – the key used to identify the timestamps in each trajectory
variables_key (string) – the key used to identify the names of the variables in the net

_array_indx

the index of the outer JsonArray to extract the data from

_df_samples_list

a Dataframe list in which every dataframe contains a trajectory

_raw_data

The raw contents of the json file to import

build_sorter(sample_frame: pandas.core.frame.DataFrame) → List¶: Implements the abstract method build_sorter of the AbstractImporter for this dataset.

clear_data_frame_list() → None¶: Removes all values present in the dataframes in the list _df_samples_list.

dataset_id() → object¶: If the original dataset contains multiple dataset, this method returns a unique id to identify the current dataset

import_data(indx: int = 0) → None¶

Implements the abstract method of AbstractImporter.

Parameters: indx (int) – the index of the outer JsonArray to extract the data from, default to 0

import_sampled_cims(raw_data: List, indx: int, cims_key: str) → Dict¶

Imports the synthetic CIMS in the dataset in a dictionary, using variables labels as keys for the set of CIMS of a particular node.

Parameters

raw_data (List) – List of Dicts
indx (int) – The index of the array from which the data have to be extracted
cims_key (string) – the key where the json object cims are placed

Returns

a dictionary containing the sampled CIMS for all the variables in the net

Return type

Dictionary

import_structure(raw_data: List) → pandas.core.frame.DataFrame¶

Imports in a dataframe the data in the list raw_data at the key _structure_label

Parameters: raw_data (List) – List of Dicts
Returns: Dataframe containg the starting node a ending node of every arc of the network
Return type: pandas.Dataframe

import_trajectories(raw_data: List) → List¶

Imports the trajectories from the list of dicts raw_data.

Parameters: raw_data (List) – List of Dicts
Returns: List of dataframes containing all the trajectories
Return type: List

import_variables(raw_data: List) → pandas.core.frame.DataFrame¶

Imports the data in raw_data at the key _variables_label.

Parameters: raw_data (List) – List of Dicts
Returns: Datframe containg the variables simbolic labels and their cardinalities
Return type: pandas.Dataframe

normalize_trajectories(raw_data: List, indx: int, trajectories_key: str) → List¶

Extracts the trajectories in raw_data at the index index at the key trajectories key.

Parameters

raw_data (List) – List of Dicts
indx (int) – The index of the array from which the data have to be extracted
trajectories_key (string) – the key of the trajectories objects

Returns

A list of daframes containg the trajectories

Return type

List

one_level_normalizing(raw_data: List, indx: int, key: str) → pandas.core.frame.DataFrame¶

Extracts the one-level nested data in the list raw_data at the index indx at the key key.

Parameters

raw_data (List) – List of Dicts
indx (int) – The index of the array from which the data have to be extracted
key (string) – the key for the Dicts from which exctract data

Returns

A normalized dataframe

Return type

pandas.Datframe

read_json_file() → List¶

Reads the JSON file in the path self.filePath.

Returns: The contents of the json file
Return type: List

PyCTBN.PyCTBN.utility.sample_importer module¶

class PyCTBN.PyCTBN.utility.sample_importer.SampleImporter(trajectory_list: Union[pandas.core.frame.DataFrame, numpy.ndarray, List] = None, variables: Union[pandas.core.frame.DataFrame, numpy.ndarray, List] = None, prior_net_structure: Union[pandas.core.frame.DataFrame, numpy.ndarray, List] = None)¶

Bases: PyCTBN.PyCTBN.utility.abstract_importer.AbstractImporter

Implements the abstracts methods of AbstractImporter and adds all the necessary methods to process and prepare the data loaded directly by using DataFrame

Parameters

trajectory_list (typing.Union[pd.DataFrame, np.ndarray, typing.List]) – the data that describes the trajectories
variables (typing.Union[pd.DataFrame, np.ndarray, typing.List]) – the data that describes the variables with name and cardinality
prior_net_structure (typing.Union[pd.DataFrame, np.ndarray, typing.List]) – the data of the real structure, if it exists

_df_samples_list

a Dataframe list in which every dataframe contains a trajectory

_raw_data

The raw contents of the json file to import

build_sorter(sample_frame: pandas.core.frame.DataFrame) → List¶: Implements the abstract method build_sorter of the AbstractImporter in order to get the ordered variables list.

dataset_id() → str¶: If the original dataset contains multiple dataset, this method returns a unique id to identify the current dataset

import_data(header_column=None)¶

PyCTBN.PyCTBN.utility package¶

Submodules¶

PyCTBN.PyCTBN.utility.abstract_importer module¶

PyCTBN.PyCTBN.utility.cache module¶

PyCTBN.PyCTBN.utility.json_importer module¶

PyCTBN.PyCTBN.utility.sample_importer module¶

Module contents¶