PyCTBN.PyCTBN.utility package

Submodules

PyCTBN.PyCTBN.utility.abstract_importer module

class PyCTBN.PyCTBN.utility.abstract_importer.AbstractImporter(file_path: str = None, trajectory_list: Union[pandas.core.frame.DataFrame, numpy.ndarray] = None, variables: pandas.core.frame.DataFrame = None, prior_net_structure: pandas.core.frame.DataFrame = None)

Bases: abc.ABC

Abstract class that exposes all the necessary methods to process the trajectories and the net structure.

Parameters
  • file_path (str) – the file path, or dataset name if you import already processed data

  • trajectory_list (typing.Union[pandas.DataFrame, numpy.ndarray]) – Dataframe or numpy array containing the concatenation of all the processed trajectories

  • variables (pandas.DataFrame) – Dataframe containing the nodes labels and cardinalities

Prior_net_structure

Dataframe containing the structure of the network (edges)

_sorter

A list containing the variables labels in the SAME order as the columns in concatenated_samples

Warning

The parameters variables and prior_net_structure HAVE to be properly constructed as Pandas Dataframes with the following structure: Header of _df_structure = [From_Node | To_Node] Header of _df_variables = [Variable_Label | Variable_Cardinality] See the tutorial on how to construct a correct concatenated_samples Dataframe/ndarray.

Note

See :class:JsonImporter for an example implementation

build_list_of_samples_array(concatenated_sample: pandas.core.frame.DataFrame) → List

Builds a List containing the the delta times numpy array, and the complete transitions matrix

Parameters

concatenated_sample (pandas.Dataframe) – the dataframe/array from which the time, and transitions matrix have to be extracted and converted

Returns

the resulting list of numpy arrays

Return type

List

abstract build_sorter(trajecory_header: object) → List

Initializes the _sorter class member from a trajectory dataframe, exctracting the header of the frame and keeping ONLY the variables symbolic labels, cutting out the time label in the header.

Parameters

trajecory_header (object) – an object that will be used to define the header

Returns

A list containing the processed header.

Return type

List

clear_concatenated_frame() → None

Removes all values in the dataframe concatenated_samples.

compute_row_delta_in_all_samples_frames(df_samples_list: List) → None

Calls the method compute_row_delta_sigle_samples_frame on every dataframe present in the list df_samples_list. Concatenates the result in the dataframe concatanated_samples

Parameters

df_samples_list (List) – the datframe’s list to be processed and concatenated

Warning

The Dataframe sample_frame has to follow the column structure of this header: Header of sample_frame = [Time | Variable values] The class member self._sorter HAS to be properly INITIALIZED (See class members definition doc)

Note

After the call of this method the class member concatanated_samples will contain all processed and merged trajectories

compute_row_delta_sigle_samples_frame(sample_frame: pandas.core.frame.DataFrame, columns_header: List, shifted_cols_header: List) → pandas.core.frame.DataFrame

Computes the difference between each value present in th time column. Copies and shift by one position up all the values present in the remaining columns.

Parameters
  • sample_frame (pandas.Dataframe) – the traj to be processed

  • columns_header (List) – the original header of sample_frame

  • shifted_cols_header (List) – a copy of columns_header with changed names of the contents

Returns

The processed dataframe

Return type

pandas.Dataframe

Warning

the Dataframe sample_frame has to follow the column structure of this header: Header of sample_frame = [Time | Variable values]

property concatenated_samples
abstract dataset_id() → object

If the original dataset contains multiple dataset, this method returns a unique id to identify the current dataset

property file_path
property sorter
property structure
property variables

PyCTBN.PyCTBN.utility.cache module

class PyCTBN.PyCTBN.utility.cache.Cache

Bases: object

This class acts as a cache of SetOfCims objects for a node.

__list_of_sets_of_parents

a list of Sets objects of the parents to which the cim in cache at SAME index is related

__actual_cache

a list of setOfCims objects

clear()

Clear the contents both of __actual_cache and __list_of_sets_of_parents.

find(parents_comb: Set)

Tries to find in cache given the symbolic parents combination parents_comb the SetOfCims related to that parents_comb.

Parameters

parents_comb (Set) – the parents related to that SetOfCims

Returns

A SetOfCims object if the parents_comb index is found in __list_of_sets_of_parents. None otherwise.

Return type

SetOfCims

put(parents_comb: Set, socim: PyCTBN.PyCTBN.structure_graph.set_of_cims.SetOfCims)

Place in cache the SetOfCims object, and the related symbolic index parents_comb in __list_of_sets_of_parents.

Parameters
  • parents_comb (Set) – the symbolic set index

  • socim (SetOfCims) – the related SetOfCims object

PyCTBN.PyCTBN.utility.json_importer module

class PyCTBN.PyCTBN.utility.json_importer.JsonImporter(file_path: str, samples_label: str, structure_label: str, variables_label: str, time_key: str, variables_key: str)

Bases: PyCTBN.PyCTBN.utility.abstract_importer.AbstractImporter

Implements the abstracts methods of AbstractImporter and adds all the necessary methods to process and prepare the data in json extension.

Parameters
  • file_path (string) – the path of the file that contains tha data to be imported

  • samples_label (string) – the reference key for the samples in the trajectories

  • structure_label (string) – the reference key for the structure of the network data

  • variables_label (string) – the reference key for the cardinalites of the nodes data

  • time_key (string) – the key used to identify the timestamps in each trajectory

  • variables_key (string) – the key used to identify the names of the variables in the net

_array_indx

the index of the outer JsonArray to extract the data from

_df_samples_list

a Dataframe list in which every dataframe contains a trajectory

_raw_data

The raw contents of the json file to import

build_sorter(sample_frame: pandas.core.frame.DataFrame) → List

Implements the abstract method build_sorter of the AbstractImporter for this dataset.

clear_data_frame_list() → None

Removes all values present in the dataframes in the list _df_samples_list.

dataset_id() → object

If the original dataset contains multiple dataset, this method returns a unique id to identify the current dataset

import_data(indx: int = 0) → None

Implements the abstract method of AbstractImporter.

Parameters

indx (int) – the index of the outer JsonArray to extract the data from, default to 0

import_sampled_cims(raw_data: List, indx: int, cims_key: str) → Dict

Imports the synthetic CIMS in the dataset in a dictionary, using variables labels as keys for the set of CIMS of a particular node.

Parameters
  • raw_data (List) – List of Dicts

  • indx (int) – The index of the array from which the data have to be extracted

  • cims_key (string) – the key where the json object cims are placed

Returns

a dictionary containing the sampled CIMS for all the variables in the net

Return type

Dictionary

import_structure(raw_data: List) → pandas.core.frame.DataFrame

Imports in a dataframe the data in the list raw_data at the key _structure_label

Parameters

raw_data (List) – List of Dicts

Returns

Dataframe containg the starting node a ending node of every arc of the network

Return type

pandas.Dataframe

import_trajectories(raw_data: List) → List

Imports the trajectories from the list of dicts raw_data.

Parameters

raw_data (List) – List of Dicts

Returns

List of dataframes containing all the trajectories

Return type

List

import_variables(raw_data: List) → pandas.core.frame.DataFrame

Imports the data in raw_data at the key _variables_label.

Parameters

raw_data (List) – List of Dicts

Returns

Datframe containg the variables simbolic labels and their cardinalities

Return type

pandas.Dataframe

normalize_trajectories(raw_data: List, indx: int, trajectories_key: str) → List

Extracts the trajectories in raw_data at the index index at the key trajectories key.

Parameters
  • raw_data (List) – List of Dicts

  • indx (int) – The index of the array from which the data have to be extracted

  • trajectories_key (string) – the key of the trajectories objects

Returns

A list of daframes containg the trajectories

Return type

List

one_level_normalizing(raw_data: List, indx: int, key: str) → pandas.core.frame.DataFrame

Extracts the one-level nested data in the list raw_data at the index indx at the key key.

Parameters
  • raw_data (List) – List of Dicts

  • indx (int) – The index of the array from which the data have to be extracted

  • key (string) – the key for the Dicts from which exctract data

Returns

A normalized dataframe

Return type

pandas.Datframe

read_json_file() → List

Reads the JSON file in the path self.filePath.

Returns

The contents of the json file

Return type

List

PyCTBN.PyCTBN.utility.sample_importer module

class PyCTBN.PyCTBN.utility.sample_importer.SampleImporter(trajectory_list: Union[pandas.core.frame.DataFrame, numpy.ndarray, List] = None, variables: Union[pandas.core.frame.DataFrame, numpy.ndarray, List] = None, prior_net_structure: Union[pandas.core.frame.DataFrame, numpy.ndarray, List] = None)

Bases: PyCTBN.PyCTBN.utility.abstract_importer.AbstractImporter

Implements the abstracts methods of AbstractImporter and adds all the necessary methods to process and prepare the data loaded directly by using DataFrame

Parameters
  • trajectory_list (typing.Union[pd.DataFrame, np.ndarray, typing.List]) – the data that describes the trajectories

  • variables (typing.Union[pd.DataFrame, np.ndarray, typing.List]) – the data that describes the variables with name and cardinality

  • prior_net_structure (typing.Union[pd.DataFrame, np.ndarray, typing.List]) – the data of the real structure, if it exists

_df_samples_list

a Dataframe list in which every dataframe contains a trajectory

_raw_data

The raw contents of the json file to import

build_sorter(sample_frame: pandas.core.frame.DataFrame) → List

Implements the abstract method build_sorter of the AbstractImporter in order to get the ordered variables list.

dataset_id() → str

If the original dataset contains multiple dataset, this method returns a unique id to identify the current dataset

import_data(header_column=None)

Module contents