PyCTBN.PyCTBN.utility package¶
Submodules¶
PyCTBN.PyCTBN.utility.abstract_importer module¶
-
class
PyCTBN.PyCTBN.utility.abstract_importer.
AbstractImporter
(file_path: str = None, trajectory_list: Union[pandas.core.frame.DataFrame, numpy.ndarray] = None, variables: pandas.core.frame.DataFrame = None, prior_net_structure: pandas.core.frame.DataFrame = None)¶ Bases:
abc.ABC
Abstract class that exposes all the necessary methods to process the trajectories and the net structure.
- Parameters
file_path (str) – the file path, or dataset name if you import already processed data
trajectory_list (typing.Union[pandas.DataFrame, numpy.ndarray]) – Dataframe or numpy array containing the concatenation of all the processed trajectories
variables (pandas.DataFrame) – Dataframe containing the nodes labels and cardinalities
- Prior_net_structure
Dataframe containing the structure of the network (edges)
- _sorter
A list containing the variables labels in the SAME order as the columns in
concatenated_samples
Warning
The parameters
variables
andprior_net_structure
HAVE to be properly constructed as Pandas Dataframes with the following structure: Header of _df_structure = [From_Node | To_Node] Header of _df_variables = [Variable_Label | Variable_Cardinality] See the tutorial on how to construct a correctconcatenated_samples
Dataframe/ndarray.Note
See :class:
JsonImporter
for an example implementation-
build_list_of_samples_array
(concatenated_sample: pandas.core.frame.DataFrame) → List¶ Builds a List containing the the delta times numpy array, and the complete transitions matrix
- Parameters
concatenated_sample (pandas.Dataframe) – the dataframe/array from which the time, and transitions matrix have to be extracted and converted
- Returns
the resulting list of numpy arrays
- Return type
List
-
abstract
build_sorter
(trajecory_header: object) → List¶ Initializes the
_sorter
class member from a trajectory dataframe, exctracting the header of the frame and keeping ONLY the variables symbolic labels, cutting out the time label in the header.- Parameters
trajecory_header (object) – an object that will be used to define the header
- Returns
A list containing the processed header.
- Return type
List
-
clear_concatenated_frame
() → None¶ Removes all values in the dataframe concatenated_samples.
-
compute_row_delta_in_all_samples_frames
(df_samples_list: List) → None¶ Calls the method
compute_row_delta_sigle_samples_frame
on every dataframe present in the listdf_samples_list
. Concatenates the result in the dataframeconcatanated_samples
- Parameters
df_samples_list (List) – the datframe’s list to be processed and concatenated
Warning
The Dataframe sample_frame has to follow the column structure of this header: Header of sample_frame = [Time | Variable values] The class member self._sorter HAS to be properly INITIALIZED (See class members definition doc)
Note
After the call of this method the class member
concatanated_samples
will contain all processed and merged trajectories
-
compute_row_delta_sigle_samples_frame
(sample_frame: pandas.core.frame.DataFrame, columns_header: List, shifted_cols_header: List) → pandas.core.frame.DataFrame¶ Computes the difference between each value present in th time column. Copies and shift by one position up all the values present in the remaining columns.
- Parameters
sample_frame (pandas.Dataframe) – the traj to be processed
columns_header (List) – the original header of sample_frame
shifted_cols_header (List) – a copy of columns_header with changed names of the contents
- Returns
The processed dataframe
- Return type
pandas.Dataframe
Warning
the Dataframe
sample_frame
has to follow the column structure of this header: Header of sample_frame = [Time | Variable values]
-
property
concatenated_samples
¶
-
abstract
dataset_id
() → object¶ If the original dataset contains multiple dataset, this method returns a unique id to identify the current dataset
-
property
file_path
¶
-
property
sorter
¶
-
property
structure
¶
-
property
variables
¶
PyCTBN.PyCTBN.utility.cache module¶
-
class
PyCTBN.PyCTBN.utility.cache.
Cache
¶ Bases:
object
This class acts as a cache of
SetOfCims
objects for a node.- __list_of_sets_of_parents
a list of
Sets
objects of the parents to which the cim in cache at SAME index is related- __actual_cache
a list of setOfCims objects
-
clear
()¶ Clear the contents both of
__actual_cache
and__list_of_sets_of_parents
.
-
find
(parents_comb: Set)¶ Tries to find in cache given the symbolic parents combination
parents_comb
theSetOfCims
related to thatparents_comb
.- Parameters
parents_comb (Set) – the parents related to that
SetOfCims
- Returns
A
SetOfCims
object if theparents_comb
index is found in__list_of_sets_of_parents
. None otherwise.- Return type
-
put
(parents_comb: Set, socim: PyCTBN.PyCTBN.structure_graph.set_of_cims.SetOfCims)¶ Place in cache the
SetOfCims
object, and the related symbolic indexparents_comb
in__list_of_sets_of_parents
.- Parameters
parents_comb (Set) – the symbolic set index
socim (SetOfCims) – the related SetOfCims object
PyCTBN.PyCTBN.utility.json_importer module¶
-
class
PyCTBN.PyCTBN.utility.json_importer.
JsonImporter
(file_path: str, samples_label: str, structure_label: str, variables_label: str, time_key: str, variables_key: str)¶ Bases:
PyCTBN.PyCTBN.utility.abstract_importer.AbstractImporter
Implements the abstracts methods of AbstractImporter and adds all the necessary methods to process and prepare the data in json extension.
- Parameters
file_path (string) – the path of the file that contains tha data to be imported
samples_label (string) – the reference key for the samples in the trajectories
structure_label (string) – the reference key for the structure of the network data
variables_label (string) – the reference key for the cardinalites of the nodes data
time_key (string) – the key used to identify the timestamps in each trajectory
variables_key (string) – the key used to identify the names of the variables in the net
- _array_indx
the index of the outer JsonArray to extract the data from
- _df_samples_list
a Dataframe list in which every dataframe contains a trajectory
- _raw_data
The raw contents of the json file to import
-
build_sorter
(sample_frame: pandas.core.frame.DataFrame) → List¶ Implements the abstract method build_sorter of the
AbstractImporter
for this dataset.
-
clear_data_frame_list
() → None¶ Removes all values present in the dataframes in the list
_df_samples_list
.
-
dataset_id
() → object¶ If the original dataset contains multiple dataset, this method returns a unique id to identify the current dataset
-
import_data
(indx: int = 0) → None¶ Implements the abstract method of
AbstractImporter
.- Parameters
indx (int) – the index of the outer JsonArray to extract the data from, default to 0
-
import_sampled_cims
(raw_data: List, indx: int, cims_key: str) → Dict¶ Imports the synthetic CIMS in the dataset in a dictionary, using variables labels as keys for the set of CIMS of a particular node.
- Parameters
raw_data (List) – List of Dicts
indx (int) – The index of the array from which the data have to be extracted
cims_key (string) – the key where the json object cims are placed
- Returns
a dictionary containing the sampled CIMS for all the variables in the net
- Return type
Dictionary
-
import_structure
(raw_data: List) → pandas.core.frame.DataFrame¶ Imports in a dataframe the data in the list raw_data at the key
_structure_label
- Parameters
raw_data (List) – List of Dicts
- Returns
Dataframe containg the starting node a ending node of every arc of the network
- Return type
pandas.Dataframe
-
import_trajectories
(raw_data: List) → List¶ Imports the trajectories from the list of dicts
raw_data
.- Parameters
raw_data (List) – List of Dicts
- Returns
List of dataframes containing all the trajectories
- Return type
List
-
import_variables
(raw_data: List) → pandas.core.frame.DataFrame¶ Imports the data in
raw_data
at the key_variables_label
.- Parameters
raw_data (List) – List of Dicts
- Returns
Datframe containg the variables simbolic labels and their cardinalities
- Return type
pandas.Dataframe
-
normalize_trajectories
(raw_data: List, indx: int, trajectories_key: str) → List¶ Extracts the trajectories in
raw_data
at the indexindex
at the keytrajectories key
.- Parameters
raw_data (List) – List of Dicts
indx (int) – The index of the array from which the data have to be extracted
trajectories_key (string) – the key of the trajectories objects
- Returns
A list of daframes containg the trajectories
- Return type
List
-
one_level_normalizing
(raw_data: List, indx: int, key: str) → pandas.core.frame.DataFrame¶ Extracts the one-level nested data in the list
raw_data
at the indexindx
at the keykey
.- Parameters
raw_data (List) – List of Dicts
indx (int) – The index of the array from which the data have to be extracted
key (string) – the key for the Dicts from which exctract data
- Returns
A normalized dataframe
- Return type
pandas.Datframe
-
read_json_file
() → List¶ Reads the JSON file in the path self.filePath.
- Returns
The contents of the json file
- Return type
List
PyCTBN.PyCTBN.utility.sample_importer module¶
-
class
PyCTBN.PyCTBN.utility.sample_importer.
SampleImporter
(trajectory_list: Union[pandas.core.frame.DataFrame, numpy.ndarray, List] = None, variables: Union[pandas.core.frame.DataFrame, numpy.ndarray, List] = None, prior_net_structure: Union[pandas.core.frame.DataFrame, numpy.ndarray, List] = None)¶ Bases:
PyCTBN.PyCTBN.utility.abstract_importer.AbstractImporter
Implements the abstracts methods of AbstractImporter and adds all the necessary methods to process and prepare the data loaded directly by using DataFrame
- Parameters
trajectory_list (typing.Union[pd.DataFrame, np.ndarray, typing.List]) – the data that describes the trajectories
variables (typing.Union[pd.DataFrame, np.ndarray, typing.List]) – the data that describes the variables with name and cardinality
prior_net_structure (typing.Union[pd.DataFrame, np.ndarray, typing.List]) – the data of the real structure, if it exists
- _df_samples_list
a Dataframe list in which every dataframe contains a trajectory
- _raw_data
The raw contents of the json file to import
-
build_sorter
(sample_frame: pandas.core.frame.DataFrame) → List¶ Implements the abstract method build_sorter of the
AbstractImporter
in order to get the ordered variables list.
-
dataset_id
() → str¶ If the original dataset contains multiple dataset, this method returns a unique id to identify the current dataset
-
import_data
(header_column=None)¶