emodel_generalisation.information¶

Module to compute information theory on mcmc sampling of emodels.

Functions

`compute_higher_order`(df[, order, column_1, ...])	Compute higher order IT.
`compute_higher_orders`(df[, min_order, ...])	Compute higher order IT measures.
`create_reduced_tuple_set`(df, data_folder, order)	Create a reduced tuple set.
`create_reduced_tuple_set_features`(df, ...[, ...])	Select reduced set of tuples using lower percentile of previous order.
`create_reduced_tuple_set_parameters`(df, ...)	Select reduced set of tuples using top previous tuples.
`get_jidt_calc`([tpe, algo_type])	Get the jidt information theory calculator of given type.
`log`(x[, unit])	Log function.
`mi_gaussian`(x)	MI with gaussian approximation.
`oinfo_gaussian`(x)	Compute Oinfo with guaussian approximation.
`plot_pair_correlations`(df[, split, ...])	Scatter plots of pairs with pearson larger than min_corr, and pearson correlation matrix.
`plot_tuple_distributions`([data_folder, ...])	Plot tuple distributions.
`reduce_features`(df[, threshold])	Reduce number of feature to non-correlated features.
`reduce_matrix_percentile`(df, percentile[, data])	Reduce matrix percentile.
`rsi_gaussian`(x)	RSI calculation with gaussians (assuming first element is y).
`setup_jidt`([jarlocation])	Setup the java env for jidt code.

emodel_generalisation.information.compute_higher_order(df, order=3, column_1='features', column_2='normalized_parameters', correlation_type='MI', n_workers=50, batch_size=100, param_tuples=None)¶: Compute higher order IT.

emodel_generalisation.information.compute_higher_orders(df, min_order=3, max_order=5, column_1='features', column_2='normalized_parameters', correlation_type='MI', n_workers=50, batch_size=100, top=100, min_order_select=3, output_folder='IT_data', with_largests=True)¶

Compute higher order IT measures.

Parameters:

df (dataframe) – MCMC output dataframe
split (float) – max cost to filter dataframe
max_order (int) – max order to compute
column_1 (str) – name of column one (features for example)
columns_2 (str) – usually param column
correlation_type (str) – MI/Oinfo
n_workers (int) – number of parallel workers to use (to many leads to memory error)
batch_size (int) – number of IT evaluation for each workers
min_order_select (int) – after which order we start to only use top/botom best tuples
output_folder (str) – folder to save .csv for each order

emodel_generalisation.information.create_reduced_tuple_set(df, data_folder, order, column_1='features', column_2='normalized_parameters', corr_type='Oinfo', top=100, with_largests=True)¶: Create a reduced tuple set.

emodel_generalisation.information.create_reduced_tuple_set_features(df, data_folder, order, column_1='features', column_2='normalized_parameters', corr_type='Oinfo', top=100, with_largests=True)¶: Select reduced set of tuples using lower percentile of previous order.

emodel_generalisation.information.create_reduced_tuple_set_parameters(df, data_folder, order, column_1='features', column_2='normalized_parameters', corr_type='Oinfo', top=100, with_largests=True)¶: Select reduced set of tuples using top previous tuples.

emodel_generalisation.information.get_jidt_calc(tpe='MI', algo_type='gaussian')¶: Get the jidt information theory calculator of given type.

emodel_generalisation.information.log(x, unit='nats')¶: Log function.

emodel_generalisation.information.mi_gaussian(x)¶: MI with gaussian approximation.

emodel_generalisation.information.oinfo_gaussian(x)¶: Compute Oinfo with guaussian approximation.

emodel_generalisation.information.plot_pair_correlations(df, split=None, min_corr=0.3, column_1='normalized_parameters', column_2=None, filename='parameter_pairs.pdf', clip=0.4, correlation_type='pearson', with_plots=False, plot_top_only_perc=None)¶

Scatter plots of pairs with pearson larger than min_corr, and pearson correlation matrix.

If column_2 is provided, the correlation will be non-square and no clustering will be applied. :param min_corr: minimum correlation for plotting scatter plot :type min_corr: float :param clip: value to clip correlation matrix :type clip: float

emodel_generalisation.information.plot_tuple_distributions(data_folder='data', figure_name='IT_corr.pdf', correlation_type='Oinfo', min_order=3, max_order=20, column_1='features', with_min=True, with_max=True, n_top_tuples=100, tuple_freq_thresh=0.01)¶: Plot tuple distributions.

emodel_generalisation.information.reduce_features(df, threshold=0.9)¶: Reduce number of feature to non-correlated features.

emodel_generalisation.information.reduce_matrix_percentile(df, percentile, data=None)¶: Reduce matrix percentile.

emodel_generalisation.information.rsi_gaussian(x)¶: RSI calculation with gaussians (assuming first element is y).

emodel_generalisation.information.setup_jidt(jarlocation='/gpfs/bbp.cscs.ch/home/arnaudon/code/jidt/infodynamics.jar')¶: Setup the java env for jidt code.