emodel_generalisation.information

Module to compute information theory on mcmc sampling of emodels.

Functions

compute_higher_order(df[, order, column_1, ...])

Compute higher order IT.

compute_higher_orders(df[, min_order, ...])

Compute higher order IT measures.

create_reduced_tuple_set(df, data_folder, order)

Create a reduced tuple set.

create_reduced_tuple_set_features(df, ...[, ...])

Select reduced set of tuples using lower percentile of previous order.

create_reduced_tuple_set_parameters(df, ...)

Select reduced set of tuples using top previous tuples.

get_jidt_calc([tpe, algo_type])

Get the jidt information theory calculator of given type.

log(x[, unit])

Log function.

mi_gaussian(x)

MI with gaussian approximation.

oinfo_gaussian(x)

Compute Oinfo with guaussian approximation.

plot_pair_correlations(df[, split, ...])

Scatter plots of pairs with pearson larger than min_corr, and pearson correlation matrix.

plot_tuple_distributions([data_folder, ...])

Plot tuple distributions.

reduce_features(df[, threshold])

Reduce number of feature to non-correlated features.

reduce_matrix_percentile(df, percentile[, data])

Reduce matrix percentile.

rsi_gaussian(x)

RSI calculation with gaussians (assuming first element is y).

setup_jidt([jarlocation])

Setup the java env for jidt code.

emodel_generalisation.information.compute_higher_order(df, order=3, column_1='features', column_2='normalized_parameters', correlation_type='MI', n_workers=50, batch_size=100, param_tuples=None)

Compute higher order IT.

emodel_generalisation.information.compute_higher_orders(df, min_order=3, max_order=5, column_1='features', column_2='normalized_parameters', correlation_type='MI', n_workers=50, batch_size=100, top=100, min_order_select=3, output_folder='IT_data', with_largests=True)

Compute higher order IT measures.

Parameters:
  • df (dataframe) – MCMC output dataframe

  • split (float) – max cost to filter dataframe

  • max_order (int) – max order to compute

  • column_1 (str) – name of column one (features for example)

  • columns_2 (str) – usually param column

  • correlation_type (str) – MI/Oinfo

  • n_workers (int) – number of parallel workers to use (to many leads to memory error)

  • batch_size (int) – number of IT evaluation for each workers

  • min_order_select (int) – after which order we start to only use top/botom best tuples

  • output_folder (str) – folder to save .csv for each order

emodel_generalisation.information.create_reduced_tuple_set(df, data_folder, order, column_1='features', column_2='normalized_parameters', corr_type='Oinfo', top=100, with_largests=True)

Create a reduced tuple set.

emodel_generalisation.information.create_reduced_tuple_set_features(df, data_folder, order, column_1='features', column_2='normalized_parameters', corr_type='Oinfo', top=100, with_largests=True)

Select reduced set of tuples using lower percentile of previous order.

emodel_generalisation.information.create_reduced_tuple_set_parameters(df, data_folder, order, column_1='features', column_2='normalized_parameters', corr_type='Oinfo', top=100, with_largests=True)

Select reduced set of tuples using top previous tuples.

emodel_generalisation.information.get_jidt_calc(tpe='MI', algo_type='gaussian')

Get the jidt information theory calculator of given type.

emodel_generalisation.information.log(x, unit='nats')

Log function.

emodel_generalisation.information.mi_gaussian(x)

MI with gaussian approximation.

emodel_generalisation.information.oinfo_gaussian(x)

Compute Oinfo with guaussian approximation.

emodel_generalisation.information.plot_pair_correlations(df, split=None, min_corr=0.3, column_1='normalized_parameters', column_2=None, filename='parameter_pairs.pdf', clip=0.4, correlation_type='pearson', with_plots=False, plot_top_only_perc=None)

Scatter plots of pairs with pearson larger than min_corr, and pearson correlation matrix.

If column_2 is provided, the correlation will be non-square and no clustering will be applied. :param min_corr: minimum correlation for plotting scatter plot :type min_corr: float :param clip: value to clip correlation matrix :type clip: float

emodel_generalisation.information.plot_tuple_distributions(data_folder='data', figure_name='IT_corr.pdf', correlation_type='Oinfo', min_order=3, max_order=20, column_1='features', with_min=True, with_max=True, n_top_tuples=100, tuple_freq_thresh=0.01)

Plot tuple distributions.

emodel_generalisation.information.reduce_features(df, threshold=0.9)

Reduce number of feature to non-correlated features.

emodel_generalisation.information.reduce_matrix_percentile(df, percentile, data=None)

Reduce matrix percentile.

emodel_generalisation.information.rsi_gaussian(x)

RSI calculation with gaussians (assuming first element is y).

emodel_generalisation.information.setup_jidt(jarlocation='/gpfs/bbp.cscs.ch/home/arnaudon/code/jidt/infodynamics.jar')

Setup the java env for jidt code.