fusetools.stat_tools.Test

class fusetools.stat_tools.Test[source]

Bases: object

Functions for implementing Statistical tests.

Methods

chi_squared_result

Calculates correlation between 2 proportions using a Chi-Squared test..

correlation

Performs a Pearson test of correlation between two data samples.

cramers_corrected_stat

Calculates correlation between 2 categorical variables using Cramer’s method.

pe

Calculates the Price Elasticity of Demand.

poisson

Performs a Poisson test which tests statistical difference between groups comparing counts over a period of time.

sample_size1

Calculates sample size needed for desired measuring effect size.

survival_result

Performs a survival test which tells if statistical difference in times until an outcome between two samples.

ttest

Performs a t-test between two groups split by a flag.

ttest_result

Performs a T-Test between two groups of data.

classmethod chi_squared_result(sample1_successes, sample1_trials, sample2_successes, sample2_trials)[source]

Calculates correlation between 2 proportions using a Chi-Squared test..

Parameters
  • sample1_successes – Sample 1’s successes.

  • sample1_trials – Sample 1’s trials.

  • sample2_successes – Sample 2’s successes.

  • sample2_trials – Sample 2’s successes.

Returns

Chi-Squared p-value.

classmethod correlation(sample1_dat, sample2_dat)[source]

Performs a Pearson test of correlation between two data samples.

Parameters
  • sample1_dat – Sample 1 data array/list.

  • sample2_dat – Sample 2 data array/list.

Returns

Pearson correlation result.

classmethod cramers_corrected_stat(cat_col1, cat_col2)[source]

Calculates correlation between 2 categorical variables using Cramer’s method.

Parameters
  • cat_col1 – Categorical column 1.

  • cat_col2 – Categorical column 2.

Returns

Correlation between 2 categorical variables using Cramer’s method.

classmethod pe(type, original_quantity=False, new_quantity=False, original_price=False, new_price=False, pe_prices=False, pe_quantities=False)[source]

Calculates the Price Elasticity of Demand.

Parameters
  • type – Classification of whether data is in array/list data format or a scalar format (sample or other).

  • original_quantity – Starting quantity demanded if data is scalar values.

  • new_quantity – Ending quantity demanded if data is scalar values.

  • original_price – Starting price if data is scalar values.

  • new_price – Ending price if data is scalar values.

  • pe_prices – Array/list of prices paid for quantities demanded.

  • pe_quantities – Array/lust of quantities demanded.

Returns

Price elasticity of demand (float).

classmethod poisson(sample1_events, sample1_days, sample2_events, sample2_days)[source]

Performs a Poisson test which tests statistical difference between groups comparing counts over a period of time.

Parameters
  • sample1_events – Count of sample 1 events.

  • sample1_days – Count of sample 1 days.

  • sample2_events – Count of sample 2 events.

  • sample2_days – Count of sample 2 days.

Returns

P-value for a Poisson statistical test.

classmethod sample_size1(baseline_input, effect_size_input, significance_level_input, statistical_power_input)[source]

Calculates sample size needed for desired measuring effect size.

Parameters
  • baseline_input – Baseline rate to measure effect against against.

  • effect_size_input – Desired effect size to measure.

  • significance_level_input – Desired level of statistical significance.

  • statistical_power_input – Desired level of statistical power.

Returns

Calculated sample size.

classmethod survival_result(data_type, sample1_dat_survival=False, sample2_dat_survival=False, survival_confidence_level=False, sample1_dat_survival_mean=False, sample1_dat_survival_size=False, sample2_dat_survival_mean=False, sample2_dat_survival_size=False)[source]

Performs a survival test which tells if statistical difference in times until an outcome between two samples.

Parameters
  • data_type – Classification of whether data is in array/list data format or a scalar format (sample or other).

  • sample1_dat_survival – Sample 1 data if array/list.

  • sample2_dat_survival – Sample 1 data if array/list.

  • survival_confidence_level – Confidence interval to assess measure test.

  • sample1_dat_survival_mean – Sample 1 mean if scalar value.

  • sample1_dat_survival_size – Sample 1 size if scalar value.

  • sample2_dat_survival_mean – Sample 2 mean if scalar value.

  • sample2_dat_survival_size – Sample 2 size if scalar value.

Returns

P-value for statistical significance in difference in times until an outcomes between two samples.

classmethod ttest(df, grp_col, grp_1_flag, grp_2_flag, target_kpi)[source]

Performs a t-test between two groups split by a flag.

Parameters
  • df – Pandas DataFrame containing data.

  • grp_col – Column used to group the data.

  • grp_1_flag – Value used to distinguish group 1.

  • grp_2_flag – Value used to distinguish group 2.

  • target_kpi – Column for the target metric to compare test across groups.

Returns

T-Test p-value.

classmethod ttest_result(sample1_dat_ttest, sample2_dat_ttest)[source]

Performs a T-Test between two groups of data.

Parameters
  • sample1_dat_ttest – Sample 1 dataset.

  • sample2_dat_ttest – Sample 2 dataset.

Returns

T-Test p-value.