fusetools.stat_tools.Test¶

class fusetools.stat_tools.Test[source]¶

Bases: object

Functions for implementing Statistical tests.

Methods

`chi_squared_result`	Calculates correlation between 2 proportions using a Chi-Squared test..
`correlation`	Performs a Pearson test of correlation between two data samples.
`cramers_corrected_stat`	Calculates correlation between 2 categorical variables using Cramer’s method.
`pe`	Calculates the Price Elasticity of Demand.
`poisson`	Performs a Poisson test which tests statistical difference between groups comparing counts over a period of time.
`sample_size1`	Calculates sample size needed for desired measuring effect size.
`survival_result`	Performs a survival test which tells if statistical difference in times until an outcome between two samples.
`ttest`	Performs a t-test between two groups split by a flag.
`ttest_result`	Performs a T-Test between two groups of data.

classmethod chi_squared_result(sample1_successes, sample1_trials, sample2_successes, sample2_trials)[source]¶

Calculates correlation between 2 proportions using a Chi-Squared test..

Parameters

sample1_successes – Sample 1’s successes.
sample1_trials – Sample 1’s trials.
sample2_successes – Sample 2’s successes.
sample2_trials – Sample 2’s successes.

Returns

Chi-Squared p-value.

classmethod correlation(sample1_dat, sample2_dat)[source]¶

Performs a Pearson test of correlation between two data samples.

Parameters

sample1_dat – Sample 1 data array/list.
sample2_dat – Sample 2 data array/list.

Returns

Pearson correlation result.

classmethod cramers_corrected_stat(cat_col1, cat_col2)[source]¶

Calculates correlation between 2 categorical variables using Cramer’s method.

Parameters

cat_col1 – Categorical column 1.
cat_col2 – Categorical column 2.

Returns

Correlation between 2 categorical variables using Cramer’s method.

classmethod pe(type, original_quantity=False, new_quantity=False, original_price=False, new_price=False, pe_prices=False, pe_quantities=False)[source]¶

Calculates the Price Elasticity of Demand.

Parameters

type – Classification of whether data is in array/list data format or a scalar format (sample or other).
original_quantity – Starting quantity demanded if data is scalar values.
new_quantity – Ending quantity demanded if data is scalar values.
original_price – Starting price if data is scalar values.
new_price – Ending price if data is scalar values.
pe_prices – Array/list of prices paid for quantities demanded.
pe_quantities – Array/lust of quantities demanded.

Returns

Price elasticity of demand (float).

classmethod poisson(sample1_events, sample1_days, sample2_events, sample2_days)[source]¶

Performs a Poisson test which tests statistical difference between groups comparing counts over a period of time.

Parameters

sample1_events – Count of sample 1 events.
sample1_days – Count of sample 1 days.
sample2_events – Count of sample 2 events.
sample2_days – Count of sample 2 days.

Returns

P-value for a Poisson statistical test.

classmethod sample_size1(baseline_input, effect_size_input, significance_level_input, statistical_power_input)[source]¶

Calculates sample size needed for desired measuring effect size.

Parameters

baseline_input – Baseline rate to measure effect against against.
effect_size_input – Desired effect size to measure.
significance_level_input – Desired level of statistical significance.
statistical_power_input – Desired level of statistical power.

Returns

Calculated sample size.

classmethod survival_result(data_type, sample1_dat_survival=False, sample2_dat_survival=False, survival_confidence_level=False, sample1_dat_survival_mean=False, sample1_dat_survival_size=False, sample2_dat_survival_mean=False, sample2_dat_survival_size=False)[source]¶

Performs a survival test which tells if statistical difference in times until an outcome between two samples.

Parameters

data_type – Classification of whether data is in array/list data format or a scalar format (sample or other).
sample1_dat_survival – Sample 1 data if array/list.
sample2_dat_survival – Sample 1 data if array/list.
survival_confidence_level – Confidence interval to assess measure test.
sample1_dat_survival_mean – Sample 1 mean if scalar value.
sample1_dat_survival_size – Sample 1 size if scalar value.
sample2_dat_survival_mean – Sample 2 mean if scalar value.
sample2_dat_survival_size – Sample 2 size if scalar value.

Returns

P-value for statistical significance in difference in times until an outcomes between two samples.

classmethod ttest(df, grp_col, grp_1_flag, grp_2_flag, target_kpi)[source]¶

Performs a t-test between two groups split by a flag.

Parameters

df – Pandas DataFrame containing data.
grp_col – Column used to group the data.
grp_1_flag – Value used to distinguish group 1.
grp_2_flag – Value used to distinguish group 2.
target_kpi – Column for the target metric to compare test across groups.

Returns

T-Test p-value.

classmethod ttest_result(sample1_dat_ttest, sample2_dat_ttest)[source]¶

Performs a T-Test between two groups of data.

Parameters

sample1_dat_ttest – Sample 1 dataset.
sample2_dat_ttest – Sample 2 dataset.

Returns

T-Test p-value.