fusetools.stat_tools.Test¶
-
class
fusetools.stat_tools.Test[source]¶ Bases:
objectFunctions for implementing Statistical tests.
Methods
Calculates correlation between 2 proportions using a Chi-Squared test..
Performs a Pearson test of correlation between two data samples.
Calculates correlation between 2 categorical variables using Cramer’s method.
Calculates the Price Elasticity of Demand.
Performs a Poisson test which tests statistical difference between groups comparing counts over a period of time.
Calculates sample size needed for desired measuring effect size.
Performs a survival test which tells if statistical difference in times until an outcome between two samples.
Performs a t-test between two groups split by a flag.
Performs a T-Test between two groups of data.
-
classmethod
chi_squared_result(sample1_successes, sample1_trials, sample2_successes, sample2_trials)[source]¶ Calculates correlation between 2 proportions using a Chi-Squared test..
- Parameters
sample1_successes – Sample 1’s successes.
sample1_trials – Sample 1’s trials.
sample2_successes – Sample 2’s successes.
sample2_trials – Sample 2’s successes.
- Returns
Chi-Squared p-value.
-
classmethod
correlation(sample1_dat, sample2_dat)[source]¶ Performs a Pearson test of correlation between two data samples.
- Parameters
sample1_dat – Sample 1 data array/list.
sample2_dat – Sample 2 data array/list.
- Returns
Pearson correlation result.
-
classmethod
cramers_corrected_stat(cat_col1, cat_col2)[source]¶ Calculates correlation between 2 categorical variables using Cramer’s method.
- Parameters
cat_col1 – Categorical column 1.
cat_col2 – Categorical column 2.
- Returns
Correlation between 2 categorical variables using Cramer’s method.
-
classmethod
pe(type, original_quantity=False, new_quantity=False, original_price=False, new_price=False, pe_prices=False, pe_quantities=False)[source]¶ Calculates the Price Elasticity of Demand.
- Parameters
type – Classification of whether data is in array/list data format or a scalar format (sample or other).
original_quantity – Starting quantity demanded if data is scalar values.
new_quantity – Ending quantity demanded if data is scalar values.
original_price – Starting price if data is scalar values.
new_price – Ending price if data is scalar values.
pe_prices – Array/list of prices paid for quantities demanded.
pe_quantities – Array/lust of quantities demanded.
- Returns
Price elasticity of demand (float).
-
classmethod
poisson(sample1_events, sample1_days, sample2_events, sample2_days)[source]¶ Performs a Poisson test which tests statistical difference between groups comparing counts over a period of time.
- Parameters
sample1_events – Count of sample 1 events.
sample1_days – Count of sample 1 days.
sample2_events – Count of sample 2 events.
sample2_days – Count of sample 2 days.
- Returns
P-value for a Poisson statistical test.
-
classmethod
sample_size1(baseline_input, effect_size_input, significance_level_input, statistical_power_input)[source]¶ Calculates sample size needed for desired measuring effect size.
- Parameters
baseline_input – Baseline rate to measure effect against against.
effect_size_input – Desired effect size to measure.
significance_level_input – Desired level of statistical significance.
statistical_power_input – Desired level of statistical power.
- Returns
Calculated sample size.
-
classmethod
survival_result(data_type, sample1_dat_survival=False, sample2_dat_survival=False, survival_confidence_level=False, sample1_dat_survival_mean=False, sample1_dat_survival_size=False, sample2_dat_survival_mean=False, sample2_dat_survival_size=False)[source]¶ Performs a survival test which tells if statistical difference in times until an outcome between two samples.
- Parameters
data_type – Classification of whether data is in array/list data format or a scalar format (sample or other).
sample1_dat_survival – Sample 1 data if array/list.
sample2_dat_survival – Sample 1 data if array/list.
survival_confidence_level – Confidence interval to assess measure test.
sample1_dat_survival_mean – Sample 1 mean if scalar value.
sample1_dat_survival_size – Sample 1 size if scalar value.
sample2_dat_survival_mean – Sample 2 mean if scalar value.
sample2_dat_survival_size – Sample 2 size if scalar value.
- Returns
P-value for statistical significance in difference in times until an outcomes between two samples.
-
classmethod
ttest(df, grp_col, grp_1_flag, grp_2_flag, target_kpi)[source]¶ Performs a t-test between two groups split by a flag.
- Parameters
df – Pandas DataFrame containing data.
grp_col – Column used to group the data.
grp_1_flag – Value used to distinguish group 1.
grp_2_flag – Value used to distinguish group 2.
target_kpi – Column for the target metric to compare test across groups.
- Returns
T-Test p-value.
-
classmethod