Dataset statistics
| Number of variables | 14 |
|---|---|
| Number of observations | 25000 |
| Missing cells | 45602 |
| Missing cells (%) | 13.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 2.7 MiB |
| Average record size in memory | 112.0 B |
Variable types
| NUM | 10 |
|---|---|
| CAT | 2 |
| BOOL | 2 |
first_order_day has a high cardinality: 412 distinct values | High cardinality |
cnt_orders_60d_fwd is highly correlated with cnt_orders_30d_fwd and 1 other fields | High correlation |
cnt_orders_30d_fwd is highly correlated with cnt_orders_60d_fwd | High correlation |
cnt_orders_90d_fwd is highly correlated with cnt_orders_60d_fwd and 1 other fields | High correlation |
cnt_orders_6m_fwd is highly correlated with cnt_orders_90d_fwd | High correlation |
voucher_amount has 22801 (91.2%) missing values | Missing |
member_get_member_viral has 22801 (91.2%) missing values | Missing |
user_id has unique values | Unique |
Reproduction
| Analysis started | 2020-10-06 14:14:03.172326 |
|---|---|
| Analysis finished | 2020-10-06 14:14:53.666148 |
| Duration | 50.49 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
| Distinct | 25000 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2135903.17 |
|---|---|
| Minimum | 1988 |
| Maximum | 5220624 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 195.3 KiB |
Quantile statistics
| Minimum | 1988 |
|---|---|
| 5-th percentile | 335317.3 |
| Q1 | 828961.5 |
| median | 1865581 |
| Q3 | 3280115.5 |
| 95-th percentile | 4753594.7 |
| Maximum | 5220624 |
| Range | 5218636 |
| Interquartile range (IQR) | 2451154 |
Descriptive statistics
| Standard deviation | 1443230.416 |
|---|---|
| Coefficient of variation (CV) | 0.6757003019 |
| Kurtosis | -0.9984336108 |
| Mean | 2135903.17 |
| Median Absolute Deviation (MAD) | 1159394 |
| Skewness | 0.4726336348 |
| Sum | 5.339757924e+10 |
| Variance | 2.082914035e+12 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 551950 | 1 | < 0.1% | |
| 554654 | 1 | < 0.1% | |
| 2091688 | 1 | < 0.1% | |
| 735878 | 1 | < 0.1% | |
| 1047210 | 1 | < 0.1% | |
| 697946 | 1 | < 0.1% | |
| 1299266 | 1 | < 0.1% | |
| 2808494 | 1 | < 0.1% | |
| 4809684 | 1 | < 0.1% | |
| 371376 | 1 | < 0.1% | |
| Other values (24990) | 24990 | > 99.9% |
| Value | Count | Frequency (%) | |
| 1988 | 1 | < 0.1% | |
| 8172 | 1 | < 0.1% | |
| 9746 | 1 | < 0.1% | |
| 35580 | 1 | < 0.1% | |
| 37514 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 5220624 | 1 | < 0.1% | |
| 5220142 | 1 | < 0.1% | |
| 5219824 | 1 | < 0.1% | |
| 5219452 | 1 | < 0.1% | |
| 5219352 | 1 | < 0.1% |
| Distinct | 412 |
|---|---|
| Distinct (%) | 1.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 195.3 KiB |
| 05/03/2016 | 162 |
|---|---|
| 27/02/2016 | 150 |
| 28/02/2016 | 143 |
| 14/02/2016 | 142 |
| 06/03/2016 | 139 |
| Other values (407) |
| Value | Count | Frequency (%) | |
| 05/03/2016 | 162 | 0.6% | |
| 27/02/2016 | 150 | 0.6% | |
| 28/02/2016 | 143 | 0.6% | |
| 14/02/2016 | 142 | 0.6% | |
| 06/03/2016 | 139 | 0.6% | |
| 20/02/2016 | 138 | 0.6% | |
| 22/05/2016 | 130 | 0.5% | |
| 19/03/2016 | 129 | 0.5% | |
| 18/03/2016 | 128 | 0.5% | |
| 06/02/2016 | 127 | 0.5% | |
| Other values (402) | 23612 | 94.4% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |