tstoolbox.tstoolbox.gof¶
- tstoolbox.tstoolbox.gof(obs_col=1, sim_col=2, stats='default', replace_nan=None, replace_inf=None, remove_neg=False, remove_zero=False, start_date=None, end_date=None, round_index=None, clean=False, index_type='datetime', source_units=None, target_units=None, kge_sr=1.0, kge09_salpha=1.0, kge12_sgamma=1.0, kge_sbeta=1.0)¶
Will calculate goodness of fit statistics between two time-series.
The first time series must be the observed, the second the simulated series. You can only give two time-series.
- Parameters:
obs_col – If integer represents the column number of standard input. Can be If integer represents the column number of standard input. Can be a csv, wdm, hdf or xlsx file following format specified in ‘tstoolbox read …’.
sim_col – If integer represents the column number of standard input. Can be a csv, wdm, hdf or xlsx file following format specified in ‘tstoolbox read …’.
stats (str) –
[optional, Python: list, Command line: comma separated string, default is ‘default’]
Comma separated list of statistical measures.
You can select two groups of statistical measures.
stats
Description
default
A subset of common statistic measures
all
All available statistic measures
The ‘default’ set of statistics are:
stats
Description
me
Mean error or bias -inf < ME < inf, close to 0 is better
pc_bias
Percent Bias -inf < PC_BIAS < inf, close to 0 is better
apc_bias
Absolute Percent Bias 0 <= APC_BIAS < inf, close to 0 is better
rmsd
Root Mean Square Deviation/Error 0 <= RMSD < inf, smaller is better
crmsd
Centered Root Mean Square Deviation/Error
corrcoef
Pearson Correlation coefficient (r) -1 <= r <= 1 1 perfect positive correlation 0 complete randomness -1 perfect negative correlation
coefdet
Coefficient of determination (r^2) 0 <= r^2 <= 1 1 perfect correlation 0 complete randomness
murphyss
Murphy Skill Score
nse
Nash-Sutcliffe Efficiency -inf < NSE < 1, larger is better
kge09
Kling-Gupta Efficiency, 2009 -inf < KGE09 < 1, larger is better
kge12
Kling-Gupta Efficiency, 2012 -inf < KGE12 < 1, larger is better
index_agreement
Index of agreement (d) 0 <= d < 1, larger is better
brierss
Brier Skill Score
mae
Mean Absolute Error 0 <= MAE < 1, larger is better
mean
observed mean, simulated mean
stdev
observed stdev, simulated stdev
Additional statistics:
stats
Description
acc
Anomaly correlation coefficient (ACC) -1 <= r <= 1 1 positive correlation of variation in anomalies 0 complete randomness of variation in anomalies -1 negative correlation of variation in anomalies
d1
Index of agreement (d1) 0 <= d1 < 1, larger is better
d1_p
Legate-McCabe Index of Agreement 0 <= d1_p < 1, larger is better
d
Index of agreement (d) 0 <= d < 1, larger is better
dmod
Modified index of agreement (dmod) 0 <= dmod < 1, larger is better
drel
Relative index of agreement (drel) 0 <= drel < 1, larger is better
dr
Refined index of agreement (dr) -1 <= dr < 1, larger is better
ed
Euclidean distance in vector space 0 <= ed < inf, smaller is better
g_mean_diff
Geometric mean difference
h1_mahe
H1 absolute error
h1_mhe
H1 mean error
h1_rmshe
H1 root mean square error
h2_mahe
H2 mean absolute error
h2_mhe
H2 mean error
h2_rmshe
H2 root mean square error
h3_mahe
H3 mean absolute error
h3_mhe
H3 mean error
h3_rmshe
H3 root mean square error
h4_mahe
H4 mean absolute error
h4_mhe
H4 mean error
h4_rmshe
H4 root mean square error
h5_mahe
H5 mean absolute error
h5_mhe
H5 mean error
h5_rmshe
H5 root mean square error
h6_mahe
H6 mean absolute error
h6_mhe
H6 mean error
h6_rmshe
H6 root mean square error
h7_mahe
H7 mean absolute error
h7_mhe
H7 mean error
h7_rmshe
H7 root mean square error
h8_mahe
H8 mean absolute error
h8_mhe
H8 mean error
h8_rmshe
H8 root mean square error
h10_mahe
H10 mean absolute error
h10_mhe
H10 mean error
h10_rmshe
H10 root mean square error
irmse
Inertial root mean square error (IRMSE) 0 <= irmse < inf, smaller is better
lm_index
Legate-McCabe Efficiency Index 0 <= lm_index < 1, larger is better
maape
Mean Arctangent Absolute Percentage Error (MAAPE) 0 <= maape < pi/2, smaller is better
male
Mean absolute log error 0 <= male < inf, smaller is better
mapd
Mean absolute percentage deviation (MAPD)
mape
Mean absolute percentage error (MAPE) 0 <= mape < inf, 0 indicates perfect correlation
mase
Mean absolute scaled error
mb_r
Mielke-Berry R value (MB R) 0 <= mb_r < 1, larger is better
mdae
Median absolute error (MdAE) 0 <= mdae < inf, smaller is better
mde
Median error (MdE) -inf < mde < inf, closer to zero is better
mdse
Median squared error (MdSE) 0 < mde < inf, closer to zero is better
mean_var
Mean variance
me
Mean error -inf < me < inf, closer to zero is better
mle
Mean log error -inf < mle < inf, closer to zero is better
mse
Mean squared error 0 <= mse < inf, smaller is better
msle
Mean squared log error 0 <= msle < inf, smaller is better
ned
Normalized Euclidian distance in vector space 0 <= ned < inf, smaller is better
nrmse_iqr
IQR normalized root mean square error 0 <= nrmse_iqr < inf, smaller is better
nrmse_mean
Mean normalized root mean square error 0 <= nrmse_mean < inf, smaller is better
nrmse_range
Range normalized root mean square error 0 <= nrmse_range < inf, smaller is better
nse_mod
Modified Nash-Sutcliffe efficiency (NSE mod) -inf < nse_mod < 1, larger is better
nse_rel
Relative Nash-Sutcliffe efficiency (NSE rel) -inf < nse_mod < 1, larger is better
rmse
Root mean square error 0 <= rmse < inf, smaller is better
rmsle
Root mean square log error 0 <= rmsle < inf, smaller is better
sa
Spectral Angle (SA) -pi/2 <= sa < pi/2, closer to 0 is better
sc
Spectral Correlation (SC) -pi/2 <= sc < pi/2, closer to 0 is better
sga
Spectral Gradient Angle (SGA) -pi/2 <= sga < pi/2, closer to 0 is better
sid
Spectral Information Divergence (SID) -pi/2 <= sid < pi/2, closer to 0 is better
smape1
Symmetric Mean Absolute Percentage Error (1) (SMAPE1) 0 <= smape1 < 100, smaller is better
smape2
Symmetric Mean Absolute Percentage Error (2) (SMAPE2) 0 <= smape2 < 100, smaller is better
spearman_r
Spearman rank correlation coefficient -1 <= spearman_r <= 1 1 perfect positive correlation 0 complete randomness -1 perfect negative correlation
ve
Volumetric Efficiency (VE) 0 <= ve < 1, smaller is better
watt_m
Watterson’s M (M) -1 <= watt_m < 1, larger is better
replace_nan (float) – If given, indicates which value to replace NaN values with in the two arrays. If None, when a NaN value is found at the i-th position in the observed OR simulated array, the i-th value of the observed and simulated array are removed before the computation.
replace_inf (float) – If given, indicates which value to replace Inf values with in the two arrays. If None, when an inf value is found at the i-th position in the observed OR simulated array, the i-th value of the observed and simulated array are removed before the computation.
remove_neg (boolean) – If True, when a negative value is found at the i-th position in the observed OR simulated array, the i-th value of the observed AND simulated array are removed before the computation.
remove_zero (boolean) – If true, when a zero value is found at the i-th position in the observed OR simulated array, the i-th value of the observed AND simulated array are removed before the computation.
start_date (str) –
[optional, defaults to first date in time-series, input filter]
The start_date of the series in ISOdatetime format, or ‘None’ for beginning.
end_date (str) –
[optional, defaults to last date in time-series, input filter]
The end_date of the series in ISOdatetime format, or ‘None’ for end.
round_index –
[optional, default is None which will do nothing to the index, output format]
Round the index to the nearest time point. Can significantly improve the performance since can cut down on memory and processing requirements, however be cautious about rounding to a very course interval from a small one. This could lead to duplicate values in the index.
clean –
[optional, default is False, input filter]
The ‘clean’ command will repair a input index, removing duplicate index values and sorting.
index_type (str) –
[optional, default is ‘datetime’, output format]
Can be either ‘number’ or ‘datetime’. Use ‘number’ with index values that are Julian dates, or other epoch reference.
source_units (str) –
[optional, default is None, transformation]
If unit is specified for the column as the second field of a ‘:’ delimited column name, then the specified units and the ‘source_units’ must match exactly.
Any unit string compatible with the ‘pint’ library can be used.
target_units (str) –
[optional, default is None, transformation]
The purpose of this option is to specify target units for unit conversion. The source units are specified in the header line of the input or using the ‘source_units’ keyword.
The units of the input time-series or values are specified as the second field of a ‘:’ delimited name in the header line of the input or in the ‘source_units’ keyword.
Any unit string compatible with the ‘pint’ library can be used.
This option will also add the ‘target_units’ string to the column names.
tablefmt (str) –
[optional, default is ‘csv’, output format]
The table format. Can be one of ‘csv’, ‘tsv’, ‘plain’, ‘simple’, ‘grid’, ‘pipe’, ‘orgtbl’, ‘rst’, ‘mediawiki’, ‘latex’, ‘latex_raw’ and ‘latex_booktabs’.
float_format –
[optional, output format]
Format for float numbers.
kge_sr (float) –
[optional, defaults to 1.0]
Scaling factor for kge09 and kge12 correlation.
kge09_salpha (float) –
[optional, defaults to 1.0]
Scaling factor for kge09 alpha.
kge12_sgamma (float) –
[optional, defaults to 1.0]
Scaling factor for kge12 beta.
kge_sbeta (float) –
[optional, defaults to 1.0]
Scaling factor for kge09 and kge12 beta.