mettoolbox.indices.spei¶
- mettoolbox.indices.spei(rainfall, pet, source_units, nsmallest=None, nlargest=None, groupby='M', fit_type='lmom', dist_type='gam', scale=1, start_date=None, end_date=None, dropna='no', clean=False, round_index=None, skiprows=None, index_type='datetime')¶
Standard Precipitation/Evaporation Index.
Calculates a windows cumulative sum of daily precipitation minus evaporation.
- Parameters:
rainfall (
Union
[int
,str
,DataFrame
]) –A csv, wdm, hdf5, xlsx file or a pandas DataFrame or Series or an integer column or string name of standard input.
Represents a daily time-series of precipitation in units specified in source_units.
pet (
Union
[int
,str
,DataFrame
]) –A csv, wdm, hdf5, xlsx file or a pandas DataFrame or Series or an integer column or string name of standard input.
Represents a daily time-series of evaporation in units specified in source_units.
source_units (str) –
[optional, default is None, transformation]
If unit is specified for the column as the second field of a ‘:’ delimited column name, then the specified units and the ‘source_units’ must match exactly.
Any unit string compatible with the ‘pint’ library can be used.
nsmallest (int) –
[optional, default is None]
Return the “n” days with the smallest precipitation minus evaporation index value within the groupby pandas offset period.
Cannot assign both nsmallest and nlargest keywords.
nlargest (int) –
[optional, default is None]
Return the “n” days with the largest precipitation minus evaporation index value within the groupby pandas offset period.
Cannot assign both nsmallest and nlargest keywords.
groupby (str) – Pandas offset period string representing the time over which the nsmallest or nlargest values would be evaluated.
fit_type (str ("lmom" or "mle")) – Specify the type of fit to use for fitting distribution to the precipitation data. Either L-moments (lmom) or Maximum Likelihood Estimation (mle). Note use L-moments when comparing to NCAR’s NCL code and R’s packages to calculate SPI and SPEI.
dist_type (str) –
The distribution type to fit using either L-moments (fit_type=”lmom”) or MLE (fit_type=”mle”).
dist_type
Distribution
fit_type lmom
fit_type mle
gam
Gamma
X
X
exp
Exponential
X
X
gev
Generalized Extreme Value
X
X
gpa
Generalized Pareto
X
X
gum
Gumbel
X
X
nor
Normal
X
X
pe3
Pearson III
X
X
wei
Weibull
X
X
glo
Generalized Logistic
X
gno
Generalized Normal
X
kap
Kappa
X
wak
Wakeby
X
scale (int (default=1)) – Integer to specify the number of time periods over which the standardized precipitation index is to be calculated. If freq=”M” then this is the number of months.
input_ts (str) –
[optional though required if using within Python, default is ‘-’ (stdin)]
Whether from a file or standard input, data requires a single line header of column names. The default header is the first line of the input, but this can be changed for CSV files using the ‘skiprows’ option.
Most common date formats can be used, but the closer to ISO 8601 date/time standard the better.
Comma-separated values (CSV) files or tab-separated values (TSV):
File separators will be automatically detected. Columns can be selected by name or index, where the index for data columns starts at 1.
Command line examples:
Keyword Example
Description
–input_ts=fn.csv
read all columns from ‘fn.csv’
–input_ts=fn.csv,2,1
read data columns 2 and 1 from ‘fn.csv’
–input_ts=fn.csv,2,skiprows=2
read data column 2 from ‘fn.csv’, skipping first 2 rows so header is read from third row
–input_ts=fn.xlsx,2,Sheet21
read all data from 2nd sheet all data from “Sheet21” of ‘fn.xlsx’
–input_ts=fn.hdf5,Table12,T2
read all data from table “Table12” then all data from table “T2” of ‘fn.hdf5’
–input_ts=fn.wdm,210,110
read DSNs 210, then 110 from ‘fn.wdm’
–input_ts=’-’
read all columns from standard input (stdin)
–input_ts=’-’ –columns=4,1
read column 4 and 1 from standard input (stdin)
If working with CSV or TSV files you can use redirection rather than use –input_ts=fname.csv. The following are identical:
From a file:
command subcmd –input_ts=fname.csv
From standard input (since ‘–input_ts=-’ is the default:
command subcmd < fname.csv
Can also combine commands by piping:
command subcmd < filein.csv | command subcmd1 > fileout.csv
Python library examples:
You must use the `input_ts=...` option where `input_ts` can be one of a [pandas DataFrame, pandas Series, dict, tuple, list, StringIO, or file name].
start_date (str) –
[optional, defaults to first date in time-series, input filter]
The start_date of the series in ISOdatetime format, or ‘None’ for beginning.
end_date (str) –
[optional, defaults to last date in time-series, input filter]
The end_date of the series in ISOdatetime format, or ‘None’ for end.
dropna (str) –
[optional, defauls it ‘no’, input filter]
Set dropna to ‘any’ to have records dropped that have NA value in any column, or ‘all’ to have records dropped that have NA in all columns. Set to ‘no’ to not drop any records. The default is ‘no’.
clean –
[optional, default is False, input filter]
The ‘clean’ command will repair a input index, removing duplicate index values and sorting.
round_index –
[optional, default is None which will do nothing to the index, output format]
Round the index to the nearest time point. Can significantly improve the performance since can cut down on memory and processing requirements, however be cautious about rounding to a very course interval from a small one. This could lead to duplicate values in the index.
skiprows (list-like or integer or callable) –
[optional, default is None which will infer header from first line, input filter]
Line numbers to skip (0-indexed) if a list or number of lines to skip at the start of the file if an integer.
If used in Python can be a callable, the callable function will be evaluated against the row indices, returning True if the row should be skipped and False otherwise. An example of a valid callable argument would be
lambda x: x in [0, 2]
.index_type (str) –
[optional, default is ‘datetime’, output format]
Can be either ‘number’ or ‘datetime’. Use ‘number’ with index values that are Julian dates, or other epoch reference.
names (str) –
[optional, default is None, transformation]
If None, the column names are taken from the first row after ‘skiprows’ from the input dataset.
MUST include a name for all columns in the input dataset, including the index column.
print_input –
[optional, default is False, output format]
If set to ‘True’ will include the input columns in the output table.
tablefmt (str) –
[optional, default is ‘csv’, output format]
The table format. Can be one of ‘csv’, ‘tsv’, ‘plain’, ‘simple’, ‘grid’, ‘pipe’, ‘orgtbl’, ‘rst’, ‘mediawiki’, ‘latex’, ‘latex_raw’ and ‘latex_booktabs’.