# tstoolbox.tstoolbox.ewm_window¶

`tstoolbox.tstoolbox.``ewm_window`(input_ts='-', columns=None, start_date=None, end_date=None, dropna='no', skiprows=None, index_type='datetime', names=None, clean=False, statistic='', alpha_com=None, alpha_span=None, alpha_halflife=None, alpha=None, min_periods=0, adjust=True, ignore_na=False, source_units=None, target_units=None, print_input=False)

Calculate exponential weighted functions.

Exactly one of center of mass, span, half-life, and alpha must be provided. Allowed values and relationship between the parameters are specified in the parameter descriptions above; see the link at the end of this section for a detailed explanation.

When adjust is True (default), weighted averages are calculated using weights (1-alpha)**(n-1), (1-alpha)**(n-2), … , 1-alpha, 1.

When adjust is False, weighted averages are calculated recursively as:

weighted_average = arg; weighted_average[i] = (1-alpha)*weighted_average[i-1] + alpha*arg[i].

When ignore_na is False (default), weights are based on absolute positions. For example, the weights of x and y used in calculating the final weighted average of [x, None, y] are (1-alpha)**2 and 1 (if adjust is True), and (1-alpha)**2 and alpha (if adjust is False).

When ignore_na is True (reproducing pre-0.15.0 behavior), weights are based on relative positions. For example, the weights of x and y used in calculating the final weighted average of [x, None, y] are 1-alpha and 1 (if adjust is True), and 1-alpha and alpha (if adjust is False).

More details can be found at http://pandas.pydata.org/pandas-docs/stable/computation.html#exponentially-weighted-windows

Parameters
• statistic (str) –

[optional, defaults to ‘’]

Statistic applied to each window.

 corr correlation cov covariance mean mean std standard deviation var variance

• alpha_com (float) –

[optional, defaults to None]

Specify decay in terms of center of mass, alpha=1/(1+com), for com>=0

• alpha_span (float) –

[optional, defaults to None]

Specify decay in terms of span, alpha=2/(span+1), for span1

• alpha_halflife (float) –

[optional, defaults to None]

Specify decay in terms of half-life, alpha=1-exp(log(0.5)/halflife), for halflife>0

• alpha (float) –

[optional, defaults to None]

Specify smoothing factor alpha directly, 0<alpha<=1

• min_periods (int) –

[optional, default is 0]

Minimum number of observations in window required to have a value (otherwise result is NA).

[optional, default is True]

Divide by decaying adjustment factor in beginning periods to account for imbalance in relative weightings (viewing EWMA as a moving average)

• ignore_na (boolean) – [optional, default is False] Ignore missing values when calculating weights.

• input_ts (str) –

[optional, required if using Python API, default is ‘-‘ (stdin)]

Whether from a file or standard input, data requires a header of column names. The default header is the first line of the input, but this can be changed using the ‘skiprows’ option.

Most separators will be automatically detected. Most common date formats can be used, but the closer to ISO 8601 date/time standard the better.

Command line:

```+-------------------------+------------------------+
| --input_ts=filename.csv | to read 'filename.csv' |
+-------------------------+------------------------+
| --input_ts='-'          | to read from standard  |
|                         | input (stdin).         |
+-------------------------+------------------------+

In many cases it is better to use redirection rather that use
`--input_ts=filename.csv`.  The following are identical:

From a file:

command subcmd --input_ts=filename.csv

From standard input:

command subcmd --input_ts=- < filename.csv

The BEST way since you don't have to include `--input_ts=-` because
that is the default:

command subcmd < filename.csv

Can also combine commands by piping:

command subcmd < filename.csv | command subcmd1 > fileout.csv
```

As Python Library:

```You MUST use the `input_ts=...` option where `input_ts` can be one
of a [pandas DataFrame, pandas Series, dict, tuple,
list, StringIO, or file name].

If result is a time series, returns a pandas DataFrame.
```

• columns

[optional, defaults to all columns, input filter]

Columns to select out of input. Can use column names from the first line header or column numbers. If using numbers, column number 1 is the first data column. To pick multiple columns; separate by commas with no spaces. As used in tstoolbox pick command.

This solves a big problem so that you don’t have to create a data set with a certain order, you can rearrange columns when data is read in.

• start_date (str) –

[optional, defaults to first date in time-series, input filter]

The start_date of the series in ISOdatetime format, or ‘None’ for beginning.

• end_date (str) –

[optional, defaults to last date in time-series, input filter]

The end_date of the series in ISOdatetime format, or ‘None’ for end.

• dropna (str) –

[optional, defauls it ‘no’, input filter]

Set dropna to ‘any’ to have records dropped that have NA value in any column, or ‘all’ to have records dropped that have NA in all columns. Set to ‘no’ to not drop any records. The default is ‘no’.

• skiprows (list-like or integer or callable) –

[optional, default is None which will infer header from first line, input filter]

Line numbers to skip (0-indexed) or number of lines to skip (int) at the start of the file.

If callable, the callable function will be evaluated against the row indices, returning True if the row should be skipped and False otherwise. An example of a valid callable argument would be

`lambda x: x in [0, 2]`.

• index_type (str) –

[optional, default is ‘datetime’, output format]

Can be either ‘number’ or ‘datetime’. Use ‘number’ with index values that are Julian dates, or other epoch reference.

• names

[optional, default is None, input filter]

If None, the column names are taken from the first row after ‘skiprows’ from the input dataset.

• clean

[optional, default is False, input filter]

The ‘clean’ command will repair an index, removing duplicate index values and sorting.

• source_units

[optional, default is None, transformation]

If unit is specified for the column as the second field of a ‘:’ delimited column name, then the specified units and the ‘source_units’ must match exactly.

Any unit string compatible with the ‘pint’ library can be used.

• target_units

[optional, default is None, transformation]

The main purpose of this option is to convert units from those specified in the header line of the input into ‘target_units’.

The units of the input time-series or values are specified as the second field of a ‘:’ delimited name in the header line of the input or in the ‘source_units’ keyword.

Any unit string compatible with the ‘pint’ library can be used.

This option will also add the ‘target_units’ string to the column names.

• print_input

[optional, default is False, output format]

If set to ‘True’ will include the input columns in the output table.

• tablefmt (str) –

[optional, default is ‘csv’, output format]

The table format. Can be one of ‘csv’, ‘tsv’, ‘plain’, ‘simple’, ‘grid’, ‘pipe’, ‘orgtbl’, ‘rst’, ‘mediawiki’, ‘latex’, ‘latex_raw’ and ‘latex_booktabs’.