Usage - Command Line¶
Just run ‘wdmtoolbox’ to get a list of subcommands
usage: wdmtoolbox [-h] [-v]
{copydsnlabel,copydsn,cleancopywdm,renumberdsn,deletedsn,wdmtoswmm5rdii,extract,wdmtostd,describedsn,listdsns,createnewwdm,createnewdsn,hydhrseqtowdm,stdtowdm,csvtowdm,setattrib}
...
positional arguments:
{copydsnlabel,copydsn,cleancopywdm,renumberdsn,deletedsn,wdmtoswmm5rdii,extract,wdmtostd,describedsn,listdsns,createnewwdm,createnewdsn,hydhrseqtowdm,stdtowdm,csvtowdm,setattrib}
copydsnlabel Make a copy of a DSN label (no data).
copydsn Make a copy of a DSN.
cleancopywdm Make a clean copy of a WDM file.
renumberdsn Renumber olddsn to newdsn.
deletedsn Delete DSN.
wdmtoswmm5rdii Print out DSN data to the screen in SWMM5 RDII format.
extract Print out DSN data to the screen with ISO-8601 dates.
wdmtostd DEPRECATED: New scripts use 'extract'. Will be removed
in the future.
describedsn Print out attributes of a single DSN
listdsns Print out a table describing all DSNs in the WDM.
createnewwdm Create a new WDM file, optional to overwrite.
createnewdsn Create a new DSN.
hydhrseqtowdm Write HYDHR sequential file to a DSN.
stdtowdm DEPRECATED: Use 'csvtowdm'.
csvtowdm Write data from a CSV file to a DSN.
setattrib Set an attribute value for the DSN. See WDM
documentation for full list
options:
-h, --help show this help message and exit
-v, --version show program's version number and exit
The default for all of the subcommands that accept time-series data is to pull from stdin (typically a pipe or redirection). If a subcommand accepts an input file for an argument, you can use “… –input_ts=input_file_name.csv …”, or redirection “… < input_file_name.csv”.
A WDM file stores time-series asociated with a Data Set Number (DSN). A DSN is a number between 1 and 32000, though HSPF can only use for input and output DSNs below 1000. DSN numbers of 1000 and above should be used for calculation and observed time-series. The DSN must exist before before being used.
Typical usage:
wdmtoolbox createnewwdm met.wdm
wdmtoolbox createnewdsn met.wdm 101 --tcode=3 --constituent=HPCP --tstype=HPCP --location=12345678 --description='NWS STATION 1' --scenario=INPUT
wdmtoolbox csvtowdm met.wdm 1011 < nws_station_1.csv
To look at the DSN table:
wdmtoolbox listdsns met.wdm
You can also use “tsgettoolbox” to populate the DSN with data from various on-line sources. Look at the “tsgettoolbox” documentation at tsgettoolbox for particulars on installation, but it may be as easy as “pip install tsgettoolbox”.
“tsgettoolbox” examples:
# Make a new wdm.
wdmtoolbox createnewwdm obs.wdm
# Create new DSN.
wdmtoolbox createnewdsn obs.wdm 10 --scenario SIMULATE --location 02232000 --constituent FLOW
# Download flow data for USGS station 02232000 and pipe into DSN.
# The --startDT option is required otherwise only the latest value is
# returned.
tsgettoolbox nwis --sites=02232000 --parameterCd=00060 --startDT 2000-01-01 | wdmtoolbox csvtowdm obs.wdm 10
# List DSNs.
wdmtoolbox listdsns obs.wdm
# Plot the flow data in DSN 10.
wdmtoolbox extract obs.wdm 10 | tstoolbox plot
Sub-command Detail¶
cleancopywdm¶
usage: wdmtoolbox cleancopywdm [-h] [--overwrite] inwdmpath outwdmpath
Make a clean copy of a WDM file.
positional arguments:
inwdmpath Path and WDM filename of the input
WDM file.
outwdmpath Path and WDM filename of the output
WDM file.
options:
-h | --help
show this help message and exit
--overwrite
Whether to overwrite the target DSN if it exists.
copydsn¶
usage: wdmtoolbox copydsn [-h] [--overwrite] inwdmpath indsn outwdmpath outdsn
Make a copy of a DSN.
positional arguments:
inwdmpath Path and WDM filename of the input
WDM file.
indsn Source
DSN.
outwdmpath Path and WDM filename of the output
WDM file.
outdsn Target
DSN.
options:
-h | --help
show this help message and exit
--overwrite
Whether to overwrite the target DSN if it exists.
createnewdsn¶
usage: wdmtoolbox createnewdsn [-h] [--tstype TSTYPE] [--base_year BASE_YEAR]
[--tcode TCODE] [--tsstep TSSTEP] [--statid STATID] [--scenario SCENARIO]
[--location LOCATION] [--description DESCRIPTION] [--constituent
CONSTITUENT] [--tsfill TSFILL] wdmpath dsn
Create a new DSN.
positional arguments:
wdmpath Path and WDM
filename.
dsn The Data Set Number (DSN) for the time series in the WDM file.
This number must be greater or equal to 1 and less than or equal to 32000.
HSPF can only use for input or output DSNs of 1 to 9999, inclusive.
options:
-h | --help
show this help message and exit
--tstype TSTYPE
[optional, default to first 4 characters of 'constituent']
Time series type. Can be any 4 character string, but if not specified
defaults to first 4 characters of 'constituent'. Must match what is
used in HSPF UCI file.
Limited to 4 characters.
--base_year BASE_YEAR
[optional, defaults to 1900]
Base year of time series. The DSN will not accept any time-series before
this date and with the default settings of TGROUP=6 (i.e. yearly)
would allow time-series up to 2199.
--tcode TCODE
[optional, defaults to 4=daily time series]
Time series code, (1=second, 2=minute, 3=hour, 4=day, 5=month, 6=year)
--tsstep TSSTEP
[optional, defaults to 1]
Time series steps, defaults (and almost always is) 1.
--statid STATID
[optional, defaults to '']
The station name, limited to 16 characters.
--scenario SCENARIO
[optional defaults to '']
The name of the scenario. Can be anything, but typically, 'OBSERVED' for
calibration and input time-series and 'SIMULATE' for model results.
Limited to 8 characters.
--location LOCATION
[optional defaults to '']
The location name.
Limited to 8 characters.
--description DESCRIPTION
[optional, defaults to '']
Descriptive text.
Limited to 48 characters.
--constituent CONSTITUENT
[optional, defaults to '']
The constituent that the time series represents.
Limited to 8 characters.
--tsfill TSFILL
[optional, defaults to -999]
A time-series in a WDM file must have a value for every time interval. The
"tsfill" number is used as a placeholder for missing values.
Change to a number that is guaranteed to not be a valid number in your
time-series.
createnewwdm¶
usage: wdmtoolbox createnewwdm [-h] [--overwrite] wdmpath
Create a new WDM file, optional to overwrite.
positional arguments:
wdmpath Path and WDM
filename.
options:
-h | --help
show this help message and exit
--overwrite
Whether to overwrite the target DSN if it exists.
csvtowdm¶
usage: wdmtoolbox csvtowdm [-h] [--start_date START_DATE]
[--end_date END_DATE] [--columns COLUMNS] [--force_freq FORCE_FREQ] [--groupby
GROUPBY] [--round_index ROUND_INDEX] [--clean] [--target_units TARGET_UNITS]
[--source_units SOURCE_UNITS] [--input_ts INPUT_TS] wdmpath dsn
File can have comma separated 'year', 'month', 'day', 'hour', 'minute',
'second', 'value' OR 'date/time string', 'value'
positional arguments:
wdmpath Path and WDM
filename.
dsn The Data Set Number (DSN) for the time series in the WDM file.
This number must be greater or equal to 1 and less than or equal to 32000.
HSPF can only use for input or output DSNs of 1 to 9999, inclusive.
options:
-h | --help
show this help message and exit
--start_date START_DATE
[optional, defaults to first date in time-series, input filter]
The start_date of the series in ISOdatetime format, or 'None' for
beginning.
--end_date END_DATE
[optional, defaults to last date in time-series, input filter]
The end_date of the series in ISOdatetime format, or 'None' for end.
--columns COLUMNS
[optional, defaults to all columns, input filter]
Columns to select out of input. Can use column names from the first line
header or column numbers. If using numbers, column number 1 is the
first data column. To pick multiple columns; separate by commas with
no spaces. As used in toolbox_utils pick command.
This solves a big problem so that you don't have to create a data set with
a certain column order, you can rearrange columns when data is read
in.
--force_freq FORCE_FREQ
[optional, output format]
Force this frequency for the output. Typically you will only want to
enforce a smaller interval where toolbox_utils will insert missing
values as needed. WARNING: you may lose data if not careful with
this option. In general, letting the algorithm determine the
frequency should always work, but this option will override. Use
PANDAS offset codes.
--groupby GROUPBY
[optional, default is None, transformation]
The pandas offset code to group the time-series data into. A special code
is also available to group 'months_across_years' that will group
into twelve monthly categories across the entire time-series.
--round_index ROUND_INDEX
[optional, default is None which will do nothing to the index, output
format]
Round the index to the nearest time point. Can significantly improve the
performance since can cut down on memory and processing
requirements, however be cautious about rounding to a very course
interval from a small one. This could lead to duplicate values in
the index.
--clean
[optional, default is False, input filter]
The 'clean' command will repair a input index, removing duplicate index
values and sorting.
--target_units TARGET_UNITS
[optional, default is None, transformation]
The purpose of this option is to specify target units for unit conversion.
The source units are specified in the header line of the input or
using the 'source_units' keyword.
The units of the input time-series or values are specified as the second
field of a ':' delimited name in the header line of the input or in
the 'source_units' keyword.
Any unit string compatible with the 'pint' library can be used.
This option will also add the 'target_units' string to the column names.
--source_units SOURCE_UNITS
[optional, default is None, transformation]
If unit is specified for the column as the second field of a ':' delimited
column name, then the specified units and the 'source_units' must
match exactly.
Any unit string compatible with the 'pint' library can be used.
--input_ts INPUT_TS
[optional though required if using within Python, default is '-' (stdin)]
Whether from a file or standard input, data requires a single line header
of column names. The default header is the first line of the input,
but this can be changed for CSV files using the 'skiprows' option.
Most common date formats can be used, but the closer to ISO 8601 date/time
standard the better.
Comma-separated values (CSV) files or tab-separated values (TSV):
File separators will be automatically detected.
Columns can be selected by name or index, where the index for
data columns starts at 1.
Command line examples:
┌─────────────────────────────────┬───────────────────────────┐
│ Keyword Example │ Description │
╞═════════════════════════════════╪═══════════════════════════╡
│ --input_ts=fn.csv │ read all columns from │
│ │ 'fn.csv' │
├─────────────────────────────────┼───────────────────────────┤
│ --input_ts=fn.csv,2,1 │ read data columns 2 and 1 │
│ │ from 'fn.csv' │
├─────────────────────────────────┼───────────────────────────┤
│ --input_ts=fn.csv,2,skiprows=2 │ read data column 2 from │
│ │ 'fn.csv', skipping first │
│ │ 2 rows so header is read │
│ │ from third row │
├─────────────────────────────────┼───────────────────────────┤
│ --input_ts=fn.xlsx,2,Sheet21 │ read all data from 2nd │
│ │ sheet all data from │
│ │ "Sheet21" of 'fn.xlsx' │
├─────────────────────────────────┼───────────────────────────┤
│ --input_ts=fn.hdf5,Table12,T2 │ read all data from table │
│ │ "Table12" then all data │
│ │ from table "T2" of │
│ │ 'fn.hdf5' │
├─────────────────────────────────┼───────────────────────────┤
│ --input_ts=fn.wdm,210,110 │ read DSNs 210, then 110 │
│ │ from 'fn.wdm' │
├─────────────────────────────────┼───────────────────────────┤
│ --input_ts='-' │ read all columns from │
│ │ standard input (stdin) │
├─────────────────────────────────┼───────────────────────────┤
│ --input_ts='-' --columns=4,1 │ read column 4 and 1 from │
│ │ standard input (stdin) │
╘═════════════════════════════════╧═══════════════════════════╛
If working with CSV or TSV files you can use redirection rather than use
--input_ts=fname.csv. The following are identical:
From a file:
command subcmd --input_ts=fname.csv
From standard input (since '--input_ts=-' is the default:
command subcmd < fname.csv
Can also combine commands by piping:
command subcmd < filein.csv | command subcmd1 > fileout.csv
Python library examples:
You must use the `input_ts=...` option where `input_ts` can be
one of a [pandas DataFrame, pandas Series, dict, tuple, list,
StringIO, or file name].
deletedsn¶
usage: wdmtoolbox deletedsn [-h] wdmpath dsn
Delete DSN.
positional arguments:
wdmpath Path and WDM
filename.
dsn DSN to
delete.
options:
-h | --help
show this help message and exit
describedsn¶
usage: wdmtoolbox describedsn [-h] [--attrs ATTRS] [--tablefmt TABLEFMT]
wdmpath dsn
Print out attributes of a single DSN
positional arguments:
wdmpath Path and WDM
filename.
dsn The Data Set Number (DSN) for the time series in the WDM file.
This number must be greater or equal to 1 and less than or equal to 32000.
HSPF can only use for input or output DSNs of 1 to 9999, inclusive.
options:
-h | --help
show this help message and exit
--attrs ATTRS
[optional, default to "default"]
Attributes to retrieve from the DSN.
┌────────────────────┬─────────────────────────────────────────────┐
│ attrs │ Attributes Retrieved │
╞════════════════════╪═════════════════════════════════════════════╡
│ default │ DSN, TSSTEP, TCODE, TSFILL, IDLOCN, IDSCEN, │
│ │ IDCONS, TSBYR, STANAM, TSTYPE │
├────────────────────┼─────────────────────────────────────────────┤
│ all │ All attributes set of the 450 total │
├────────────────────┼─────────────────────────────────────────────┤
│ comma separated │ Specific attributes named in the list │
│ list of attribute │ │
╘═names══════════════╧═════════════════════════════════════════════╛
--tablefmt TABLEFMT
[optional, default is 'csv', output format]
The table format. Can be one of 'csv', 'tsv', 'plain', 'simple', 'grid',
'pipe', 'orgtbl', 'rst', 'mediawiki', 'latex', 'latex_raw' and
'latex_booktabs'.
hydhrseqtowdm¶
usage: wdmtoolbox hydhrseqtowdm [-h] [--input_ts INPUT_TS]
[--start_century START_CENTURY] wdmpath dsn
Write HYDHR sequential file to a DSN.
positional arguments:
wdmpath Path and WDM
filename.
dsn The Data Set Number (DSN) for the time series in the WDM file.
This number must be greater or equal to 1 and less than or equal to 32000.
HSPF can only use for input or output DSNs of 1 to 9999, inclusive.
options:
-h | --help
show this help message and exit
--input_ts INPUT_TS
[optional though required if using within Python, default is '-' (stdin)]
Whether from a file or standard input, data requires a single line header
of column names. The default header is the first line of the input,
but this can be changed for CSV files using the 'skiprows' option.
Most common date formats can be used, but the closer to ISO 8601 date/time
standard the better.
Comma-separated values (CSV) files or tab-separated values (TSV):
File separators will be automatically detected.
Columns can be selected by name or index, where the index for
data columns starts at 1.
Command line examples:
┌─────────────────────────────────┬───────────────────────────┐
│ Keyword Example │ Description │
╞═════════════════════════════════╪═══════════════════════════╡
│ --input_ts=fn.csv │ read all columns from │
│ │ 'fn.csv' │
├─────────────────────────────────┼───────────────────────────┤
│ --input_ts=fn.csv,2,1 │ read data columns 2 and 1 │
│ │ from 'fn.csv' │
├─────────────────────────────────┼───────────────────────────┤
│ --input_ts=fn.csv,2,skiprows=2 │ read data column 2 from │
│ │ 'fn.csv', skipping first │
│ │ 2 rows so header is read │
│ │ from third row │
├─────────────────────────────────┼───────────────────────────┤
│ --input_ts=fn.xlsx,2,Sheet21 │ read all data from 2nd │
│ │ sheet all data from │
│ │ "Sheet21" of 'fn.xlsx' │
├─────────────────────────────────┼───────────────────────────┤
│ --input_ts=fn.hdf5,Table12,T2 │ read all data from table │
│ │ "Table12" then all data │
│ │ from table "T2" of │
│ │ 'fn.hdf5' │
├─────────────────────────────────┼───────────────────────────┤
│ --input_ts=fn.wdm,210,110 │ read DSNs 210, then 110 │
│ │ from 'fn.wdm' │
├─────────────────────────────────┼───────────────────────────┤
│ --input_ts='-' │ read all columns from │
│ │ standard input (stdin) │
├─────────────────────────────────┼───────────────────────────┤
│ --input_ts='-' --columns=4,1 │ read column 4 and 1 from │
│ │ standard input (stdin) │
╘═════════════════════════════════╧═══════════════════════════╛
If working with CSV or TSV files you can use redirection rather than use
--input_ts=fname.csv. The following are identical:
From a file:
command subcmd --input_ts=fname.csv
From standard input (since '--input_ts=-' is the default:
command subcmd < fname.csv
Can also combine commands by piping:
command subcmd < filein.csv | command subcmd1 > fileout.csv
Python library examples:
You must use the `input_ts=...` option where `input_ts` can be
one of a [pandas DataFrame, pandas Series, dict, tuple, list,
StringIO, or file name].
--start_century START_CENTURY
Since 2 digit years are used, need century, defaults to 1900.
listdsns¶
usage: wdmtoolbox listdsns [-h] wdmpath
Print out a table describing all DSNs in the WDM.
positional arguments:
wdmpath Path and WDM
filename.
options:
-h | --help
show this help message and exit
renumberdsn¶
usage: wdmtoolbox renumberdsn [-h] wdmpath olddsn newdsn
Renumber olddsn to newdsn.
positional arguments:
wdmpath Path and WDM
filename.
olddsn Old DSN to
renumber.
newdsn New DSN to change old DSN
to.
options:
-h | --help
show this help message and exit
extract¶
usage: wdmtoolbox extract [-h] [--start_date START_DATE] [--end_date END_DATE]
[wdmpath ...]
Print out DSN data to the screen with ISO-8601 dates.
positional arguments:
wdmpath Path and WDM
filename. followed by space separated list of DSNs. For example:
'file.wdm 234 345 456'
OR
`wdmpath` can be space separated sets of 'wdmpath,dsn'.
'file.wdm,101 file2.wdm,104 file.wdm,227'
options:
-h | --help
show this help message and exit
--start_date START_DATE
[optional, defaults to first date in time-series, input filter]
The start_date of the series in ISOdatetime format, or 'None' for
beginning.
--end_date END_DATE
[optional, defaults to last date in time-series, input filter]
The end_date of the series in ISOdatetime format, or 'None' for end.
wdmtostd¶
usage: wdmtoolbox wdmtostd [-h] wdmpath [dsns ...] kwds
DEPRECATED: New scripts use 'extract'. Will be removed in the future.
positional arguments:
wdmpath dsns kwds
options:
-h | --help
show this help message and exit
wdmtoswmm5rdii¶
<string>:8: (WARNING/2) Definition list ends without a blank line; unexpected unindent.
usage: wdmtoolbox wdmtoswmm5rdii [-h] wdmpath [dsns ...] kwds
Print out DSN data to the screen in SWMM5 RDII format.
positional arguments:
wdmpath Path and WDM
filename.
Definition list ends without a blank line; unexpected unindent.
dsns kwds
options:
-h | --help
show this help message and exit
Usage - API¶
You can use all of the command line subcommands as functions. The function signature is identical to the command line subcommands.
Returns:
wdmtoolbox.extract returns a PANDAS DataFrame.
wdmtoolbox.listdsns returns a Python dictionary.
Almost all of the remaining functions do not return anything.
Input can be a CSV or TAB separated file, or a PANDAS DataFrame and is supplied to the function via the ‘input_ts’ keyword.
Simply import wdmtoolbox:
from wdmtoolbox import wdmtoolbox
# Then you could call the functions
ntsd = wdmtoolbox.extract('test.wdm', 4)
# Once you have a PANDAS DataFrame you can use that as input.
# For example, use 'tstoolbox' to aggregate...
from tstoolbox import tstoolbox
ntsd = tstoolbox.aggregate(statistic='mean', agg_interval='daily', input_ts=ntsd)