Welcome to nemweb’s documentation!¶
This is a python package to directly download and process AEMO files from http://www.nemweb.com.au/. Main module within the package dowloads the nemweb files and inserts the tables with into a local sqlite database.
The key modules are found below:
nemweb¶
nemfile_reader¶
reading nemfiles and zipped nemfiles into pandas dataframes
-
class
nemweb.nemfile_reader.
ZipFileStreamer
(filename)¶ Bases:
zipfile.ZipFile
ZipFile subclass, with method to extract ZipFile as byte stream to memory
-
extract_stream
(member)¶ Extract a member from the archive as a byte stream or string steam, using its full name. ‘member’ may be a filename or a ZipInfo object.
-
-
nemweb.nemfile_reader.
nemfile_reader
(nemfile_object)¶ Returns a dict containing a pandas dataframe each table in a nemfile. The fileobject needs to be unzipped csv (nemfile), and can be either a file or an an in stream fileobject.
-
nemweb.nemfile_reader.
nemzip_reader
(nemzip_object)¶ Returns a dict containing a pandas dataframe each table in a zipped nemfile. The fileobject is needs to be a zipped csv (nemzip), and can be either a file or an in stream fileobject. Function checks there is only one file to unzip, unzips to a nemfile (csv) in memory, and passes nemfile_object to nemfile reader.
-
nemweb.nemfile_reader.
zip_streams
(fileobject)¶ Generator that yields each memeber of a zipfile as a BytesIO stream. Can take a filename or file-like object (BytesIO object) as an argument.
nemweb_current¶
Module for downloading data different ‘CURRENT’ nemweb dataset (selected data sets from files from http://www.nemweb.com.au/Reports/CURRENT)
Module includes one main superclass for handling generic nemweb current files. A series of namedtuples (strored in global constant DATASETS) contains the relevant data for specfic datasets. Datasets included from ‘CURRENT’ index page:
- TradingIS_Reports
- DispatchIS_Reports
- Dispatch_SCADA
- Next_Day_Dispatch (DISPATCH_UNIT_SOLUTION)
- Next_Day_Actual_Gen (METER_DATA_GEN_DUID)
- ROOFTOP_PV/ACTUAL
-
nemweb.nemweb_current.
CurrentDataset
¶ alias of
nemweb.nemweb_current.NemwebCurrentFile
-
class
nemweb.nemweb_current.
CurrentFileHandler
¶ Bases:
object
class for handling ‘CURRENT’ nemweb files from http://www.nemweb.com.au Requires a ‘CurrentDataset’ namedtuple with following fields:
- nemweb_name: the name of the dataset to be download (e.g. Dispatch_SCADA)
- filename_pattern: a regex expression to match and a determine datetime from filename on nemweb. As example, for files in the Dispatch_SCADA dataset (e.g “PUBLIC_DISPATCHSCADA_201806201135_0000000296175732.zip”) the regex file_patten is PUBLIC_DISPATCHSCADA_([0-9]{12})_[0-9]{16}.zip
- the format of the string to strip the datetime from. From the above example, the match returns ‘201806201135’, so the string is “%Y%m%d%H%M”,
- the list of tables to insert from each dataset. This is derived from the 2nd and 3rd column in the nemweb dataset. For example, the 2nd column is in Dispatch_SCADA is “DISPATCH” and the 3rd is “SCADA_VALUE” and the name is “DISPATCH_UNIT_SCADA”.
Several datasets contain multiple tables. Examples can be found in the DATASETS dict (nemweb_reader.DATASETS)
-
download
(link)¶ Dowloads nemweb zipfile from link into memory as a byteIO object. nemfile object is returned from the byteIO object
-
update_data
(dataset, print_progress=False, start_date=None, end_date='30001225', db_name='nemweb_live.db')¶ Main method to process nemweb dataset - downloads the index page for the dataset - determines date to start downloading from - matches the start date against files in the index - inserts new files into database
-
nemweb.nemweb_current.
update_datasets
(datasets, print_progress=False)¶ function that updates a subset of datasets (as a list) contained in DATASETS
nemweb_sqlite¶
interfaces with sqlite3 database
-
nemweb.nemweb_sqlite.
insert
(dataframe, table_name, db_name='nemweb_live.db')¶ Inserts dataframe into a table (table name) in an sqlite3 database (db_name). Database directory needs to be specfied in config.ini file
-
nemweb.nemweb_sqlite.
start_from
(table_name, db_name='nemweb_live.db', timestamp_col='SETTLEMENTDATE', start_date=None)¶ Returns a date to start downloading data from. Tries determining latest date from table in database. On fail prompts user to input date.
-
nemweb.nemweb_sqlite.
table_latest_record
(table_name, db_name='nemweb_live.db', timestamp_col='SETTLEMENTDATE')¶ Returns the lastest timestamp from a table in an sqlite3 database as a datetime object.
Timestamp fields in nemweb files usually named “SETTLEMENTDATE”. Sometimes INTERVAL_DATETIME is used.