This work is made available under the Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) License.
This means you are free to:
Under the following terms:
Notices:
For the avoidance of doubt and explanation of terms please refer to the full license notice and legal code.
If you wish to use any of the material from this report please cite as:
This work is (c) 2019 the University of Southampton.
You may not be reading the most recent version of this report. Please check:
the overall package documentation;
This work was supported by:
The NZ GREEN Grid household electricity demand study recruited a sample of c 25 households in each of two regions of New Zealand (Stephenson et al. 2017). The first sample was recruited in early 2014 and the second in early 2015. Research data includes:
NB: Version 1 of the data package does not include the time-use diaries.
This report provides summary data quality statistics for the original GREEN Grid GridSpy household power demand monitoring data. This data was used to create a derived ‘safe’ dataset using the code in the GREENGridData repository.
The original data files files are stored on the University of Otago’s High-Capacity Central File Storage HCS.
Data collection is ongoing and this section reports on the availability of data files collected up to the time at which the most recent safe file was created (2018-08-02 18:03:19). To date we have 25,148 files from 44 unique GridSpy IDs.
However a large number of files (14,929 or 59%) have 1 of two file sizes (43 or 2751 bytes) and we have determined that they contain no data as the monitoring devices have either been removed (households have moved or withdrawn from the study) or data transfer has failed. We therefore flag these files as ‘to be ignored’.
In addition two of the GridSpy units were re-used in new households following withdrawal of the original participants. The GridSpy IDs (rf_XX) remained unchanged despite allocation to different households. The original input data does not therefore distinguish between these households and we discuss how this is resolved in the clean safe data in Section 4.1 below.
Figure 3.1 shows the distribution of the file sizes of all files over time by GridSpy ID. Note that white indicates the presence of small files which may not contain observations.
Figure 3.1: Mean file sizes (all files)
As we can see, relatively large files were downloaded (manually) in June and October 2016 before an automated download process was implemented from January 2017. A final manual download appears to have taken place in early December 2017.
Figure 3.2 plots the same results but excludes files which do not meet the file size threshold and which we therefore assume do not contain data.