This report uses the safe version of the grid spy 1 minute data which has been processed using the code in https://github.com/CfSOtago/GREENGridData/tree/master/dataProcessing/gridSpy. It also assumes you have already run the example circuit extraction script using circuit = Heat Pump.
Report purpose:
The data used to generate this report is:
First we load the household data. readr
will give some feedback on the columns.
Next we load the Grid Spy extract for Heat Pump. This uses a GREENGridData
package function intended to load the cleaned individual household data which warns that two of the column names are not found. These columns were dropped during the extraction process so we can safely ignore these warnings
hhID | linkID | r_dateTime | circuit | powerW |
rf_08 | rf_08 | 2015-04-01 00:00:00 | Heat Pump$2092 | 0 |
rf_08 | rf_08 | 2015-04-01 00:01:00 | Heat Pump$2092 | 0 |
rf_08 | rf_08 | 2015-04-01 00:02:00 | Heat Pump$2092 | 0 |
rf_08 | rf_08 | 2015-04-01 00:03:00 | Heat Pump$2092 | 0 |
rf_08 | rf_08 | 2015-04-01 00:04:00 | Heat Pump$2092 | 0 |
rf_08 | rf_08 | 2015-04-01 00:05:00 | Heat Pump$2092 | 0 |
Table 3.1 shows the first few rows of the Grid Spy 1 minute power data.
| r_dateTime | circuit |
| |
Length:14250284 | Length:14250284 | Min. :2015-04-01 00:00:00 | Length:14250284 | Min. : -655.00 | |
Class :character | Class :character | 1st Qu.:2015-06-22 12:39:00 | Class :character | 1st Qu.: 0.00 | |
Mode :character | Mode :character | Median :2015-09-16 13:12:00 | Mode :character | Median : 0.00 | |
NA | NA | Mean :2015-09-21 08:00:39 | NA | Mean : 147.92 | |
NA | NA | 3rd Qu.:2015-12-17 17:52:00 | NA | 3rd Qu.: 61.29 | |
NA | NA | Max. :2016-03-31 23:59:00 | NA | Max. :27759.00 |
Table 3.2 shows a summary of the Grid Spy 1 minute power data.
Note that we have some Nega watts - which households have them?
| r_dateTime | circuit |
| obsTime |
| |
Length:13851941 | Length:13851941 | Min. :2015-04-01 00:00:00 | Length:13851941 | Min. : 0.00 | Length:13851941 | Length:13851941 | |
Class :character | Class :character | 1st Qu.:2015-06-21 15:43:00 | Class :character | 1st Qu.: 0.00 | Class1:hms | Class :character | |
Mode :character | Mode :character | Median :2015-09-16 00:11:00 | Mode :character | Median : 0.00 | Class2:difftime | Mode :character | |
NA | NA | Mean :2015-09-20 18:47:53 | NA | Mean : 154.28 | Mode :numeric | NA | |
NA | NA | 3rd Qu.:2015-12-17 10:44:00 | NA | 3rd Qu.: 70.19 | NA | NA | |
NA | NA | Max. :2016-03-31 23:59:00 | NA | Max. :27759.00 | NA | NA |
Table 3.3 shows a summary of the Grid Spy 1 minute power data after the removal of any negaWatts.
Note that:
First we create a Southern Hemisphere season variable. Luckily we have a function to do this in the GREENGridData
package. We print a check table to ensure we are all happy with the coding of season
gsDT <- GREENGridData::addNZSeason(gsDT)
table(lubridate::month(gsDT$r_dateTime, label = TRUE), gsDT$season, useNA = "always")
For simplicity we will focus only on Summer and Winter.
This section plots overall mean power per half hour by season.
gsDT <- gsDT[, r_dateTimeQHour := lubridate::floor_date(r_dateTime, unit = "15 mins")]
# create mean power across 15 minute periods to use as base dataset (comparable to SAVE)
qHourDT <- gsDT[, .(meanW = mean(powerW)), keyby = .(r_dateTimeQHour,linkID, season)
qHourDT <- qHourDT[, obsQHour := hms::as.hms(r_dateTimeQHour)]
plotDT <- qHourDT[, .(meanW = mean(meanW)), keyby = .(season, obsQHour)
# set attributes for plot
vLineAlpha <- 0.4
vLineCol <- "#0072B2" # http://www.cookbook-r.com/Graphs/Colors_(ggplot2)/#a-colorblind-friendly-palette
timeBreaks <- c(hms::as.hms("04:00:00"),
# create default caption
myCaption <- paste0("GREENGrid Grid Spy household electricity demand data (https://dx.doi.org/10.5255/UKDA-SN-853334)",
"\n", min(lubridate::date(gsDT$r_dateTime)),
" to ", max(lubridate::date(gsDT$r_dateTime)),
"\nTime = Pacific/Auckland",
"\n (c) ", lubridate::year(now())," University of Otago")
myPlot <- ggplot2::ggplot(plotDT[!is.na(season)], # make sure no un-set seasons/non-parsed dates
aes(x = obsQHour, y = meanW/1000)) +
geom_line() +
facet_grid(season ~ .) +
scale_colour_manual(values=ggParams$cbPalette) + # use colour-blind friendly palette
theme(strip.text.y = element_text(angle = 0, vjust = 0.5, hjust = 0.5)) +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 0.5)) +
guides(colour = guide_legend(title = "Season: ")) +
theme(legend.position = "bottom") +
labs(title = paste0(params$circuit, ": seasonal mean power demand profiles"),
y = "Mean kW per 15 minutes",
x = "Time of day",
caption = myCaption
myPlot +
scale_x_time(breaks = timeBreaks) +
geom_vline(xintercept = timeBreaks, alpha = vLineAlpha, colour = vLineCol)
Figure 4.1: Demand profile plot
#ggplot2::ggsave(paste0(ggParams$repoLoc,"/examples/outputs/", params$circuit, "_meankWperminBySeason.png"))
Figure 4.1 shows the overall mean kW per minute in each season for this circuit (Heat Pump).
Table 4.1 shows the number of households who have different numbers of people (children and adults). This table includes households where we do not know the number of people (NA) but we do have electricity demand data.
Q57 | Freq |
NA | 2 |
1 | 4 |
2 | 11 |
3 | 10 |
4 | 10 |
5 | 5 |
6 | 2 |
Clearly this is too fine grained (too many categories). We therefor collapse to form the coding shown in 4.2.
nPeople | Freq |
NA | 2 |
1 | 4 |
2 | 11 |
3 | 10 |
4+ | 17 |
Now we link (join) the Grid Spy and household data.tables and aggregate (summarise) by season and number of people. You can do this using data.table
’s on the fly join but we have found pre-joining of the columns you want to be much faster. We’re not sure why as it shouldn’t be. You can probably also do this in dplyr
etc but we haven’t tried.
Figure 4.2 shows the mean kW per minute per season by presence of young children for this circuit (Heat Pump). Can you see anything interesting or unusual and might this be due to the numbers of households in each group?
Figure 4.2: Demand profile plot - n people
nPeople | nHHs |
1 | 2 |
2 | 4 |
3 | 8 |
4+ | 12 |
This section plots overall mean power per minute by season and number of children aged 0-12 as an illustration of how to link the Grid Spy and household data. We will go through the steps with commentary and showing the code…
Table 4.4 shows the number of households who have different numbers of children aged 0-12 so we know how many households make up each line on the plot. This table includes households where we do not know the number of children (NA) but we do have electricity demand data.
nChildren0_12 | Freq |
NA | 2 |
0 | 17 |
1 | 11 |
2 | 10 |
3 | 4 |
presenceChildren | Freq |
0 children | 19 |
1+ child | 25 |
Now use the aggregated data.table
to make the plot. Note that as specified this will add a line for nChildren0_12 == NA household(s) - see Table ??.
keepCols <- c("linkID", "nChildren0_12")
mergedDT <- qHourDT[hhDT[, ..keepCols]]
plotDT <- mergedDT[!is.na(nChildren0_12), .(meanW = mean(meanW),
sdW = sd(meanW),
nObs = .N), keyby = .(season, obsQHour, nChildren0_12)]
plotDT <- plotDT[, ci_upper := meanW + qnorm(0.975)*(sdW/sqrt(nObs))]
plotDT <- plotDT[, ci_lower := meanW - qnorm(0.975)*(sdW/sqrt(nObs))]
basePlot <- ggplot2::ggplot(plotDT[!is.na(season)], # make sure no un-set seasons/non-parsed dates
aes(x = obsQHour, y = meanW/1000,
colour = as.factor(nChildren0_12))) +
geom_line() +
scale_colour_manual(values=ggParams$cbPalette) + # use colour-blind friendly palette
facet_grid(season ~ .) +
theme(strip.text.y = element_text(angle = 0, vjust = 0.5, hjust = 0.5)) +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 0.5)) +
guides(colour = guide_legend(title = "Number of children aged 0 - 12: ")) +
theme(legend.position = "bottom") +
labs(title = paste0(params$circuit, ": seasonal mean power demand profiles by n children aged 0-12"),
y = "Mean kW per 15 minutes",
x = "Time of day",
caption = myCaption
basePlot <- basePlot +
scale_x_time(breaks = timeBreaks) +
geom_vline(xintercept = timeBreaks, alpha = vLineAlpha, colour = vLineCol)
Figure 4.3: Demand profile plot - n kids
# add 95% CI
ciPlot <- basePlot + geom_errorbar(aes(ymin = ci_lower/1000, ymax = ci_upper/1000))
# use reduced
keepCols <- c("linkID", "presenceChildren")
mergedDT <- qHourDT[hhDT[, ..keepCols]]
plotDT <- mergedDT[!is.na(presenceChildren), .(meanW = mean(meanW),
sdW = sd(meanW),
nObs = .N), keyby = .(season, obsQHour, presenceChildren)]
plotDT <- plotDT[, ci_upper := meanW + qnorm(0.975)*(sdW/sqrt(nObs))]
plotDT <- plotDT[, ci_lower := meanW - qnorm(0.975)*(sdW/sqrt(nObs))]
basePlot <- ggplot2::ggplot(plotDT[!is.na(season)], # make sure no un-set seasons/non-parsed dates
aes(x = obsQHour, y = meanW/1000,
colour = as.factor(presenceChildren))) +
geom_line() +
scale_colour_manual(values=ggParams$cbPalette) + # use colour-blind friendly palette
facet_grid(season ~ .) +
theme(strip.text.y = element_text(angle = 0, vjust = 0.5, hjust = 0.5)) +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 0.5)) +
guides(colour = guide_legend(title = "Number of children aged 0 - 12: ")) +
theme(legend.position = "bottom") +
labs(title = paste0(params$circuit, ": seasonal mean power demand profiles by n children aged 0-12"),
y = "Mean kW per 15 minutes",
x = "Time of day",
caption = myCaption
basePlot <- basePlot +
scale_x_time(breaks = timeBreaks) +
geom_vline(xintercept = timeBreaks, alpha = vLineAlpha, colour = vLineCol)
Figure 4.3: Demand profile plot - n kids
# add 95% CI
ciPlot <- basePlot + geom_errorbar(aes(ymin = ci_lower/1000, ymax = ci_upper/1000))
# actual n household used
t <- mergedDT[!is.na(presenceChildren) & !is.na(season) , .(nHHs = uniqueN(linkID)), keyby = .(presenceChildren)]
knitr::kable(t, caption = "Actual n households used in plot")
presenceChildren | nHHs |
0 children | 8 |
1+ child | 20 |
Figure 4.3 shows the mean kW per minute per season by presence of young children for this circuit (Heat Pump). Can you see anything interesting or unusual and might this be due to the numbers of households in each group?
Analysis completed in 62.5 seconds ( 1.04 minutes) using knitr in RStudio with R version 3.5.1 (2018-07-02) running on x86_64-apple-darwin15.6.0.
