@dataknut
)Get Carsten’s re-weighted lighting profile data - derived from (Anderson et al. 2018)
ggLightDT <- data.table::fread("~/Dropbox/Work/Otago_CfS_Ben/Students/CarstenDortans_MSc/Writing bursary/Lighting.csv")
skimr::skim(ggLightDT)
## Skim summary statistics
## n obs: 10376945
## n variables: 3
##
## ── Variable type:character ──────────────────────────────────────────────────────────────────────────────────────────
## variable missing complete n min max empty n_unique
## r_dateTime 0 10376945 10376945 19 19 0 526920
##
## ── Variable type:integer ────────────────────────────────────────────────────────────────────────────────────────────
## variable missing complete n mean sd p0 p25 p50
## V1 0 10376945 10376945 5188473 3e+06 1 2594237 5188473
## p75 p100 hist
## 7782709 1e+07 ▇▇▇▇▇▇▇▇
##
## ── Variable type:numeric ────────────────────────────────────────────────────────────────────────────────────────────
## variable missing complete n mean sd p0 p25 p50 p75
## powerW 0 10376945 10376945 9118290.6 2e+07 0 0 0 8346812.79
## p100 hist
## 3.3e+08 ▇▁▁▁▁▁▁▁
head(ggLightDT)
## V1 r_dateTime powerW
## 1: 1 2015-04-01 00:00:00 5835851
## 2: 2 2015-04-01 00:01:00 5874919
## 3: 3 2015-04-01 00:02:00 5835851
## 4: 4 2015-04-01 00:03:00 5835851
## 5: 5 2015-04-01 00:04:00 5835851
## 6: 6 2015-04-01 00:05:00 5835851
Check that the datetimes are NZ time otherwise we won’t be comparing the same half-hours.
ggLightDT <- ggLightDT[, r_dateTime := lubridate::as_datetime(r_dateTime)]
ggLightDT$V1 <- NULL # how did that get there?
ggLightDT <- ggLightDT[, hms := hms::as.hms(r_dateTime)]
ggLightDT <- ggLightDT[, halfHour := hms::trunc_hms(hms, 60*30)]
ggLightDT <- GREENGridData::addNZSeason(ggLightDT)
plotDT <- ggLightDT[, .(meanGW = mean(powerW/10000000)), keyby = .(halfHour, season)]
p <- ggplot2::ggplot(plotDT, aes(x = halfHour, y = meanGW, colour = season)) +
geom_line() +
labs(x = "Time of Day",
y = "Mean power demand (Lighting, GW)",
caption = "GREEN Grid lighting data re-weighted to NZ population")
p
Looks OK. This is because hms::as.hms()
cleverly converts r_dateTime
which it assumes to be UTC (it is) into local time. Without telling us…
Now convert that to half-hourly GWh for comparison with EA data.
# do this before aggregating
ggLightDT <- ggLightDT[, powerWh := powerW/60] # cos it's per minute
ggLightDT <- ggLightDT[, consumptionGWh := powerWh/1000000000]
# need to aggregate from 1 min to (all) half hours
ggLightDT <- ggLightDT[, halfHourDate := lubridate::floor_date(r_dateTime, "30 mins")]
dt <- ggLightDT[, .(consumptionGWh = sum(consumptionGWh)), keyby = .(halfHourDate, halfHour, season)]
# now get the mean
plotGG_DT <- dt[, .(MeanGWh = mean(consumptionGWh)), keyby = .(halfHour, season)]
# check
message("Sum of GWh before agg = ", sum(ggLightDT$consumptionGWh))
## Sum of GWh before agg = 1577
message("Sum of GWh after agg = ", sum(dt$consumptionGWh))
## Sum of GWh after agg = 1577
p <- ggplot2::ggplot(plotGG_DT, aes(x = halfHour, y = MeanGWh, colour = season)) +
geom_line() +
labs(x = "Time of Day",
y = "Mean energy consumption (Lighting, GWh)",
caption = "GREEN Grid lighting data re-weighted to NZ population")
p
Get the 2015 Generation data from the EA (igores distributed gen - solar as not really relevant to evening peaks). Data pre-downloaded from https://www.emi.ea.govt.nz/Wholesale/Datasets/Generation/Generation_MD & cleaned.
eaPath <- "~/Data/NZ_GREENGrid/ea/"
fList <- data.table::as.data.table(list.files(eaPath))
f2015 <- fList[V1 %like% "2015" & V1 %like% "long"]
f2015 <- f2015[, fullName := paste0(eaPath, V1)]
ea2015GenDT <- data.table::data.table() # data bucket
# load them all in one go very fast
ea2015GenDT = rbindlist(lapply(f2015$fullName, fread))
skimr::skim(ea2015GenDT)
## Skim summary statistics
## n obs: 1273500
## n variables: 14
##
## ── Variable type:character ──────────────────────────────────────────────────────────────────────────────────────────
## variable missing complete n min max empty n_unique
## Fuel_Code 0 1273500 1273500 3 6 0 7
## Gen_Code 0 1273500 1273500 3 15 0 65
## Nwk_Code 0 1273500 1273500 4 4 0 24
## POC_Code 0 1273500 1273500 7 7 0 63
## rDate 0 1273500 1273500 10 10 0 365
## rDateTime 0 1273500 1273500 0 20 50940 17521
## Site_Code 0 1273500 1273500 3 3 0 63
## Tech_Code 0 1273500 1273500 3 5 0 5
## Time_Period 0 1273500 1273500 3 4 0 50
## Trading_date 0 1273500 1273500 10 10 0 365
##
## ── Variable type:integer ────────────────────────────────────────────────────────────────────────────────────────────
## variable missing complete n mean sd p0 p25 p50
## month 0 1273500 1273500 6.53 3.45 1 4 7
## rTime 50940 1222560 1273500 43200 24936.13 900 22050 43200
## year 0 1273500 1273500 2015 0 2015 2015 2015
## p75 p100 hist
## 10 12 ▇▅▇▃▅▇▅▇
## 64350 85500 ▇▇▇▇▇▇▇▇
## 2015 2015 ▁▁▁▇▁▁▁▁
##
## ── Variable type:numeric ────────────────────────────────────────────────────────────────────────────────────────────
## variable missing complete n mean sd p0 p25 p50
## kWh 50940 1222560 1273500 33373.31 49994.08 0 5178.98 17288.55
## p75 p100 hist
## 42794 4e+05 ▇▂▁▁▁▁▁▁
head(ea2015GenDT)
## Site_Code POC_Code Nwk_Code Gen_Code Fuel_Code Tech_Code Trading_date
## 1: ANI MAT1101 BOPD aniwhenua Hydro Hydro 2015-01-01
## 2: ANI MAT1101 BOPD aniwhenua Hydro Hydro 2015-01-02
## 3: ANI MAT1101 BOPD aniwhenua Hydro Hydro 2015-01-03
## 4: ANI MAT1101 BOPD aniwhenua Hydro Hydro 2015-01-04
## 5: ANI MAT1101 BOPD aniwhenua Hydro Hydro 2015-01-05
## 6: ANI MAT1101 BOPD aniwhenua Hydro Hydro 2015-01-06
## Time_Period kWh rTime rDate rDateTime month year
## 1: TP1 3682.54 900 2015-01-01 2015-01-01T00:15:00Z 1 2015
## 2: TP1 4179.41 900 2015-01-02 2015-01-02T00:15:00Z 1 2015
## 3: TP1 3820.13 900 2015-01-03 2015-01-03T00:15:00Z 1 2015
## 4: TP1 3544.18 900 2015-01-04 2015-01-04T00:15:00Z 1 2015
## 5: TP1 3550.79 900 2015-01-05 2015-01-05T00:15:00Z 1 2015
## 6: TP1 3612.30 900 2015-01-06 2015-01-06T00:15:00Z 1 2015
Now do the same process of checking and aggregating.
ea2015GenDT <- ea2015GenDT[, r_dateTime := lubridate::as_datetime(rDateTime)]
ea2015GenDT <- ea2015GenDT[, r_dateTime := lubridate::force_tz(r_dateTime, tzone = "Pacific/Auckland")] # got to force it
ea2015GenDT <- ea2015GenDT[, hms := hms::as.hms(r_dateTime)]
ea2015GenDT <- ea2015GenDT[, halfHour := hms::trunc_hms(hms, 60*30)]
ea2015GenDT <- GREENGridData::addNZSeason(ea2015GenDT)
# make GWh
ea2015GenDT <- ea2015GenDT[, consumptionGWh := kWh/1000000]
# add up to total GWh for eachhalf hour (currently it's per gen site/fuel)
aggea2015GenDT <- ea2015GenDT[, .(consumptionGWh = sum(consumptionGWh, na.rm = TRUE)), # avoid NAs in dates (DST breaks) & kWh
keyby = .(season, r_dateTime, halfHour)]
# check
message("Sum of GWh = ", sum(aggea2015GenDT$consumptionGWh))
## Sum of GWh = 40800.874669309
# now re-create plotDT for the mean
plotEA_DT <- aggea2015GenDT[!is.na(r_dateTime), # avoid the single broken datetime as it messes with the plot
.(MeanGWh = mean(consumptionGWh)), keyby = .(halfHour, season)]
p <- ggplot2::ggplot(plotEA_DT, aes(x = halfHour, y = MeanGWh, colour = season)) +
geom_line() +
labs(x = "Time of Day",
y = "Mean energy consumption per half-hour (All, GWh)",
caption = "EA Wholesale Generation data (excl. distributed solar)")
p
So now we need to plot the contribution of lighting to this.
setkey(plotEA_DT, season, halfHour)
setkey(plotGG_DT, season, halfHour)
plotEA_DT <- plotEA_DT[,eaMeanGWh := MeanGWh]
plotGG_DT <- plotGG_DT[, lightingMeanGWh := MeanGWh]
plotDT <- plotEA_DT[!is.na(MeanGWh)][plotGG_DT] #get rid of the bloody DST breaks
plotDT <- plotDT[, pc_lighting := 100*(lightingMeanGWh/eaMeanGWh)]
p <- ggplot2::ggplot(plotDT, aes(x = halfHour, y = pc_lighting, colour = season)) +
geom_line() +
labs(x = "Time of Day",
y = "Lighting as a % of total generation (GWh)",
caption = "EA Wholesale generation data & GREEN Grid population weighted lighting data")
p
t <- summary(plotDT[, .(eaMeanGWh,lightingMeanGWh,pc_lighting )])
kableExtra::kable(t, caption = "Summary of data results", digits = 3) %>%
kable_styling()
eaMeanGWh | lightingMeanGWh | pc_lighting | |
---|---|---|---|
Min. :1.657 | Min. :0.01570 | Min. : 0.9095 | |
1st Qu.:2.010 | 1st Qu.:0.03648 | 1st Qu.: 1.6015 | |
Median :2.439 | Median :0.05662 | Median : 2.2824 | |
Mean :2.328 | Mean :0.08965 | Mean : 3.6086 | |
3rd Qu.:2.557 | 3rd Qu.:0.11217 | 3rd Qu.: 4.8794 | |
Max. :3.115 | Max. :0.38186 | Max. :12.3451 |
Finally we calculate the mean GWh and % contribution of lighting in the morning and evening peak periods for use in the paper.
amPeakStart <- hms::as.hms("07:00:00")
amPeakEnd <- hms::as.hms("09:00:00")
pmPeakStart <- hms::as.hms("17:00:00") # see https://www.electrickiwi.co.nz/hour-of-power
pmPeakEnd <- hms::as.hms("21:00:00") # see https://www.electrickiwi.co.nz/hour-of-power
plotDT$peakCode <- "Off peak (night)"
plotDT <- plotDT[, peakCode := ifelse(halfHour >= amPeakStart &
halfHour < amPeakEnd,
"Morning peak",
peakCode)]
plotDT <- plotDT[, peakCode := ifelse(halfHour > amPeakEnd &
halfHour < pmPeakStart,
"Off peak (day)",
peakCode)]
plotDT <- plotDT[, peakCode := ifelse(halfHour >= pmPeakStart &
halfHour < pmPeakEnd,
"Evening peak",
peakCode)]
t <- plotDT[, .('Mean GWh generation' = mean(eaMeanGWh),
'Mean lighting GWh consumption' = mean(lightingMeanGWh),
'Mean lighting %' = mean(pc_lighting),
'Max lighting %' = max(pc_lighting)), keyby = .(season,peakCode)]
kableExtra::kable(t,
caption = "Summary of off/peak results (half-hourly data)",
digits = 2) %>%
kable_styling()
season | peakCode | Mean GWh generation | Mean lighting GWh consumption | Mean lighting % | Max lighting % |
---|---|---|---|---|---|
Spring | Evening peak | 2.62 | 0.19 | 7.18 | 9.01 |
Spring | Morning peak | 2.66 | 0.10 | 3.86 | 5.60 |
Spring | Off peak (day) | 2.51 | 0.05 | 2.12 | 2.92 |
Spring | Off peak (night) | 2.03 | 0.07 | 3.01 | 8.72 |
Summer | Evening peak | 2.42 | 0.08 | 3.19 | 5.71 |
Summer | Morning peak | 2.41 | 0.06 | 2.70 | 3.79 |
Summer | Off peak (day) | 2.47 | 0.03 | 1.41 | 1.61 |
Summer | Off peak (night) | 1.97 | 0.05 | 2.27 | 6.56 |
Autumn | Evening peak | 2.59 | 0.24 | 9.24 | 10.62 |
Autumn | Morning peak | 2.50 | 0.13 | 5.11 | 7.64 |
Autumn | Off peak (day) | 2.44 | 0.05 | 2.05 | 3.41 |
Autumn | Off peak (night) | 1.93 | 0.07 | 3.18 | 9.26 |
Winter | Evening peak | 2.99 | 0.33 | 11.07 | 12.35 |
Winter | Morning peak | 2.83 | 0.18 | 6.48 | 9.95 |
Winter | Off peak (day) | 2.65 | 0.07 | 2.58 | 5.46 |
Winter | Off peak (night) | 2.14 | 0.08 | 3.56 | 9.26 |
Anderson, Ben, and David Eyers. 2018. GREENGridData: Processing Nz Green Grid Project Data to Create a ’Safe’ Version for Data Archiving and Re-Use. https://github.com/CfSOtago/GREENGridData.
Anderson, Ben, David Eyers, Rebecca Ford, Diana Giraldo Ocampo, Rana Peniamina, Janet Stephenson, Kiti Suomalainen, Lara Wilcocks, and Michael Jack. 2018. “New Zealand GREEN Grid Household Electricity Demand Study 2014-2018,” September. doi:10.5255/UKDA-SN-853334.
Arino de la Rubia, Eduardo, Hao Zhu, Shannon Ellis, Elin Waring, and Michael Quinn. 2017. Skimr: Skimr. https://github.com/ropenscilabs/skimr.
Dowle, M, A Srinivasan, T Short, S Lianoglou with contributions from R Saporta, and E Antonyan. 2015. Data.table: Extension of Data.frame. https://CRAN.R-project.org/package=data.table.
Grolemund, Garrett, and Hadley Wickham. 2011. “Dates and Times Made Easy with lubridate.” Journal of Statistical Software 40 (3): 1–25. http://www.jstatsoft.org/v40/i03/.
Müller, Kirill. 2018. Hms: Pretty Time of Day. https://CRAN.R-project.org/package=hms.
Wickham, Hadley. 2009. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. http://ggplot2.org.
Zhu, Hao. 2018. KableExtra: Construct Complex Table with ’Kable’ and Pipe Syntax. https://CRAN.R-project.org/package=kableExtra.