Regression analysis

Use the regression analysis tool in Envizi to do baseline regression analysis of consumption against heating and cooling loads or against other performance metrics that are predictive of consumption. The tool is available as part of the Interval Metering Analytics module and the Utility Bill Analytics module.

Before you do regression analysis, see the following sections for information about how you must create a regression model by configuring a proposed model and then saving it as an active model. Also, ensure that you meet the prerequisites.

Regression analysis tool parameters

The following list outlines the parameters that are associated with the regression analysis tool, some of which you configure in the Regression Analysis page:
Heating degree days (HDD)
The number of degrees that a day's average temperature value is below the HDD base temperature of the building.
Cooling degree days (CDD)
The number of degrees that a day's average temperature is above the CDD base temperature of the building.
HDD base temperature
This is the temperature value below which buildings need to be heated. HDD base temperatures are set differently for each building but in some cases default values such as 65 degrees Fahrenheit or 18 Degrees Celsius can be used as defaults.
CDD base temperature
This is the temperature above which buildings needs to be cooled. CDD base temperatures are set differently for each building but in some cases default values such as 65 degrees Fahrenheit or 18 Degrees Celsius can be used as defaults. If the temperature goes above the base temperature, the building will start using HVAC systems to cool the building down.

If you want the regression analysis tool to suggest an optimum base temperature, in the Proposed Model tool, select the Auto Fit option.

Note: If you save a new base temperature, it becomes the new base temperature for the building. Any existing models for other accounts or meters that used a different base temperature might become invalid. If you save a model that uses base temperatures that are different from previously saved models, the tool will ask you to confirm that you want to save the change. If you confirm the change, then you will need to update the historical HDD and CDD values to represent the saves base temperature. See the Managing Base Temperatures section.
Baseline period
This is the period used to carry out the regression modelling. Any predictions of energy use in the future will be based on the correlation between the predictive variables (HDD, CDD or other KPIs) and consumption during the baseline period. Therefore if a building’s operating characteristics change significantly from the baseline period, the predicted consumption will not be accurate. It is important to have enough baseline data, where 12-18 months of baseline data is considered a minimum in most cases.

In the Regression Analysis page, use the time selector control to configure the baseline period.

Prerequisites

Before you do regression analysis on account data or meter data, ensure that the following prerequisites are met:
  • Create a regression model, as outlined in this topic.
  • Define the temperature unit of measure to be selected at the organization level.
  • The regression tool requires historical data for the account or meter for which you want to run a regression model. Generally at least 12 months of data for the baseline period is required.
  • Ensure that the location is linked to an appropriate weather station.
  • Set the HDD and CDD base temperatures for the building. If a base temperature has not been set, the regression tool defaults to country or client level default values. This step is not mandatory, as the regression tool will allow you to set the base temperatures.
  • Configure the nonworking days setting at the location level to the appropriate setting. For example, if the facility runs 7 days a week, the nonworking days settings should be changed to Not Applicable. The default setting is for Saturday and Sunday to be nonworking days.

Creating a regression model

You can access the regression analysis tool from either an Account Summary page or a Meter Summary page. Click Analyze > Regression Analysis, and the Regression Analysis page is displayed.

To create a regression model that is used to calculate expected consumption based on weather normalization, in the Regression Analysis page, use the Proposed Model tool to define an acceptable model and then save the model. Configure parameters, such as a baseline period and base temperatures, to include in the model. Then, run the model and the regression tool indicates the model's fitness visually, based on R2. Additional statistical fitness indicators are also available in a grid. You can use the Model Fit chart to review how the proposed model predicts actual data. The regression analysis tool provides detailed feedback if the proposed model is invalid or does not pass certain fitness thresholds.

If you are satisfied with the proposed model, you can save the model and it becomes your active model. The active model is used to calculate expected and normalized consumption. The correlations derived from the model will then be used for normalization in IBM® ESG Suite wherever normalized reporting is required. Once saved, the proposed model will become the active model.

To update a model, you can use the Proposed Model tool to create another model and save it.

Note: R2 is the coefficient of determination, which is a term from statistics. It is a measure that indicates how closely data fits a model. The value has a range up to 1, so normally any value greater than 0.75 is considered a good fit. Any value less than 0.75 indicates no correlation exists between the data input and the data output.

Active model

In the Regression Analysis page, the Active Model section shows you the current regression model that is being used to calculate expected and normalized consumption for the account or meter. The section is empty if you have not previously created a model.

Modeling nonworking days

When modelling account data, nonworking days are accounted for when calculating degree days, where degree days are excluded for nonworking days. If modeling interval meter data, then nonworking days can be chosen as a variable in the model. If nonworking days are chosen as a variable in the model, the model will determine a unique constant for nonworking day base load consumption.

Outputs from modelling used for normalization

The following values from the regression model are used in calculating expected and normalized consumption:
  • Slope for HDD
  • Slope for CDD
  • Slope for KPI
  • Base load for working day
  • Base load for nonworking day

Model statistics

In addition to the R2 value displayed visually, other statistical variables from the proposed model are available through the Export Proposed Model button.

Note: Regression statistics will not be populated for models that have been captured manually. One way to identify if a model has been captured manually is if the base period date is reported as unknown.

Regression analysis charts

The following charts are displayed on the Regression Analysis page:
Model fit chart
The model fit chart shows the actual data against the active model, if there is a current active model, and the proposed model. This chart can be used to visually review the fitness of the proposed and active models.
Heating degree days (HDD) chart
The heating degree days chart is a scatter graph that shows HDD against consumption on a monthly basis for account data or daily basis for meter data.
Cooling degree days (CDD) chart
The cooling degree days chart is a scatter graph that shows CDD against consumption on a monthly basis for account data or daily basis for meter data.

Managing base temperatures

The base temperature is used to calculate HDD and CDD values. Both the base temperatures and HDD/CDD values are stored at a location level. As a result, accounts and meters at a given location share a common set of base temperatures, HDD, and CDD values. Changing and saving new base temperatures will impact previously saved models of not only the account or meter being analyzed, but any account or meter that has a saved model at that location. Before a change in base temperature is saved, Envizi will provide a warning that changes could impact other accounts or meters.

If the base temperature change is saved, the historic HDD and CDD values stored at the location will need to be updated. This process happens automatically after the proposed model is saved. However, if a manual recalculation of HDD and CDD values for the location is required, you will need to go through the following steps:
  1. Click Manage > Locations.
  2. In the Locations grid, use the search features to navigate to the location that needs its historical HDD and CDD values updated.
  3. Right-click the location and select Compile HDD/CDD. Confirming the action will result in the historical HDD and CDD values being recalculated using the new base temperatures.