The incoming
master and reading data must be prepared and be in the
/home/<utility>/staging directory. The files must be in a .zip format and
the names must start with master_data_*.zip and reading_data_*.zip
respectively. For example:
master_data_2017-12-19.zip
and
reading_data_2017-12-19.zip
Procedure
Log into the notebook node as a tenant user with access rights to HDFS and Hbase and start the
master data flow.
Open the /home/<utility>/automation directory.
Run the script: ./master_data_flow.sh
The output example for the zip file master_data_2017-12-19.zip:Figure 1. The example output
Where master_data_2017-12-19.log is the log directory,
master_data_2017-12-19.report contains the quality report, and
master_data_2017-12-19.success indicates that the flow was completed without
errors.
From the notebook node, start the reading data flow.
Open the /home/<utility>/automation directory.
Run the script: ./cm_automation/reading_data_flow.sh.
Note: Export the ANALYSIS_LOAD_UNTIL_TIME and
ANALYSIS_VOLTAGE_UNTIL_TIME environment variables with a suitable time for the
reading data before running the script if the latest time in the reading data is not yesterday.
The output example for the zip file reading_data_2017-12-19.zip:Figure 2. The reading data output
Where reading_data_2017-12-19.log is the log directory,
reading_data_2017-12-19.report contains the quality report, and
reading_data_2017-12-19.success indicates that the flow was completed without
errors.
From the notebook node, start the analysis flow.
Open the /home/<utility>/automation directory.
Run the script:
./cm_automation/analysis_flow.sh
Note: Export the ANALYSIS_LOAD_UNTIL_TIME and
ANALYSIS_VOLTAGE_UNTIL_TIME environment variables with a suitable time if
necessary.
The output example for the zip file
analysis_flow_2018-01-08.zip:Figure 3. The analysis flow output
Where analysis_flow_2018-01-08.log is the log directory,
analysis_flow_2018-01-08.report contains the quality report, and
analysis_flow_2018-01-08.success indicates that the flow was completed without
errors.
To enable or disable voltage or load analysis open analysis_flow.sh in a text editor.
Open the file analysis_flow.sh in a text editor.
Edit the tasks part of this file.
To enable the load or voltage analysis remove the comment symbol #.
To disable the load or voltage analysis add the comment symbol #.
The three flows can be run separately, or scheduled with crontab for a specified time.
Log in as a tenant user.
Run the command:
crontab -e
Put the contents into a text editor.
Note: For the format of the crontab file, please refer to the Linux crontab guide.
An
example crontab
file:
#specify time zone to be used. Right now it is UTC timezone
CRON_TZ=UTC
PATH=/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin
# specify the necessary environment variables
#PYTHON_LIB=
#SPARK_HOME=
#uncomment the following env variables to adjust the time used in the flow if needed
#ANALYSIS_LOAD_UNTIL_TIME=2016-09-01
#ANALYSIS_VOLTAGE_UNTIL_TIME=2016-08-31
#LOAD_UNTIL_TIME=2016-09-01
#VOLTAGE_UNTIL_TIME=2016-08-31
#master_data_flow & reading_data_flow
#master data flow scheduled at 14:00 every day, UTC timezone
00 14 * * * $HOME/automation/master_data_flow.sh &
#reading data flow scheduled at 15:00 every day, UTC timezone
00 15 * * * $HOME/automation/reading_data_flow.sh &
#analysis_flow scheduled at 18:00 every Saturday, UTC timezone
00 18 * * sat $HOME/automation/analysis_flow.sh &