Resource Estimation window
Use the Resource Estimation window to estimate and predict the system resource utilization of parallel job runs.
The appearance of the Resource Estimation window differs depending on the task that you perform:
Create resource models
The first time that you open the Resource Estimation window for a job, a static model is generated by default. Click the Model toolbar button to display the Create Resource Model options. The input data sources for your job are on the left, with the Auto data sampling check box selected. Use the following buttons and fields to generate a model:
- Model Name
- Specifies a name for the model that you want to generate.
- Model Type
- Specifies the type of model to generate:
- Static
- Estimates the system resources that are needed for a job run, excluding CPU. You must use the Auto data sampling option.
- Dynamic
- Predicts the resources that are needed for a job run. The model is based on a sampling of actual input data. You can use the Auto data sampling option, or you can specify a data range.
- Auto
- Generates a data sample automatically for static and dynamic models. Clear this check box to specify a data range for a dynamic model, and type a value in the From and To fields for each data source. The beginning value of the range must be 0 or higher, and the ending value must not exceed the maximum number of records in the data source.
- Copy Previous
- Copies the data sampling specifications from models that you previously generated. If no previous models exist, this button is unavailable.
- Generate
- Generates a model after you specify the model name, type, and data sampling option.
Make resource projections
Click the Projection toolbar button to display the Make Resource Projection options. Use the following fields and buttons to create a projection:
- Projection Name
- Specifies a name for the projection that you want to create.
- Input Units
- Specifies the unit of measurement for the projection: Size in Megabytes or Number of Records.
- Input size
- Specifies the size of input for the projection. Type a value in the Megabytes (MB) or the Records fields.
- Copy Previous
- Copies the specifications from projections that you previously generated. If no previous projections exist, this button is unavailable.
- Generate
- Generates a projection after you specify the projection name, input units, and input size.
View resource data
After you generate a model for a job, you can use the Resource Estimation window to view model information, make projections, analyze job partitions and stages, display charts, and generate reports. Use the following controls:
- Toolbar
- Contains buttons to run the current job, create resource models, create projections, set options, and generate reports.
- Selection pane
- Provides controls for viewing models, projections, statistics, and charts:
- Models
- Displays a list of resource models that you generate. Select a model to view the model type, the number of data segments in the model, the total size of the input data, and the data sampling descriptions for each input. Right-click a model to delete it.
- Input Projections
- Displays a list of projections about the size of the data sources in a job. The default projection estimates the size by using the model that you generated. Actual projections reflect the actual size of each data source in a completed job run. Right-click a projection to delete it.
- Job Tree
- Displays the partitions in a job and the stages that ran on each partition. Select a partition to view statistics about the size of the input data, the CPU utilization, the size of the disk space and scratch space, and the partition utilization by stage. Select a stage to view its partition utilization and data set throughput. Click the tabs in the right pane to view statistics for different models.
- Stages
- Displays the stages in a job. Select a stage to view partition utilization and data set throughput. Click the tabs in the right pane to view statistics for different models.
- Charts
- Displays a list of charts that are available to help you visualize resource
utilization:
- Partition Utilization
- Charts in this category depict the resource utilization requirements of each stage on all partitions, including CPU requirements, scratch requirements, and disk requirements.
- Dataset Throughput
- Charts in this category depict the data set throughput of each stage on all partitions, including record totals and data set sizes. Click the tabs above each chart to navigate to different links.
- Operator Utilization
- Charts in this category compare the resource utilization of all stages on each partition, including CPU requirements, scratch requirements, and disk requirements. Click the tabs above each chart to navigate to different partitions.
Each chart displays estimates for different models in different colors. Use the Jump To field in the right pane to navigate to different stages.
- Display pane
- Displays the selected model, input projection, job partition, stage, or chart.