treeas properties

Tree-AS node iconThe Tree-AS node is similar to the CHAID node; however, the Tree-AS node is designed to process big data to create a single tree and displays the resulting model in the output viewer. The node generates a decision tree by using chi-square statistics (CHAID) to identify optimal splits. This use of CHAID can generate nonbinary trees, meaning that some splits have more than two branches. Target and input fields can be numeric range (continuous) or categorical. Exhaustive CHAID is a modification of CHAID that does a more thorough job of examining all possible splits but takes longer to compute.

Table 1. treeas properties
treeas Properties Values Property description
target field In the Tree-AS node, CHAID models require a single target and one or more input fields. A frequency field can also be specified. See Common modeling node properties for more information.
method chaid exhaustive_chaid  
max_depth integer Maximum tree depth, from 0 to 20. The default value is 5.
num_bins integer Only used if the data is made up of continuous inputs. Set the number of equal frequency bins to be used for the inputs; options are: 2, 4, 5, 10, 20, 25, 50, or 100.
record_threshold integer The number of records at which the model will switch from using p-values to Effect sizes while building the tree. The default is 1,000,000; increase or decrease this in increments of 10,000.
split_alpha number Significance level for splitting. The value must be between 0.01 and 0.99.
merge_alpha number Significance level for merging. The value must be between 0.01 and 0.99.
bonferroni_adjustment flag Adjust significance values using Bonferroni method.
effect_size_threshold_cont number Set the Effect size threshold when splitting nodes and merging categories when using a continuous target. The value must be between 0.01 and 0.99.
effect_size_threshold_cat number Set the Effect size threshold when splitting nodes and merging categories when using a categorical target. The value must be between 0.01 and 0.99.
split_merged_categories flag Allow resplitting of merged categories.
grouping_sig_level number Used to determine how groups of nodes are formed or how unusual nodes are identified.
chi_square pearson likelihood_ratio Method used to calculate the chi-square statistic: Pearson or Likelihood Ratio
minimum_record_use use_percentage use_absolute  
min_parent_records_pc number Default value is 2. Minimum 1, maximum 100, in increments of 1. Parent branch value must be higher than child branch.
min_child_records_pc number Default value is 1. Minimum 1, maximum 100, in increments of 1.
min_parent_records_abs number Default value is 100. Minimum 1, maximum 100, in increments of 1. Parent branch value must be higher than child branch.
min_child_records_abs number Default value is 50. Minimum 1, maximum 100, in increments of 1.
epsilon number Minimum change in expected cell frequencies..
max_iterations number Maximum iterations for convergence.
use_costs flag  
costs structured Structured property. The format is a list of 3 values: the actual value, the predicted value, and the cost if that prediction is wrong. For example: tree.setPropertyValue("costs", [["drugA", "drugB", 3.0], ["drugX", "drugY", 4.0]])
default_cost_increase none linear square custom Only enabled for ordinal targets. Set default values in the costs matrix.
calculate_conf flag  
display_rule_id flag Adds a field in the scoring output that indicates the ID for the terminal node to which each record is assigned.