object
isolation_forest
Extended Isolation Forest (EIF) algorithm for anomaly detection. Implements the improved version described by Hariri et al. (2019) that uses random hyperplane cuts instead of axis-aligned cuts, eliminating score bias artifacts. Builds an ensemble of isolation trees from a dataset object implementing the dataset_protocol protocol. Missing attribute values are represented using anonymous variables.
logtalk_load(isolation_forest(loader))static, context_switching_calls
Algorithm: The Extended Isolation Forest builds an ensemble of isolation trees (iTrees) by recursively partitioning the data using random hyperplanes. Anomalous points, being few and different, require fewer partitions (shorter path lengths) to be isolated.
Extended vs Original: The original Isolation Forest uses axis-aligned splits (random attribute + random value), which introduces bias in anomaly scores along coordinate axes. The extended version uses random hyperplane cuts with arbitrary slopes, producing more consistent and reliable anomaly scores.
Extension level: The extension level controls the dimensionality of the random hyperplane cuts. Level 0 corresponds to the original axis-aligned Isolation Forest. The default level is
d - 1(fully extended) wheredis the number of dimensions.Prediction: The
predict/3predicate returnsanomalyif the anomaly score is above the threshold (default: 0.5) andnormalotherwise. Thescore_all/3predicate returns a sorted list of all instances with their corresponding scores and class labels. Predictions use by default the learned model options but can override them using theanomaly_threshold/1option.Anomaly score: The anomaly score
s(x)is computed ass(x) = 2^(-E(h(x))/c(psi))whereE(h(x))is the average path length across all trees,c(psi)is the average path length of unsuccessful searches in a BST, andpsiis the subsample size. Scores close to 1 indicate anomalies; scores below 0.5 indicate normal points.Discrete attributes: Discrete (categorical) attributes are mapped to numeric indices based on their position in the attribute value list declared by the dataset. This allows the algorithm to handle datasets with mixed attribute types.
Missing values: Missing attribute values are represented using anonymous variables. During tree construction, missing values are replaced with random values drawn from the observed range of the corresponding attribute. During scoring, instances with missing values are sent down both branches of the tree and the path length is computed as the weighted average of the two branches.
Classifier representation: The learned model is represented as an
if_model(Trees, SubsampleSize, AttributeNames, Attributes, Ranges, Options)compound term.
Public predicates
learn/3
Learns an isolation forest model from the given dataset object using the specified options. Valid options are number_of_trees/1 (default: 100), subsample_size/1 (default: 256 or the number of instances if smaller), extension_level/1 (default: d - 1 where d is the number of dimensions), and anomaly_threshold/1 (default: 0.5).
staticlearn(Dataset,Model,Options)learn(+object_identifier,-compound,+list(compound)) - onepredict/4
Predicts whether an instance is an anomaly or normal using the learned model and the anomaly threshold with the given options. The instance is a list of Attribute-Value pairs where missing values are represented using anonymous variables. Returns anomaly if the anomaly score is above the threshold, normal otherwise.
staticpredict(Model,Instance,Prediction,Options)predict(+compound,+list,-atom,+list(compound)) - onescore/3
Computes the anomaly score for a given instance using the learned model. The instance is a list of Attribute-Value pairs where missing values are represented using anonymous variables. The score is in the range [0.0, 1.0]. Scores close to 1.0 indicate anomalies. Scores close to 0.5 or below indicate normal instances.
staticscore(Model,Instance,Score)score(+compound,+list,-float) - onescore_all/3
Computes the anomaly scores for all instances in the dataset. Returns a list of Id-Class-Score triples sorted by descending anomaly score.
staticscore_all(Dataset,Model,Scores)score_all(+object_identifier,+compound,-list) - oneProtected predicates
(no local declarations; see entity ancestors if any)
Private predicates
(no local declarations; see entity ancestors if any)
Operators
(none)
See also