.. index:: single: c45 .. _c45/0: .. rst-class:: right **object** ``c45`` ======= C4.5 decision tree learning algorithm. Builds a decision tree from a dataset object implementing the ``dataset_protocol`` protocol and provides predicates for exporting the learned tree as a list of predicate clauses or to a file. Supports both discrete and continuous attributes, handles missing values, and supports tree pruning. | **Availability:** | ``logtalk_load(c45(loader))`` | **Author:** Paulo Moura | **Version:** 1:0:0 | **Date:** 2026-02-20 | **Compilation flags:** | ``static, context_switching_calls`` | **Implements:** | ``public`` :ref:`classifier_protocol ` | **Uses:** | :ref:`format ` | :ref:`list ` | :ref:`numberlist ` | :ref:`pairs ` | **Remarks:** - Algorithm: C4.5 is an extension of the ID3 algorithm that uses information gain ratio instead of information gain for attribute selection, which avoids bias towards attributes with many values. - Discrete attributes: The learned decision tree is represented as ``leaf(Class)`` for leaf nodes and ``tree(Attribute, Subtrees)`` for internal nodes with discrete attributes, where ``Subtrees`` is a list of ``Value-Subtree`` pairs. - Continuous attributes: For continuous (numeric) attributes, the tree uses binary threshold splits represented as ``tree(Attribute, threshold(Threshold), LeftSubtree, RightSubtree)`` where ``LeftSubtree`` corresponds to values ``=< Threshold`` and ``RightSubtree`` to values ``> Threshold``. - Missing values: Missing attribute values are represented using anonymous variables. During tree construction, examples with missing values for the split attribute are distributed to all branches. Entropy and gain calculations use only examples with known values for the attribute being evaluated. - Tree pruning: The ``prune/3`` and ``prune/5`` predicates implement pessimistic error pruning (PEP), which estimates error rates using the upper confidence bound of the binomial distribution (Wilson score interval) with a configurable confidence factor (default 0.25, range ``(0.0, 1.0)``) and minimum instances per leaf (default 2). Subtrees are replaced with leaf nodes when doing so would not increase the estimated error. | **Inherited public predicates:** |  :ref:`classifier_protocol/0::classifier_to_clauses/4`  :ref:`classifier_protocol/0::classifier_to_file/4`  :ref:`classifier_protocol/0::learn/2`  :ref:`classifier_protocol/0::predict/3`  :ref:`classifier_protocol/0::print_classifier/1`   .. contents:: :local: :backlinks: top Public predicates ----------------- .. index:: prune/5 .. _c45/0::prune/5: ``prune/5`` ^^^^^^^^^^^ Prunes a decision tree using pessimistic error pruning (PEP). This post-pruning method estimates error rates using the upper confidence bound of the binomial distribution with the given confidence factor and replaces subtrees with leaf nodes when doing so would not increase the estimated error. Pruning helps reduce overfitting and can improve generalization to unseen data. | **Compilation flags:** | ``static`` | **Template:** | ``prune(Dataset,Tree,ConfidenceFactor,MinInstances,PrunedTree)`` | **Mode and number of proofs:** | ``prune(+object_identifier,+tree,+float,+positive_integer,-tree)`` - ``one`` | **Remarks:** - Confidence factor: The confidence factor controls the aggressiveness of pruning. It must be in the range ``(0.0, 1.0)``. Lower values result in more aggressive pruning (smaller, simpler trees), while higher values result in less pruning (larger, more complex trees). The default value is ``0.25``. - Minimum instances per leaf: The minimum number of instances required at a leaf node. When a node has fewer instances than this value, the node may be pruned. It must be a positive integer. The default value is ``2``. - Statistical basis: The pruning uses the upper confidence bound of the binomial distribution to estimate the true error rate. ------------ .. index:: prune/3 .. _c45/0::prune/3: ``prune/3`` ^^^^^^^^^^^ Prunes a decision tree using pessimistic error pruning (PEP) with default parameter values. Calls ``prune/5`` with ``ConfidenceFactor = 0.25`` and ``MinInstances = 2``. | **Compilation flags:** | ``static`` | **Template:** | ``prune(Dataset,Tree,PrunedTree)`` | **Mode and number of proofs:** | ``prune(+object_identifier,+tree,-tree)`` - ``one`` | **Remarks:** - Default parameters: Uses the standard C4.5 default values: confidence factor of ``0.25`` (the confidence level for computing the upper bound of the error estimate) and minimum instances per leaf of ``2``. ------------ Protected predicates -------------------- (no local declarations; see entity ancestors if any) Private predicates ------------------ (no local declarations; see entity ancestors if any) Operators --------- (none) .. seealso:: :ref:`dataset_protocol `, :ref:`isolation_forest `, :ref:`knn `, :ref:`naive_bayes `, :ref:`nearest_centroid `, :ref:`random_forest `, :ref:`ada_boost `