Skip to content

Bibliography

More than a hundred articles about Khiops are available on this page.

To go further, here is a selection of scientific papers organized as a reading path designed to clarify the AutoML pipeline. We highly recommend reading them in the suggested order, after exploring the documentation provided on this website. The gray entries indicate complementary material that can be read later on, without hindering your overall understanding of the pipeline.

Optimal Encoding

  1. Discretization models: MODL: a Bayes optimal discretization method for continuous attributes - download
  2. Grouping models: A Bayes optimal approach for partitioning the values of categorical attributes - download
  3. The regression case: A New Probabilistic Approach In Rank Regression with Optimal Bayesian Partitioning - download

Auto Feature Engineering

  1. Multi-table data: A scalable robust and automatic propositionalization approach for Bayesian classification of large mixed numerical and categorical data - download
  2. Decision trees: A Bayes Evaluation Criterion for Decision Trees - download
  3. Pair discretization: Optimum simultaneous discretization with data grid models in supervised classification: a Bayesian model selection approach - download

Parsimonious Training

  1. Fractional Naive Bayes (FNB): Non-convex optimization for a parsimonious weighted selective naive Bayes classifier - download
  2. Previous versions (Khiops <v10): Compression-Based Averaging of Selective Naive Bayes Classifiers - download