Skip to content


To go further, here's a selection of scientific papers organized according to a reading path which facilitates the understanding of the Auto-ML pipeline. It is highly recommended to read these papers in the suggested order, after reading the documentation presented on this website. The gray lines indicate additional information, which can be read at a later stage, and which will not prevent you from gaining an overall understanding of the pipeline.


More than a hundred articles about Khiops are available on this page.

Optimal Encoding

  1. Discretization models: MODL: a Bayes optimal discretization method for continuous attributes - download
  2. Grouping models: A Bayes optimal approach for partitioning the values of categorical attributes - download
  3. The regression case: A New Probabilistic Approach In Rank Regression with Optimal Bayesian Partitioning - download

Auto Feature Engineering

  1. Multi-table data: A scalable robust and automatic propositionalization approach for Bayesian classification of large mixed numerical and categorical data - download
  2. Decision trees: A Bayes Evaluation Criterion for Decision Trees - download
  3. Pair discretization: Optimum simultaneous discretization with data grid models in supervised classification: a Bayesian model selection approach - download

Parsimonious Training

  1. Previous versions: Compression-Based Averaging of Selective Naive Bayes Classifiers - download
  2. Currently, from Khiops V10: an article is currently being written on the parsimonious Bayesian classifier as presented in the documentation.