Bibliography
More than a hundred articles about Khiops are available on this page.
To go further, here's a selection of scientific papers organized according to a reading path which facilitates the understanding of the Auto-ML pipeline. It is highly recommended to read these papers in the suggested order, after reading the documentation presented on this website. The gray lines indicate additional information, which can be read at a later stage, and which will not prevent you from gaining an overall understanding of the pipeline.
Optimal Encoding
- Discretization models: MODL: a Bayes optimal discretization method for continuous attributes - download
- Grouping models: A Bayes optimal approach for partitioning the values of categorical attributes - download
- The regression case: A New Probabilistic Approach In Rank Regression with Optimal Bayesian Partitioning - download
Auto Feature Engineering
- Multi-table data: A scalable robust and automatic propositionalization approach for Bayesian classification of large mixed numerical and categorical data - download
- Decision trees: A Bayes Evaluation Criterion for Decision Trees - download
- Pair discretization: Optimum simultaneous discretization with data grid models in supervised classification: a Bayesian model selection approach - download