core.dictionary¶
Submodule of khiops.core
Classes to manipulate Khiops Dictionary files
Note
To have a complete illustration of the access to the information of all classes in
this module look at their write methods which write them in Khiops Dictionary
file format (.kdic).
Functions¶
Reads a Khiops dictionary file |
|
Applies the upper-scope operator |
Classes¶
A Khiops Dictionary |
|
Main class containing the information of a Khiops dictionary file |
|
A metadata container for a dictionary, a variable or variable block |
|
A rule of a variable or variable block in a Khiops dictionary |
|
A variable of a Khiops dictionary |
|
A variable block of a Khiops dictionary |
- class khiops.core.dictionary.Dictionary(json_data=None)¶
Bases:
objectA Khiops Dictionary
A Khiops Dictionary is a description of a table transformation. Common uses in the Khiops framework are :
Describing the schema of an input table: In this case it is the identity transformation of the table(s).
Describing a predictor (classifier or regressor): In this case it is the transformation between the original table(s) and the prediction values or probabilities.
- Parameters:
- json_datadict, optional
Python dictionary representing an element of the list at the
dictionariesfield of a Khiops Dictionary JSON file. If not specified returns an empty instance.
- Attributes:
- namestr
Dictionary name.
- rootbool
True if the dictionary is the root of an dictionary hierarchy.
- keylist of str
Names of the key variables.
- variableslist of
Variable The dictionary variables.
- variable_blockslist of
VariableBlock The dictionary variable blocks.
- labelstr
Dictionary label.
- commentslist of str
List of dictionary comments.
- internal_commentslist of str
List of internal dictionary comments.
- meta_data
MetaData MetaData object of the dictionary.
- add_variable(variable)¶
Adds a variable to this dictionary
- Parameters:
- variable
Variable The variable to be added.
- variable
- Raises:
TypeErrorIf variable is not of type
VariableValueErrorIf the name is empty or if there is already a variable with that name.
- add_variable_block(variable_block)¶
Adds a variable block to this dictionary
- Parameters:
- variable_block
VariableBlock The variable block to be added.
- variable_block
- Raises:
TypeErrorIf variable is not of type
VariableBlockValueErrorIf the name is empty or if there is already a variable block with that name.
- add_variable_from_spec(name, type, label='', used=True, object_type=None, structure_type=None, rule=None, meta_data=None)¶
Adds a variable to this dictionary using a complete specification
- Parameters:
- namestr
Variable name.
- typestr
Variable type. See
Variable.- labelstr, default “”
Label of the variable.
- usedbool, default
True Usage status of the variable.
- object_typestr, optional
Object type. Ignored if variable type not in [“Entity”, “Table”].
- structure_typestr, optional
Structure type. Ignored if variable type is not “Structure”.
- rulestr, optional
String representation of a variable rule.
- meta_datadict, optional
A Python dictionary which holds the metadata specification. The dictionary keys are str. The values can be str, bool, float or int.
- Raises:
ValueErrorIf the variable name is empty or does not comply with the formatting constraints.
If there is already a variable with the same name.
If the given variable type is unknown.
If a native type is given ‘object_type’ or ‘structure_type’.
If the ‘meta_data’ is not a dictionary.
- copy()¶
Returns a copy of this instance
- Returns:
DictionaryA copy of this instance.
- get_value(key)¶
Returns the metadata value associated to the specified key
- Returns:
MetaDataMetadata value associated to the specified key.
Noneis returned if the metadata key is not found.
- get_variable(variable_name)¶
Returns the specified variable
- Parameters:
- variable_namestr
A name of a variable.
- Returns:
VariableThe specified variable.
Noneis returned if the variable name is not found.
- get_variable_block(variable_block_name)¶
Returns the specified variable block
- Parameters:
- variable_block_namestr
A name of a variable block.
- Returns:
VariableBlockThe specified variable block.
Noneis returned if the variable block name is not found.
- is_key_variable(variable)¶
Returns
Trueif a variable belongs to this dictionary’s key- Parameters:
- variable
Variable The variable for the query.
- variable
- Returns:
- bool
Trueif the variable belong to the key.
- remove_variable(variable_name)¶
Removes the specified variable from this dictionary
- remove_variable_block(variable_block_name, keep_native_block_variables=True)¶
Removes the specified variable block from this dictionary
Note
Non-native block variables (those created from block rules) are never kept in the dictionary.
- Parameters:
- variable_namestr
Name of the variable block to be removed.
- keep_native_block_variablesbool, default
True If
Trueand the block is native then only the block structure is removed from the dictionary but the variables are kept in it; neither the variables point to the block nor the removed block points to the variables. IfFalsethe variables are removed from the dictionary; the block preserves the references to their variables.
- Returns:
VariableBlockThe removed variable block.
- Raises:
KeyErrorIf no variable block with the specified name exists.
- use_all_variables(is_used)¶
Sets the
usedflag of all dictionary variables to the specified value- Parameters:
- is_usedbool
Sets the
usedfield tois_usedfor all theVariableobjects in this dictionary.
- write(writer)¶
Writes the dictionary to a file writer in
.kdicformat- Parameters:
- writer
KhiopsOutputWriter Output dictionary file.
- writer
- class khiops.core.dictionary.DictionaryDomain(json_data=None)¶
Bases:
KhiopsJSONObjectMain class containing the information of a Khiops dictionary file
A DictionaryDomainain is a collection of
Dictionaryobjects. These dictionaries usually represent either a database schema or a predictor model.- Parameters:
- json_datadict, optional
Python dictionary representing the data of a Khiops Dictionary JSON file. If not specified it returns an empty instance.
Note
Prefer the
read_dictionary_filefunction from the core API to obtain an instance of this class from a Khiops Dictionary file (kdicorkdicj).
- Attributes:
- toolstr
Name of the Khiops tool that generated the dictionary file.
- versionstr
Version of the Khiops tool that generated the dictionary file.
- dictionarieslist of
Dictionary The domain’s dictionaries.
- add_dictionary(dictionary)¶
Adds a dictionary to this domain
- Parameters:
- dictionary
Dictionary The dictionary to be added.
- dictionary
- Raises:
TypeErrorIf
dictionaryis not of typeDictionary.
- copy()¶
Copies this domain instance
- Returns:
DictionaryDomainA copy of this instance.
- export_khiops_dictionary_file(kdic_file_path)¶
Exports the domain in
.kdicformat- Parameters:
- kdic_file_pathstr
Path of the output dictionary file (
.kdic).
- extract_data_paths(source_dictionary_name)¶
Extracts the data paths for a dictionary in a multi-table schema
See Multi-Table Learning Primer for more details about data paths.
- Parameters:
- source_dictionary_namestr
Name of a dictionary.
- Returns:
- list of str
The additional data paths for the secondary tables of the specified dictionary.
- get_dictionary(dictionary_name)¶
Returns the specified dictionary
- Parameters:
- dictionary_namestr
Name of the dictionary.
- Returns:
DictionaryThe specified dictionary.
Noneis returned if the dictionary name is not found.
- get_dictionary_at_data_path(data_path)¶
Returns the dictionary name for the specified data path
- Parameters:
- data_pathstr
A data path for the specified table. Usually the output of
extract_data_paths.
- Returns:
DictionaryThe dictionary object pointed by this data path.
- Raises:
ValueErrorIf the path is not found.
- remove_dictionary(dictionary_name)¶
Removes a dictionary from the domain
- Returns:
DictionaryThe removed dictionary.
- Raises:
KeyErrorIf no dictionary with the specified name exists.
- write(stream_or_writer)¶
Writes the domain to a file writer in
.kdicformat- Parameters:
- stream_or_writer
io.IOBaseorKhiopsOutputWriter Output stream or writer.
- stream_or_writer
- class khiops.core.dictionary.MetaData(json_data=None)¶
Bases:
objectA metadata container for a dictionary, a variable or variable block
The metadata for both dictionaries and variables is a list of key-value pairs. The values can be set either to a string, to a number, or to the boolean value True. The latter represents flag metadata: they are either present (
True) or absent.- Parameters:
- json_datadict, optional
Python dictionary representing the object at a
metaDatafield of a dictionary domain, dictionary or variable in a Khiops Dictionary JSON file. If None it returns an empty instance.
- Attributes:
- keyslist of str
The metadata keys.
- valueslist
Metadata values for each key in
keys(synchronized lists). They can be either str, int or float.
- add_value(key, value)¶
Adds a value at the specified key
- Parameters:
- keystr
Key to be added. A valid key is a sequence of non-accented alphanumeric characters which starts with a non-numeric character.
- valuebool, int, float or str
Value to be added.
- Raises:
TypeErrorIf the key is not a valid string
If the value is not a valid string or if is not bool, int, float.
ValueErrorIf the key is already stored.
- get_value(key)¶
Returns the value at the specified key
- Returns:
- int, str or float
The value at the specified key.
Noneis returned if the key is not found.
- Raises:
TypeErrorIf
keyis not str.
- is_empty()¶
Returns
Trueif the meta-data is empty- Returns:
- bool
Returns
Trueif the meta-data is empty
- remove_key(key)¶
Removes the value at the specified key
- write(writer)¶
Writes the metadata to a file writer in
.kdicformat- Parameters:
- writer
KhiopsOutputWriter Output writer.
- writer
- class khiops.core.dictionary.Rule(*name_and_operands, verbatim=None, is_reference=False)¶
Bases:
objectA rule of a variable or variable block in a Khiops dictionary
This object is a convenience feature which eases rule creation and serialization, especially in complex cases (rule operands which are variables or rules themselves, sometimes upper-scoped). A
Ruleinstance must be converted tostrbefore setting it in aVariableorVariableBlockinstance.Ruleinstances can be created either from full operand specifications, or from verbatim rules. The latter is useful when the rule is retrieved from an existing variable or variable block and is used as an operand in another rule.- Parameters:
- name_and_operandstuple
Each tuple member can have one of the following types:
The first element of the
name_and_operandstuple is the name of the rule and must be str or bytes and non-empty for a standard rule, i.e. ifis_referenceis not set.- verbatimstr or bytes, optional
Verbatim representation of an entire rule. If set, then
names_and_operandsmust be empty.- is_referencebool, default
False If set to
True, then the rule is serialized as a reference rule:Rule(Operand1, Operand2, ...)is serialized as[Operand1, Operand2, ...].
- Attributes:
- namestr or bytes or
None Name of the rule. It is
Nonefor reference rules.- operandstuple of operands
Each operand has one of the following types:
- is_referencebool
The reference status of the rule.
Note
This attribute cannot be changed on a
Ruleinstance.
- namestr or bytes or
Examples
- basic rule, with variables as operands:
- verbatim:
Product(PetalLength, PetalWidth)
- object construction:
petal_length_var = kh.Variable() petal_length_var.name = "PetalLength" petal_length_var.type = "Numerical" petal_width_var = kh.Variable() petal_width_var.name = "PetalWidth" petal_width_var.type = "Numerical" rule = kh.Rule("Product", petal_length_var, petal_width_var)
- multi-table rule:
- verbatim:
TableCount( TableSelection( Vehicles, EQ(PassengerNumber, 1) ) )
- object construction:
vehicles_var = accidents_dictionary.get_variable("Vehicles") passenger_number_var = vehicles_dictionary.get_variable( "PassengerNumber" ) rule = kh.Rule( "TableCount", kh.Rule( "TableSelection", vehicles_var, kh.Rule("EQ", passenger_number_var, 1) ) )
- multi-table rule with upper-scoped operands (advanced usage):
- verbatim:
TableSelection( Vehicles, EQ( PassengerNumber, .TableMax(Vehicles, PassengerNumber) ) )
- object construction:
vehicles_var = accidents_dictionary.get_variable("Vehicles") passenger_number_var = vehicles_dictionary.get_variable( "PassengerNumber" ) rule = kh.Rule( "TableSelection", vehicles_var, kh.Rule( "EQ", passenger_number_var, kh.upper_scope( kh.Rule( "TableMax", vehicle_var, passenger_number_var ) ) ) )
- write(writer)¶
Writes the rule to a file writer in the
.kdicformatThis method ensures proper
Ruleserialization, automatically handling:back-quote recoding in variable names
double-quote recoding in categorical constants
missing data (
inf,-inf,NaN) serialization as#Missingupper-scope operator serialization as
.
- Parameters:
- writer
KhiopsOutputWriter Output writer.
Note
self.nameis not included in the serialization of reference rules.
- writer
- class khiops.core.dictionary.Variable(json_data=None)¶
Bases:
objectA variable of a Khiops dictionary
- Parameters:
- json_datadict, optional
Python dictionary representing an element of the list at the
variablesfield of dictionaries found in a Khiops Dictionary JSON file. If not specified it returns an empty instance.
- Attributes:
- namestr
Variable name.
- usedbool
True if the variable is used.
- typestr
Variable type. It can be either native (
Categorical,Numerical,Time,Date,Timestamp,TimestampTZ,Text), internal (TextList,Structure)or relational (
Entity- 0-1 relationship,Table- 0-n relationship)- object_typestr
Type complement for the
TableandEntitytypes.- structure_typestr
Type complement for the
Structuretype. Set to “” for other types.- rulestr
Derivation rule or external table reference. Set to “” if there is no rule associated to this variable. Examples:
standard rule: “Sum(Var1, Var2)”
reference rule: “[TableName]”
- variable_block
VariableBlock Block to which the variable belongs. Not set if the variable does not belong to a block.
- labelstr
Variable label.
- commentslist of str
List of variable comments.
- meta_data
MetaData Variable metadata.
Examples
- See the following function of the
samples.pydocumentation script:
- full_type()¶
Returns the variable’s full type
- Returns:
- str
The full type is the variable type plus its complement if the type is not basic.
- get_value(key)¶
Returns the metadata value associated to the specified key
- Returns:
MetaDataMetadata value associated to the specified key.
Noneis returned if the metadata key is not found.
- is_native()¶
Returns
Trueif the variable comes directly from a data columnVariables are not native if they come from a derivation rule, an external entity, a sub-table or structures.
- Returns:
- bool
Trueif a variables comes directly from a data column.
- is_reference_rule()¶
Returns
Trueif the special reference rule is usedThe reference rule is used to make reference to an external entity.
- Returns:
- bool
Trueif the special reference rule is used.
- is_relational()¶
Returns
Trueif the variable is of relational typeRelational variables reference other tables or external entities.
- Returns:
- bool
True if the variable is of relational type.
- write(writer)¶
Writes the domain to a file writer in
.kdicformat- Parameters:
- writer
KhiopsOutputWriter Output writer.
- writer
- class khiops.core.dictionary.VariableBlock(json_data=None)¶
Bases:
objectA variable block of a Khiops dictionary
- Parameters:
- json_datadict, optional
Python dictionary representing an element of the list at the
variablesfield of a dictionary object in a Khiops Dictionary JSON file. The element must have ablockNamefield. If not specified it returns an empty instance.
- Attributes:
- namestr
Block name.
- rule
Block derivation rule.
- variables
List of the Variable objects of the block.
- labelstr
Block label.
- commentslist of str
List of block comments.
- internal_commentslist of str
List of internal block comments.
- meta_data
Metadata object of the block.
- add_variable(variable)¶
Adds a variable to this block
- get_value(key)¶
Returns the metadata value associated to the specified key
- Returns:
MetaDataMetadata value associated to the specified key.
Noneis returned if the metadata key is not found.
- remove_variable(variable)¶
Removes a variable from this block
- write(writer)¶
Writes the variable block to a file writer in
.kdicformat- Parameters:
- writer
KhiopsOutputWriter Output writer.
- writer
- khiops.core.dictionary.read_dictionary_file(dictionary_file_path)¶
Reads a Khiops dictionary file
- Parameters:
- dictionary_filestr
Path of the file to be imported. The file can be either Khiops Dictionary (extension
kdic) or Khiops JSON Dictionary (extension.jsonor.kdicj).
- Returns:
DictionaryDomainAn dictionary domain representing the information in the dictionary file.
- Raises:
ValueErrorWhen the file has an extension other than
.kdic,.kdicjor.json.
Examples
- See the following functions of the
samples.pydocumentation script:
- khiops.core.dictionary.upper_scope(operand)¶
Applies the upper-scope operator
.to an operand- Parameters:
- Returns:
- upper-scoped operand
The upper-scoped operand, as if the upper-scope operator
.were applied to an operand in a rule in the.kdicdictionary language.
- Raises: