Variable block derivation rules
Basic variable block rules
The following variable block rules allow to copy a block of variable within a dictionary (like Copy and CopyC), or to obtain a copy of a block of variable defined in an Entity variable (like GetValue and GeValueC).
CopyBlock
Copy of a block of numerical variables. Copy of a block of categorical variables.Example
In this example, the block of variables named Items is copied to a new block named Words. Each variable has a unique name within the whole dictionary, and a unique VarKey within its block. The variables of the input block are copied to those of the output block having the same VarKey.
GetBlock
Access to a block of numerical variables of an entity. Access to a block of categorical variables of an entity.In the following example, the block of variables named Items in the Entity(Document) variable CurriculumVitae is accessed in a new block named CVWords. In the main entity.
Example
Sparse partition of a secondary Table
Table variables are used in the multi-table format of Khiops to specify a 0-n relationship between two entities, for example between a customer and its usages in a secondary table.
The following rules allow to partition a secondary table into a set of part, using a Partition rule that specifies how to partition the secondary table and a TablePartitionRule that produces a block of Table variables from a secondary table and the partition specification.
Builds a partition structure, which is a cross-product of univariate partitions. The parameters are univariate partitions, chosen among IntervalBounds, ValueGroups, ValueSetC or ValueSet.Example
The following bivariate partition exploits a ValueSetC rule to partition categorical values into three groups and a IntervalBounds rule to partition numerical rules into two intervals
Structure(Partition) partitionServiceDuration = Partition(ValueSetC("Mobile", "Tel", " * "), IntervalBounds(5.5));
The resulting bivariate partition consists of 6 parts, with index from 1 to 6.
Mobile | Tel | * | |
---|---|---|---|
]-inf;5.5] | 1 | 2 | 3 |
]5.5;+inf[ | 4 | 5 | 6 |
Example
In the following dictionary, the usagesByServiceDuration block of variables is computed from a TablePartition rule that divides the secondary table Usages into a set of sub-parts according to the partition specified in the variable partitionServiceDuration. Among the 6 potential parts, 4 are described in the block of variables and related to their part index using their VarKey. The other 2 parts are simply ignored in the dictionary.
Root Dictionary Customer(id_customer)
{
Categorical id_customer ;
Categorical Name;
Table(Usage) Usages;
Structure(Partition) partitionServiceDuration = Partition(ValueSetC("Mobile", "Tel", " * "), IntervalBounds(5.5));
{
Table(Usage) MobileSmallDuration; <VarKey=1>
Table(Usage) TelSmallDuration; <VarKey=2>
Table(Usage) MobileLargeDuration; <VarKey=4>
Table(Usage) TelLargeDuration; <VarKey=5>
} usagesByServiceDuration = TablePartition(Usages, partitionServiceDuration, Service, Duration);
};
Computing statistics from blocks of Table variables
Blocks of Table parts can be used to produce block of values by computing the statistics of a given secondary variable on each part defined in the block.
Example
The TablePartitionMean rule computes the mean value of the secondary variable Price for each Table part in the block of Table variables usagesByServiceDuration.
Root Dictionary Customer(id_customer)
{
Categorical id_customer;
Categorical Name;
Table(Usage) Usages;
Structure(Partition) partitionServiceDuration = Partition(ValueSetC("Mobile", "Tel", " * "), IntervalBounds(5.5));
{
Table(Usage) MobileSmallDuration; <VarKey=1>
Table(Usage) TelSmallDuration; <VarKey=2>
Table(Usage) MobileLargeDuration; <VarKey=4>
Table(Usage) TelLargeDuration; <VarKey=5>
} usagesByServiceDuration = TablePartition(Usages, partitionServiceDuration, Service, Duration);
{
Numerical MobileSmallDurationMeanPrice; <VarKey=1>
Numerical TelSmallDurationMeanPrice; <VarKey=2>
Numerical MobileLargeDurationMeanPrice; <VarKey=4>
Numerical TelLargeDurationMeanPrice; <VarKey=5>
} usagesMeanPriceByServiceDuration = TablePartitionMean(usagesByServiceDuration, Price);
};
The following sparse rules allow to compute various statistic indicators from a block of Table parts.
Number of records per part.Example
usagesCountsByServiceDuration =TablePartitionCount(usagesByServiceDuration);
Computing statistics from blocks of values in secondary tables
A block of values in a secondary table corresponds to a list of secondary variables managed in the same block. Table rules such as TableMean, TableMode, TableStandard deviation are then extended to the case of a secondary block of variables to compute a block of values for all variables in the secondary block.
Example
The TableBlockSum rule computes the sum of the values of each variable in the block of variables defined in the secondary table.
Root Dictionary Applicant(id_applicant)
{
Categorical id_applicant ;
Categorical Name ;
Entity(Document) CurriculumVitae ;
Entity(Document) MotivationLetter ;
Table(Document) RecommandationLetters ;
{
Numerical archive ; <VarKey=1>
Numerical name ; <VarKey=2>
…
} RecommandationWords = TableBlockSum(RecommandationLetters, items) ;=
};