Table Partition Rules
Sparse partition of a secondary Table
Table variables are used in the multi-table format of Khiops to specify a 0-n relationship between two entities, for example between a customer and its usages in a secondary table.
The Partition
rule specifies how to partition a secondary table.
The TablePartition
rule
produces a block of Table variables from a secondary table and the partition specification.
Computing statistics from blocks of Table variables
Blocks of Table parts can then be used to produce block of values by computing the statistics of a given secondary variable on each part defined in the block. The sparse rules below allow to compute various statistic indicators from a block of Table parts.
Example
The TablePartitionMean rule computes the mean value of the secondary variable Price for each Table part in the block of Table variables usagesByServiceDuration.
Root Dictionary Customer(id_customer)
{
Categorical id_customer;
Categorical Name;
Table(Usage) Usages;
Structure(Partition) partitionServiceDuration = Partition(ValueSetC("Mobile", "Tel", " * "), IntervalBounds(5.5));
{
Table(Usage) MobileSmallDuration; <VarKey=1>
Table(Usage) TelSmallDuration; <VarKey=2>
Table(Usage) MobileLargeDuration; <VarKey=4>
Table(Usage) TelLargeDuration; <VarKey=5>
} usagesByServiceDuration = TablePartition(Usages, partitionServiceDuration, Service, Duration);
{
Numerical MobileSmallDurationMeanPrice; <VarKey=1>
Numerical TelSmallDurationMeanPrice; <VarKey=2>
Numerical MobileLargeDurationMeanPrice; <VarKey=4>
Numerical TelLargeDurationMeanPrice; <VarKey=5>
} usagesMeanPriceByServiceDuration = TablePartitionMean(usagesByServiceDuration, Price);
};
Partition
Builds a partition structure, which is a cross-product of univariate partitions. The parameters are univariate partitions, chosen among IntervalBounds, ValueGroups, ValueSetC or ValueSet.Example
The following bivariate partition exploits a ValueSetC rule to partition categorical values into three groups and a IntervalBounds rule to partition numerical rules into two intervals
Structure(Partition) partitionServiceDuration = Partition(ValueSetC("Mobile", "Tel", " * "), IntervalBounds(5.5));
The resulting bivariate partition consists of 6 parts, with index from 1 to 6.
Mobile | Tel | * | |
---|---|---|---|
]-inf;5.5] | 1 | 2 | 3 |
]5.5;+inf[ | 4 | 5 | 6 |
TablePartition
Builds a block of Table parts from a secondary Table and the specification of a partition. Note that the block of variables is potentially sparse, as only the non-empty parts are managed.Example
In the following dictionary, the usagesByServiceDuration block of variables is computed from a TablePartition rule that divides the secondary table Usages into a set of sub-parts according to the partition specified in the variable partitionServiceDuration. Among the 6 potential parts, 4 are described in the block of variables and related to their part index using their VarKey. The other 2 parts are simply ignored in the dictionary.
Root Dictionary Customer(id_customer)
{
Categorical id_customer ;
Categorical Name;
Table(Usage) Usages;
Structure(Partition) partitionServiceDuration = Partition(ValueSetC("Mobile", "Tel", " * "), IntervalBounds(5.5));
{
Table(Usage) MobileSmallDuration; <VarKey=1>
Table(Usage) TelSmallDuration; <VarKey=2>
Table(Usage) MobileLargeDuration; <VarKey=4>
Table(Usage) TelLargeDuration; <VarKey=5>
} usagesByServiceDuration = TablePartition(Usages, partitionServiceDuration, Service, Duration);
};
TablePartitionCount
Number of records per part.Example
usagesCountsByServiceDuration =TablePartitionCount(usagesByServiceDuration);