Query specification

The second step consists to specify the queries allowing to define the types of rules to generate. You can create your own queries with the RQL-Language or use the query builder.


An RQL-query has the following format:

SCOPE t1 IN (dataset1), t2 IN (dataset2),
WHERE condition(t1,t2,…)
HAVING label1: predicate1(t1,t2,…) OVER attributes1
AND label2: predicate2(t1,t2,…) OVER attributes2

Datasets can be file names, DB table names or classical SQL-queries.
The condition corresponds to a classical SQL WHERE clause.
Predicates define the properties to be evaluated on each attribute occuring in the corresponding "OVER" clause.
Labels are mandatory only when several predicates are defined for a same attribute.
The attributes have to be separated by a comma. It's also possible to use "OVER ALL" or "OVER ALL MINUS attributes" to avoid listing a large number of attributes. See examples for more details.

    Query Builder

You can use the query builder to ensure that your query is correct and will pass the parsing step. All clauses of the language can be specify, it is possible to give different thresholds for each attribute or use the same by cliking on the arrow button.

    Predefined queries

The query builder assistant proposes two predefined queries:

SCOPE t1 IN data
HAVING t1.ATT>=#th1 AND t1.ATT<=#th2 OVER quantitative_attributes
AND t1.ATT=#th OVER categorical_attributes
SCOPE t1 IN data, t2 IN data
HAVING ABS(t2.ATT-t1.ATT)>=#th1 AND ABS(t2.ATT-t1.ATT)<=#th2
                                                                                           OVER quantitative_attributes
AND t1.ATT=t2.ATT OVER categorical_attributes

The query Q2 tests all couples of tuples.

For these predefined queries, you can change the dataset names (table or file name only), thresholds and lists of attributes. The thresholds #th, #th1 and #th2 can be different for each attribute. Computation time are optimized for these two predefined queries.

    Some remarks

You must use all RQL or SQL operators in uppercase : FINDRULES, SCOPE, IN, HAVING, ATT, AND, OVER, ALL, MINUS, WHERE…
Nested SQL queries must wrapped by parenthesis.
Aggregation functions (COUNT, AVG...) must be followed by an alias (AS + name).
Selection operator * must be prefixed by an alias like D.* and all datasets must have an alias if you use at least one *.
An alphanumerical criteria must be wrapped by ' like this : t1.ATT != 'TRUE'.
Dataset names are case sensitive. Don't forget to rename your file if needed before the upload.

< Previous                                                                                                                         Next >