-
Notifications
You must be signed in to change notification settings - Fork 179
Defining and executing queries
Cascalog includes a number of options for the creation and execution of its queries.
<-
is the query creation operator. As described in How Cascalog executes a query, <-
accepts a vector of output variables and a series of predicates. The various predicates allowed are discussed in Guide to custom operations.
For example, given this dataset:
(def people [["ben" 35]
["jerry" 41]])
The following query filters out all people in the dataset under 40:
(<- [?name ?age]
(people ?name ?age)
(< ?age 40))
This query does nothing on its own. It can act as a generator for other queries, as described here, or we can bind it to an output tap with the query execution operator.
?-
is the query execution operator. It takes a sequence of <output tap, query>
pairs, and executes all supplied queries in parallel. For example:
(?- (stdout)
(<- [?name ?age]
(people ?name ?age)
(< ?age 40)))
Prints the following:
RESULTS
-----------------------
ben 35
-----------------------
(stdout)
is a Cascading tap that prints the string representation of output tuples to the output stream.
?<-
allows for combined query creation and execution. It accepts a single output tap, a result vector, and a series of predicates, like so:
(?<- (stdout)
[?name ?age]
(people ?name ?age)
(< ?age 40))
This query is functionally equivalent to the previous example, under ?-
.
??-
accepts any numbers of queries (defined by <-
), executes them in parallel, and returns a sequence of sequences of the results of each query's execution.
That last bit can be a bit confusing. Let's look at two examples. Here's the result of executing a single query with ??-
:
(def results
(??- (<- [?name ?age]
(people ?name ?age)
(< ?age 40))))
user=> results
((["ben" 35]))
Notice that we have an outer sequence with a single entry, corresponding to the single subquery we executed. This entry is a sequence containing that query's output tuples. Had we executed two queries, we would see a sequence with two inner sequences:
(def multi-results
(??- (<- [?name ?age]
(people ?name ?age)
(< ?age 40))
(<- [?name ?age]
(people ?name ?age)
(< ?age 50))))
user=> multi-results
((["ben" 35]) (["ben" 35] ["jerry" 41]))
??<-
allows for combined query creation and execution into a clojure sequence. Because ??<-
only allows execution of a single query, it returns a single sequence of tuples:
(def results-??<-
(??<- [?name ?age]
(people ?name ?age)
(< ?age 50)))
user=> results-??<-
(["ben" 35] ["jerry" 41])