-
Notifications
You must be signed in to change notification settings - Fork 97
Development
- Do One Thing and Do It Well to keep the program simple, maintainable, and robust.
- Play nicely with other tools. Particularly, produce an output that can be consumed by other tools. To support the Do One Thing and Do It Well principle.
To run unit tests:
cd clairvoyance
python3 -m unittest tests/*_test.py
- First bump version in the
pyproject.toml
- Then trigger the CD process
git tag v2.0.1 main git push origin v2.0.1
Since we're trying to obtain valid schema it's good to know what essential components each valid schema has. Pick a schema of your choice and look at top-level keys.
cat schema.json | jq '.data.__schema | keys'
[
"directives",
"mutationType",
"queryType",
"subscriptionType",
"types"
]
We can skip directives
for now, as well as mutationType
, queryType
and subscriptionType
because they are simple {"name": "Root"}
dictionaries where Root
can be obtained by __typename
field.
The interesting part is types
key. Let's take a closer look at it. It's an array each element of which is following.
cat schema.json | jq '.data.__schema.types[0] | keys'
[
"description",
"enumValues",
"fields",
"inputFields",
"interfaces",
"kind",
"name",
"possibleTypes"
]
Most important keys are name
, kind
, fields
and inputFields
. enumValues
and possibleTypes
are important too but they aren't supported by clairvoyance yet. description
and interfaces
aren't so important and probably hard to obtain.
...
Apollo Server / graphql-js "features"
Let's assume that we have Apollo Server as our target and introspection is disabled. What "features" of Apollo Server can we use in order to obtain schema?
Please note that examples shown on fields but same techniques apply to arguments as well.
If we supply invalid field which is similar to valid field, underlying graphql-js library will kindly give us list of suggestions with valid fields.
{
"query": "{ star }"
}
{
"errors": [
{
"message": "Cannot query field \"star\" on type \"Root\". Did you mean \"starship\"?",
"locations": [
{
"line": 1,
"column": 3
}
]
}
]
}
More information about this behaviour can be found in How apollo-server suggestions works? issue.
{
"query": "{ vehicle }"
}
{
"errors": [
{
"message": "Field \"vehicle\" of type \"Vehicle\" must have a selection of subfields. Did you mean \"vehicle { ... }\"?",
"locations": [
{
"line": 1,
"column": 3
}
]
}
]
}
So we can understand if field is valid even without suggestions.
film
field requires id
or filmID
to be provided.
{
"query": "{ film {id} }"
}
{
"errors": [
{
"message": "must provide id or filmID",
"locations": [
{
"line": 1,
"column": 3
}
],
"path": [
"film"
]
}
],
"data": {
"film": null
}
}
However it doesn't matter in case of obtaining field name and type because Apollo Server / graphql-js kindly generates error messages for film
's fields even without valid arguments.
{
"query": "{ film { titl } }"
}
{
"errors": [
{
"message": "Cannot query field \"titl\" on type \"Film\". Did you mean \"title\"?",
"locations": [
{
"line": 1,
"column": 10
}
]
}
]
}
If we supply multiple fields, we'll get error for each field.
{
"query": "{ star spice }"
}
{
"errors": [
{
"message": "Cannot query field \"star\" on type \"Root\". Did you mean \"starship\"?",
"locations": [
{
"line": 1,
"column": 3
}
]
},
{
"message": "Cannot query field \"spice\" on type \"Root\". Did you mean \"species\"?",
"locations": [
{
"line": 1,
"column": 8
}
]
}
]
}
This allows us to speed up the process of probing for fields and arguments (up to several thousand words per second with single thread).
This module contains code needed for running clairvoyance from command line (e.g. python3 -m clairvoyance
).
As of 24 Oct 2020 it also contains code for running clairvoyance()
on each of types. This part should be moved to oracle.py
or another module.
This module contains code for creating schema. There are two main types of functions:
- Those starting with
probe_
perform HTTP request, do basic response analysis and call functions starting withget_
. - Functions with
get_
in turn works offline and tries to extract valid fields / args / ... from error messages.
clairvoyance()
function used to manage the process of step-by-step schema construction (probe_*
and get_*
functions).
This module contains classes for various GraphQL concepts such as Schema
, Type
, Field
, InputValue
, TypeRef
. Each of them has methods for converting to / from JSON.
There is also Config
class which holds GraphQL endpoint configuration (e.g. URL, headers, bucket size).
Concerning async, there still is room for improvements, I'll push some as I review the integrity of clairvoyance features.
So, a program is either I/O or CPU bound.
If it's CPU bound, you want to start multiple process/worker to take over each CPU (in python there is the GIL which make it harder etc..)
Clairvoyance is a I/O bound program.
The aiohttp module implements a few recycling mechanism but it takes over sync program because you can proceed task while waiting for server. Basically, requests module block the thread until the server replies completely and in a HTTP request lifecycle, this is the longest time elapsed.
The idea is to send a batch of requests and wait for the batch once. Complexity wide, the waiting time will result in O(n) -> O(1).
by @c3b5aw