Preparing light client structure [v2] #150

svyatonik · 2018-05-08T20:53:05Z

Following #142 + conversation with @rphmeier .
I've tried to make specialized PolkadotApi implementation (just a stub for now), which should work on light client by requesting execution proofs from full nodes.
I've tried to reuse as much service/client/network code as possible, as suggested by @arkpar in #142 . Though having light implementation of Client is not a priority right now (if I understand you correctly, @rphmeier), there's still light version of Client (stub actually).

Brief overview:

extracted CallExecutor trait with call(at, method, call_data) method. Two implementations - LocalCallExecutor, which uses locally-available state data and RemoteCallExecutor which fetches execution proof from remote nodes && checks it locally;
since there are no tries right now, merkle proofs are not available => execution proof is the whole state atm. This will be changed later, after tries are landed. So the proof check is: (1) check that root is the same as in the at block (2) check that the local execution result is the same. So it isn't a light client actually right now, since it fetches a lot of data from remote node;
client backend is now shared between Client and CallExecutor implementations => it is now stored as Arc in both Client and CallExecutor;
introduced RemotePolkadotApi implementation of PolkadotApi, which takes advantage of remote execution. Still not sure that 2nd implementation is required or we can make single PolkadotApi for Client, working solely with CallExecutor - so it is a subject to change;
'light backend' should fetch required data from remote nodes using provided Fetcher. Since there's separate layer for executing calls (CallExecutor), it should only fetch a data for RPC calls (when properly used), but still it should be a fully-functional backend;
OnDemand is an implementation of Fetcher, which is now able to fetch execution proofs only. Later, block body, storage, tx inclusion proofs should be added;
Client::authorities_at implementation, which used direct state access is replaced with runtime method call => added this method to the test-runtime implementation => test-runtime genesis file change;
slightly changed sync protocol for light nodes - body is not queried when new block is announced. Client::check_justification works because remote authorities_at call is available on light clients;
as a result of ^^^, headers synchronization works right now (only headers + justification is queried when new block is announced + check_justification works as intended [given that there's no merkle proofs, though]).

Some TODOs (excluding described above):

separate tx pool implementation - if I understand correctly, we should store only local transactions on light clients. Plus, current implementation never prunes txs from the pool when they're included in the block, because this relies on accessing the block body;
only connecting to full nodes from light client nodes. Or maybe at least reserving some slots for full nodes.

svyatonik · 2018-05-08T21:05:37Z

P.S.:

light client backend uses in-memory blockhain backend as a storage now (i.e. it will resync from start after restarting);
there are big 'technical' changes about moving prepare_genesis closure content into GenesisBuilder trait method build - logic is not changed here;
not sure if collator + validator nodes should also be considered as FULL nodes (i.e. if they're storing state data locally) - would appreciate any advice (there's TODO for this in the code);

rphmeier · 2018-05-11T14:03:59Z

polkadot/cli/src/lib.rs

@@ -124,6 +115,10 @@ pub fn run<I, T>(args: I) -> error::Result<()> where
 		info!("Starting validator.");
 		role = service::Role::VALIDATOR;
 	}
+	else if matches.is_present("light") {


else if should be on preceding line

rphmeier · 2018-05-11T14:05:25Z

polkadot/consensus/src/service.rs

@@ -211,21 +211,23 @@ pub struct Service {

 impl Service {
 	/// Create and start a new instance.
-	pub fn new<C>(
+	pub fn new<A, C>(


i'm not convinced that this service needs to support light client in any form since it's only for validators (who are definitely full nodes)

there should be a hierarchy client backend traits where we can bound things by Backend: LocalBackend to filter out light nodes or Backend: RemoteBackend to accept light nodes only.

Added here. Had to add another pair of traits (LocalPolkadotApi + RemotePolkadotApi), since consensus service depends on PolkadotApi trait (and I didn't wanted to make PolkadotApi generic).

rphmeier · 2018-05-11T14:07:19Z

polkadot/service/src/lib.rs

+
+/// Creates light client and register protocol with the network service
+pub fn new_light(config: Configuration) -> Result<Service<client::light::Backend, client::RemoteCallExecutor<client::light::Backend, CodeExecutor>>, error::Error> { // TODO: light
+	Service::new(move |_, executor, genesis_builder: GenesisBuilder| {


for an actual service, the logic needs to be fairly different for a light client.

for one, we need to introduce a feedback loop between block import and the sync service, where we can fetch things like validator set changes and state proofs before continuing sync.

and we should cut out all the validator stuff from light clients, it's not necessary to support

Isn't result of import_block is enough? I mean - we can do anything (like check for validators set changes there (in import_block)) there && just return error to sync if something goes wrong. So this will affect only small portion inside Client implementation && leave the Service as is.

rphmeier · 2018-05-18T12:46:36Z

substrate/client/src/call_executor.rs

+
+		let (remote_result, remote_proof) = self.fetcher.execution_proof(block_hash, method, call_data)?;
+
+		// code below will be replaced with proper proof check once trie-based proofs will be possible


these are possible now, but I'm fine with deferring to another PR if you like

yeah - better to start adding support for tries after this PR (overall structure) is approved

rphmeier · 2018-05-18T12:49:37Z

substrate/client/src/client.rs

-		self.executor.clone()
+	/// Get the set of authorities at a given block.
+	pub fn authorities_at(&self, id: &BlockId) -> error::Result<Vec<AuthorityId>> {
+		self.executor.call(id, "authorities",&[])


what is your plan for having light clients memoize this and code changes? How will that fit into the abstraction?

Under the current consensus algorithm, we can expect certain runtimes to make a digest entry every time the authorities or code change for the next block. Under the planned consensus algorithm this change will be applied at some point beyond finality of the signal.

Now: detect change in import_block method (digest is available there for light clients) => reset cache (not sure yet - where cache must be placed - in OnDemand service, or at the LightBackend/LightCallExecutor).

Then: not sure, but I guess the same strategy will work, but cache reset must be scheduled, instead of executing right now.

we will probably want to keep all authority sets and scheduled transitions in the DB until they are no longer necessary, and then have some way to look up by block ID

OK - will add some meta-db + make cache backed by this db then.

btw - the question from the future PRs :) do we need the same data on full nodes, or we're only considering archive nodes right now (=> all states are available => cache is not required)?

right now let's consider only archive nodes. but the scheduling logic should be mostly the same for light and full nodes.

rphmeier · 2018-05-18T12:50:19Z

substrate/client/src/in_mem.rs

@@ -26,7 +26,8 @@ use primitives::block::{self, Id as BlockId, HeaderHash};
 use blockchain::{self, BlockStatus};
 use state_machine::backend::{Backend as StateBackend, InMemory};

-fn header_hash(header: &block::Header) -> block::HeaderHash {
+/// Compute block header hash.
+pub fn header_hash(header: &block::Header) -> block::HeaderHash {


this method is redundant with blake2_256(), isn't it?

thanks, replaced with direct call

rphmeier · 2018-05-18T12:51:51Z

substrate/client/src/light.rs

+
+	fn storage(&self, _key: &[u8]) -> Result<Option<Vec<u8>>, Self::Error> {
+		// TODO [light]: fetch from remote node
+		Err(error::ErrorKind::NotAvailableOnLightClient.into())


IMO this is really not the right approach. Making round trips for every storage entry we want to read <<<< making a single round trip for an entire call.

There should be a StateBackend which is generated from an execution proof; then all the methods can be implemented.

There's impl StateBackend for OnDemandState - stub right now. storage is not intended to be used during calls - as mentioned in description:

'light backend' should fetch required data from remote nodes using provided Fetcher. Since there's separate layer for executing calls (CallExecutor), it should only fetch a data for RPC calls (when properly used), but still it should be a fully-functional backend;

rphmeier · 2018-05-18T13:04:42Z

substrate/network/src/on_demand.rs

+		match core.remove(peer, response.id) {
+			Some(request) => {
+				// we do not bother if receiver has been dropped already
+				let _ = request.sender.send(response);


it should check the validity of the response here and reassign if the response is not correct. otherwise we are just relying that our peers are correct and are vulnerable to sybil attack.

we could have each request come with an associated attempts counter, and users can specify a default maximum when creating the OnDemand service. When dispatching a request they can use that default or provide their own maximum.

Maybe better to add this logic to the LightCallExecutor / LightBackend - remote execution proof check = local execution (am I right?). And that is performed in LightCallExecutor right now. So fetch => execute locally => disconnect if fail.

sure, as long as there's a convenient way to disconnect the peer from the LightCallExecutor -- that will require passing peer info along with the response.

This is a little different than the OnDemand service in parity-ethereum, which handles data validation and rescheduling internally. But then it has the nice property that only good data comes out of the service. This architecture is also useful for implementing a network protocol that implements heterogeneous requests in a single packet, where inputs of later requests are outputs of earlier ones (see PIP definition).

note that the eventual goal for substrate is to use a network protocol like that -- if there is another way to do it without adding more round trips I am fine with that as well.

no - I think this would change the structure significantly, so let it be in this PR - I'll update the code here

I've added FetchChecker trait to the light client module (where Fetcher is also defined). It is responsible for checking remote response and converting it from 'remote' to 'local' form (like dropping merkle proofs, because they're not required after check is performed).
So implementation of Fetcher (i.e. OnDemand) could stay in network crate && do not refer directly to blockchain/client && only check+convert remote response using this trait (implemented in light module).

I'll add attempts counter in next PRs, if you don't mind - still want to achieve some overall structure stability in this PR.

I don't see FetchChecker anywhere in the diff.

Trait and implementation: paritytech/polkadot@f4070e2#diff-389a8eabc40f446e244b84b3d9c5c343R51
Usage: paritytech/polkadot@f4070e2#diff-a15b054a65ff0e7c42d5add3417c90faR151

rphmeier · 2018-05-18T13:07:55Z

substrate/network/src/on_demand.rs

+				None => return,
+			};
+
+			let mut request = self.pending_requests.pop_front().expect("checked in loop condition; qed");


can't we just loop for mut x in self.pending_requests.drain()?

there might be no peers available or number of peers is < than number of requests (currently 1 request <-> 1 peer at a time)

rphmeier · 2018-05-18T13:50:42Z

polkadot/api/src/lib.rs

@@ -34,8 +34,10 @@ extern crate error_chain;
 #[cfg(test)]
 extern crate substrate_keyring as keyring;

-use client::backend::Backend;
-use client::Client;
+pub mod light;


nit: since light clients are meant to be first-class we should also put the full client implementation in a submodule

thanks, will do

Tried to do this, but actually full client could be backed by db backend (which is already in the separate crate, named db, not the full) and in-memory backend (which I guess could be renamed to TestBackend). So there's actually nothing to move to the full submodule right now, except for db crate.

impl<B: LocalBackend> PolkadotApi for Client<B, LocalCallExecutor<B, NativeExecutor<LocalDispatch>>> looks like it could go in a full module.

right. missed that you're talking about Api, not about the client

rphmeier · 2018-05-22T13:01:25Z

substrate/network/src/on_demand.rs

+	message: message::RemoteCallRequest,
+}
+
+impl Response {


this should just be a Future. wait comes for free then, along with a bunch of other things which are useful for composing async calls.

rphmeier · 2018-05-22T13:03:11Z

substrate/client/src/light.rs

+/// Light client data fetcher.
+pub trait Fetcher: Send + Sync {
+	/// Fetch method execution proof.
+	fn execution_proof(&self, block: HeaderHash, method: &str, call_data: &[u8]) -> error::Result<(Vec<u8>, Vec<Vec<u8>>)>;


should return a Future. It will still be easy to use it in a synchronous way.

maybe CallExecutor should return a future too. This will make it compose nicely with asynchronous RPC servers

Changed Fetcher to return Future. Left CallExecutor as is for now

rphmeier · 2018-05-22T13:07:51Z

we should store only local transactions on light clients

light clients should also not propagate non-local transactions, or any block data. They should take up "corners" of the gossip topology where they only leech data.

gavofyork · 2018-05-25T20:59:02Z

need another merge...

…ytech#150) * created npm package and added it to tests * install types bundle * also export definitions Co-authored-by: Joshy Orndorff <JoshOrndorff@users.noreply.github.com> * remove exitreasons>

* Refactor xsupport pallet * Optimize using try_for_each * Rename to as_string and as_string_hex * . * Rename to as_addr() * Remove useless Str

Simplify the properties generation in ChainSpec

* Add toolchain file + gh action file * Add build action * Adjust test workflow for client only * Delete testing action * Change step name

Co-authored-by: Demi M. Obenour <demiobenour@gmail.com>

svyatonik added the A0-please_review Pull request needs code review. label May 8, 2018

svyatonik force-pushed the light_init branch from 72da113 to 85197d6 Compare May 8, 2018 22:35

rphmeier reviewed May 11, 2018

View reviewed changes

svyatonik added 3 commits May 18, 2018 11:51

light client structure + remote call requests

79f5e0a

moved else

20688c0

Remote + Local Backend/PolkadotApi traits

4c0b599

svyatonik force-pushed the light_init branch from a1137fc to 4c0b599 Compare May 18, 2018 08:51

rphmeier reviewed May 18, 2018

View reviewed changes

removed redundant header_hash

218b2ed

rphmeier reviewed May 18, 2018

View reviewed changes

svyatonik added A4-gotissues and removed A0-please_review Pull request needs code review. labels May 18, 2018

introduce FetchChecker for Fetcher

f4070e2

svyatonik added A0-please_review Pull request needs code review. and removed A4-gotissues labels May 21, 2018

rphmeier reviewed May 22, 2018

View reviewed changes

rphmeier added A5-grumble and removed A0-please_review Pull request needs code review. labels May 22, 2018

svyatonik added 2 commits May 22, 2018 16:15

extracted LocalPolkadorApi to full submodule

6e04fb8

Future as Fetcher result

5901a2a

svyatonik added A0-please_review Pull request needs code review. and removed A5-grumble labels May 22, 2018

rphmeier approved these changes May 22, 2018

View reviewed changes

svyatonik added 3 commits May 23, 2018 10:35

fixed tests compilation

7661c9f

Merge branch 'master' into light_init

d65c8d5

fixed tests after merge

cfe2cae

gavofyork added A8-looksgood and removed A0-please_review Pull request needs code review. labels May 24, 2018

Merge branch 'master' into light_init

c681974

gavofyork added A7-looksgoodcantmerge and removed A8-looksgood labels May 26, 2018

Merge branch 'master' into light_init

0d7e6ac

svyatonik added A8-looksgood and removed A7-looksgoodcantmerge labels May 28, 2018

svyatonik merged commit 6ba747c into master May 28, 2018

svyatonik deleted the light_init branch May 28, 2018 07:28

svyatonik mentioned this pull request Jun 26, 2018

DB-based blockchain data cache for light nodes #251

Merged

svyatonik mentioned this pull request Aug 8, 2018

Limit number of retries for light client queries #513

Merged

liuchengxu pushed a commit to chainx-org/substrate that referenced this pull request Aug 23, 2021

add runtime module log and copyright (paritytech#150)

ed97f3f

liuchengxu added a commit to chainx-org/substrate that referenced this pull request Aug 23, 2021

Refactor xsupport (paritytech#150)

d40999d

* Refactor xsupport pallet * Optimize using try_for_each * Rename to as_string and as_string_hex * . * Rename to as_addr() * Remove useless Str

liuchengxu added a commit to autonomys/substrate that referenced this pull request Jun 3, 2022

Merge pull request paritytech#150 from subspace/simplify-properties

5cab601

Simplify the properties generation in ChainSpec

helin6 pushed a commit to boolnetwork/substrate that referenced this pull request Jul 25, 2023

Implement the concat in twox_64_concat (paritytech#150)

511e6f5

Co-authored-by: Demi M. Obenour <demiobenour@gmail.com>


		let (remote_result, remote_proof) = self.fetcher.execution_proof(block_hash, method, call_data)?;

		// code below will be replaced with proper proof check once trie-based proofs will be possible

Preparing light client structure [v2] #150

Preparing light client structure [v2] #150

Conversation

svyatonik commented May 8, 2018

svyatonik commented May 8, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rphmeier May 11, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rphmeier May 11, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rphmeier May 18, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rphmeier May 22, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rphmeier May 18, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rphmeier May 18, 2018 • edited Loading

Choose a reason for hiding this comment

rphmeier May 18, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rphmeier May 18, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rphmeier May 22, 2018 • edited Loading

Choose a reason for hiding this comment

rphmeier May 22, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rphmeier commented May 22, 2018 • edited Loading

gavofyork commented May 25, 2018

rphmeier May 11, 2018 •

edited

Loading

rphmeier May 11, 2018 •

edited

Loading

rphmeier May 18, 2018 •

edited

Loading

rphmeier May 22, 2018 •

edited

Loading

rphmeier May 18, 2018 •

edited

Loading

rphmeier May 18, 2018 •

edited

Loading

rphmeier May 18, 2018 •

edited

Loading

rphmeier May 18, 2018 •

edited

Loading

rphmeier May 22, 2018 •

edited

Loading

rphmeier May 22, 2018 •

edited

Loading

rphmeier commented May 22, 2018 •

edited

Loading