-
Notifications
You must be signed in to change notification settings - Fork 454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
strongest possible reputation system #31
Comments
Loggers were disabled due to unittests, added disable_existing_loggers =
This will be moved earlier in the roadmap. Multi-chain will evolve with numerous small steps into this envisioned perfect system.
If we go beyond one million users it is possible the database becomes a bottleneck. We will first analyse, modify and tune the algorithms. If this has proven to be insufficient we evaluate libraries such as: overkill approach for billions of edges or Python graph database (inactive) |
In my opinion, we should avoid branching at all costs. The problem we are solving with branching is two-fold: First the blocking issues on timeouts, second the infinite growth of the datastructure. The second problem can be easily resolved by having some sort of secondary archive multichain. In this multichain, you are allowed to archive a transaction older than X period of time. This transaction will be your new genesis block and others are not allowed to ask you for earlier transactions (so you can delete them). The first problem can be resolved by scaling the moment of signature request by the bandwith available. Incoming signature requests will never cause a problem for you, only for the other party, if you currently have an unanswered signature request outstanding. Hence, a client should only ask for a signature from someone else, if the probability of receiving another request before receiving a response is sufficiently low. (This can be estimated using simple probability/queuing theory). |
On archiving: On request timing: |
Alternative concept: half-signed only, or optimistic requesting Another way of going around the blocking problem is to base the chain on only half signed blocks.
The peer can refuse to create the corresponding block on his chain, but this is no different from him refusing to sign your block. You can add fake blocks to your own chain, but without evidence they are not valid. |
ToDo: merge major/minor seq idea, half-signed+revert. |
In a session with @synctext, we discussed the half-signed only and branching options. He was especially concerned about having the vertical compression (updating transactions) and the difficulty with the reuse of sequence numbers. This could be solved by introducing a minor sequence number. (Alternatively you could accept a gap in the sequence numbering, as long as it is not decreasing. It's the hashes that provide the real linkage anyway. Same result, you never use a sequence number twice.) This saves you from having to forgo all the credit from interaction B-C. Note that A will only update the block if it is still the head of the chain of A for obvious reasons. If A is not able to cooperate then you have to accept that one of the branches cannot continue. Probably the one with the smallest credit. I'm thinking that nodes will have to be fair and open about all this parallel stuff and allow that "unmergable" transactions and any replaced/overriden blocks be harvested by a crawler. Because what if C secretly decides to use 3.1 or 4.0? It would look bad if B denied any knowledge of this block. Also this would still combine with PimV.'s half-signed as default proposal. |
design dream 1: if you do not follow the protocol, you never get any of the benefits. branching: signing a certificate with the same sequence number (major+minor) constitutes a double spending attack, instant banning. An elegant and easy mechanism. This is the key reason for sequence numbers, they become a coin that can only be spend once or represent 1 truth. Append-only ensures any record freezes past behavior. A single record represents an exact position in the chain, impossible without sequence numbers. |
The whole problem with having any branching option is that the "design dream 2" mentioned above is lost. The fact that a client has multiple branches means that they cannot be forced to rebase or merge it back into their main branch. This means that there exists a "sidebranch hiding attack", in which a client cannot be forced to divulge a side-branch existing. To make up for this, there would need to be some mechanism that forces user to merge or rebase branches. However, this means that proofs of misbehavior get much more convoluted. My proposal for the compression is to be fairly lax with sending signing requests, scaling blocks with both the number of interactions with that person and the bandwith of the originator. If this during the experimental phase turns out to yield too much growth still, or too many unsigned/dropped transactions, because of the block size, I would be in favour of exploring a secondary datastructure where we consolidate some of the information. But I would like to see if the compression is an actual problem before moving towards those kinds of solutions. |
For the alpha version of multichain:
|
Another mayor flaw that needs fixing before alpha:
|
future approach:
|
As of today we have a design freeze of multi-chain. It is far from feature complete, merely a dry run of a concrete design. Accurate accounting of the totals is still out-of-scope. We will create a stable implementation of this design and include it in Tribler. |
I was having a good brainstorming session about the "parity for contributions" problem where the number of relays is not fixed. Here is what I came up with: Assume a seeder S, who has a proxy infront of him and a rendevous proxy (Ps and Ps'). The downloader (D) only has a rendevous proxy (Pd') to hide behind. In the naive approach of payout on this setup, the seeder asks for his true up and down. Each proxy then tacks on its total and thus the total download cost gets billed to the downloader. There are five downsides to this scheme:
Some initial thoughts on remedies might go as follows:
To me it is obvious that propagating the true cost of downloading along the chain is not feasable nor elegant. An alternative solution outline that I have thought about is this: The S and D can sign directly to each other through the circuit. Next the relays can sign the relay traffic among themselves (hops Ps-Ps' and Ps'-Pd'). However this shorts Ps and Pd' since they don't get S-Ps and Pd'-D respectively. However they are the inverse of each other, so perhaps a scheme can be devised that "closes the loop" such that Ps and Pd' sign eachothers outstanding up and down. This also disconnects the proxies from the seeder&downloader in multichain. S and D can include themselves in the proxy loop, since they can claim to be just a proxy. If S and D do not include themselves in the proxy loop they indirectly expose themselves to Ps and Pd' respectively as seeder and downloader, atleast there is no public record of this. A downside of this scheme is that the proxies are neutrally balanced, and since there is no public recognition for relaying in this system, there is not much incentive to relay traffic. To fix this, multichain could be extended with new counters named "relay" and corresponding "relay_total" (or have a secondary multichain instance). The point of this counter would be to give recognition to the relays. However this does not combine perfectly with the proxy loop signing scheme since S and D then also get relay credits. So exposing S and D might be needed on the first and last proxy in the chain, such that they can eliminate them from the proxy signing loop. This last bit is a rather fundamental problem, what ever scheme is devised S and D can always claim to be or present themselves as just a proxy, that is the point of annonimity. However to prevent them from getting credited (or paid or counted) as a relay, their true role will have to be exposed. The conflict between these two facts needs to be resolved. @synctext proposes to do this by having fixed length proxy chains. |
This week the alpha version of multichain got merged. Solid progress, this is 2 months after the idea of an alpha took root. We assume no redesign is needed.. The features for the next Beta 1.0 release are:
Beta 2.0:
|
ToDo on the foundations of trust:
|
Reputation systems are linked to both game theory and smart contracts. Our upcoming 'smarter smart contracts' work requires enforcement mechanisms. This is brought together in self-enforcing protocols :
|
#67 from 2nd April 2013 seems to be the earliest public record of TrustChain / MultiChain / BarterCast3. |
Given my current belief system, I would frame this as follows:
So, I would say you do not trade reputation directly: instead, you trade currency in such a way that other honest participants increase the reputation score they hold of you. It all depends on the conversion mechanism from heterogeneous units of accounting to homogeneous units of work[1]. Whether it is fair to say this corresponds to meeting the buyer's price, I don't know -- it is probably subjective: paying your own Sybil is most likely not that impressive to others. In conclusion, in a roundabout way:
Yes. Lastly:
Given my current 6 page limit, this will be limited to the latter: determining reputation of others. [1] For example, you may define 10 Mb download equals -10 work, 10 Mb upload equals +10 work, 1 Bitcoin equals +1000 work. This will probably also fluctuate continuously due to complicating temporal factors, like current market prices or the local legal ramifications of uploading. |
So, the hard work now begins of proving some strong property that results from this neat model. For instance, if the number of fraudsters/Sybil is limited to 🔣, it is guaranteed that cooperation will emerge. Or. Even a tiny minority which is honest we prove that the ecosystem eventually converges, they find each other, collaborate, and can sustainably withstand continuous Sybil or fraud attacks of any kind. |
Small update: I'm now referring to the |
An "accounting mechanism" differs from an "accountability mechanism". Accountability, in my opinion, is the extent to which someone can be held responsible/accountable for their actions and behaviour after-the-fact. Accounting, however, is the recording of actions in a community as they are performed. Given these (informal) definitions, the term "accounting mechanism" seems to align better with reputation/trust. |
I mostly agree, but I believe accounting has the subtle undertone of being a log of things holding value that are countable and transferable, which is not necessarily true for all events in the system. This is why I'm trying to find another term. The proper terminology would probably be an |
next step could be a list of deployed reputation systems: |
Key related work: A Taxonomy to Express Open Challenges in Trust and Reputation Systems
|
Update: the #31 (comment) paper is now published. https://doi.org/10.1145/3428662.3428790 |
New axiomatic principles:
inspiration: "Pragmatically: the past is shared; the present is independent", from Jepsen intro to distributed systems |
Related work (non-axiomatic):
|
Database support for our axiomatic principles:
For 2023 I hope our lab is experienced enough to build a new (self-organising) distributed database paradigm @kozlovsky. Something on top of Pony, integral reputation/trust tracking, something like Trustchain for each content insert, everybody runs a "full nodes", and uncompromising permissionless. Like the perfect blend of Bitcoin with ORM. The reputation function of content is used to determine if information is actively suppressed or actively spread |
To avoid making the same mistakes twice, two short post-mortems of our previous work:
|
True! application-specific gossip forwarding functions. Generic approaches never get used or are inefficient. |
3Es of Sybil resistance are: 1) Entry cost 2) Existence cost 3) Exit cost. |
Idea: strongest possible reputation system by combining all existing science.
Problem: our scientists Rahim, Adele, Nitin and Dimitra did not combine their work.
Reason: optimal publication strategy is one clean and simple idea
There are no joint publications, no re-usage of algorithms and little collaboration.
We evaluated in isolation multiple enhancements to make accounting of contributions stronger and incentive compatible:
Reputation system proposals often assume either:
Option number 1 is beyond the state-of-the-art, without assuming an honest central server with infinite capacity. Option number 2 is difficult to achieve. For scalability seasons, the storage layer will always be imperfect and will never offer both instant, accurate and complete recall of information. Thus we need to be able to tolerate limited accuracy or coverage in addition to the usual misreporting, lying and sybil creation.
A reputations system needs to determine (assuming background gossip of trust vectors):
1 Who to talk to
2 how to reach them
3 What to say
4 How much to trust
Existing work is focused on no.4 mostly, while the others have proven to be
much harder to solve. Nobody identified them as a problems and no
solutions exist.
Trust calculation is just 50 lines of code, while no. 1-3 is beyond
the state-of-the-art in epidemic protocols.
Missing topics:
3. "what to say", Efficient synchronisation of BarterCast (subgraph) trust vectors
Existing material:
1. Who to talk to
Biased random walk
2. how to reach them
Churn resiliance, state preservation, live overlay, NAT traversal, etc
4. How much to trust
Meulpolder 2008: Max-flow
Rahim: various sybil-limiters
Adele: rewards also effort
Dimitra: betweenness centrality
Nitin: sybil-limiters
The text was updated successfully, but these errors were encountered: