Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove intermediate InsertNodeAtPath in ipfs add #1964

Closed
wants to merge 78 commits into from

Conversation

rht
Copy link
Contributor

@rht rht commented Nov 13, 2015

This is an experiment.
The time it takes for an ipfs add is halved when the intermediate root(s) are not created.

daviddias and others added 30 commits November 11, 2015 10:04
License: MIT
Signed-off-by: David Dias <daviddias.p@gmail.com>
This used to lead to large refcount numbers, causing Flush to create a
lot of IPFS objects, and merkledag to consume tens of gigabytes of
RAM.

License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
OS X sed is documented as "-i SUFFIX", GNU sed as "-iSUFFIX". The one
consistent case seems to be "-iSUFFIX", where suffix cannot empty (or
OS X will parse the next argument as the suffix).

This used to leave around files named `refsout=` on Linux, and was
just confusing.

License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
These secondary copies were never actually queried, and didn't contain
the indirect refcounts so they couldn't become the authoritative
source anyway as is. New goal is to move pinning into IPFS objects.

A migration will be needed to remove the old data from the datastore.
This can happen at any time after this commit.

License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
Pinner had method GetManual that returned a ManualPinner, so every
Pinner had to implement ManualPinner anyway.

License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
Platform-dependent behavior is not nice, and negative refcounts are
not very useful.

License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>

sharness: Don't assume we know all things that can create garbage

License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
WARNING: No migration performed! That needs to come in a separate
commit, perhaps amended into this one.

This is the minimal rewrite, only changing the storage from
JSON(+extra keys) in Datastore to IPFS objects. All of the pinning
state is still loaded in memory, and written from scratch on Flush. To
do more would require API changes, e.g. adding error returns.

Set/Multiset is not cleanly separated into a library, yet, as it's API
is expected to change radically.

License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
…lure

License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
There was doublewrapping with an unneeded msgio. given that we
use a stream muxer now, msgio is only needed by secureConn -- to
signal the boundaries of an encrypted / mac-ed ciphertext.

Side note: i think including the varint length in the clear is
actually a bad idea that can be exploited by an attacker. it should
be encrypted, too. (TODO)

License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
* ID service stream
* make the relay service use msmux
* fix nc tests

Note from jbenet: Maybe we should remove the old protocol/muxer
and see what breaks. It shouldn't be used by anything now.

License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
Signed-off-by: Juan Batiz-Benet <juan@benet.ai>
The addition of a locking interface to the blockstore allows us to
perform atomic operations on the underlying datastore without having to
worry about different operations happening in the background, such as
garbage collection.

License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
This commit improves (fixes) the FetchGraph call for recursively
fetching every descendant node of a given merkledag node. This operation
should be the simplest way of ensuring that you have replicated a dag
locally.

This commit also implements a method in the merkledag package called
EnumerateChildren, this method is used to get a set of the keys of every
descendant node of the given node. All keys found are noted in the
passed in KeySet, which may in the future be implemented on disk to
avoid excessive memory consumption.

License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
License: MIT
Signed-off-by: Juan Batiz-Benet <juan@benet.ai>
License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>

dont GC blocks used by pinner

License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>

comment GC algo

License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>

add lock to blockstore to prevent GC from eating wanted blocks

License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>

improve FetchGraph

License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>

separate interfaces for blockstore and GCBlockstore

License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>

reintroduce indirect pinning, add enumerateChildren dag method

License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
@rht
Copy link
Contributor Author

rht commented Nov 19, 2015

Found it: in flatfs.New, the sync argument doesn't get passed into the Datastore object creation.

@whyrusleeping
Copy link
Member

@rht wow, good catch. thanks!

whyrusleeping and others added 6 commits November 19, 2015 11:15
License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
if bucket doesnt have enough peers, grab more elsewhere
add closenotify and large timeout to gateway
License: MIT
Signed-off-by: rht <rhtbot@gmail.com>
Add config option for flatfs no-sync
@rht rht added the topic/perf Performance label Nov 22, 2015
License: MIT
Signed-off-by: rht <rhtbot@gmail.com>
@rht rht mentioned this pull request Nov 24, 2015
rht and others added 4 commits November 24, 2015 14:48
License: MIT
Signed-off-by: rht <rhtbot@gmail.com>
License: MIT
Signed-off-by: rht <rhtbot@gmail.com>
License: MIT
Signed-off-by: Jeromy <jeromyj@gmail.com>
ipfs files ls without -l is faster
@rht rht mentioned this pull request Dec 1, 2015
42 tasks
@jbenet
Copy link
Member

jbenet commented Dec 1, 2015

LGTM. ok so do we still want to merge this @whyrusleeping ?

(what if the "edited dag" is too big to hold in memory? not sure what happens now? fails? but in future should store on disk and gc memory re-reading as necessary? (should be possible to edit massive dags with very little memory, but enough disk))

License: MIT
Signed-off-by: rht <rhtbot@gmail.com>

Small cleanup coreunix.Add

License: MIT
Signed-off-by: rht <rhtbot@gmail.com>
@rht
Copy link
Contributor Author

rht commented Dec 1, 2015

(rebased)

The last / only InsertNodeAtPath in this PR doesn't consume much memory, since what it does is just patching the final hash to the "global" root of an IpfsNode (as in, the root dir one would see of /ipfs if the repo is fuse-mounted).

@whyrusleeping
Copy link
Member

closing, we use the mfs code for adds now (if changes from here are still relevant to that process, please file a new PR on top of master with them)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/in-progress In progress topic/perf Performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants