Skip to content
This repository has been archived by the owner on Feb 12, 2024. It is now read-only.

Implement Garbage Collection #2022

Merged
merged 56 commits into from
Aug 26, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
eb4ab1f
feat: implement ipfs refs
dirkmc Apr 22, 2019
761e305
feat: refs support in http api
dirkmc Apr 24, 2019
2f632ec
feat: use ipld instead of unix-fs-exporter for refs
dirkmc Apr 28, 2019
7b498fb
feat: refs local
dirkmc Apr 30, 2019
c8e964a
feat: add refs.localPullStream && refs.localReadableStream
dirkmc May 1, 2019
0137caf
feat: make object.links work with CBOR
dirkmc May 3, 2019
f6d7a2a
feat: handle multiple refs. Better param handling
dirkmc May 3, 2019
793a355
feat: GC
dirkmc May 8, 2019
719a9f9
chore: add comment to explain cli param parsing
dirkmc May 8, 2019
a5db723
refactor: move links retrieval from object to refs
dirkmc May 9, 2019
ae27eb5
feat: expose GC to http api
dirkmc May 10, 2019
1e3aedc
test: unskip repo gc test
dirkmc May 10, 2019
df86ce4
fix: refactor and fix some bugs with GC
dirkmc May 10, 2019
0d5085d
feat: GC locking
dirkmc May 20, 2019
43b1720
test: add gc locking tests
dirkmc May 21, 2019
d970a32
refactor: rebase
dirkmc May 21, 2019
3ec57d9
fix: gc use uppercase dag.Links
dirkmc May 21, 2019
255dee3
chore: update package.json deps
dirkmc May 21, 2019
c8d1f08
chore: rebase
dirkmc May 21, 2019
568a1d9
chore: add joi to package.json
dirkmc May 22, 2019
c0007af
refactor: pin/gc common code
dirkmc May 22, 2019
8de0c2b
fix: browser gc tests
dirkmc May 22, 2019
28b615d
fix: gc parsing of block cid in browser
dirkmc May 22, 2019
05ae894
test: add gc-lock tests
dirkmc May 23, 2019
8b52444
fix: gc lock error handling
dirkmc May 23, 2019
0c7cfdf
fix: gc - take pin lock after resolve
dirkmc May 23, 2019
81e3dd0
fix: make sure each GCLock instance uses distinct mutex
dirkmc May 23, 2019
7898ca2
fix: choose non-overlapping port for GC test
dirkmc May 24, 2019
bdcbddb
fix: better gc test port config
dirkmc May 24, 2019
a07ed7f
test: increase timeout for repo gc test
dirkmc May 24, 2019
ef86efc
fix: webworkers + mortice
dirkmc May 28, 2019
75159a0
chore: refactor mortice options
dirkmc May 28, 2019
c2b5ef6
fix: gc rm test on Windows
dirkmc May 28, 2019
7cfc53b
fix: ensure gc filters all internal pins
dirkmc Jun 7, 2019
026158f
test: enable gc tests over ipfs-http-client
dirkmc Jun 7, 2019
bf4e731
chore: better gc logging
dirkmc Jun 10, 2019
356e263
fix: pin walking
dirkmc Jun 14, 2019
58f34d6
refactor: pin set walking
dirkmc Jun 21, 2019
168046a
refactor: import pull modules directly
dirkmc Jul 9, 2019
216e53a
chore: update mortice package
dirkmc Jul 9, 2019
4712178
refactor: use assert.fail() instead of throwing for programmer err
dirkmc Jul 9, 2019
681b577
chore: lint fixes
dirkmc Jul 9, 2019
c81a920
chore: update ipfs-http-client
dirkmc Jul 10, 2019
11a02dd
fix: path to gc-lock
dirkmc Jul 10, 2019
1e4c97a
fix: apply review comments
dirkmc Jul 15, 2019
a8c362c
chore: address review comments
dirkmc Jul 17, 2019
43a2644
refactor: simplify gc-lock
dirkmc Jul 18, 2019
c992c51
refactor: move EventEmitter from GCLock to test
dirkmc Jul 18, 2019
c364b19
refactor: better default args handing in Mutex
dirkmc Jul 18, 2019
358b957
fix: lint fixes
dirkmc Jul 19, 2019
bcc8a69
fix: use repo path as mortice id
dirkmc Jul 19, 2019
24c072e
test: add repo gc cli tests
dirkmc Jul 22, 2019
97f8054
fix: remove hacky cocde
dirkmc Jul 22, 2019
36f45f4
test: sharness test for GC
dirkmc Jul 23, 2019
d085a30
fix: review feedback
alanshaw Aug 21, 2019
f869455
fix: Do not load all of a DAG into memory when pinning (#2387)
achingbrain Aug 23, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@
"array-shuffle": "^1.0.1",
"async": "^2.6.1",
"async-iterator-all": "^1.0.0",
"async-iterator-to-pull-stream": "^1.1.0",
"async-iterator-to-pull-stream": "^1.3.0",
"async-iterator-to-stream": "^1.1.0",
"base32.js": "~0.1.0",
"bignumber.js": "^9.0.0",
Expand All @@ -83,6 +83,7 @@
"debug": "^4.1.0",
"dlv": "^1.1.3",
"err-code": "^2.0.0",
"explain-error": "^1.0.4",
"file-type": "^12.0.1",
"fnv1a": "^1.0.1",
"fsm-event": "^2.1.0",
Expand All @@ -95,7 +96,7 @@
"ipfs-bitswap": "~0.25.1",
"ipfs-block": "~0.8.1",
"ipfs-block-service": "~0.15.2",
"ipfs-http-client": "^33.1.0",
"ipfs-http-client": "^33.1.1",
"ipfs-http-response": "~0.3.1",
"ipfs-mfs": "~0.12.0",
"ipfs-multipart": "~0.1.1",
Expand Down Expand Up @@ -139,6 +140,7 @@
"merge-options": "^1.0.1",
"mime-types": "^2.1.21",
"mkdirp": "~0.5.1",
"mortice": "^1.2.2",
"multiaddr": "^6.1.0",
"multiaddr-to-uri": "^5.0.0",
"multibase": "~0.6.0",
Expand Down Expand Up @@ -193,6 +195,7 @@
"ipfsd-ctl": "^0.43.0",
"libp2p-websocket-star": "~0.10.2",
"ncp": "^2.0.0",
"p-event": "^4.1.0",
"qs": "^6.5.2",
"rimraf": "^3.0.0",
"sinon": "^7.4.0",
Expand Down
29 changes: 24 additions & 5 deletions src/cli/commands/repo/gc.js
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,31 @@ module.exports = {

describe: 'Perform a garbage collection sweep on the repo.',

builder: {},
builder: {
quiet: {
alias: 'q',
desc: 'Write minimal output',
type: 'boolean',
default: false
},
'stream-errors': {
desc: 'Output individual errors thrown when deleting blocks.',
type: 'boolean',
default: true
}
},

handler (argv) {
argv.resolve((async () => {
const ipfs = await argv.getIpfs()
await ipfs.repo.gc()
handler ({ getIpfs, print, quiet, streamErrors, resolve }) {
resolve((async () => {
const ipfs = await getIpfs()
const res = await ipfs.repo.gc()
for (const r of res) {
if (r.err) {
streamErrors && print(r.err.message, true, true)
} else {
print((quiet ? '' : 'removed ') + r.cid)
}
}
})())
}
}
25 changes: 15 additions & 10 deletions src/core/components/block.js
Original file line number Diff line number Diff line change
Expand Up @@ -81,17 +81,19 @@ module.exports = function block (self) {
cb(null, new Block(block, cid))
})
},
(block, cb) => self._blockService.put(block, (err) => {
if (err) {
return cb(err)
}
(block, cb) => self._gcLock.readLock((_cb) => {
dirkmc marked this conversation as resolved.
Show resolved Hide resolved
self._blockService.put(block, (err) => {
if (err) {
return _cb(err)
}

if (options.preload !== false) {
self._preload(block.cid)
}
if (options.preload !== false) {
self._preload(block.cid)
}

cb(null, block)
})
_cb(null, block)
})
}, cb)
], callback)
}),
rm: promisify((cid, callback) => {
Expand All @@ -100,7 +102,10 @@ module.exports = function block (self) {
} catch (err) {
return setImmediate(() => callback(errCode(err, 'ERR_INVALID_CID')))
}
self._blockService.delete(cid, callback)

// We need to take a write lock here to ensure that adding and removing
// blocks are exclusive operations
self._gcLock.writeLock((cb) => self._blockService.delete(cid, cb), callback)
dirkmc marked this conversation as resolved.
Show resolved Hide resolved
}),
stat: promisify((cid, options, callback) => {
if (typeof options === 'function') {
Expand Down
8 changes: 5 additions & 3 deletions src/core/components/files-regular/add-pull-stream.js
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,9 @@ function pinFile (file, self, opts, cb) {
const isRootDir = !file.path.includes('/')
const shouldPin = pin && isRootDir && !opts.onlyHash && !opts.hashAlg
if (shouldPin) {
return self.pin.add(file.hash, { preload: false }, err => cb(err, file))
// Note: addPullStream() has already taken a GC lock, so tell
// pin.add() not to take a (second) GC lock
return self.pin.add(file.hash, { preload: false, lock: false }, err => cb(err, file))
dirkmc marked this conversation as resolved.
Show resolved Hide resolved
} else {
cb(null, file)
}
Expand Down Expand Up @@ -156,7 +158,7 @@ module.exports = function (self) {
}

opts.progress = progress
return pull(
return self._gcLock.pullReadLock(() => pull(
pullMap(content => normalizeContent(content, opts)),
pullFlatten(),
pullMap(file => ({
Expand All @@ -167,6 +169,6 @@ module.exports = function (self) {
pullAsyncMap((file, cb) => prepareFile(file, self, opts, cb)),
pullMap(file => preloadFile(file, self, opts)),
pullAsyncMap((file, cb) => pinFile(file, self, opts, cb))
)
))
}
}
28 changes: 15 additions & 13 deletions src/core/components/object.js
Original file line number Diff line number Diff line change
Expand Up @@ -242,19 +242,21 @@ module.exports = function object (self) {
}

function next () {
self._ipld.put(node, multicodec.DAG_PB, {
cidVersion: 0,
hashAlg: multicodec.SHA2_256
}).then(
(cid) => {
if (options.preload !== false) {
self._preload(cid)
}

callback(null, cid)
},
(error) => callback(error)
)
self._gcLock.readLock((cb) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might have missed the explanation, can someone link me to it or explain here why does GC need another lock on top of the Repo lock?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GC lock is used to prevent blocks being added/removed from the blockstore while GC is running. My understanding is that the Repo lock is used to prevent multiple instances of js-ipfs-repo from opening the same path, is that correct?

Copy link
Member

@daviddias daviddias Jul 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying.

Now wondering.. what happens to all inflight blocks coming in Bitswap? Is there a test to see if js-ipfs OOMs when performing a GC and someone sends a ton of blocks through bitswap that get queued in memory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question - in general that's the problem with stop-the-world GC, it blocks everything up while GC is being performed.

self._ipld.put(node, multicodec.DAG_PB, {
cidVersion: 0,
hashAlg: multicodec.SHA2_256
}).then(
(cid) => {
if (options.preload !== false) {
self._preload(cid)
}

cb(null, cid)
},
cb
)
}, callback)
}
}),

Expand Down
Loading