Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix broken tests on elements and various cleanups #3568

Merged
merged 10 commits into from
Mar 23, 2020

Conversation

cdecker
Copy link
Member

@cdecker cdecker commented Mar 4, 2020

This was supposed to be a regular cleanup PR, but given that a couple of tests broke along the way for liquid-regtest, it took me a bit longer.

  • Create test directory: some tests, especially those testing tools, will not create the working directory themselves, so we create it when the fixture is created.
  • Strengthened a couple of tx creation tests, mainly minimizing the array allocation sizes and ensuring everything is set up correctly (subsumes chain specific modifications).
  • MIgrate away from deprecate pylightning dependency towards pyln-client
  • There is a lingering test failure in which we fail to remove the working directory. I added a printing of the file listing to see what files are causing that issue. It then rethrows the error so we don't just swallow it.

There is one more thing I need to find before un-drafting:

  • If running with DEVELOPER=1, TEST_DB_PROVIDER=postgres and VALGRIND=1 there is a memory leak when retrieving UTXOs for txprepare.

Changelog-None

@cdecker cdecker force-pushed the cleanups branch 3 times, most recently from 7bd62a4 to e1e2801 Compare March 9, 2020 14:34
@cdecker
Copy link
Member Author

cdecker commented Mar 16, 2020

I'll have to revisit the memleak when running with postgres later, don't want to hold up the other cleanups since they are rather nice imho.

@cdecker cdecker marked this pull request as ready for review March 16, 2020 14:09
@cdecker cdecker requested a review from niftynei March 16, 2020 14:09
@cdecker cdecker added this to the 0.8.2 milestone Mar 16, 2020
@cdecker cdecker force-pushed the cleanups branch 3 times, most recently from 8b5a055 to d63cf6c Compare March 17, 2020 17:44
@cdecker
Copy link
Member Author

cdecker commented Mar 17, 2020

Managed to fix two more regressions:

  • The memleak that was present in postgres due to unordered UTXOs was also a leak with sqlite3 but manifested only rarely. Now fixed.
  • A new test that broke because it expected mainnet fees on liquid. Parameterized using chainparams

This should now be complete and ready to be reviewed. Paging release captain @niftynei 😉

@cdecker
Copy link
Member Author

cdecker commented Mar 17, 2020

Here are the results from running all the various configurations:

[{"['TEST_NETWORK=regtest', 'VALGRIND=0', 'PYTEST_PAR=10', 'TEST_DB_PROVIDER=postgres', 'DEVELOPER=1']": '0'},
 {"['TEST_NETWORK=regtest', 'VALGRIND=0', 'PYTEST_PAR=10', 'TEST_DB_PROVIDER=sqlite3', 'DEVELOPER=1']": '0'},
 {"['TEST_NETWORK=regtest', 'VALGRIND=1', 'PYTEST_PAR=10', 'TEST_DB_PROVIDER=postgres', 'DEVELOPER=1']": '0'},
 {"['TEST_NETWORK=regtest', 'VALGRIND=1', 'PYTEST_PAR=10', 'TEST_DB_PROVIDER=sqlite3', 'DEVELOPER=1']": '0'},
 {"['TEST_NETWORK=liquid-regtest', 'VALGRIND=0', 'PYTEST_PAR=10', 'TEST_DB_PROVIDER=postgres', 'DEVELOPER=1']": '0'},
 {"['TEST_NETWORK=liquid-regtest', 'VALGRIND=0', 'PYTEST_PAR=10', 'TEST_DB_PROVIDER=sqlite3', 'DEVELOPER=1']": '0'},
 {"['TEST_NETWORK=liquid-regtest', 'VALGRIND=1', 'PYTEST_PAR=10', 'TEST_DB_PROVIDER=postgres', 'DEVELOPER=1']": '0'},
 {"['TEST_NETWORK=liquid-regtest', 'VALGRIND=1', 'PYTEST_PAR=10', 'TEST_DB_PROVIDER=sqlite3', 'DEVELOPER=1']": '0'}]

Some tests may not spawn a node at all, so make sure that our assumption that
the directory exists in the fixture cleanup is correct by creating the
directory.
@rustyrussell
Copy link
Contributor

Simple rebase: test conflicts.

Copy link
Collaborator

@darosior darosior left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't ACK the element stuff, but otherwise looks good to me.

Leaved a few questions and restarted Travis

@@ -272,7 +272,7 @@ ifeq ($(PYTEST),)
exit 1
else
# Explicitly hand DEVELOPER and VALGRIND so you can override on make cmd line.
PYTHONPATH=`pwd`/contrib/pyln-client:`pwd`/contrib/pyln-testing:`pwd`/contrib/pylightning:`pwd`/contrib/pyln-proto/:$(PYTHONPATH) TEST_DEBUG=1 DEVELOPER=$(DEVELOPER) VALGRIND=$(VALGRIND) $(PYTEST) tests/ $(PYTEST_OPTS)
PYTHONPATH=`pwd`/contrib/pyln-client:`pwd`/contrib/pyln-testing:`pwd`/contrib/pyln-proto/:$(PYTHONPATH) TEST_DEBUG=1 DEVELOPER=$(DEVELOPER) VALGRIND=$(VALGRIND) $(PYTEST) tests/ $(PYTEST_OPTS)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

bitcoin/tx.c Outdated
}

if (pos == -1) {
if (pos >= tx->wtx->num_outputs) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It cannot be >, can it ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. For some reason I'm in the habit of being overly cautious in these cases (spurious increments somewhere), but I'll fixup in this case :-)

bitcoin/tx.c Outdated

/* Make sure we have a place to stash the witness script in. */
if (tal_count(tx->output_witscripts) < pos + 1) {
fprintf(stderr, "XXX resizing array to %d to fit item at %d\n", pos + 1, pos);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trailing debug line ?

@@ -465,7 +465,7 @@ def test_closing_fee(node_factory, bitcoind, chainparams):
'funder_feerate_per_kw': 29006,
'fundee_feerate_per_kw': 27625,
'close_initiated_by': 'funder',
'expected_close_fee': 20333
'expected_close_fee': 33533 if chainparams['elements'] else 20333
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this calculated ? Asking for a friend #3592 where I wondered how much to bump it in functional tests (e.g. 616a432#diff-d03575dee065f2dc319cf0e8d39ebe95R548)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually just binary search until I find params that work on both sides (keeping as many parameters as possible fixed).

I have not found a good way to derive from first principles, other than reimplementing the logic in python, which'd just double the maintenance effort without a major upside.

@@ -403,8 +403,10 @@ static const struct utxo **wallet_select(const tal_t *ctx, struct wallet *w,
* confirmation height and that it is below the required
* maxheight (current_height - minconf) */
if (maxheight != 0 &&
(!u->blockheight || *u->blockheight > maxheight))
(!u->blockheight || *u->blockheight > maxheight)) {
tal_free(u);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing so will unreserve the utxo, right ? If so, I think that's what I missed to fix #3591 in #3593 !

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet: we are in wallet_select so the coins have not been marked as selected yet (that happens 6 lines down). It's the destructor added to the tallocated array on line 361 that'd free them, but they were not being added yet (just skipped) before this fix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a roundabout way of saying that to unmark selected UTXOs you need to tal_free the utxos array :-)

We roll the `elements_add_fee_output` function and the cropping of
overallocated arrays into the `bitcoin_tx_finalize` function. This is supposed
to be the final cleanup and compaction step before a tx can be sent to bitcoin
or passed off to other daemons.

This is the cleanup promised in ElementsProject#3491
`pylightning` is not much more than an alias for `pyln-client`, so this
removes the need to install that as well just to run the tests.
For some reason we fail to remove the test directory in some cases. My
hypothesis is that it is a daemon that is not completely shut down yet, and
still writes to the directory. This commit intercepts the error, prints any
files in the directory and re-raises the error. This should allow us to debug
the reappears.
pytest captures the output by monkey patching out `sys.stdout`. This may
conflict with our use of `sys.stdout` when configuring logging, resulting in
the "Write to closed file" issue that is spamming the logs. By making the
logging configuration a fixture hopefully we always use the correct
stdout (after pytest has monkey-patched it).
Looking for specific feerates, but not adjusting the amounts involved doesn't
work.
Postgres does not guarantee that the insertion order is the returned order,
which leads us to skip outputs that have already been stolen onto the selected
utxos set, but not added to it because it isn't confirmed. This may also
happen with sqlite3 though it's a lot rarer in that case.
@rustyrussell
Copy link
Contributor

Ack c85988f

@rustyrussell rustyrussell merged commit e99ab03 into ElementsProject:master Mar 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants