Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[hailctl] Fix dependency installation for hailctl dataproc clusters #12510

Merged
merged 22 commits into from
Dec 13, 2022

Conversation

daniel-goldstein
Copy link
Contributor

@daniel-goldstein daniel-goldstein commented Nov 28, 2022

I broke this when I separated out the dependencies for hail into two layers:

  1. hailtop dependencies
  2. hail dependencies, which builds on top of the hailtop dependencies.

This fix does two things:

  • Use the full dependencies in 1 & 2
  • Use fully pinned dependencies when installing on clusters which seems better than using our wide-range dependencies. I left the install-deps and install-dev-deps as the normal requirements files as those are meant for development (I think?) but am happy to take opinions on whether we should use fully pinned deps there as well. I have so far been going by the rule of thumb of fully-pinned for CI and production environments, more lax rules for dev environments.

See here for additional context.

cc: @tpoterba, any idea why the test dataproc test succeeded?

@tpoterba
Copy link
Contributor

test_dataproc only runs on deploys AFAIK

@daniel-goldstein
Copy link
Contributor Author

Like a release? I think the terminology is a little overloaded. In services code we consider "deploy" as the CI pipeline that runs whenever main is updated.

@daniel-goldstein
Copy link
Contributor Author

daniel-goldstein commented Nov 28, 2022

Ah I see, all these steps cut out early for non-release deploys. So we wouldn't have released a broken package which is good, but we'd still end up with a broken deploy, which is annoying.

@tpoterba
Copy link
Contributor

Sorry about the incorrect terminology! Your conclusion is my understanding, yes.

hail/Makefile Outdated
@@ -268,13 +268,13 @@ $(eval $(call ENV_VAR,cloud_base))
$(eval $(call ENV_VAR,wheel_cloud_path))

python/hailtop/hailctl/deploy.yaml: env/cloud_base env/wheel_cloud_path
python/hailtop/hailctl/deploy.yaml: $(resources) python/requirements.txt
python/hailtop/hailctl/deploy.yaml: $(resources) python/pinned-requirements.txt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this target does not exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it need to be? It's just a file in the repo

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a foot-gun to me. Suppose I'm trying to experiment with a modified version of Hail. I've modified python/requirements.txt but I've not run generate-pip-lockfile. I run make install-hailctl thinking that this will incorporate my changes into a new wheel file which I can use to start cluster. In reality, my changes are ignored and the old pinned requirements are used.

I think we need a rule like:

python/pinned-requirements.txt: python/requirements.txt
	docker run --rm -it \
		-v $(HAIL_HAIL_DIR):/hail \
		python:3.7-slim \
		/bin/bash -c "pip install pip-tools && cd /hail && \
	pip-compile --upgrade python/requirements.txt \
		--output-file=python/pinned-requirements.txt"

Or a rule like:

python/pinned-requirements.txt: generate-pip-lockfile

This latter rule will trigger generate-pip-lockfile to always be run if we depend on the pinned-requirements.txt which isn't great. I think the best option is to have a rule for each requirements file and to also have generate-pip-lockfile as a special fast path for developers who want to update all the lock files.

Copy link
Contributor

@danking danking Nov 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, if we want to avoid invoking docker thrice, we could do this:

python/pinned-requirements.txt: update-pip-lockfiles
python/dev/pinned-requirements.txt: update-pip-lockfiles
python/hailtop/pinned-requirements.txt: update-pip-lockfiles
.PHONY: update-pip-lockfiles
update-pip-lockfiles: python/requirements.txt python/dev/requirements.txt python/hailtop/requirements.txt
    ...

And then to forcibly regenerate you gotta use -B/--always-make

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification, this makes a lot of sense! I will incorporate these changes.

hail/Makefile Outdated
@@ -324,7 +324,7 @@ install: $(WHEEL)

.PHONY: install-on-cluster
install-on-cluster: $(WHEEL)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this target can run without generated pinned requirements files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand entirely. Do you mean that install-on-cluster should use unpinned requirements or is this a concern about the target / dependencies of the target?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is similar to my concerns above, there's an implicit dependency between python/requirements.txt and python/pinned-requirements.txt. At the very least we should error if the pinned-requirements.txt is older than the requirements.txt

@vladsavelyev
Copy link
Contributor

vladsavelyev commented Nov 28, 2022

Thank you! (this issue is blocking us now)

hail/Makefile Outdated
@@ -268,13 +268,13 @@ $(eval $(call ENV_VAR,cloud_base))
$(eval $(call ENV_VAR,wheel_cloud_path))

python/hailtop/hailctl/deploy.yaml: env/cloud_base env/wheel_cloud_path
python/hailtop/hailctl/deploy.yaml: $(resources) python/requirements.txt
python/hailtop/hailctl/deploy.yaml: $(resources) python/pinned-requirements.txt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a foot-gun to me. Suppose I'm trying to experiment with a modified version of Hail. I've modified python/requirements.txt but I've not run generate-pip-lockfile. I run make install-hailctl thinking that this will incorporate my changes into a new wheel file which I can use to start cluster. In reality, my changes are ignored and the old pinned requirements are used.

I think we need a rule like:

python/pinned-requirements.txt: python/requirements.txt
	docker run --rm -it \
		-v $(HAIL_HAIL_DIR):/hail \
		python:3.7-slim \
		/bin/bash -c "pip install pip-tools && cd /hail && \
	pip-compile --upgrade python/requirements.txt \
		--output-file=python/pinned-requirements.txt"

Or a rule like:

python/pinned-requirements.txt: generate-pip-lockfile

This latter rule will trigger generate-pip-lockfile to always be run if we depend on the pinned-requirements.txt which isn't great. I think the best option is to have a rule for each requirements file and to also have generate-pip-lockfile as a special fast path for developers who want to update all the lock files.

hail/Makefile Outdated
@@ -324,7 +324,7 @@ install: $(WHEEL)

.PHONY: install-on-cluster
install-on-cluster: $(WHEEL)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is similar to my concerns above, there's an implicit dependency between python/requirements.txt and python/pinned-requirements.txt. At the very least we should error if the pinned-requirements.txt is older than the requirements.txt

@daniel-goldstein
Copy link
Contributor Author

daniel-goldstein commented Dec 6, 2022

Because of the unpredictable way that git clone might realize the requirements files, I removed the pinned-requirements file as a dependency of the changed targets. In both cases, regenerating that file (either in CI as part of the deploy.yaml target or on a cluster for install-on-cluster) could cause a dataproc cluster running with untested dependency versions even if the requirements.txt files are unchanged. I do, however, require that the pinned-requirements files be compatible using the same check we do in CI

I performed the following manual testing:

  1. Creating a dataproc cluster through hailctl dataproc start
  2. ssh'ing into said cluster, cloning this branch and running make -C hail install-on-cluster to completion
  3. Updating the requirements.txt file to something incompatible and successfully installing on cluster again with updated pinned requirements

However, I'm not sure I'm actually doing this right. I checked that in step 2 I was not regenerating any pinned-requirments files, but in step 3 make updated the pinned requirements without me telling it to, I'm guessing because of the wheel's dependence on PY_FILES and I changed the source under hail/python. So I don't entirely understand why I have this desirable result.

danking
danking previously requested changes Dec 6, 2022
hail/Makefile Outdated
HAIL_HAIL_DIR=$(HAIL_HAIL_DIR) \
../generate_pip_lockfile.sh python/hailtop/requirements.txt python/hailtop/pinned-requirements.txt

python/pinned-requirements.txt: python/hailtop/requirements.txt python/requirements.txt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we have dependencies on hailtop here and on hailtop and hail in the dev target?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

python/requirements.txt includes those requirements from python/hailtop/requirements.txt, and the dev requirements are constrained by python/requirements.txt, so if any of these requirements files changed, it could affect the pinned-requirements.

hail/Makefile Outdated
rm -f $@
echo "dataproc:" >> $@
for FILE in $(notdir $(resources)); do \
echo " $$FILE: $(cloud_base)/$$FILE" >> $@ || exit 1; done
echo " wheel: $(wheel_cloud_path)" >> $@
echo " pip_dependencies: $(shell cat python/requirements.txt | sed '/^[[:blank:]]*#/d;s/#.*//' | grep -v pyspark | tr "\n" "|||")" >> $@
printf " pip_dependencies: " >> $@
$(PIP) install pip-tools && ../check_pip_requirements.sh python/requirements.txt python/pinned-requirements.txt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that this will solve dependencies using both files as constraints, then compare the solution to the pinned (second argument)? The idea being that if I add a new dependency, a new transitive dependency, or a conflicting version to the first file, it will trigger a diff failure when the solution is compared to the existing pinned file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

@daniel-goldstein daniel-goldstein dismissed danking’s stale review December 9, 2022 18:25

Removed the targets due to the reproducibility issues we discussed, but ensured that the targets that use the pinned requirements files check that they are still compatible with the requirements.txt files. I tested this by changing the requirements file (one by making an incompatible version change, one by adding a new dependency) and saw that the deploy.yaml make target failed and emitted a message to recreate the lockfiles

@daniel-goldstein
Copy link
Contributor Author

@danking Sorry for the delay on this. Trying to move this along faster now to unblock the release

Copy link
Contributor

@danking danking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow up with change to target naming of check-pip-lockfiles

@danking danking merged commit 01c8099 into hail-is:main Dec 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants