-
-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v0.10.4 with new build strategy #43
Conversation
Hi! This is the friendly automated conda-forge-linting service. I just wanted to let you know that I linted all conda-recipes in your PR ( |
@conda-forge-admin, please rerender. |
…nda-forge-pinning 2022.08.27.13.34.34
@mkitti and @MilesCranmer this works as expected. But have a look inside the logs...
also @ocefpaf |
This is obviously just a draft. I will want to completely tweak this so that we do it more properly. Some issues to address:
|
@mkitti pls see how it is structured: if you download the artifacts from here https://artprodeus21.artifacts.visualstudio.com/A910fa339-c7c2-46e8-a579-7ea247548706/84710dde-1620-425b-80d0-4cf5baca359d/_apis/artifact/cGlwZWxpbmVhcnRpZmFjdDovL2NvbmRhLWZvcmdlL3Byb2plY3RJZC84NDcxMGRkZS0xNjIwLTQyNWItODBkMC00Y2Y1YmFjYTM1OWQvYnVpbGRJZC81NTk1ODkvYXJ0aWZhY3ROYW1lL2NvbmRhX2FydGlmYWN0c18yMDIyMDgyOC42LjFfbGludXhfNjRf0/content?format=zip then you will find this package under "broken", then you can unzip it and see how it got structured The test still fails with pycall not being installed correctly.... |
The depot we package should only contain two directories.
That's why I created a temporary directory for the depot and then only copied those two directories. |
Hi! This is the friendly automated conda-forge-linting service. I wanted to let you know that I linted all conda-recipes in your PR ( Here's what I've got... For recipe:
|
Hi! This is the friendly automated conda-forge-linting service. I just wanted to let you know that I linted all conda-recipes in your PR ( |
…nda-forge-pinning 2022.08.29.15.07.45
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to add ${PREFIX}/share/SymbolicRegression.jl/depot
to the JULIA_DEPOT_PATH
environment variable with :
separators.
recipe/build.sh
Outdated
@@ -8,6 +8,6 @@ ${PYTHON} -c 'import pysr; pysr.install()' | |||
|
|||
ls ${PREFIX}/share/julia | |||
|
|||
rm -rf ${PREFIX}/share/julia/!(packages|artifacts) | |||
rm -rf '${PREFIX}/share/julia/!(packages|artifacts)' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line does not seem to do what you want.
+ rm -rf '${PREFIX}/share/julia/!(packages|artifacts)'
+ ls /home/conda/feedstock_root/build_artifacts/pysr_1661794169440/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placeh/share/julia
registries
scratchspaces
stdlib
test
artifacts
base
base.cache
clones
compiled
conda
environments
julia-config.jl
logs
packages
prefs
registries
scratchspaces
stdlib
test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rm -rf '${PREFIX}/share/julia/!(packages|artifacts)' | |
mkdir -p "${PREFIX}/share/SymbolicRegression.jl/depot" | |
mv "${PREFIX}/share/julia/packages" "${PREFIX}/share/SymbolicRegression.jl/depot/packages" | |
mv "${PREFIX}/share/julia/artifacts" "${PREFIX}/share/SymbolicRegression.jl/depot/artifacts" | |
rm -rf "${PREFIX}/share/julia" |
This probably should be moved into a SymbolicRegression.jl-feedstock, and we should probably try to have an independent PyCall.jl-feedstock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably should be moved into a SymbolicRegression.jl-feedstock, and we should probably try to have an independent PyCall.jl-feedstock.
Yeah... but that defeats the point right?
Okay, I understand what you're getting at, but I want to make it work without having to create an alternative depot. We already have a predefined depot. The larger problem here is, Pkg
was designed for "projects" but our whole setup is for "envs", while both are interchangeable to an extent, they're quite different in how they're fundamentally set up.
My question is, why is it not working now? What are we missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that Julia depots stack just like Julia environments stack. The mechanism is used for system installed packages, which is kind of what we are doing.
We probably need the environments
folder: share/julia/environments/pysr-0.10.1
and/or need to activate that environment.
Basically we need to satisfy
https://github.com/MilesCranmer/PySR/blob/c3dc203265816db4df8d38e7ea29419eafdb2778/pysr/julia_helpers.py#L87
The last expression here should return true
.
from julia.core import JuliaInfo, UnsupportedPythonError
info = JuliaInfo.load(julia="julia")
info.is_pycall_built()
is_pycall_built
is from here:
https://github.com/JuliaPy/pyjulia/blob/87c669e2729f9743fe2ab39320ec9b91c9300a96/src/julia/juliainfo.py#L134-L135
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like there is a share/julia/conda
which is important.
$ ls -R share/julia/conda
share/julia/conda:
3 deps.jl
share/julia/conda/3:
$ cat share/julia/conda/deps.jl
const ROOTENV = "/home/mkitti/mambaforge/envs/pysr/share/julia/conda/3"
const MINICONDA_VERSION = "3"
const USE_MINIFORGE = true
const CONDA_EXE = "/home/mkitti/mambaforge/envs/pysr/share/julia/conda/3/bin/conda"
Without that folder, then PyCall.jl cannot compile because Conda.jl cannot compile.
julia> using PyCall
[ Info: Precompiling PyCall [438e738f-606a-5dbb-bf0a-cddfbfd45ab0]
ERROR: LoadError: ArgumentError: Path to conda environment is not valid: /home/mkitti/mambaforge/envs/pysr/share/julia/conda/3
Stacktrace:
[1] prefix(path::String)
@ Conda ~/mambaforge/envs/pysr/share/pysr_depot/packages/Conda/x2UxR/src/Conda.jl:42
[2] top-level scope
@ ~/mambaforge/envs/pysr/share/pysr_depot/packages/Conda/x2UxR/src/Conda.jl:47
[3] include
@ ./Base.jl:418 [inlined]
[4] include_package_for_output(pkg::Base.PkgId, input::String, depot_path::Vector{String}, dl_load_path::Vector{String}, load_path::Vector{String}, concrete_deps::Vector{Pair{Base.PkgId, UInt64}}, source::String)
@ Base ./loading.jl:1318
[5] top-level scope
@ none:1
[6] eval
@ ./boot.jl:373 [inlined]
[7] eval(x::Expr)
@ Base.MainInclude ./client.jl:453
[8] top-level scope
@ none:1
in expression starting at /home/mkitti/mambaforge/envs/pysr/share/pysr_depot/packages/Conda/x2UxR/src/Conda.jl:1
ERROR: LoadError: Failed to precompile Conda [8f4d0f93-b110-5947-807f-2305c1781a2d] to /home/mkitti/mambaforge/envs/pysr/share/julia/compiled/v1.7/Conda/jl_gZtz9C.
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:33
[2] compilecache(pkg::Base.PkgId, path::String, internal_stderr::IO, internal_stdout::IO, ignore_loaded_modules::Bool)
@ Base ./loading.jl:1466
[3] compilecache(pkg::Base.PkgId, path::String)
@ Base ./loading.jl:1410
[4] _require(pkg::Base.PkgId)
@ Base ./loading.jl:1120
[5] require(uuidkey::Base.PkgId)
@ Base ./loading.jl:1013
[6] require(into::Module, mod::Symbol)
@ Base ./loading.jl:997
[7] include
@ ./Base.jl:418 [inlined]
[8] include_package_for_output(pkg::Base.PkgId, input::String, depot_path::Vector{String}, dl_load_path::Vector{String}, load_path::Vector{String}, concrete_deps::Vector{Pair{Base.PkgId, UInt64}}, source::Nothing)
@ Base ./loading.jl:1318
[9] top-level scope
@ none:1
[10] eval
@ ./boot.jl:373 [inlined]
[11] eval(x::Expr)
@ Base.MainInclude ./client.jl:453
[12] top-level scope
@ none:1
in expression starting at /home/mkitti/mambaforge/envs/pysr/share/pysr_depot/packages/PyCall/ygXW2/src/PyCall.jl:1
ERROR: Failed to precompile PyCall [438e738f-606a-5dbb-bf0a-cddfbfd45ab0] to /home/mkitti/mambaforge/envs/pysr/share/julia/compiled/v1.7/PyCall/jl_QiBAvQ.
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:33
[2] compilecache(pkg::Base.PkgId, path::String, internal_stderr::IO, internal_stdout::IO, ignore_loaded_modules::Bool)
@ Base ./loading.jl:1466
[3] compilecache(pkg::Base.PkgId, path::String)
@ Base ./loading.jl:1410
[4] _require(pkg::Base.PkgId)
@ Base ./loading.jl:1120
[5] require(uuidkey::Base.PkgId)
@ Base ./loading.jl:1013
[6] require(into::Module, mod::Symbol)
@ Base ./loading.jl:997
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That deps.jl is without our recent activate.sh modifications. With the modifications, we should end up with the following, which looks better.
const ROOTENV = "/home/mkitti/mambaforge/envs/pysr"
const MINICONDA_VERSION = "3"
const USE_MINIFORGE = true
const CONDA_EXE = "/home/mkitti/mambaforge/bin/conda"
recipe/build.sh
Outdated
|
||
rm -rf '${PREFIX}/share/julia/!(packages|artifacts)' | ||
|
||
ls ${PREFIX}/share/julia |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ls ${PREFIX}/share/julia | |
ls ${PREFIX}/share/SymbolicRegression.jl/depot |
recipe/scripts/activate.sh
Outdated
@@ -0,0 +1 @@ | |||
export JULIA_DEPOT_PATH="${JULIA_DEPOT_PATH}:${PREFIX}/share/SymbolicRegression.jl/depot" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Appending the depot is not going to do anything if we do not put anything there.
(pysr) $ ls share/julia
base base.cache environments julia-config.jl logs stdlib test
(pysr) $ ls share/SymbolicRegression.jl/depot
artifacts conda environments packages
The one thing we may need to do is add PyCall.jl to the environment. That is
julia -e "using Pkg; Pkg.add(\"PyCall\")"
This should not download anything. It should just add PyCall to Project.toml.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should add PyCall.jl to the @pysr-0.10.1
environment instead of the default one named after the conda environment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I got confused. I was looking for share/SymbolicRegression.jl/depot
but I couldn't find it, so I was like I might as well add it for now.
But this still doesn't answer my question... or maybe adding PyCall is the answer? I thought it was already added? I will add it now and test. I will also add you to my fork
How do you want to populate the new depot? I don't see code to do that at the moment. |
For now, I am simply populating the main julia depot, i.e. under My worry is that in the packaging part, a lot of patching and editing is happening and corrupting these packaged julia entities |
e.g.
|
I don't see the difference between creating a new depot and keeping the default depot. Maybe I am not following your logic. |
PyCall is definitely under $PREFIX, but the check doesn't seem to see it...
|
You want to run the Julia package manager every time someone activates an environment? Perhaps you mean a post-link script? In that case, I'm taking the advice here from here:
If this doesn't apply, we could certainly take that approach as we started to in #41. |
If the package manager has not been run yet in that environment. |
What guarantees that Pkg will use bundled versions of the packages? If one of the dependencies gets updated in the future, then Julia will end up installing it from the internet instead of from the stacked depot, right? And if you're OK with that, then what was the point of bundling the dependencies in the first place? |
Also how do you deal with packages that may have incompatible mutual dependencies? e.g. if you release packages A and B, which depend on package C, how do you ensure that C is compatible with both versions of A and B that are installed? The logical answer seems to be that you need C to be a Conda package too. The conclusion of which is that you need to release all Julia packages you depend on as individual Conda packages, if you want to ensure compatible versions. This seems like a huge undertaking, similar in scope to conda-forge itself. |
Part of the point here is to deliver the PySR software as ready-to-use in a solution that has been fully tested by conda-forge. By invoking the Julia package manager during conda package build time we can control the versions included, pinning the dependencies as needed. The status quo that exists before this pull request is that the user has to take additional deliberate steps to make this package function. As such, Julia package management is lies completely out side of conda-forge's control. The specific configuration that the user ends up with may never has been tested together. I concur that packaging individual Julia packages would be the ideal. To support PySR, this would require adding ~100 individual packages. That's a tremendous barrier to starting though. What I'm trying to sort out here is a way to get started while delivering a practical solution.
We package a complete environment with a Project.toml and Manifest.toml, so that specifies the exact package versions to use. That complete environment is the default used by PySR and it is "installed" after conda finishes. As time goes on this package will be rebuilt with new Julia dependencies. The point of bundling the dependencies in the first place is that conda and conda-forge are now managing those files. The bundled solution has been tested as part of the CI here. There is some appeal in that. In my understanding, this is what conda-forge is about and what separates from just using conda and any other channel. conda-forge aims to deliver a completely tested solution built from source.
I discussed this above. We are currently addressing a The next targets for packaging are those at the intersection of these two packages. Let's consider another question for a second. Why should conda-forge package pure Python or R packages? Why not leave that to pip or poetry? I think this is exactly analogous to the Julia package situation. If I can just use the Julia package manager in an activate script, why not use |
Should we rebase and start a new PR, @isuruf and @cjdoris? I just want to add one point to the discussion. In my mind, a super important part of conda-forge is relocatability. Let me illustrate this point. Both @MilesCranmer and I are actually affiliated with the same academic institution where our HPC compute nodes have no internet access (login nodes do). Say I want to use this great and powerful package (pysr). I get it from conda-forge on my login node and then I get an allocation to run some interesting code on the compute nodes (our HPC uses slurm). But, it will give me a silent error about not having the necessary deps, therefore wasting my time and causing confusion (now, think of an average user who thinks it will just work, not us maintainers and developers). For me personally, this is my main motivation here. Every time If I understand correctly, the user can simply do another round of Edit: this relocatability argument extends to making containers as well. |
An activate script only moves the problem around by a tiny bit and doesn't seem to be recommended in this case; moreover, see my long-running obsession with login vs compute nodes: #43 (comment) |
Exactly. We have really good tools and we should use them to great effect. Otherwise, just use pip or clone the repo, etc. --- we are not just another pip because there is no need; pip is great at its thing |
To clear up any confusion, I am already pro this proposal. Having one package manager be able to resolve both Python and Julia (and other) dependencies for us sounds great. I'm just trying to figure out how it works in practice and if it is scalable to many packages.
Is this literally the Manifest created when you build PySR? What happens when we use Conda to install another Julia package with its own Manifest? It's not possible to instantiate two Manifests, do you merge them somehow? A few smaller questions:
|
Thinking out loud, just as a point of comparison, there is also my JuliaPkg Python package for managing Julia dependencies. This is the default way that JuliaCall manages its dependencies. It basically provides a way for a Python package to declare any Julia dependencies it needs, and then JuliaPkg will ensure those dependencies are met. Some pros of JuliaPkg compared to the conda-forge proposal:
Some cons:
So if we instead just used JuliaPkg everywhere, with the addition of |
@conda-forge-admin, please rerender |
…nda-forge-pinning 2022.09.13.20.27.40
Yes, this is the manifest we used to build. We put it in a dedicated environment called The current plan is to reconcile the two environments by pinning packages while building the conda-forge packages such that the Julia packages are at the same version. At the moment, this is only requires pinning two packages.
Ultimately, I think automation is the key. conda-forge seems to have a tradition of automating as much as possible with Github bots, so there is precedence. I'm just not sure if conda-forge is ready for me to stage 100 packages at once. That's why I'm taking a more evolutionary approach where we start with a small number of packages with larger bundles and then focus on the intersection of the current packages. PyCall.jl and PythonCall.jl would definitely be one of the first Julia packages I think.
The existing depot, which we moved from Earlier I had proposed moving towards a common
This is a nice approach. Conda-forge could use JuliaPkg at build time to deliver the initial package state. Thus, JuliaPkg is complementary to the approach taken here. |
Thanks this is all really helpful.
OK so you install one environment per Conda package, giving you a consistent set of Julia packages. Is there a mechanism to merge these into a single environment, so that it's possible to access a bunch of unrelated Julia packages in a single session? I think maybe that's what you meant in the quote above but I'm not sure. How does it work? |
One could use environment stacking: insert!(LOAD_PATH, 2, "@pysr-0.11.0")
insert!(LOAD_PATH, 2, "@xbitinfo-0.4.3") See https://docs.julialang.org/en/v1/manual/code-loading/#Environments That still requires us to ensure consistency of the dependencies. In the short term, this means pinning packages in the depots. In the long term, we split out the mutual dependencies into their own packages |
Is this PR good to go? I tried to unpin Julia in #50 but having some issues (which I assume are solved in this PR) - so would be great if I could merge this. Cheers, |
My current plan is to attend the conda-forge core meeting next week to discuss this. That said, I don't think the world will end if you merged this. The worse case scenario is that we'll have to back out the changes later. Let me take a look at #50 |
This is your feedstock @MilesCranmer, feel free to merge |
Thanks, sounds good to me! I am eager to hear the results of the discussion. Since it seems like the current package (even before recent versions, possibly) is failing on macOS, and this one seems to fix those issues, I will go ahead and merge now. I think it will also allow for us to study the stability of the build strategy which you could discuss at the meeting, too. Happy to back out the changes too, if a different strategy is decided on! |
Just tested it locally - works perfectly! Super cool to not have to do any extra install steps. Since this fixes the Manifest during build, it would be interesting in the future to think about packaging sysimages created by |
Shipping a system image is interesting although a fully optimized approach would still involve building the system image on the user's machine. I've noticed that custom system images often can take advantage of advanced features like AVX-512 if built on the user's machine.
We should take a look at replacing pyjulia with juliacall upstream. It would be nice if we were less dependent on environment variables for communication. For example, it would be nice to set the initial Julia environment via the The next feedstock that we may visit will be xbitinfo / BitInformation.jl. I would like to have the conversation with conda-forge core first though before proceeding further. |
Thanks for doing this. This seems to be the quickest way to get things moving. I would attend with you but I am trying to remain anonymous lol, and I trust you are a better messenger anyway :) |
Thinking out loud, while @mkitti talks to core, @cjdoris, we could possibly look into taking care of your packages with a similar setup. I need like at least a week to begin having sustained free time and clear mind though. Eventually, we may want to create some sort of an automated tool to do the tricks on |
fixes #38
Checklist
0
(if the version changed)conda-smithy
(Use the phrase@conda-forge-admin, please rerender
in a comment in this PR for automated rerendering)