Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DYAMOND summer initial conditions #64

Merged
merged 1 commit into from
Nov 14, 2024
Merged

Conversation

sriharshakandala
Copy link
Member

@sriharshakandala sriharshakandala commented Nov 7, 2024

Upload DYAMOND summer initial conditions.

Checklist:

  • I created a new folder $artifact_name
    • I added a README.md in that that folder that
      • describes the data and processing done to it
      • lists the sources of the raw data
      • lists the required citation, licenses
    • If applicable (e.g., for Creative Commons), I added a LICENSE file
    • I added the scripts that retrieve, process, and produce the artifact
    • I added the environment used for such scripts (typically, Project.toml
      and Manifest.toml)
    • I added the OutputArtifacts.toml file containing the information
      needed for package developers to add $artifact_name to their package
  • I uploaded the artifact folder to the Caltech cluster (in
    /groups/esm/ClimaArtifacts/artifacts/$artifact_name)
  • I added the relevant code to the Overides.toml on the Caltech Cluster
    (in /groups/esm/ClimaArtifacts/artifacts/Overrides.toml)
  • I added a link to the main README.md to point to the new artifact

@sriharshakandala sriharshakandala marked this pull request as draft November 7, 2024 21:53
@sriharshakandala sriharshakandala force-pushed the sk/add_dyamond_summer branch 2 times, most recently from 3ff71f0 to 74cca40 Compare November 8, 2024 19:43
@sriharshakandala sriharshakandala marked this pull request as ready for review November 8, 2024 19:44
Copy link
Member

@Sbozzolo Sbozzolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, read the documentation and see the other examples on how to add artifacts.

In this, keep in mind that the key point of ClimaArtifacts is to ensure that everything is reproducible, so we provide automated scripts to generate artifacts (including the download).

@Sbozzolo Sbozzolo self-requested a review November 8, 2024 19:58
@sriharshakandala
Copy link
Member Author

Please, read the documentation and see the other examples on how to add artifacts.

In this, keep in mind that the key point of ClimaArtifacts is to ensure that everything is reproducible, so we provide automated scripts to generate artifacts (including the download).

I tried using the create_artifact_guided_one_file option specified in the documentation. Somewhere during the execution of the script, with this ~17GB file, the storage used by the script bloated to 777GB and crashed the simulation!
Is there a concern with manually generating the OutputArtifacts.toml file as I did here? May be, we can do something like a checksum verification for validating file integrity instead? cc: @charleskawczynski @Sbozzolo

@Sbozzolo
Copy link
Member

Sbozzolo commented Nov 9, 2024

I tried using the create_artifact_guided_one_file option specified in the documentation. Somewhere during the execution of the script, with this ~17GB file, the storage used by the script bloated to 777GB and crashed the simulation!

This sounds like a bug. I'll try recreating locally. The one thing I can think of that might cause such a behavior is creating the tarball, but that should not even happen for such a large file. At what stage did it crash?

Is there a concern with manually generating the OutputArtifacts.toml file as I did here? May be, we can do something like a checksum verification for validating file integrity instead? cc: @charleskawczynski @Sbozzolo

Yes, the script does other things too. We don't want upload an artifact like the one that you added for a few reasons:

  1. As it is, it means that instantiation of ClimaAtmos (and any other package that uses this artifact) would download 17 GB of data
  2. This large artifact would be mirrored on julia's artifact servers (and is not good etiquette to use such a large amount of storage)
  3. Computing shas of large artifacts can take a long time, especially when there's lot of files, so the script implements shortcuts.

To solve problems 1-3, the script defines a notion of "downloadable" and "undownloadable" artifacts and treat them differently. The script also prompts you to take certain actions (or not). For example, in this case, we would not upload the file to box.

In addition to that,
4. We want everything to be reproducible, ideally with a one-click script. If you add the file like this, you put the burden on the next person who wants to recreate/modify it to figure out how to do it.
5. The sha also looks very long compared to all the other shas. SHA256 should 32 bytes, so the sha is probably not correct.
6. The link to the artifact is not a direct link, so this would fail.

When possible, the script also verifies that OutputArtifact.toml is correct.

@Sbozzolo
Copy link
Member

Sbozzolo commented Nov 9, 2024

I ran this code and it worked without problems on my machine:

using ClimaArtifactsHelper

const FILE_URL = "https://swift.dkrz.de/v1/dkrz_ab6243f85fe24767bb1508712d1eb504/SAPPHIRE/DYAMOND/ifs_oper_T1279_2016080100.nc"
const FILE_PATH = "ifs_oper_T1279_2016080100.nc"

create_artifact_guided_one_file(FILE_PATH; artifact_name = basename(@__DIR__), file_url = FILE_URL)

Output:

julia> include("create_artifact.jl")
[ Info: ifs_oper_T1279_2016080100.nc not found, downloading it (might take a while)
The artifact directory is large (18323896344 bytes), so the artifact has to be handled manually
The id of your artifact is 3786a72f51576785549ec4ca42ff3222a9f6cff2
Create the folder `/groups/esm/ClimaArtifacts/artifacts/atmos_dyamond_summer` on the cluster
Then, upload the content of atmos_dyamond_summer to that folder
Add the following entry to the Overrides.toml file you find in `/groups/esm/ClimaArtifacts/artifacts/`

3786a72f51576785549ec4ca42ff3222a9f6cff2 = "/groups/esm/ClimaArtifacts/artifacts/atmos_dyamond_summer"

Here is your artifact string. Copy and paste it to your Artifacts.toml

[atmos_dyamond_summer]
git-tree-sha1 = "3786a72f51576785549ec4ca42ff3222a9f6cff2"

Artifact string saved to OutputArtifacts.toml
Feel free to add other metadata/properties (e.g., laziness)
Enjoy the rest of your day!

@sriharshakandala
Copy link
Member Author

sriharshakandala commented Nov 9, 2024

Can the artifact_name be different than basename(@__DIR__)? I am using the following script:

using ClimaArtifactsHelper

const FILE_URL = "https://swift.dkrz.de/v1/dkrz_ab6243f85fe24767bb1508712d1eb504/SAPPHIRE/DYAMOND/ifs_oper_T1279_2016080100.nc"
const FILE_PATH = "ifs_oper_T1279_2016080100.nc"
artifact_name = "DYAMOND_summer_initial_conditions"
create_artifact_guided_one_file(FILE_PATH; artifact_name = artifact_name, file_url = FILE_URL)

In this case, it downloaded the .nc successfully and created the DYAMOND_summer_initial_conditions folder, but the .nc file placed in this folder was about 777GB!

@sriharshakandala
Copy link
Member Author

I reran both versions of the code, with artifact_name = basename(@__DIR__) and artifact_name = "DYAMOND_summer_initial_conditions". I both cases, it downloads the artifact file to the project home folder, then creates a folder with the name artifact_name and adds a file with the same name in that artifact_name folder. However the file it places in the artifact_name folder is unusually large (>777GB or so)!

@sriharshakandala
Copy link
Member Author

sriharshakandala commented Nov 9, 2024

julia --project create_artifact.jl 
┌ Warning: atmos_dyamond_summer already exists. Content will end up in the artifact and may be overwritten.
└ @ ClimaArtifactsHelper ~/work/ClimaArtifacts/ClimaArtifactsHelper.jl/src/ClimaArtifactsHelper.jl:195
┌ Warning: Abort this calculation, unless you know what you are doing.
└ @ ClimaArtifactsHelper ~/work/ClimaArtifacts/ClimaArtifactsHelper.jl/src/ClimaArtifactsHelper.jl:196
[ Info: ifs_oper_T1279_2016080100.nc not found, downloading it (might take a while)
ERROR: LoadError: IOError: sendfile: no space left on device (ENOSPC)
Stacktrace:
 [1] uv_error
   @ ./libuv.jl:100 [inlined]
 [2] sendfile(dst::Base.Filesystem.File, src::Base.Filesystem.File, src_offset::Int64, bytes::Int64)
   @ Base.Filesystem ./filesystem.jl:153
 [3] sendfile(src::String, dst::String)
   @ Base.Filesystem ./file.jl:1004
 [4] cp(src::String, dst::String; force::Bool, follow_symlinks::Bool)
   @ Base.Filesystem ./file.jl:386
 [5] cp
   @ ./file.jl:378 [inlined]
 [6] create_artifact_guided_one_file(file_path::String; artifact_name::String, file_url::String, append::Bool)
   @ ClimaArtifactsHelper ~/work/ClimaArtifacts/ClimaArtifactsHelper.jl/src/ClimaArtifactsHelper.jl:208
 [7] top-level scope
   @ ~/work/ClimaArtifacts/atmos_dyamond_summer/create_artifact.jl:9
ls -all
total 35815640
drwxr-xr-x  10 sriharshakandala  staff          320 Nov  9 09:57 .
drwxr-xr-x  32 sriharshakandala  staff         1024 Nov  7 09:32 ..
-rw-------   1 sriharshakandala  staff        12288 Nov  8 11:00 .README.md.swp
-rw-------   1 sriharshakandala  staff        12288 Nov  9 09:35 .create_artifact.jl.swp
-rw-r--r--   1 sriharshakandala  staff         7976 Nov  8 14:50 Manifest.toml
-rw-r--r--   1 sriharshakandala  staff           69 Nov  8 14:50 Project.toml
-rw-r--r--   1 sriharshakandala  staff         1276 Nov  8 11:00 README.md
drwxr-xr-x   3 sriharshakandala  staff           96 Nov  9 09:57 atmos_dyamond_summer
-rw-r--r--   1 sriharshakandala  staff          456 Nov  9 09:34 create_artifact.jl
-rw-r--r--   1 sriharshakandala  staff  18323896344 Nov  9 09:57 ifs_oper_T1279_2016080100.nc
cd atmos_dyamond_summer/ 
sriharshakandala@sriharshas-CLIMA-MacBook-Pro atmos_dyamond_summer % du -sh *
777G	ifs_oper_T1279_2016080100.nc
```

@Sbozzolo
Copy link
Member

Sbozzolo commented Nov 9, 2024 via email

@sriharshakandala
Copy link
Member Author

I am not sure what's going on here. The code up to that point is extremely simple (only three lines of code: download, mv, and cp). Can you try running the cp command from your repl

Base.cp(file_path, joinpath(output_dir, basename(file_path)))
(And check that the paths in the function are correct) It should very easy to debug what goes wrong (you don't have to re-download the file every time for these tests)

On Sat, Nov 9, 2024, 10:19 AM Sriharsha Kandala @.> wrote: julia --project create_artifact.jl ┌ Warning: atmos_dyamond_summer already exists. Content will end up in the artifact and may be overwritten. └ @ ClimaArtifactsHelper ~/work/ClimaArtifacts/ClimaArtifactsHelper.jl/src/ClimaArtifactsHelper.jl:195 ┌ Warning: Abort this calculation, unless you know what you are doing. └ @ ClimaArtifactsHelper ~/work/ClimaArtifacts/ClimaArtifactsHelper.jl/src/ClimaArtifactsHelper.jl:196 [ Info: ifs_oper_T1279_2016080100.nc not found, downloading it (might take a while) ERROR: LoadError: IOError: sendfile: no space left on device (ENOSPC) Stacktrace: [1] uv_error @ ./libuv.jl:100 [inlined] [2] sendfile(dst::Base.Filesystem.File, src::Base.Filesystem.File, src_offset::Int64, bytes::Int64) @ Base.Filesystem ./filesystem.jl:153 [3] sendfile(src::String, dst::String) @ Base.Filesystem ./file.jl:1004 [4] cp(src::String, dst::String; force::Bool, follow_symlinks::Bool) @ Base.Filesystem ./file.jl:386 [5] cp @ ./file.jl:378 [inlined] [6] create_artifact_guided_one_file(file_path::String; artifact_name::String, file_url::String, append::Bool) @ ClimaArtifactsHelper ~/work/ClimaArtifacts/ClimaArtifactsHelper.jl/src/ClimaArtifactsHelper.jl:208 [7] top-level scope @ ~/work/ClimaArtifacts/atmos_dyamond_summer/create_artifact.jl:9 — Reply to this email directly, view it on GitHub <#64 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACF6E7NLT3DUKLJMCB3FNLTZ7ZG27AVCNFSM6AAAAABRMFN4AWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINRWGM4DQMBYHA . You are receiving this because you were mentioned.Message ID: @.>

cp ifs_oper_T1279_2016080100.nc atmos_dyamond_summer/ifs_oper_T1279_2016080100.nc

works well. However,

Base.cp("ifs_oper_T1279_2016080100.nc", "atmos_dyamond_summer/ifs_oper_T1279_2016080100.nc")

tries to create a huge file that eventually uses up all the space on the hard drive.

@Sbozzolo
Copy link
Member

This sounds like an issue with Julia Base, can you open at issue on their GitHub page?

@sriharshakandala
Copy link
Member Author

Using run(...) fixes the issue.

 julia --project create_artifact.jl 
[ Info: ifs_oper_T1279_2016080100.nc not found, downloading it (might take a while)
The artifact directory is large (18323896344 bytes), so the artifact has to be handled manually
The id of your artifact is 3786a72f51576785549ec4ca42ff3222a9f6cff2
Create the folder `/groups/esm/ClimaArtifacts/artifacts/atmos_dyamond_summer` on the cluster
Then, upload the content of atmos_dyamond_summer to that folder
Add the following entry to the Overrides.toml file you find in `/groups/esm/ClimaArtifacts/artifacts/`

3786a72f51576785549ec4ca42ff3222a9f6cff2 = "/groups/esm/ClimaArtifacts/artifacts/atmos_dyamond_summer"

Here is your artifact string. Copy and paste it to your Artifacts.toml

[atmos_dyamond_summer]
git-tree-sha1 = "3786a72f51576785549ec4ca42ff3222a9f6cff2"

Artifact string saved to OutputArtifacts.toml
Feel free to add other metadata/properties (e.g., laziness)
Enjoy the rest of your day!

I made a PR #69 with the change.

Copy link
Member

@Sbozzolo Sbozzolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove the change in ClimaArtifactsHelper.jl (we can discuss it in the other PR), squash the commits, and add a link to the main readme?

@sriharshakandala
Copy link
Member Author

Can you remove the change in ClimaArtifactsHelper.jl (we can discuss it in the other PR), squash the commits, and add a link to the main readme?

Yes. That's the plan. I wanted to first check if this works correctly!

sriharshakandala added a commit that referenced this pull request Nov 13, 2024
`Base.cp` seems to have a bug when copying large files. Using `run` fixes this behaviour.
This fixes the bug mentioned in #64
sriharshakandala added a commit that referenced this pull request Nov 13, 2024
`Base.cp` seems to have a bug when copying large files. Using `run` fixes this behaviour.
This fixes the bug mentioned in #64
sriharshakandala added a commit that referenced this pull request Nov 13, 2024
`Base.cp` seems to have a bug when copying large files. Using `run` fixes this behaviour.
This fixes the bug mentioned in #64
@sriharshakandala sriharshakandala force-pushed the sk/add_dyamond_summer branch 2 times, most recently from b35e86f to 28d6fe6 Compare November 13, 2024 23:13
@sriharshakandala
Copy link
Member Author

Can you remove the change in ClimaArtifactsHelper.jl (we can discuss it in the other PR), squash the commits, and add a link to the main readme?

Done.

README.md Outdated Show resolved Hide resolved
@sriharshakandala sriharshakandala merged commit ccfc9ae into main Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants