Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update requests for Rucio Datasets #705

Open
mhkirby opened this issue Aug 12, 2024 · 3 comments
Open

Update requests for Rucio Datasets #705

mhkirby opened this issue Aug 12, 2024 · 3 comments
Assignees
Labels
data_collections enhancement New feature or request monitoring General DUNE Computing monitoring services

Comments

@mhkirby
Copy link
Member

mhkirby commented Aug 12, 2024

I'm not certain about the usefulness of these plots. The fact that there are more than 200k datasets in Rucio concerns me that we're over populating Rucio with datasets. So maybe this isn't an update for monitoring but for how we're using Rucio. I think that we need to have something other than datasets to monitor, because 200k isn't something that we can reasonably track in our heads. or make decisions about. We need something less granular and more reasonably associated with something like PDHD raw data, PDHD TP files, PDHD noise runs, something like that.

@mhkirby mhkirby added enhancement New feature or request monitoring General DUNE Computing monitoring services labels Aug 12, 2024
@StevenCTimm
Copy link
Collaborator

The overall plan--something that both Aaron and myself are working on,
is to make a plot that includes the location of the public data sets.. i.e. at the higher level than the original run dataset.
Agree that the run-level data sets, of which there are many, don't give us very much information except for statistics.
The best proxy we have right now is the plot by scope--of which Aaron and Wenlong both have independent plots.

@ahiguera-mx
Copy link
Contributor

ahiguera-mx commented Aug 12, 2024

For the official physics dataset, we can do what @mhkirby suggests, thanks to the 1:1 correspondence between rucio container and metacat dataset, at the moment, I have a Python script that does that. To translate into a dashboard monitoring is something that I would have to coordinate with @wyuan-uoe

@ahiguera-mx
Copy link
Contributor

As discussed in an email thread, the current tools to monitor and report are available here and here. This is what has been used for CRAB reports. Let me know if you guys have any questions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data_collections enhancement New feature or request monitoring General DUNE Computing monitoring services
Projects
None yet
Development

No branches or pull requests

4 participants