Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SiteLocalConfigService::storageDescriptionPath inconsistency #46043

Open
Dr15Jones opened this issue Sep 18, 2024 · 9 comments
Open

SiteLocalConfigService::storageDescriptionPath inconsistency #46043

Dr15Jones opened this issue Sep 18, 2024 · 9 comments

Comments

@Dr15Jones
Copy link
Contributor

All other code in SiteLocalConfigService find where the proper site information is stored via the member data m_url, which normally comes from the environment variable SITECONFIG_PATH. However, SiteLocalConfigService::storageDescriptionPath calls getenv("SITECONFIG_PATH") itself (without checking it is valid).

Should storageDescriptionPath be changed to use m_url?

@Dr15Jones
Copy link
Contributor Author

assign core

@cmsbuild
Copy link
Contributor

New categories assigned: core

@Dr15Jones,@makortel,@smuzaffar you have been requested to review this Pull request/Issue and eventually sign? Thanks

@cmsbuild
Copy link
Contributor

cms-bot internal usage

@cmsbuild
Copy link
Contributor

A new Issue was created by @Dr15Jones.

@Dr15Jones, @antoniovilela, @makortel, @mandrenguyen, @rappoccio, @sextonkennedy, @smuzaffar can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@makortel
Copy link
Contributor

Should storageDescriptionPath be changed to use m_url?

For history, the SiteLocalConfigService::storageDescriptionPath() function was introduced in #37278. I don't see any particular discussion there on this particular question. Although we pretty much require the SITECONFIG_PATH to be defined. And the job should fail earlier if the SITECONFIG_PATH is not defined (demonstrated e.g. in https://cms-talk.web.cern.ch/t/crab-test-cmssw-12-6-x-invalid-site-local-config/15423/15), unless the node happens to have a file /JobConfig/site-local-config.xml.

All present uses (I guess, didn't really verify) of m_url are about access to site-local-config.xml (whose default location is $SITECONFIG_PATH/JobConfig/site-local-config.xml. The SiteLocalConfigService::storageDescriptionPath(), however, operates on the directory structure under $SITECONFIG_PATH.

Theoretically we could take the directory part of m_url and go one level up. My only question in that approach would be how to deal with the case when the site-local-config.xml path is overridden in the configuration

: m_url(pset.getUntrackedParameter<std::string>("siteLocalConfigFileUrl", defaultURL())),

Should the storagePath() use the directory structure specified by the overridden location of site-local-config.xml, or $SITECONFIG_PATH?

In any case I think

  • the getenv() handling should be improved (could be e.g. checking the return value, or throwing an exception early if SITECONFIG_PATH is not set)
  • we'd (at least eventually) need an ability to override the path used in storageDescriptionPath() via configuration (could be e.g. tied to the present siteLocalConfigFileUrl, or a new configuration parameter)

Adding @stlammel @nhduongvn if they would have any thoughts.

@stlammel
Copy link

In case of a subsite override, m_url would be of the form $SITECONFIG_PATH//JobConfig/site-local-config.xml and one would have to go up two directory levels.
Is the override the env variable SITECONFIG_PATH provides insuffcient?
In the age of CVMFS, one could default SITECONFIG_PATH, i.e. if not set, to /cvmfs/cms.cern.ch/SITECONF/ in both places.
My two cents.

  • Stephan

@makortel
Copy link
Contributor

In case of a subsite override, m_url would be of the form $SITECONFIG_PATH//JobConfig/site-local-config.xml and one would have to go up two directory levels.

Well, in case of CMSSW-configuration level override, I'd assume the m_url could be nearly anything, and wouldn't necessarily follow CMS' production infrastructure conventions. (other question is then how much such flexibility should we allow).

Is the override the env variable SITECONFIG_PATH provides insuffcient?

I can imagine weird (user) use cases where adjusting environment variables might not be easy, but adjusting CMSSW configuration would be.

@nhduongvn
Copy link
Contributor

The m_url points to site-local-config.xml
storageDescriptionPath is used to get the path to storage.json (which equivalent to storage.xml in the old days). Since it is not trivial to obtain this path (I mean using couple lines of code), this function is needed. In the old days, the location of storage.xml is clearly defined in site-local-config.xml (see https://cmssdt.cern.ch/lxr/source/FWCore/Services/src/SiteLocalConfigService.cc#0353 <catalog url="trivialcatalog_file:/x/y/z.xml"/>) so a dedicated function might not needed.
If we want to totally liberate from rigid setting dictated by $SITECONFIG_PATH, we need to go back to the old format of site-local-config.xml where location of storage.json is explicitly defined similar to storage.xml case. However, from what I understand this conflicts with the ideas of improving the whole storage descriptions in the first place (I mean for some reasons we prefer to maintain a consistent location of storage.json which is where JobConfig folder locates)
I think storageDescriptionPath should not be replaced by m_url based on the current settings. Moreover, they are doing two different jobs: m_url is for site-local-config.xml while storageDescriptionPath is for storeage.json

@nhduongvn
Copy link
Contributor

if we want CMSSW-configuration level override for storage.json, probably a new config setting like m_storage_json_path is needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants