-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLI importers are broken #1238
Comments
@dunn I think you'll have this problem if you upgrade to Hyrax. |
Thanks for the heads-up! |
Not sure I quite follow this. The def file_attributes
files_directory.present? && files.present? ? { files: file_paths } : {}
end
...
def transform_attributes
StringLiteralProcessor.process(attributes.slice(*permitted_attributes)).merge(file_attributes)
end to end up used in
Hyrax::DefaultMiddlewareStack.build_stack.build(Actors::Terminator.new) which has to be one of the least intelligible pieces of indirection in our stack, layering on 17 other middleware actors. @jcoyne, your contention is that none of these actually does what we need anymore? Not even @mjgiarlo What is the right direction to fix this:
|
@atz I'd be most inclined in favor of |
@mjgiarlo That would be fine for me (assuming it would work). We would need to decide how to handle the "identity" of files that start as just local filesystem files. Like, do we look for a matching Note: if the point of the CLI is "slurp these local files into fedora", but the CarrierWave config is backed remotely (like S3), then we are introducing cost. I don't see a way around that if ingest is going to trigger async jobs anyway though. I think anybody who cares about that would have to use shared mounts and configure CarrierWave to use them. |
@mjgiarlo, @jcoyne: trying to use
This is almost useless as a governing instruction to end users. If the sum total total of the instructions are "Give us some CSV - with a header row, maybe include What is the relationship between the data and the images in the There is an example CSV
So from that, we might be able to deduce some answers (field names, repeatability!), but also raise other questions: how is Anwyay, when I attempt even a test run on the fixture CSV, I get:
My interpretation is that even our fixture CSV is not "valid enough" to actually be processed. And we expect admins/users to do better than that? |
If I add a
None of this even gets into the Hyrax changes. |
OK, cherry-picked 6e4ccf3 to get some of Mike's fixes. Now I get:
This is after reverting back to the unaltered fixture CSV.... OR from using @mjgiarlo's |
OK, I isolated the exception to be: > ActiveFedora::Base.exists? "wg827ks1643"
true
> Collection.exists? "wg827ks1643"
true
> GenericWork.exists? "wg827ks1643"
*** ActiveFedora::ActiveFedoraError Exception: Model mismatch. Expected GenericWork. Got: Collection To me, this doesn't fit the method sig/behavior of If you can't safely ask whether a thing exists, the rest of your program logic is going to get ornate or fall apart. |
Filed a PR on AF: samvera/active_fedora#1277 |
Reassigned to another project, so this will fall to somebody else to pick up. |
I haven't really been following this and #1164 so I may not have the full picture, but I've been working on some things that may be relevant. Looks like the Hyku CLI was based partly on work DCE did for UCSB; we've rewritten much of it and now have a single As for the problems around CSV format that @atz mentioned, I've been talking with @mcritchlow about working toward some sort of standardized schema that would allow us to do validations (and would also make code more shareable). The beginnings of that is here: https://github.com/ucsblibrary/metadata-ci |
Related issue logged in Hyrax (for 2.0.0.beta): samvera/hyrax#1654 I've been using the ObjectFactory for my own eprints importer in Hyku, bypassing I'm having problems passing |
This commit removes the actor that handled the
:files
attribute: samvera/hyrax@3f1b581The ObjectFactory and its descendants expect the
:files
attribute to be handled by the work actor.https://github.com/samvera-labs/hyku/blob/master/lib/importer/factory/object_factory.rb#L133
This probably affects the CSVImporter and MODSImporter but not the PURLImporter since the latter uses remote files: https://github.com/samvera-labs/hyku/blob/master/app/jobs/import_work_from_purl_job.rb#L47
The text was updated successfully, but these errors were encountered: