Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs] Migrating from an older Spotlight version #1768

Closed
danschmidt5189 opened this issue Mar 8, 2017 · 14 comments
Closed

[Docs] Migrating from an older Spotlight version #1768

danschmidt5189 opened this issue Mar 8, 2017 · 14 comments

Comments

@danschmidt5189
Copy link

danschmidt5189 commented Mar 8, 2017

I inherited an old version of Spotlight (e5f5b6b) and am trying to upgrade to the latest version. My overall procedure boiled down to:

  • Deploying our existing version to a Vagrant dev environment
  • Re-running the rake rails:template command pointing at the latest template.rb
  • Handling conflicts (not trivial :)

The schema has quite a few backwards-incompatible changes, so to load production data I:

  • Down-migrated to our production DB version (rake db:migrate VERSION=20170213200533)
  • Imported production data into dev (using yaml_db)
  • Ran remaining migrations
  • Copied production's public/uploads into the dev instance

At this point, I'd expect that running rake spotlight:reindex would restore the production state. That's worked with previous, albeit much smaller, upgrades. However in this case, I get the following error:

$ bundle exec rake spotlight:reindex
== Reindexing << redacted: the first exhibit title >> ==
rake aborted!
ActiveRecord::RecordNotFound: Couldn't find Spotlight::FeaturedImage without an ID

That looks like the result of a schema change that I'm just not sure how to fix — any advice?

More generally, what's the recommended procedure for upgrading an existing, production Spotlight instance?


Edit 1: Also, it's worth noting that the <exhibit>/resources/reindex_all endpoint 404s after this upgrade. After some inspection, it turns out that this is the result of the same "RecordNotFound" error you'd get executing the same command via rake — it just surfaces as a 404 in the UI.

Edit 2: Ok, looks like you also need to:

  • Mount the Riiif engine, e.g.: config/routes.rb: mount Riiif::Engine => '/images', as: 'riiif'
  • Migrate into Riiiif: rake spotlight:migrate_to_riiif[http://localhost/images]

After running this, the items/images are visible in Curation>Items, but the image URLs are wrong.

BTW, the migration task throws this warning repeatedly:

DEPRECATION WARNING: Blacklight::Document#empty? is deprecated; use obj.to_h.empty? instead. (called from method_missing at /srv/spotlight/vendor/bundle/ruby/2.2.0/gems/activemodel-4.2.3/lib/active_model/attribute_methods.rb:430)```
@danschmidt5189 danschmidt5189 changed the title Migrating from older versions [Docs] Migrating from an older Spotlight version Mar 8, 2017
@jkeck
Copy link
Contributor

jkeck commented Mar 9, 2017

@danschmidt5189 looking at your edits, I think you've gotten most of the way there. Not sure if you were able to find the upgrade notes in our release notes or not but definitely check those out.

I think the wrong URL issue that you are seeing is because the host name that you passed in the IIIF migration task was http://localhost/images when it should probably be http://localhost (and a port if you're running under one).

If that's not the case, please let me know what the erroneous image urls you're seeing.

@danschmidt5189
Copy link
Author

danschmidt5189 commented Mar 9, 2017

Ah, thanks @jkeck . I missed the release notes!

After re-running migrate_to_iiif with the correct address, we're now getting 500s instead of 404s. Progress!

From the logs:

I, [2017-03-09T02:01:35.763471 #1236]  INFO -- : Started GET "/images/48/0,972,2861,286/1800,180/0/default.jpg" for 127.0.0.1 at 2017-03-09 02:01:35 +0000
I, [2017-03-09T02:01:35.764633 #1236]  INFO -- : Processing by Riiif::ImagesController#show as JPEG
I, [2017-03-09T02:01:35.764722 #1236]  INFO -- :   Parameters: {"rotation"=>"0", "region"=>"0,972,2861,286", "quality"=>"default", "model"=>"riiif/image", "id"=>"48", "size"=>"1800,180"}
I, [2017-03-09T02:01:35.765772 #1236]  INFO -- : Completed 500 Internal Server Error in 1ms (ActiveRecord: 0.0ms)
F, [2017-03-09T02:01:35.766859 #1236] FATAL -- :
ArgumentError (You must provide a format):
  vendor/bundle/ruby/2.2.0/gems/riiif-1.1.0/app/models/riiif/image.rb:66:in `decode_options!'
  vendor/bundle/ruby/2.2.0/gems/riiif-1.1.0/app/models/riiif/image.rb:38:in `render'
  vendor/bundle/ruby/2.2.0/gems/riiif-1.1.0/app/controllers/riiif/images_controller.rb:23:in `show'
  vendor/bundle/ruby/2.2.0/gems/actionpack-4.2.3/lib/action_controller/metal/implicit_render.rb:4:in `send_action'

I will look more into this tomorrow. And my apologies for my vagueness — it's a non-trivial stack and my experience with it is quite limited. Thanks for your help!

@jcoyne
Copy link
Member

jcoyne commented Mar 9, 2017

I notice you are using Rails 4.2.3. Does this error persist if you upgrade to 4.2.8?

@danschmidt5189
Copy link
Author

@jcoyne Nope, both the deprecation warning and RIIIF error persist in testing with Rails 4.2.8.

@danschmidt5189
Copy link
Author

danschmidt5189 commented Mar 10, 2017

Looking into this a bit further, the bug is related to riiif's use of hashes versus Rails' ActiveSupport::HashWithIndifferentAccess object. Namely, images_controller supposedly returns an indifferent-access hash (according to the comment):

##
# @return [ActiveSupport::HashWithIndifferentAccess]
def image_request_params
  params.permit(:region, :size, :rotation, :quality, :format).to_h
end

But in the version of Rails I'm testing (4.2.8 now), it's actually returning a regular Ruby hash. Thus, attempts to access params[:format] fail later down the call stack — they need to be params["format"].

Upgrading to Rails 5.0.2 fixes the riiif problem (so images render correctly) but causes a new error in rendering Spotlight's views:

I, [2017-03-10T20:45:29.375684 #17759]  INFO -- :   Rendered vendor/bundle/ruby/2.2.0/bundler/gems/spotlight-a0d047f089ad/app/views/_user_util_links.html.erb (69.7ms)
I, [2017-03-10T20:45:29.375926 #17759]  INFO -- :   Rendered vendor/bundle/ruby/2.2.0/bundler/gems/spotlight-a0d047f089ad/app/views/shared/_header_navbar.html.erb (71.2ms)
I, [2017-03-10T20:45:29.377031 #17759]  INFO -- : Completed 500 Internal Server Error in 160ms (ActiveRecord: 6.1ms)
F, [2017-03-10T20:45:29.385014 #17759] FATAL -- :
F, [2017-03-10T20:45:29.385168 #17759] FATAL -- : ActionView::Template::Error (Attribute was supposed to be a Hash, but was a ActionController::Parameters. -- <ActionController::Parameters {"spotlight_upload_description_tesim"=><ActionController::Parameters {"label"=>"Description", "weight"=>"0", "list"=>true, "gallery"=>true, "masonry"=>true, "slideshow"=>true, "show"=>true, "enabled"=>true} permitted: false>, "spotlight_upload_attribution_tesim"=><ActionController::Parameters {/* ... snip ... */):
F, [2017-03-10T20:45:29.385445 #17759] FATAL -- :     1: <div class="navbar-right">
    2:
    3:   <ul class="nav navbar-nav">
    4:     <%= render_nav_actions do |config, action|%>
    5:       <li><%= action %></li>
    6:     <% end %>
    7:   </ul>
F, [2017-03-10T20:45:29.385495 #17759] FATAL -- :
F, [2017-03-10T20:45:29.385975 #17759] FATAL -- : vendor/bundle/ruby/2.2.0/gems/activerecord-5.0.2/lib/active_record/coders/yaml_column.rb:34:in `assert_valid_value'

@danschmidt5189
Copy link
Author

Ok, tracked down the latest problem to a change in how Rails 5 handles ActionController::Parameters — previously a hash, now an Object. This leads to the type mismatch error pasted above (in the blacklight_configuration model).

Still working on a fix for that one.

@jcoyne
Copy link
Member

jcoyne commented Mar 13, 2017

@danschmidt5189 Yes, but we're using Rails 5 and haven't been able do duplicate you problem. Even with ActionController::Parameters you can access values (#[]) using strings or symbols.

@danschmidt5189
Copy link
Author

@jcoyne Were you able to upgrade exhibits that were created in the pre-Rails 5 version of Spotlight? Newly created exhibits are fine; it's older exhibits that trigger the error.

@danschmidt5189
Copy link
Author

danschmidt5189 commented Mar 13, 2017

So I re-tested this end-to-end from scratch with production data. For background, the basic process is:

  1. Install spotlight from our Git repo
  2. Import production data, then run remaining migrations
  3. Start the app
  4. Run spotlight:migrate_to_iiif[http://$DEV_SERVER_ADDR]
  5. Run spotlight:reindex

The key step as it pertains to this Issue is (2). If I load our production data as-is (containing references to !ruby/hash:ActionController::Parameters) then it fails with the error pasted above. If I sed-replace that to !ruby/hash:Hash (or, presumably, any sub-class of Hash) it works.

So, takeaways:

  • Regex-replacing the deserialization line in the database provides an ugly workaround.
  • I'd expect all users with significant existing production data to run into this problem trying to migrate from Rails 4 to 5.

@jcoyne
Copy link
Member

jcoyne commented Mar 15, 2017

I think this may fix your problem with RIIIF sul-dlss/riiif#49

@jcoyne
Copy link
Member

jcoyne commented Mar 15, 2017

That patch was released as 1.1.2 https://github.com/curationexperts/riiif/releases/tag/v1.1.2

@danschmidt5189
Copy link
Author

danschmidt5189 commented Jun 28, 2017

Following up—do you guys know if the IIIF migration task misses Spotlight::HomePage objects?

After running the migration, we ran into a bug where editing a home page containing a solr_documents_grid would result in all the images 404ing. In the database, we can see that post-edit the iiif-related fields are all empty, e.g.:

{
  "item_1": {
    "id": "12-662",
    "title": "燕山俠隱",
    "thumbnail_image_url": "",
    "full_image_url": "/uploads/spotlight/resources/upload/url/662/yan_shan_xia_yin_2.jpg",
    "iiif_tilesource": "undefined",
    "iiif_manifest_url": "undefined",
    "iiif_canvas_id": "undefined",
    "iiif_image_id": "undefined",
    "weight": "1",
    "display": "true"
  }
}

If you manually delete and re-add the entry, it's updated to something like this in the DB:

{
  "item_0": {
    "id": "12-549",
    "title": "红色娘子军",
    "thumbnail_image_url": "",
    "full_image_url": "",
    "iiif_tilesource": "http://exhibits.vagrant.lib.berkeley.edu:3000/images/702/info.json",
    "iiif_manifest_url": "/fonoroff-collection/catalog/12-549/manifest",
    "iiif_canvas_id": "http://exhibits.vagrant.lib.berkeley.edu:3000/fonoroff-collection/catalog/12-549/manifest/canvas/12-549",
    "iiif_image_id": "",
    "weight": "0",
    "display": "true"
  }
}

Any ideas? We kludge'd our way around this by writing our own rake task, which loops through our pages and updates content manually. But that's pretty brittle; I was hoping you guys could shed some light on the underlying issue.

And by the way, thanks for all your work on this! We're still learning how the Blacklight suite of apps are architected; down the road we could hopefully contribute to development. Some of this stuff, though, especially the SirTrevorRails serializations, read like black magic at the moment.


Note: The JSON snippets above are just extracts from running a SQL query, essentially:

SELECT content FROM spotlight_pages WHERE id = ? AND type = 'Spotlight::HomePage';

I then pulled out the offending documents from data.type=solr_documents_grid. The underlying record has a format like:

{
  "data": [
    {
      "type": "solr_documents_grid",
      "data": {
        "item": {
          // Samples extracted from here...
        }
      }
    },
    // ...
  ]
}

Hope that provides enough context to job the memory of someone familiar with the codebase.

@jkeck
Copy link
Contributor

jkeck commented Jun 28, 2017

@danschmidt5189 do you mean the thumbnail of the home page or images w/i the content of widgets? We did the former not the later (as deserializing/reserializng the sir-trevor widget output was deemed too much of a pain).

What I would expect to see in this case is that in the absence of IIIF data, the widget would fall-back to the full_image_url. It's possible that some part of the UI switched the IIIF field values to undefined (assuming that's coming from the database and not locally formatted), which we're not properly capturing for? Are you seeing this immediately post-migration or has somebody come along to this page and saved it (which then subsequently caused the undefined values to show up)?

@danschmidt5189
Copy link
Author

danschmidt5189 commented Jun 28, 2017

The latter on both counts—1) I'm referring to the thumbnails embedded within a homepage, not the homepage's thumbnail; and 2) it's fine initially (right after the iiif migration) but breaks after an edit (even if you don't actually change anything).

Is it possible to programmatically rebuild the widgets in code? (We ended up mucking with the serialized JSON directly; it works but we're unsure if that could create other problems down the road.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants