Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fallback to screenshots or opengraph if possible in media resolution #452

Merged
merged 12 commits into from
Jan 15, 2021

Conversation

johnshaughnessy
Copy link
Contributor

@johnshaughnessy johnshaughnessy commented Jan 12, 2021

Fix #435

A lot of screenshots in a hubs room

@johnshaughnessy johnshaughnessy changed the title Feature/media resolve fallback to screenshots Fallback to screenshots or opengraph if possible in media resolution Jan 12, 2021
# twice in a row. These resolve functions can be simplified so that we can
# more easily track individual failures and only fallback when necessary.
# Also make sure they have a uniform response shape for indicating an
# error.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a tech debt concern that I do not want to cover in this PR, (and I am not sure I'm right), but it might be worth changing in the future:

I think what we might want to do is:

  • Refactor resolve/2, resolve_with_ytdl, and resolve_non_video so that they return {:ok, %ResolvedMedia{}} | {:error, reason} instead of a return value for the Cachex fallback function.
  • In the outer function (resolve/1), match on the {:ok, %ResolvedMedia{}} | {:error, reason} tuples (from the functions above) using case to decide
    • whether to try to fallback to screenshots/opengraph,
    • whether to increment Statix metrics or other telemetry,
    • what we actually want to store in the cache in the event of an error (e.g. details about what went wrong so that we can return that info to the client)
  • From the outer function (resolve/1), return a tuple for the Cachex fallback function with a status of :commit or :ignore:

A fallback function should return a Tuple consisting of a :commit or :ignore tag and a value. If the Tuple is tagged :commit the value will be placed into the cache and then returned. If tagged :ignore the value will be returned without being written to the cache. If you return a value which does not fit this structure, it will be assumed that you are committing the value.

From https://hexdocs.pm/cachex/Cachex.html#fetch/4

This may not be necessary, but it seems to me like it might be clearer to explicitly return one of:

  • {:commit, resolved_media}
  • {:commit, {:error, reason}}
  • {:commit, nil} (Don't use nil to indicate 404 -- instead use something like {:commit, {:error, 404}}
  • {:ignore, some_uncached_value}

We should do our best differentiate between the return values of Cachex.fetch(:media_urls, query) responses. Specifically,

case Cachex.fetch(:media_urls, query) do
  {:error, _} -> # This is a Cachex error!
  {_status, {:error, _reason} } -> # This is a cached error we put in ourselves
end

See this commit for details ffab45c#diff-41751f7931f9a08bf162d96c927b4c6872c21b2d215916a5b0f8041848d08152R119

This might be easier to distinguish if we create a ResolvedMediaError struct to put in the cache so that we don't mistake the difference between the two.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds correct to me. Mixing the fallback stuff in with resolving is confusing. We should have resolving return an ok/error result like you suggest, and then use that in cachex fallback. Might be nice for the cachex part not even to live in this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Captured this in #454 for addressing at a later date

defp maybe_do_telemetry({:error, _reason}), do: Statix.increment("ret.media_resolver.unknown_error")
defp maybe_do_telemetry({:commit, :error}), do: Statix.increment("ret.media_resolver.500")
defp maybe_do_telemetry({:commit, {:error, _reason}}), do: Statix.increment("ret.media_resolver.500")
defp maybe_do_telemetry(_), do: nil
Copy link
Contributor Author

@johnshaughnessy johnshaughnessy Jan 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only do telemetry on the {:commit, _} tuples because that indicates when they are first entered into the cache. This way, we don't log 100 404's if there's 1 resource that 100 people try to reference because the media resolution only really failed once.

@@ -24,6 +24,8 @@ defmodule Ret.HttpUtils do
end

def retry_until_success(verb, url, body \\ "", headers \\ [], cap_ms \\ 5_000, expiry_ms \\ 10_000) do
headers = headers ++ [{"User-Agent", "Mozilla/5.0 (X11; Linux x86_64; rv:84.0) Gecko/20100101 Firefox/84.0"}]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is correct, but it seems like we need to spoof a User-Agent for certain sites to respond to the requests we make for screenshot / opengraph resolution (the ones with the Range headers below https://github.com/mozilla/reticulum/pull/452/files#diff-466e850938cfeaae016678e1ab2b328f907c2b5c7620a19ded2470dfad504b54R365 )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems sane to spoof a normal browser, what specific string we use doesn't really matter much I don't think. Might want to use a Windows one in case the linux one confuses some random website.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced the linux one with a Windows + Firefox one:
Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:84.0) Gecko/20100101 Firefox/84.0

Copy link
Contributor

@netpro2k netpro2k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@@ -24,6 +24,8 @@ defmodule Ret.HttpUtils do
end

def retry_until_success(verb, url, body \\ "", headers \\ [], cap_ms \\ 5_000, expiry_ms \\ 10_000) do
headers = headers ++ [{"User-Agent", "Mozilla/5.0 (X11; Linux x86_64; rv:84.0) Gecko/20100101 Firefox/84.0"}]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems sane to spoof a normal browser, what specific string we use doesn't really matter much I don't think. Might want to use a Windows one in case the linux one confuses some random website.

@@ -74,6 +74,14 @@ defmodule RetWeb.Api.V1.MediaController do
end

defp resolve_and_render(conn, url, version, quality \\ nil) do
query = query_for(conn, url, version, quality)
value = Cachex.fetch(:media_urls, query)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could pipeline value through here (would need to change render_resolved_media_or_error)... Not sure if it would actually be clearer since value is not being transformed anyway...

This definitely reads more clearly than what we had before.

Also, we didn't have it before but might be worth adding a note that Cachex.fetch is what actually kicks off the resolution if its not in the cache.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RE: adding a note about Cachex.fetch.

I could be persuaded into adding a note (and I did not know this behavior before now either), but since it's used in several places in the client I am also OK with having people find out about this by consulting the docs. I don't feel strongly about this though so I'll add it if you think it's helpful

./lib/ret/slack_client.ex:78:      case Cachex.fetch(
./lib/ret/slack_client.ex:93:      case Cachex.fetch(
./lib/ret/slack_client.ex:105:      case Cachex.fetch(
./lib/ret/app_config.ex:49:        Cachex.fetch(:app_config, "")
./lib/ret/app_config.ex:117:    case Cachex.fetch(:app_config_value, key) do
./lib/ret/app_config.ex:126:    case Cachex.fetch(:app_config_owned_file_uri, key) do
./lib/ret/discord_client.ex:66:      case Cachex.fetch(:discord_api, "/guilds/#{community_id}/members/#{provider_account_id}") do
./lib/ret/discord_client.ex:71:      case Cachex.fetch(:discord_api, "/users/#{provider_account_id}") do
./lib/ret/discord_client.ex:80:    case Cachex.fetch(:discord_api, "/users/#{provider_account_id}") do
./lib/ret/discord_client.ex:138:      case Cachex.fetch(:discord_api, "/guilds/#{community_id}") do
./lib/ret/discord_client.ex:146:        case Cachex.fetch(:discord_api, "/guilds/#{community_id}/roles") do
./lib/ret/discord_client.ex:172:        case Cachex.fetch(:discord_api, "/channels/#{channel_id}") do
./lib/ret/discord_client.ex:212:      case Cachex.fetch(:discord_api, "/guilds/#{community_id}/members/#{discord_user_id}") do
./lib/ret/coturn.ex:21:    {_, coturn_secret} = Cachex.fetch(:coturn_secret, :coturn_secret)
./lib/ret_web/controllers/api/v1/media_search_controller.ex:86:    case Cachex.fetch(cache_for_query(query), query) do
./lib/ret_web/controllers/api/v1/media_controller.ex:78:    value = Cachex.fetch(:media_urls, query)

# twice in a row. These resolve functions can be simplified so that we can
# more easily track individual failures and only fallback when necessary.
# Also make sure they have a uniform response shape for indicating an
# error.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds correct to me. Mixing the fallback stuff in with resolving is confusing. We should have resolving return an ok/error result like you suggest, and then use that in cachex fallback. Might be nice for the cachex part not even to live in this file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Media resolver suggestion: screenshot fallback for failed media resolution
2 participants