Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecated TLS negotiation in URLLIB3 causing issues #819

Open
flooie opened this issue Dec 14, 2023 · 9 comments
Open

Deprecated TLS negotiation in URLLIB3 causing issues #819

flooie opened this issue Dec 14, 2023 · 9 comments
Assignees

Comments

@flooie
Copy link
Contributor

flooie commented Dec 14, 2023

In reviewing a number of scrapers today for problem sets, I noticed that there was a pattern for when many started showing. That pattern was around December 5th.

After investigating further it became clear that three court websites for atleast four courts still use TLSv1.2 - see below.

- https://www.la-fcca.org
- * SSL connection using TLSv1.2 / AES128-SHA

- https://www.jud.ct.gov"
- * SSL connection using TLSv1.2 / AES256-SHA256
- Warning: Ignores instruction to use SSLv3

- https://oag.ca.gov
- * SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256

urllib3 - which underlies requests deprecated the negotiations for older TLS standards when it switched to urllib3 > 2.0.

Four Louisiana Court of Appeals, the Supreme and Appellate courts of Connecticut we can resolve this by keeping urllib3 at the 1.26.18

For whatever reason - I havent found a simple solution for the California AG website. But writing a custom http adapter with a "legacy server connect" solves California AG. See below.

class CustomHttpAdapter (requests.adapters.HTTPAdapter):
    # "Transport adapter" that allows us to use custom ssl_context.

    def __init__(self, ssl_context=None, **kwargs):
        self.ssl_context = ssl_context
        super().__init__(**kwargs)

    def init_poolmanager(self, connections, maxsize, block=False):
        self.poolmanager = urllib3.poolmanager.PoolManager(
            num_pools=connections, maxsize=maxsize,
            block=block, ssl_context=self.ssl_context)


def get_legacy_session():
    ctx = ssl.create_default_context(ssl.Purpose.SERVER_AUTH)
    ctx.options |= 0x4  # OP_LEGACY_SERVER_CONNECT
    session = requests.session()
    session.mount('https://', CustomHttpAdapter(ctx))
    return session

Regardless - calag, conn, connctapp and lactapp_1 all have a path to resolve them.

I have a feeling that this change was triggered in Missouri appellate courts - and the errors built up and caused the courts to block us. Which needs a call to reach out.

@mlissner
Copy link
Member

Some details would help? What's the issue, which court, stacktrace?

@flooie flooie changed the title OpenSSL issues since December 5th Deprecated TLS negotiation in URLLIB3 causing issues Dec 14, 2023
@flooie
Copy link
Contributor Author

flooie commented Dec 14, 2023

Read more about these changes here

Here

@flooie flooie moved this to Todo in @grossir's backlog Dec 27, 2023
@flooie flooie moved this from Todo to State Supreme/Appellate/OA in @grossir's backlog Dec 28, 2023
Copy link

sentry-io bot commented Jan 22, 2024

Sentry issue: COURTLISTENER-65H

Copy link

sentry-io bot commented Jan 22, 2024

Sentry issue: COURTLISTENER-64R

Copy link

sentry-io bot commented Jan 22, 2024

Sentry issue: COURTLISTENER-64T

Copy link

sentry-io bot commented Jan 22, 2024

Sentry issue: COURTLISTENER-6C1

@grossir
Copy link
Contributor

grossir commented Aug 13, 2024

connctapp has started failing again. I guess conn may fail too after the fixed scraper is merged

Sentry Issue: COURTLISTENER-7HW

@grossir grossir reopened this Aug 13, 2024
@flooie
Copy link
Contributor Author

flooie commented Aug 13, 2024

Sadly this is not surprising.

@grossir
Copy link
Contributor

grossir commented Sep 9, 2024

Currently we are using custom adapter, which fixed the loading of the HTML results page. However, downloading the actual opinion is failing on the server. It does not fail when running it locally, which adds to the mistery

self.cipher = "AES256-SHA256"
self.set_custom_adapter(self.cipher)

I propose changing the cipher approach for self.request['verify'] = False and seeing if that works

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

3 participants