Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NY Misc. court #827

Closed
mlissner opened this issue Dec 21, 2023 · 9 comments
Closed

Add NY Misc. court #827

mlissner opened this issue Dec 21, 2023 · 9 comments

Comments

@mlissner
Copy link
Member

One of our clients wants to get alerts for cases in the NY Misc. court. Now that @grossir is here, we should make this client happy.

@grossir
Copy link
Contributor

grossir commented Dec 22, 2023

Hi @mlissner the issue you linked gives me a 404. Is it broken or is it just me?

@mlissner
Copy link
Member Author

That's to our CRM, which we keep pretty private, but I wanted to link it so that we remember we did this for the client. We have three closed repos: crm, fundraising, and kubernetes.

@flooie
Copy link
Contributor

flooie commented Dec 22, 2023

@mlissner - do you mean this reporter

NY MISC Reporter

@mlissner
Copy link
Member Author

Maybe? I'll forward you a message with details.

@flooie
Copy link
Contributor

flooie commented Dec 22, 2023

Adding this scraper requires a significant extension to Juriscraper to accommodate the scraping of various "other courts" in New York, beyond the scope of specific individual courts. This development will necessitate two key changes:

Modification to Juriscraper Paradigm: The current structure of Juriscraper is predominantly focused on scraping specific courts. The proposed scraper will need to be more flexible, allowing it to handle a range of different courts under the "other courts" category in New York.

Updates to CourtListener (cl_scrape_opinions file): To integrate this new scraper effectively, we need to modify the cl_scrape_opinions file in CourtListener. This is crucial because the court_id typically associated with specific courts will now need to accommodate a broader category.

Additionally, we will need to update the courts_db to map the text of the court names to their respective court_ids. This is essential to ensure proper identification of the court in courtlistener.

Identified Courts for Inclusion

So far, we have identified the following courts for inclusion in this scraper:

  • Criminal Court of the City of New York, Bronx County
  • Surrogate's Court, Queens County
  • Supreme Court, Kings County
  • Supreme Court, New York County
  • Supreme Court, Bronx County
  • Supreme Court, Kings County
  • Criminal Court Of The City Of New York, New York County
  • Supreme Court, Queens County
  • Supreme Court, Erie County
  • Civil Court of the City of New York, Bronx County
  • County Court, Saratoga County
  • Supreme Court, Nassau County
  • Supreme Court, Richmond County
  • Supreme Court, Westchester County
  • Supreme Court, Onondaga County
  • Court Of Claims
  • Civil Court Of The City Of New York, Queens County
  • Supreme Court, Suffolk County
  • City Court Of Little Falls, Herkimer County
  • Civil Court Of The City Of New York, Bronx County
  • Civil Court Of The City Of New York, Queens County
from juriscraper.opinions.united_states.state import nyappterm_1st


class Site(nyappterm_1st.Site):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.court = "Other Courts"
        self.parameters.update({"court": self.court})

this code more or less is what is all that is needed to get teh scraper working once we make a slight modification to nyappterm_1st

@flooie flooie moved this to Todo in @grossir's backlog Dec 26, 2023
@flooie flooie moved this from Todo to Unhappy people in @grossir's backlog Dec 28, 2023
@grossir grossir moved this from Unhappy people to In Progress in @grossir's backlog Dec 28, 2023
@grossir
Copy link
Contributor

grossir commented Jan 2, 2024

This is the complete list of courts for NYMisc 2018-today, for 35 048 cases, 27 798 pdf, 7 250 htm. Some had no court data, or not on the usual places. Got data for 34 886

There are some errors like

  • 'upreme Court, Kings County' missing an S;
  • 'Supureme Court, Kings County' or 'Surpeme Court, New York County' with wrong spelling order
  • Suprme Court, Supeme missing character
{
 'Chatham Town Court, Columbia County',
 'City Court Of Albany, Albany County',
 'City Court Of Buffalo, Erie County',
 'City Court Of Cohoes',
 'City Court Of Cohoes, Albany County',
 'City Court Of Glens Falls, Warren County',
 'City Court Of Gloversville, Fulton County',
 'City Court Of Hudson, Columbia County',
 'City Court Of Ithaca, Tompkins County',
 'City Court Of Little  Falls, Herkimer County',
 'City Court Of Little Falls',
 'City Court Of Little Falls, Herkimer County',
 'City Court Of Long Beach, Nassau County',
 'City Court Of Middletown, Orange County',
 'City Court Of Mount Vernon',
 'City Court Of Mount Vernon, New York County',
 'City Court Of Mount Vernon, Westchester County',
 'City Court Of New Rochelle, Westchester County',
 'City Court Of Norwich, Chenango County',
 'City Court Of Oswego, Oswego County',
 'City Court Of Poughkeepsie',
 'City Court Of Poughkeepsie, Dutchess County',
 'City Court Of Rochester, Monroe County',
 'City Court Of Rome',
 'City Court Of Rye, Westchester County',
 'City Court Of Syracuse, Onondaga County',
 'City Court Of Troy, Rensselaer County',
 'City Court Of Utica',
 'City Court Of Watertown',
 'City Court Of Watertown, Jefferson County',
 'City Court Of Yonkers',
 'City Court Of Yonkers, Westchester County',
 'City Court Pf Mount Vernon, Westchester County',
 'City Court of Albany',
 'City Court of Albany, Albany County',
 'City Court of Auburn',
 'City Court of Buffalo',
 'City Court of Cohoes',
 'City Court of Glens Falls',
 'City Court of Glens Falls,',
 'City Court of Gloversville',
 'City Court of Hudson',
 'City Court of Ithaca',
 'City Court of Ithaca, Tompkins County',
 'City Court of Jamestown',
 'City Court of Little Falls',
 'City Court of Long Beach',
 'City Court of Middletown',
 'City Court of Mount Vernon',
 'City Court of New Rochelle, Westchester County',
 'City Court of Norwich',
 'City Court of Peekskill',
 'City Court of Peekskill, Westchester County',
 'City Court of Poughkeepsie',
 'City Court of Rensselaer',
 'City Court of Rochester',
 'City Court of Rome',
 'City Court of Rome, Oneida County',
 'City Court of Rye',
 'City Court of Rye, Weschester County',
 'City Court of Rye, Westchester County',
 'City Court of Saratoga',
 'City Court of Schenectady',
 'City Court of Syracuse',
 'City Court of Troy',
 'City Court of Utica',
 'City Court of Watertown',
 'City Court of Watervliet, Albany County',
 'City Court of White Plains',
 'City Court of Yonkers',
 'City Court of the City of Rye, Westchester County',
 'City Court, Westchester County',
 'Civil Court Of The City Of New York Queens County',
 'Civil Court Of The City Of New York,  New York County',
 'Civil Court Of The City Of New York, Bronx County',
 'Civil Court Of The City Of New York, Broxn County',
 'Civil Court Of The City Of New York, Kings County',
 'Civil Court Of The City Of New York, New York County',
 'Civil Court Of The City Of New York, Queens County',
 'Civil Court Of The City Of New York, Richmond County',
 'Civil Court Of the City Of New York, Bronx County',
 'Civil Court of City of New York, Bronx County',
 'Civil Court of The City Of New York, Queens County',
 'Civil Court of the City Of New York, New York County',
 'Civil Court of the City of New York',
 'Civil Court of the City of New York Kings County',
 'Civil Court of the City of New York Queens County',
 'Civil Court of the City of New York, Bronx County',
 'Civil Court of the City of New York, Kings County',
 'Civil Court of the City of New York, New York County',
 'Civil Court of the City of New York, Queens County',
 'Civil Court of the City of New York, Richmond County',
 'Civil Court of the City of NewYork, Bronx County',
 'Civil Court, Bronx County',
 'Civil Court, Kings County',
 'Civil Court, New York County',
 'Civil Court, Of The City Of New York, Bronx County',
 'Civil Court, Queens County',
 'County Court,  Essex County',
 'County Court, Albany County',
 'County Court, Allegany County',
 'County Court, Broome County',
 'County Court, Cayuga County',
 'County Court, Clinton County',
 'County Court, Columbia County',
 'County Court, Cortland County',
 'County Court, Dutchess County',
 'County Court, Erie County',
 'County Court, Essex County',
 'County Court, Franklin County',
 'County Court, Genesee County',
 'County Court, Jefferson County',
 'County Court, Livingston County',
 'County Court, Monroe County',
 'County Court, Montgomery County',
 'County Court, Nassau County',
 'County Court, Niagara County',
 'County Court, Oneida County',
 'County Court, Onondaga County',
 'County Court, Ontario County',
 'County Court, Orange County',
 'County Court, Putnam County',
 'County Court, Rockland County',
 'County Court, Saratoga County',
 'County Court, Schenectady County',
 'County Court, Schuyler County',
 'County Court, St. Lawrence County',
 'County Court, Steuben County',
 'County Court, Suffolk County',
 'County Court, Sullivan County',
 'County Court, Tompkins County',
 'County Court, Ulster County',
 'County Court, Warren County',
 'County Court, Wayne County',
 'County Court, Weschester County',
 'County Court, Westchester County',
 'County Court, Westechester County',
 'County Court, Wyoming County',
 'County Court, Yates County',
 'Court Of Claims',
 'Court of Claims',
 'Criminal Court Of The City Of New York, Bronx County',
 'Criminal Court Of The City Of New York, Kings County',
 'Criminal Court Of The City Of New York, New York County',
 'Criminal Court Of The City Of New York, Queens County',
 'Criminal Court Of The City Of New York, Richmond County',
 'Criminal Court of the City of New York, \r\nQueens County',
 'Criminal Court of the City of New York, Bronx County',
 'Criminal Court of the City of New York, Kings County',
 'Criminal Court of the City of New York, New York County',
 'Criminal Court of the City of New York, Queens County',
 'Criminal Court of the City of New York, Richmond County',
 'Criminal Court, Bronx County',
 'Criminal Court, Kings County',
 'Criminal Court, Queens County',
 'Criminal Court, Richmond County',
 'District Court Of Nassau County',
 'District Court Of Nassau County, First District',
 'District Court Of Nassau County, First Disttrict',
 'District Court Of Nassau County, Second District',
 'District Court Of Nassau County, Third District',
 'District Court Of Suffolk County',
 'District Court Of Suffolk County, First District',
 'District Court Of Suffolk County, Fourth District',
 'District Court Of Suffolk County, Sixth District',
 'District Court Of Suffolk County, Third District',
 'District Court of Nassau County, First District',
 'District Court of Nassau County, Fourth District',
 'District Court of Suffolk County',
 'District Court of Suffolk County, First District',
 'District Court of Suffolk County, Second District',
 'District Court of Suffolk County, Sixth District',
 'District Court of Suffolk County, Third District',
 'District Court, Nassau County',
 'District Court, Suffolk County',
 'Family Court, Albany County',
 'Family Court, Bronx County',
 'Family Court, Chemung County',
 'Family Court, Clinton County',
 'Family Court, Dutchess County',
 'Family Court, Erie County',
 'Family Court, Franklin County',
 'Family Court, Kings County',
 'Family Court, Madison County',
 'Family Court, Monroe County',
 'Family Court, Montgomery County',
 'Family Court, Nassau County',
 'Family Court, New York County',
 'Family Court, Niagara County',
 'Family Court, Oneida County',
 'Family Court, Onondaga County',
 'Family Court, Orange County',
 'Family Court, Oswego County',
 'Family Court, Otsego County',
 'Family Court, Queens County',
 'Family Court, Rockland County',
 'Family Court, Saratoga County',
 'Family Court, Schuyler County',
 'Family Court, Steuben County',
 'Family Court, Suffolk County',
 'Family Court, Sullivan County',
 'Family Court, Tompkins County',
 'Family Court, Ulster County',
 'Family Court, Warren County',
 'Family Court, Washington County',
 'Family Court, Wayne County',
 'Family Court, Westchester County',
 'Family Court, Yates County',
 'Glens Falls City Court',
 'Justice Court Of The Town Of Bedford, Westchester County',
 'Justice Court Of The Town Of Cambria, Niagara County',
 'Justice Court Of The Town Of East Fishkill, Dutchess County',
 'Justice Court Of The Town Of Greece, Monroe County',
 'Justice Court Of The Town Of Greenburgh, Westchester County',
 'Justice Court Of The Town Of Hamptonburgh, Orange County',
 'Justice Court Of The Town Of Henrietta, Monroe County',
 'Justice Court Of The Town Of Kinderhook, Columbia County',
 'Justice Court Of The Town Of New Scotland, Albany County',
 'Justice Court Of The Town Of Newburgh, Orange County',
 'Justice Court Of The Town Of Ogden, Monroe County',
 'Justice Court Of The Town Of Ossining, Westchester County',
 'Justice Court Of The Town Of Parma, Monroe County',
 'Justice Court Of The Town Of Pleasant Valley, Dutchess County',
 'Justice Court Of The Town Of Ramapo, Rockland County',
 'Justice Court Of The Town Of Red Hook, Dutchess County',
 'Justice Court Of The Town Of Stuyvesant, Columbia County',
 'Justice Court Of The Town Of Webster, Monroe County',
 'Justice Court Of The Village Of Dobbs Ferry,  Westchester County',
 'Justice Court Of The Village Of Hastings On Hudson, Westchester\r\nCounty',
 'Justice Court Of The Village Of Kings Point, Nassau County',
 'Justice Court Of The Village Of Massapequa Park',
 'Justice Court Of The Village Of Massapequa Park, Nassau County',
 'Justice Court Of The Village Of Oyster Bay Cove, Nassau County',
 'Justice Court Of The Village Of Piermont, Rockland County',
 'Justice Court Of The Village Of Red Hook',
 'Justice Court Of The Village Of Red Hook, Dutchess County',
 'Justice Court Of The Village Of Sleepy Hollow, Westchester\r\nCounty',
 'Justice Court Of The Village Of Tuckahoe, Westchester County',
 'Justice Court Of Town Of Webster, Monroe County',
 'Justice Court of The Village of Lattingtown',
 'Justice Court of the Town of Clinton, Dutchess County',
 'Justice Court of the Town of Deerpark, Orange County',
 'Justice Court of the Town of East Fishkill, Dutchess County',
 'Justice Court of the Town of Greenburgh, Westchester County',
 'Justice Court of the Town of Henrietta, Monroe County',
 'Justice Court of the Town of Lockport, Niagara County',
 'Justice Court of the Town of New Scotland, Albany County',
 'Justice Court of the Town of Newburgh, Orange County',
 'Justice Court of the Town of Niskayuna, Schenectady County',
 'Justice Court of the Town of Ogden, Monroe County',
 'Justice Court of the Town of Ossining, Westchester County',
 'Justice Court of the Town of Parma, Monroe County',
 'Justice Court of the Town of Penfield, Monroe County',
 'Justice Court of the Town of Pleasant Valley, Dutchess County',
 'Justice Court of the Town of Pound Ridge, Westchester County',
 'Justice Court of the Town of Somers, Westchester County',
 'Justice Court of the Town of Webster, Monroe County',
 'Justice Court of the Village of Bellport, Suffolk County',
 'Justice Court of the Village of Dobbs Ferry, Westchester \r\nCounty',
 'Justice Court of the Village of Farmingdale, Nassau County',
 'Justice Court of the Village of Northport, Suffolk County',
 'Justice Court of the Village of Piermont, Rockland County',
 'Justice Court of the Village of Red Hook, Dutchess County',
 'Norwich City Court',
 'Norwich City Court, Chenango County',
 'Peekskill City Court',
 'Peekskill City Court, Westchester County',
 'Poughkeepsie City Court',
 'Rochester City Court',
 'Supeme Court, Albany County',
 'Supeme Court, Bronx County',
 'Supeme Court, Erie County',
 'Supeme Court, New York County',
 'Supeme Court, Queens County',
 'Supeme Court, Westchester County',
 'Superme Court, Bronx County',
 'Superme Court, New York County',
 'Superme Court, Queens County',
 'Supree Court, New York County',
 'Suprem Court, New York County',
 'Supreme  Court, Kings County',
 'Supreme Court Kings County',
 'Supreme Court Nassau County',
 'Supreme Court New York County',
 'Supreme Court Orange County',
 'Supreme Court, Albany County',
 'Supreme Court, Allegany County',
 'Supreme Court, Bromx County',
 'Supreme Court, Bronx Count',
 'Supreme Court, Bronx County',
 'Supreme Court, Bronx Court',
 'Supreme Court, Broome County',
 'Supreme Court, Cattaraugus County',
 'Supreme Court, Cayuga County',
 'Supreme Court, Chautauqua County',
 'Supreme Court, Chemung County',
 'Supreme Court, Chenango County',
 'Supreme Court, Clinton County',
 'Supreme Court, Columbia County',
 'Supreme Court, Cortland County',
 'Supreme Court, County of Nassau',
 'Supreme Court, Delaware County',
 'Supreme Court, Dutchess County',
 'Supreme Court, Erie County',
 'Supreme Court, Essex County',
 'Supreme Court, Franklin County',
 'Supreme Court, Fulton County',
 'Supreme Court, Genesee County',
 'Supreme Court, Greene County',
 'Supreme Court, Hamilton County',
 'Supreme Court, Herkimer County',
 'Supreme Court, Jefferson County',
 'Supreme Court, King County',
 'Supreme Court, Kings County',
 'Supreme Court, Kings Court',
 'Supreme Court, Kings Coutny',
 'Supreme Court, Kings county',
 'Supreme Court, Livingston County',
 'Supreme Court, Madison County',
 'Supreme Court, Monroe County',
 'Supreme Court, Montgomery County',
 'Supreme Court, NY County',
 'Supreme Court, Nassau County',
 'Supreme Court, New Yiork County',
 'Supreme Court, New York',
 'Supreme Court, New York City',
 'Supreme Court, New York County',
 'Supreme Court, New York County.',
 'Supreme Court, New York Couny',
 'Supreme Court, New York Court',
 'Supreme Court, New York Couty',
 'Supreme Court, New Yorrk County',
 'Supreme Court, Niagara County',
 'Supreme Court, Niagra County',
 'Supreme Court, Oneida County',
 'Supreme Court, Onondaga County',
 'Supreme Court, Ontario County',
 'Supreme Court, Orange County',
 'Supreme Court, Oswego County',
 'Supreme Court, Otsego County',
 'Supreme Court, Putman County',
 'Supreme Court, Putnam County',
 'Supreme Court, Queens County',
 'Supreme Court, Rensselaer County',
 'Supreme Court, Richmond County',
 'Supreme Court, Rockland County',
 'Supreme Court, Saratoga County',
 'Supreme Court, Schenectady County',
 'Supreme Court, Schnectady County',
 'Supreme Court, Schoharie County',
 'Supreme Court, Schuyler County',
 'Supreme Court, Seneca County',
 'Supreme Court, St. Lawrence County',
 'Supreme Court, Steuben County',
 'Supreme Court, Suffolk County',
 'Supreme Court, Suffolk County.',
 'Supreme Court, Suffolk Cuonty',
 'Supreme Court, Sullivan County',
 'Supreme Court, Tioga County',
 'Supreme Court, Tompkins County',
 'Supreme Court, Ulster County',
 'Supreme Court, Warren County',
 'Supreme Court, Washington County',
 'Supreme Court, Wayne County',
 'Supreme Court, Weschester County',
 'Supreme Court, Westchester',
 'Supreme Court, Westchester County',
 'Supreme Court, Wyoming County',
 'Supreme Court, Yates County',
 'Supreme Court. New York County',
 'Supreme Court. Onondaga County',
 'Suprme Court, New York County',
 'Supureme Court, Kings County',
 'Surpeme Court, New York County',
 'Surpeme Court, Orange County',
 "Surrogate's Court, Albany County",
 "Surrogate's Court, Bronx County",
 "Surrogate's Court, Broome County",
 "Surrogate's Court, Delaware County",
 "Surrogate's Court, Dutchess County",
 "Surrogate's Court, Erie County",
 "Surrogate's Court, Essex County",
 "Surrogate's Court, Genesee County",
 "Surrogate's Court, Kings County",
 "Surrogate's Court, Madison County",
 "Surrogate's Court, Monroe County",
 "Surrogate's Court, Nassau County",
 "Surrogate's Court, New York County",
 "Surrogate's Court, Oneida County",
 "Surrogate's Court, Orange County",
 "Surrogate's Court, Putnam County",
 "Surrogate's Court, Queens County",
 "Surrogate's Court, Richmond County",
 "Surrogate's Court, Rockland County",
 "Surrogate's Court, Schenectady County",
 "Surrogate's Court, Suffolk County",
 "Surrogate's Court, Sullivan County",
 "Surrogate's Court, Tompkins County",
 "Surrogate's Court, Ulster County",
 "Surrogate's Court, Westchester County",
 'Tuxedo Town Court, Orange County',
 'Utica City Court',
 'supreme Court, New York County',
 'upreme Court, Kings County'}

@flooie
Copy link
Contributor

flooie commented Jan 3, 2024

courts-db has been updated for all of the non-typo examples found. @grossir - I still need to release the version and update our court list in courtlistener

@flooie
Copy link
Contributor

flooie commented Jan 3, 2024

@mlissner @grossir

Background

Not sure if you feel strongly about this mike but I need to layout a few things.

NYMisc - is actually a bit easier than first anticipated - and also - more difficult.

Traditionally, we would have scraped from this search page to collect Other Courts. We do a number of other courts as well.

The difficult part was that there are hundreds of other courts and no way to identify them without parsing the html/pdf files. This meant that we would have to break the paradigm for miscellaneous courts.

But in unraveling a "bug" that wasn't a bug yesterday we realized that there is a second listing of opinions provided by the court that we sometimes use. NY also provides the list here

Which curiously publishes opinions for MISC opinions but if you look closely you start to see them out of date order. So I called the New York State Law Reporting Bureau this morning and spoke to a nice New Yorker who explained that they often add new courts years later if that opinion becomes relevant later. For example a higher court could be referencing this case and so they decide to publish it later.

If you look at the numbering (which they do) they sequentially number based on when it's posted here and not the date filed.

Additionally, this means we can easily grab the parent_court - or child court when we scrape and can avoid any post download extraction from text because they have a handy citation lookup tool that provides that information which we should grab via a deferring list.

Here Is an example which can easily be parsed and extracted.

Proposal

I suggest that we take our parent_courts and build one scraper for each. I think that would be these parent scrapers

  • juriscraper.opinions.united_states.state.nyfam
  • juriscraper.opinions.united_states.state.nycity
  • juriscraper.opinions.united_states.state.nycounty
  • juriscraper.opinions.united_states.state.nysupreme
  • juriscraper.opinions.united_states.state.nycciv
  • juriscraper.opinions.united_states.state.nyccrim
  • juriscraper.opinions.united_states.state.nysurrogate
  • juriscraper.opinions.united_states.state.nydistrict
  • juriscraper.opinions.united_states.state.nyjustice
  • juriscraper.opinions.united_states.state.nyctclaims

Additionally, I think we should add a field for child_court or specific_court - so we can return
the exact court to add the most granular level of data possible. so

if the nysupreme -scraper returns Supreme Court, Broome County - we can say its nysupctbroom and save it
directly to the most granular court field available. This creates a minor change on courtlistener - but lets us
continue on with our juriscraper paradigm more or less.

@mlissner
Copy link
Member Author

mlissner commented Jan 4, 2024

spoke to a nice New Yorker

If you got her info, please put it in the CRM for future readers.

Proposal

This all sounds solid to me!

grossir added a commit to grossir/courtlistener that referenced this issue Jan 4, 2024
NY Misc Reporter freelawproject/juriscraper#827 has opinions for a few hundred small courts, which we have grouped into 10 families, each with its own scraper. To avoid both losing data granularity and avoid creating a scraper for each court, juriscraper will return a child_court field for each opinion, which will be transformed into the proper court object . This breaks the usual way, where the court object is obtained from the scraper module name.
grossir added a commit to grossir/courtlistener that referenced this issue Jan 4, 2024
…w hundred small courts, which we have grouped into 10 families, each with its own scraper. To avoid both losing data granularity and avoid creating a scraper for each court, juriscraper will return a child_court field for each opinion, which will be transformed into the proper court object . This changes the usual way, where the court object is obtained from the scraper module name.

Duplicate checking would not be affected, since it does not uses the court object. DupChecker:
- first uses the court url to check if the site is the same
- then checks the downloaded content to check if it changed
grossir added a commit to grossir/courtlistener that referenced this issue Jan 4, 2024
NY Misc Reporter freelawproject/juriscraper#827 has opinions for a few hundred small courts, which we have grouped into 10 families, each with its own scraper. To avoid both losing data granularity and avoid creating a scraper for each court, juriscraper will return a child_court field for each opinion, which will be transformed into the proper court object . This changes the usual way, where the court object is obtained from the scraper module name.

Duplicate checking would not be affected, since it does not uses the court object. DupChecker:
- first uses the court url to check if the site is the same
- then checks the downloaded content to check if it changed
grossir added a commit to grossir/courtlistener that referenced this issue Jan 5, 2024
NY Misc Reporter freelawproject/juriscraper#827
has opinions for a few hundred small courts, which we have grouped into
10 families, each with its own scraper. To avoid both
losing data granularity and avoid creating a scraper for each court,
juriscraper will return a child_court field for each opinion,
which will be transformed into the proper court object.
This changes the usual way, where the court object is obtained
from the scraper module name.

Duplicate checking would not be affected, since it does not uses
the court object. DupChecker:
- first uses the court url to check if the site is the same
- then checks the downloaded content to check if it changed
grossir added a commit to grossir/courtlistener that referenced this issue Jan 9, 2024
Theses changes are needed to support freelawproject/juriscraper#827

- If possible, get court object from child_court field passed by nytrial families of scrapers. Otherwise, default to parent court. This does not alter behavior for other sources. Solves freelawproject/juriscraper#827
- Pass opinion.html to site.extract_from_text, if opinion.plain_text does not exists. Solves freelawproject#3549
- Add support to update Opinion object from extract_from_text metadata dict.
grossir added a commit to grossir/courtlistener that referenced this issue Jan 9, 2024
Theses changes are needed to support freelawproject/juriscraper#827

- If possible, get court object from child_court field passed by nytrial families of scrapers. Otherwise, default to parent court. This does not alter behavior for other sources. Solves freelawproject/juriscraper#827
- Pass opinion.html to site.extract_from_text, if opinion.plain_text does not exists. Solves freelawproject#3549
- Add support to update Opinion object from extract_from_text metadata dict.
- Update juriscraper to 2.5.78
- Update courts-db to 0.10.22
grossir added a commit to grossir/courtlistener that referenced this issue Jan 9, 2024
Theses changes are needed to support freelawproject/juriscraper#827

- If possible, get court object from child_court field passed by nytrial families of scrapers. Otherwise, default to parent court. This does not alter behavior for other sources. Solves freelawproject/juriscraper#827
- Pass opinion.html to site.extract_from_text, if opinion.plain_text does not exists. Solves freelawproject#3549
- Add support to update Opinion object from extract_from_text metadata dict.
- Update juriscraper to 2.5.78
- Update courts-db to 0.10.22
grossir added a commit to grossir/courtlistener that referenced this issue Jan 9, 2024
Theses changes are needed to support freelawproject/juriscraper#827

- If possible, get court object from child_court field passed by nytrial families of scrapers. Otherwise, default to parent court. This does not alter behavior for other sources. Solves freelawproject/juriscraper#827
- Pass opinion.html to site.extract_from_text, if opinion.plain_text does not exists. Solves freelawproject#3549
- Add support to update Opinion object from extract_from_text metadata dict.
- Update juriscraper to 2.5.78
- Update courts-db to 0.10.22
@grossir grossir closed this as completed Jan 10, 2024
@github-project-automation github-project-automation bot moved this from In Progress to Done in @grossir's backlog Jan 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

3 participants