-
-
Notifications
You must be signed in to change notification settings - Fork 757
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Banyule Victoria AU not working #302
Comments
I authored that source. I'm not sure if the council has changed the interface for the new bins, it's a bit redundant at the moment since it seems OpenCities (Banyule and a number of other councils have outsourced their websites to them) have recently implemented anti-scraping features on that API. The anti-scraping protection is a little nasty. In my testing the response I get from the API without some magic cookies redirects to a JavaScript file that's heavily obfuscated. PR #250 was reverted in #256 for this reason. In my original PR (#160) we discussed sharing some common code for OpenCities-sourced APIs - since they seem identical - but now it means there are probably a range of sources that are broken by one feature change on their end. |
It looks like this can be made to work if you just make the final api call using the import json
import requests
from bs4 import BeautifulSoup
from datetime import datetime
URL = "https://www.banyule.vic.gov.au/ocapi/Public/myarea/wasteservices"
HEADERS = {
"user-agent": "Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/112.0",
"referer": "https://www.banyule.vic.gov.au/Waste-environment/Bin-collection"
}
GEOLOC = "4f7ebfca-1526-4363-8b87-df3103a10a87" # borrowed from banyule_vic_gov_au.py
PARAMS = {
'geolocationid': GEOLOC,
"ocsvclang": "en-AU"
}
s = requests.Session()
r = s.get(
URL,
headers=HEADERS,
params=PARAMS,
)
schedule = json.loads(r.text)
soup = BeautifulSoup(schedule["responseContent"], "html.parser")
x = soup.findAll("div", {"class": "note"})
y = soup.findAll("div", {"class": "next-service"})
a = [item.text.strip() for item in x]
b = [datetime.strptime(item.text.strip(),"%a %d/%m/%Y").date() for item in y]
z = list(zip(a,b))
print(z) [('Food organics and garden organics', datetime.date(2023, 4, 17)), ('Recycling', datetime.date(2023, 4, 17)), ('Rubbish', datetime.date(2023, 4, 24))] Implementing this change probably means anyone who was previously using it will have to change their config, and spend a few minutes extracting the geolocationid the website is using for their address. I'd assume that's acceptable if it's currently not working? |
Ok, this sound very interesting. I think changing the config is not big issue, because no one uses this source. |
The source supports providing From memory the issue is somewhat transient, coming and going at the whim of the back-end. I used the source for a few weeks before getting redirects almost 100% of the time. When it broke I assumed an update on their end, maybe it's been backed off since or I (and others) got unlucky? |
Same script now generates errors, so maybe I got lucky when looking at this last week. |
I may be incorrectly using the integration but I think there may be an issue as the local council have recently introduced another bin and so the calendar may not pull the data the same way?
My waste collection calendar is as follows:
waste_collection_schedule:
sources:
- name: banyule_vic_gov_au
args:
street_address: an address in, IVANHOE
customize:
- type: recycling
alias: Fogo
show: true
icon: mdi:recycle
picture: false
calendar_title: Recycling
The text was updated successfully, but these errors were encountered: