Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Ukraine production parser #5770

Merged
merged 9 commits into from
Aug 24, 2023
Merged

Conversation

Darkvyl
Copy link
Contributor

@Darkvyl Darkvyl commented Aug 22, 2023

Issue

Issue #3981. Not sure if it should be closed, still have to look into ENTSO-E parser for consumption data.

Description

Fixes the production parser for UA. Have to mock User-Agent for parser and switch to use http.client instead of requests, because requests adds something to header that makes Ukrainian server recognize parser as a bot.
Also, not sure why this parser was not able to parse past dates, seems pretty straightforward to me, so fixed that too.

Double check

  • I have tested my parser changes locally with poetry run test_parser "zone_key"
  • I have tested my parser changes locally with poetry run test_parser "zone_key" --target_datetime "date"
  • I have run pnpx prettier --write . (with no result since no changes to front-end were done) and poetry run format to format my changes.

@VIKTORVAV99 VIKTORVAV99 self-requested a review August 22, 2023 22:19
Copy link
Member

@VIKTORVAV99 VIKTORVAV99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice!
Thanks for working on it!

There are a few things that needs to be adressed and a few suggestions. Once you have taken a look at those I'll do another review but this looks very promising!

parsers/UA.py Outdated Show resolved Hide resolved
parsers/UA.py Outdated Show resolved Hide resolved
parsers/UA.py Outdated
Comment on lines 52 to 60
conn = http.client.HTTPSConnection("ua.energy")
payload = f"action=get_data_oes&report_date={target_date}&type=day"
headers = {
"Content-Type": "application/x-www-form-urlencoded",
"User-Agent": "PostmanRuntime/7.32.3",
}
conn.request("POST", "/wp-admin/admin-ajax.php", payload, headers)
res = conn.getresponse()
response = json.loads(res.read().decode("utf-8"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Been trying to wrap my head around why this works and request don't but without success so this is fine but maybe add a comment or two as to why we are using it instead of requests?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this should work with requests, does it really not?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replicated the code and changed the headers to what is used here in requests but I'm just getting 403 http codes.

And honestly I can't explain why right now.

If I have the time someday I might dig into it again because as far as I know it should work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. Interesting thought, requests work in small PoC script!
image

from requests import Session

url = "https://ua.energy/wp-admin/admin-ajax.php"
postdata = {"action": "get_data_oes", "report_date": "20.07.2023", "type": "day"}

response = Session().post(
    url, postdata, headers={"User-Agent": "PostmanRuntime/7.32.3"}
)

print(response.json())

But when I copy exact same lines to parser, poetry run fails because of 403.
I have no idea how and why it works like this, so the maximum that I can do is add a comment saying: #TODO: someone, please look at this and explain why it works like this.

Copy link
Member

@VIKTORVAV99 VIKTORVAV99 Aug 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are something going on that's for sure.
But I'd suggest putting a comment like this:

# We are using HTTP.client because Request returns 403 http codes.
# TODO: Look into why requests are returning 403 http codes while HTTP.client works.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I got an HTTP 403 Forbidden when running that proof-of-concept script in my computer for the first time (and also when I retry it). I'm located in Canada with a normal Canadian IP address, Python 3.6.9 and requests==2.27.1.

The error message says that it's from Cloudflare, so maybe they have paranoia settings tuned high in Cloudflare, I wouldn't be surprised.

Error message includes:

Why have I been blocked?
This website is using a security service to protect itself from online attacks. The action you just performed triggered the security solution. There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

Which could be anything.

Going to https://ua.energy/ in a browser also gives me that error message.

I would guess that ultimately it's not a question of using requests or http.client, and that you got it working using http.client or the PoC script is a fluke.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that maybe this service just blocks all connections not from Europe AND connection from all over the world that uses most popular library in python to make requests, requests. Strange thing to do, why would you create an API and block developers from accessing it automatically, but ok, we'll work with what they give us. I'm located in Poland and while working on that was using Ukrainian VPN. There is no difference in behavior, same problems with requests and same successes with http.client.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I can't explain it either, but ua.energy is also available to me from Sweden.

Seems very strange.

parsers/UA.py Outdated
@@ -75,10 +79,21 @@ def fetch_production(
row["production"][v] = 0.0
Copy link
Member

@VIKTORVAV99 VIKTORVAV99 Aug 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While not related to your changes we have a standardized way to do this now with the new EventList classes.

You would need to import and use the following to change it:
from electricitymap.contrib.lib.models.event_lists import (
ProductionBreakdownList,
)
from electricitymap.contrib.lib.models.events import (
ProductionMix,
StorageMix,
)

This is more of a optional feature, but I'd like to change it in here to avoid another PR.
If you don't want to do it would you have any objections to me changing it in your PR personally?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you can, please, go ahead and make changes in my PR. I tried to compare it to other parses to get a grasp of how it should be used, but I'm not sure, so, I think it will be better if you do it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do so tomorrow.

parsers/UA.py Outdated Show resolved Hide resolved
@VIKTORVAV99
Copy link
Member

VIKTORVAV99 commented Aug 24, 2023

I pushed my changes, let me know if there is any questions about them.

After you have taken a look I'll approve and merge it.

parsers/UA.py Outdated Show resolved Hide resolved
parsers/UA.py Outdated Show resolved Hide resolved
Copy link
Member

@VIKTORVAV99 VIKTORVAV99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for figuring out a fix for this parser and working with me on it!

@VIKTORVAV99 VIKTORVAV99 merged commit 2369889 into electricitymaps:master Aug 24, 2023
17 checks passed
@Darkvyl Darkvyl deleted the fix-ua branch August 24, 2023 13:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants