Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Medium post importer (from Medium data export) #3264

Merged
merged 1 commit into from
Jan 26, 2024

Conversation

boxydog
Copy link
Contributor

@boxydog boxydog commented Dec 2, 2023

Pull Request Checklist

Resolves: #3261

  • Ensured tests pass and (if applicable) updated functional test output
  • Conformed to code style guidelines by running appropriate linting tools
  • Added tests for changed code
  • Updated documentation for changed code

@boxydog boxydog force-pushed the medium_importer branch 3 times, most recently from d0673b9 to c1f2f92 Compare December 2, 2023 17:17
@boxydog
Copy link
Contributor Author

boxydog commented Dec 2, 2023

I have not quite figured out how to reference a file in the test tree for a unit test.

@boxydog boxydog force-pushed the medium_importer branch 2 times, most recently from c8fcfc7 to 1d30816 Compare December 2, 2023 19:04
@justinmayer justinmayer changed the title Medium post importer (from medium export) Medium post importer (from Medium data export) Jan 15, 2024
@justinmayer justinmayer requested a review from avaris January 15, 2024 09:49
Comment on lines 600 to 604
assert soup, f"{filepath} could not be parsed by beautifulsoup"
kind = "article"

content = soup.find("section", class_="e-content")
assert content, f"{filepath}: Post has no content"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't use assert for flow control.

if condition:
    raise RuntimeError(...)  # or if possible a more relevant exception

Also, you don't handle this error in mediumposts2fields. So if a single file triggers these, whole process is halted. Is that the desired behavior or is it better to warn about problematic ones and continue with what it can process?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to continue, but that's not implemented here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed two asserts to raise exceptions.

In general, I use asserts when I expect something to never happen, but I don't remember here exactly because it's been too long.

raw_date = soup.find("time", class_="dt-published")
date = None
if raw_date:
# This datetime can include timezone, e.g., "2017-04-21T17:11:55.799Z"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an issue? pelican should be able to parse these just fine, if they are written as is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember the issue, but there was one.

It's been so long since I submitted this that I've forgotten the issue, but there was one.

@justinmayer
Copy link
Member

@boxydog: Sincere apologies for the delay in reviewing your contribution. At the same time, maybe code that can't be fully understood six weeks after it's written also can't be maintained well on a long-term basis and perhaps needs to be adjusted or at least better documented/commented. What do you think?

@boxydog
Copy link
Contributor Author

boxydog commented Jan 17, 2024

@boxydog: Sincere apologies for the delay in reviewing your contribution. At the same time, maybe code that can't be fully understood six weeks after it's written also can't be maintained well on a long-term basis and perhaps needs to be adjusted or at least better documented/commented. What do you think?

I see and understand what the code is doing. I think the code is pretty well engineered, can be maintained, and would be useful to people.

What I don't remember is the exact circumstances that caused me to put in the asserts. Was it while I was developing code? Was it some particular data I was working with? There is no way to reconstruct that.

I wanted to import my medium posts, so I thought I'd contribute some code to the project to help others do that. But frankly, I've just lost heart contributing to this project. You should manage it the way you want, and thanks for your project. However, as a contributor, it moves very slowly, all the feedback I get is negative, it's very hard to get anything accepted, and it takes a very long time to go back and forth. If you want me to change something about this PR, feel free to suggest something concrete and doable. If you don't want the PR, just close it.

Best wishes either way.

@boxydog
Copy link
Contributor Author

boxydog commented Jan 17, 2024

all the feedback I get is negative

Sorry, it is not fair to say "all the feedback" is negative. I got thanked for contributions. It's just I know when I submit something, it's going to be a slog, possibly ending in nothing, so I'm describing things more negatively than reality.

Copy link
Member

@avaris avaris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.

@avaris
Copy link
Member

avaris commented Jan 24, 2024

However, I can't leave this without saying something.

I suppose my code reviews were the negative feedback, apologies I guess? If you don't mind a bit more feedback, I really hope you would rethink your outlook. I/We do appreciate any contribution, but I have to go through it in detail and point out any issues, clarify any ambiguities. Because once this is merged, based on past experience, more often than not maintainers will be the ones handling any potential future issues with this. I have to make sure I understand what the code is doing and the why (on top of other usual review things).

Regarding the slog, I really wish this was my job, but unfortunately it's not. So things will have to wait for free time, and that's pretty much the case for most of open source projects. As for PRs with 100s of LoC changes (like this one), frequency of right amount of free time with right state of mind gets smaller. Because, you know, life...

@justinmayer
Copy link
Member

Hi @boxydog. Thank you for your feedback.

But frankly, I've just lost heart contributing to this project. You should manage it the way you want, and thanks for your project. However, as a contributor, it moves very slowly, all the feedback I get is negative, it's very hard to get anything accepted, and it takes a very long time to go back and forth.

We do our best to review contributions as quickly as we can. As Avaris said, we are often juggling many other priorities and only have so much time to dedicate to maintaining open-source projects. We also do our best to ensure that pull request submissions are up to the standards to which we hold ourselves, as well as maintainable if and when the original contributor is no longer around to assist with the code they contributed.

More personally, I have enjoyed and valued your participation and contributions. Please let me know how we can do things better to encourage your continued participation in the project.

Copy link
Member

@justinmayer justinmayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks to @boxydog for the contribution and to @avaris for reviewing.

@justinmayer justinmayer merged commit ff35d26 into getpelican:master Jan 26, 2024
15 checks passed
@boxydog
Copy link
Contributor Author

boxydog commented Jan 26, 2024 via email

@boxydog boxydog deleted the medium_importer branch May 31, 2024 13:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Import posts from medium download
3 participants