-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Medium post importer (from Medium data export) #3264
Conversation
d0673b9
to
c1f2f92
Compare
I have not quite figured out how to reference a file in the test tree for a unit test. |
c8fcfc7
to
1d30816
Compare
pelican/tools/pelican_import.py
Outdated
assert soup, f"{filepath} could not be parsed by beautifulsoup" | ||
kind = "article" | ||
|
||
content = soup.find("section", class_="e-content") | ||
assert content, f"{filepath}: Post has no content" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't use assert
for flow control.
if condition:
raise RuntimeError(...) # or if possible a more relevant exception
Also, you don't handle this error in mediumposts2fields
. So if a single file triggers these, whole process is halted. Is that the desired behavior or is it better to warn about problematic ones and continue with what it can process?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be better to continue, but that's not implemented here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed two asserts to raise exceptions.
In general, I use asserts when I expect something to never happen, but I don't remember here exactly because it's been too long.
raw_date = soup.find("time", class_="dt-published") | ||
date = None | ||
if raw_date: | ||
# This datetime can include timezone, e.g., "2017-04-21T17:11:55.799Z" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this an issue? pelican
should be able to parse these just fine, if they are written as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't remember the issue, but there was one.
It's been so long since I submitted this that I've forgotten the issue, but there was one.
1d30816
to
d6a33f1
Compare
@boxydog: Sincere apologies for the delay in reviewing your contribution. At the same time, maybe code that can't be fully understood six weeks after it's written also can't be maintained well on a long-term basis and perhaps needs to be adjusted or at least better documented/commented. What do you think? |
I see and understand what the code is doing. I think the code is pretty well engineered, can be maintained, and would be useful to people. What I don't remember is the exact circumstances that caused me to put in the asserts. Was it while I was developing code? Was it some particular data I was working with? There is no way to reconstruct that. I wanted to import my medium posts, so I thought I'd contribute some code to the project to help others do that. But frankly, I've just lost heart contributing to this project. You should manage it the way you want, and thanks for your project. However, as a contributor, it moves very slowly, all the feedback I get is negative, it's very hard to get anything accepted, and it takes a very long time to go back and forth. If you want me to change something about this PR, feel free to suggest something concrete and doable. If you don't want the PR, just close it. Best wishes either way. |
Sorry, it is not fair to say "all the feedback" is negative. I got thanked for contributions. It's just I know when I submit something, it's going to be a slog, possibly ending in nothing, so I'm describing things more negatively than reality. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you.
However, I can't leave this without saying something. I suppose my code reviews were the negative feedback, apologies I guess? If you don't mind a bit more feedback, I really hope you would rethink your outlook. I/We do appreciate any contribution, but I have to go through it in detail and point out any issues, clarify any ambiguities. Because once this is merged, based on past experience, more often than not maintainers will be the ones handling any potential future issues with this. I have to make sure I understand what the code is doing and the why (on top of other usual review things). Regarding the slog, I really wish this was my job, but unfortunately it's not. So things will have to wait for free time, and that's pretty much the case for most of open source projects. As for PRs with 100s of LoC changes (like this one), frequency of right amount of free time with right state of mind gets smaller. Because, you know, life... |
Hi @boxydog. Thank you for your feedback.
We do our best to review contributions as quickly as we can. As Avaris said, we are often juggling many other priorities and only have so much time to dedicate to maintaining open-source projects. We also do our best to ensure that pull request submissions are up to the standards to which we hold ourselves, as well as maintainable if and when the original contributor is no longer around to assist with the code they contributed. More personally, I have enjoyed and valued your participation and contributions. Please let me know how we can do things better to encourage your continued participation in the project. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your maintenance of this project. I am now running two web
sites on it. And I agree, the burden of maintenance falls to the
maintainers. You should run the project any way you like.
And, I appreciate your feedback.
I am just giving you my honest feedback as a contributor. I think picky and
fast would be fine. Slow and less picky would also be okay. It's slow and
picky that is discouraging.
Since I think you want to stay picky, I might suggest picky and fast. So,
take weeks or months off if you have to, but once you start a review,
respond to that PR within (say) 24 hours over and over until it's merged or
closed. If we went back and forth for a week, but it was every day, I think
that would probably be okay? It's just the feeling that it's going to be
back and forth at irregular intervals for months that is not the greatest.
Or, you could pick one PR to make a priority to get in and focus on it
until it's in, then repeat, at whatever interval you choose.
Another possibility might be to take over PRs to finish the last 5% the way
you want them, to get them merged faster.
Anything you can do to increase velocity, and also perceived velocity. I
welcome your ideas on how to do that.
You're not getting paid, I am also not getting paid. You're going to run
the project how you see fit, and I support that.
But, I'm more likely to work on something if it feels like it will move in
a predictable way.
P.S. this seems like a strange place to talk about this, but I don't know
where else, so here we are.
|
Pull Request Checklist
Resolves: #3261