Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a documentation on how to write an article #507

Merged
merged 12 commits into from
Sep 20, 2022

Conversation

simoninithomas
Copy link
Member

Hey there 👋 ,

So last time I replied to Carlos to help him create his first article and I think it's a good idea to have a small tutorial on how to create an article.

I put it on the documentation with 8 actions to do to create your article.

So the explanation is a first draft, I don't know if there is something to improve to be more comprehensible to everyone.

@simoninithomas simoninithomas added the documentation Improvements or additions to documentation label Sep 5, 2022
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
<span class="fullname">Thomas Simonini</span>
</div>
</a>
</div>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to make this more fool-proof, probably at some point we can implement a simple syntax shortcut to generate those server side

(cc @Pierrci @mishig25 for instance)

No Svelte/mdsvex needed IMO, just a quick regex would probably work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo, we can put the additional fields inside the frontmatter:
https://github.dev/huggingface/blog/blob/5bf4750b18b2653b703c4e7df66b54481626a116/README.md#L26-L29

hopefully, yaml will be more fool-proof than html?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well the thing is that those html divs are injected in the middle of the content (after the h1)

Maybe something like:

{blog-metadata}

{authors edbeeching ThomasSimonini someone=guest}

or something would do the trick

But again, not worth doing anything in the short term IMO

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case do I merge now? Or you prefer that we wait the new version without html? 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, no need to wait for this, it can be done later

Copy link
Member

@julien-c julien-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea to add this!

simoninithomas and others added 4 commits September 5, 2022 16:07
Co-authored-by: Julien Chaumond <julien@huggingface.co>
Co-authored-by: Julien Chaumond <julien@huggingface.co>
Co-authored-by: Julien Chaumond <julien@huggingface.co>
@simoninithomas
Copy link
Member Author

simoninithomas commented Sep 5, 2022

Thanks for the review @julien-c I updated the elements you mentioned and did some cleanup. I'm waiting for Mishig and Pierric for their views about replacing the HTML part to replace to markdown before merging.

Copy link
Member

@osanseviero osanseviero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea! I would also add a mention on using smaller/compressed images to avoid having an expensive/slow user experience

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
simoninithomas and others added 4 commits September 6, 2022 11:13
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
@simoninithomas
Copy link
Member Author

Thanks for the review @osanseviero I'm adding the use of smaller/compressed images in the advice.

@osanseviero
Copy link
Member

Feel free to merge this and we can iterate afterwards :)

2️⃣ Create a md (markdown) file, **use a short file name**.
For instance, if your title is "Introduction to Deep Reinforcement Learning", the md file name could be `intro-rl.md`. This is important because the **file name will be the blogpost's URL**.

3️⃣ Create a new folder in `assets`. Check the last folder number and name your folder `number_md-name`, for instance `101_intro-rl`; this folder will contain **all your illustrations and thumbnail**. The folder number is mostly for (rough) ordering purposes, so it's no big deal if two concurrent articles use the same number.
Copy link
Contributor

@stas00 stas00 Sep 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check the last folder number and name your folder number_md-name

This leads to a huge overhead of constantly fighting over the next number in the sequence by multiple new blog posts that are in works. A blog post that isn't published fast gets constantly bumped down by those who are faster and the author has to constantly re-name and move all files multiple times. And more than once.

This has been already discussed multiple times between the authors and the agreement was to drop this number altogether. The only reason it hasn't happened yet is that of inertia - I have already published the first post w/o using the number.

This prefix number doesn't have any functional use. the article's unique url is a sufficient unique asset directory.

Ideally we should just do one PR where this prefix id is removed from all posts to prevent the perpetuation of this pattern. if you'd like I can do the honours.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the commented line:

The folder number is mostly for (rough) ordering purposes, so it's no big deal if two concurrent articles use the same number.

😉

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I may share where I feel this convention is counter-productive:

  1. Those of us with computer naming OCD will want a unique number
  2. most of the time we do monkey-see-monkey-do - so we, the authors will try to get a unique number
  3. If the prefix numbers are shared one can easily add files to the wrong assets folder / file completion isn't as quick.

May I ask what ordering purpose the assets folders serve? Perhaps I'm missing on some practical use-case here.

Copy link
Member Author

@simoninithomas simoninithomas Sep 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My two cents:

  1. People just need to search the last article published on _blog.yml they'll find the number.

2-3. I'm not sure about your point, since the asset folder is number_title like 101_deep-q-learning so I don't see how people can add files in the wrong assets folders. And if it happens people will find rapidly that it's the case since they will not be able to display their figures/images.

We still have a small blog in terms of blogpost articles published (between 2 to 3 articles per week) I think for now I'm going to merge this documentation because it helps to me to help new writers to rapidly being able to publish their article without repeating myself every time 😄 . We can then improve it iteratively with updates in how we publish articles based on your feedback.

Copy link
Contributor

@stas00 stas00 Sep 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

People just need to search the last article published on _blog.yml they'll find the number.

You're completely missing the failing scenario. Please follow the following scenario:

  1. Each new blog writer finds the same unused next number - indeed so. Let's take 3 new blog posts happening in parallel.
  2. Then each creates a PR with the same number - a conflict ensues. Author C has no idea that Author A and B have already chosen this number in their own PRs as that number is not claimed until the blog post is merged.

Basically you have a race condition here.

Does that help?

And this is not a hypothetical situation. On a long developed blog like BLOOM blog post I had to change the number and move files at least 3 times, until I gave up. And the same happened to other articles.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your scenario. What we do personally with RL team is when we write a common article we do everything on notion page for collaboration and when everything is done we do the transfer.

But I'm agree even with this we have a lot of back and forth (especially changing the folder name and then change the path for every image) even with this notion "solution". And that's frustrating and time consuming.

For the rest, I think we need to open an issue to see how we can modify the process. My goal with this first doc is to demystifying for newcomers how to write an article because that's not intuitive.

But for me the process will change when we'll use a blog engine or a wysiwyg like Medium, (Hashnode ❤️ ), Ghost or something else. Idk if it's something planned or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good, I'm glad to hear we are on the same page, @simoninithomas

I still think dropping the id altogether is the simplest solution as it's not serving anything in the design or use of the blog, but as you said it should be discussed in an Issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue opened: #530

@simoninithomas simoninithomas merged commit af289d5 into main Sep 20, 2022
@simoninithomas simoninithomas deleted the ThomasSimonini/NewDoc branch September 20, 2022 14:49
@stas00 stas00 mentioned this pull request Sep 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants