Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a documentation on how to write an article #507

Merged
merged 12 commits into from
Sep 20, 2022
61 changes: 61 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,67 @@
# The Hugging Face Blog Repository 🤗
This is the official repository of the [Hugging Face Blog](hf.co/blog).

## How to write an article? 📝
1️⃣ Create a branch `YourName/Title`

2️⃣ Create a md (markdown) file, **use a short file name**.
For instance, if your title is "Introduction to Deep Reinforcement Learning", the md file name could be `intro-rl.md`. This is important because the **file name will be the blogpost's URL**.

3️⃣ Create a new folder in `assets`. Check the last folder number and name your folder `number_md-name`, for instance `101_intro-rl`; this folder will contain **all your illustrations and thumbnail**. The folder number is mostly for (rough) ordering purposes, so it's no big deal if two concurrent articles use the same number.
Copy link
Contributor

@stas00 stas00 Sep 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check the last folder number and name your folder number_md-name

This leads to a huge overhead of constantly fighting over the next number in the sequence by multiple new blog posts that are in works. A blog post that isn't published fast gets constantly bumped down by those who are faster and the author has to constantly re-name and move all files multiple times. And more than once.

This has been already discussed multiple times between the authors and the agreement was to drop this number altogether. The only reason it hasn't happened yet is that of inertia - I have already published the first post w/o using the number.

This prefix number doesn't have any functional use. the article's unique url is a sufficient unique asset directory.

Ideally we should just do one PR where this prefix id is removed from all posts to prevent the perpetuation of this pattern. if you'd like I can do the honours.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the commented line:

The folder number is mostly for (rough) ordering purposes, so it's no big deal if two concurrent articles use the same number.

😉

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I may share where I feel this convention is counter-productive:

  1. Those of us with computer naming OCD will want a unique number
  2. most of the time we do monkey-see-monkey-do - so we, the authors will try to get a unique number
  3. If the prefix numbers are shared one can easily add files to the wrong assets folder / file completion isn't as quick.

May I ask what ordering purpose the assets folders serve? Perhaps I'm missing on some practical use-case here.

Copy link
Member Author

@simoninithomas simoninithomas Sep 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My two cents:

  1. People just need to search the last article published on _blog.yml they'll find the number.

2-3. I'm not sure about your point, since the asset folder is number_title like 101_deep-q-learning so I don't see how people can add files in the wrong assets folders. And if it happens people will find rapidly that it's the case since they will not be able to display their figures/images.

We still have a small blog in terms of blogpost articles published (between 2 to 3 articles per week) I think for now I'm going to merge this documentation because it helps to me to help new writers to rapidly being able to publish their article without repeating myself every time 😄 . We can then improve it iteratively with updates in how we publish articles based on your feedback.

Copy link
Contributor

@stas00 stas00 Sep 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

People just need to search the last article published on _blog.yml they'll find the number.

You're completely missing the failing scenario. Please follow the following scenario:

  1. Each new blog writer finds the same unused next number - indeed so. Let's take 3 new blog posts happening in parallel.
  2. Then each creates a PR with the same number - a conflict ensues. Author C has no idea that Author A and B have already chosen this number in their own PRs as that number is not claimed until the blog post is merged.

Basically you have a race condition here.

Does that help?

And this is not a hypothetical situation. On a long developed blog like BLOOM blog post I had to change the number and move files at least 3 times, until I gave up. And the same happened to other articles.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your scenario. What we do personally with RL team is when we write a common article we do everything on notion page for collaboration and when everything is done we do the transfer.

But I'm agree even with this we have a lot of back and forth (especially changing the folder name and then change the path for every image) even with this notion "solution". And that's frustrating and time consuming.

For the rest, I think we need to open an issue to see how we can modify the process. My goal with this first doc is to demystifying for newcomers how to write an article because that's not intuitive.

But for me the process will change when we'll use a blog engine or a wysiwyg like Medium, (Hashnode ❤️ ), Ghost or something else. Idk if it's something planned or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good, I'm glad to hear we are on the same page, @simoninithomas

I still think dropping the id altogether is the simplest solution as it's not serving anything in the design or use of the blog, but as you said it should be discussed in an Issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue opened: #530


🖼️: In terms of images, **try to have small files** to avoid having a slow loading user experience:
- Use jpeg instead of png.
- Use compressed images, you can use this website: https://www.iloveimg.com/fr/compresser-image

4️⃣ Copy and paste this to your md file and change the elements
- title
- thumbnail
- Published (change the date)
- Change the author card
- href ="/ your huggingface username"
- src : your huggingface picture, for that right click to the huggingface picture and copy the link
- <span class="fullname"> : your name

```
---
title: "PUT YOUR TITLE HERE"
thumbnail: /blog/assets/101_decision-transformers-train/thumbnail.gif
---

# Train your first Decision Transformer

<div class="blog-metadata">
<small>Published September 02, 2022.</small>
<a target="_blank" class="btn no-underline text-sm mb-5 font-sans" href="https://github.com/huggingface/blog/blob/main/decision-transformers-train.md">
Update on GitHub
</a>
</div>

<div class="author-card">
<a href="/edbeeching">
<img class="avatar avatar-user" src="https://aeiljuispo.cloudimg.io/v7/https://s3.amazonaws.com/moonup/production/uploads/1644220542819-noauth.jpeg?w=200&h=200&f=face" title="Gravatar">
<div class="bfc">
<code>edbeeching</code>
<span class="fullname">Edward Beeching</span>
</div>
</a>
<a href="/ThomasSimonini">
<img class="avatar avatar-user" src="https://aeiljuispo.cloudimg.io/v7/https://s3.amazonaws.com/moonup/production/uploads/1632748593235-60cae820b1c79a3e4b436664.jpeg?w=200&h=200&f=face" title="Gravatar">
<div class="bfc">
<code>ThomasSimonini</code>
<span class="fullname">Thomas Simonini</span>
</div>
</a>
</div>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to make this more fool-proof, probably at some point we can implement a simple syntax shortcut to generate those server side

(cc @Pierrci @mishig25 for instance)

No Svelte/mdsvex needed IMO, just a quick regex would probably work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo, we can put the additional fields inside the frontmatter:
https://github.dev/huggingface/blog/blob/5bf4750b18b2653b703c4e7df66b54481626a116/README.md#L26-L29

hopefully, yaml will be more fool-proof than html?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well the thing is that those html divs are injected in the middle of the content (after the h1)

Maybe something like:

{blog-metadata}

{authors edbeeching ThomasSimonini someone=guest}

or something would do the trick

But again, not worth doing anything in the short term IMO

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case do I merge now? Or you prefer that we wait the new version without html? 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, no need to wait for this, it can be done later

```

5️⃣ Then, you can add your content. It's markdown system so if you wrote your text on notion just control shift v to copy/paste as markdown.

6️⃣ Modify `_blog.yml` to add your blogpost.

7️⃣ When your article is ready, **open a pull request**.

8️⃣ The article will be **published automatically when you merge your pull request**.

## How to get a responsive thumbnail?
1️⃣ Create a `1300x650` image
Expand Down