Ko perf train gpu one #25250

HongB1 · 2023-08-02T03:43:28Z

What does this PR do?

Translated the <your_file>.md file of the documentation to Korean.
Thank you in advance for your review.

Part of #20179

Before reviewing

Check for missing / redundant translations (번역 누락/중복 검사)
Grammar Check (맞춤법 검사)
Review or Add new terms to glossary (용어 확인 및 추가)
Check Inline TOC (e.g. [[lowercased-header]])
Check live-preview for gotchas (live-preview로 정상작동 확인)

Who can review? (Initial)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review? (Final)

HongB1 · 2023-08-02T03:53:49Z

docs/source/ko/perf_train_gpu_one.md

+
+This guide focuses on training large models efficiently on a single GPU. These approaches are still valid if you have access to a machine with multiple GPUs but you will also have access to additional methods outlined in the [multi-GPU section](perf_train_gpu_many).
+
+In this section we have a look at a few tricks to reduce the memory footprint and speed up training for large models and how they are integrated in the [`Trainer`] and [🤗 Accelerate](https://huggingface.co/docs/accelerate/). Each method can improve speed or memory usage which is summarized in the table below:


chatGPT는 이 부분을 '메모리 풋프린트'로 그대로 표시했는데, 메모리 풋프린트를 그대로 메모리 풋프린트로 두는 게 좋을까요? 아니면 메모리 풋프린트를 메모리 사용량으로 바꾸는 것도 괜찮을까요 ?

메모리 사용량으로 표기하셔도 좋을 것으로 생각됩니다!
메모리 사용량(memory footprint)으로 병기하신다면, 의미를 더 쉽게 이해할 수 있을 것 같습니다.

HongB1 · 2023-08-02T03:54:30Z

docs/source/ko/perf_train_gpu_one.md

+
+### 8-bit Adam
+
+Instead of aggregating optimizer states like Adafactor, 8-bit Adam keeps the full state and quantizes it. Quantization means that it stores the state with lower precision and dequantizes it only for the optimization. This is similar to the idea behind FP16 training where using variables with lower precision saves memory.


quantize 번역을 어떻게 하는 게 좋을까요? 아래는 chatGPT한테 질문한 내용입니다!

질문

여기서 quantizes를 양자화한다라고 해석을 하니 의미가 잘 와닿지가 않아. Quantizes를 무슨 뜻으로 받아들이면 될까?

답변

"Quantizes"의 맥락에서의 의미는 데이터를 보다 적은 메모리를 사용하는 방식으로 변환하거나 압축하는 것을 의미합니다. 이 경우, "quantizes"는 옵티마이저의 상태를 더 적은 메모리를 사용하도록 변환한다는 의미로 해석될 수 있습니다. 즉, 데이터를 더 작은 비트로 표현함으로써 메모리를 절약하고 컴퓨팅 효율성을 높입니다.

이러한 접근 방식은 FP16 훈련에서의 아이디어와 유사한데, 이 경우에도 저정밀도(즉, 16비트 부동 소수점)를 사용하여 메모리 사용량을 줄이고 속도를 높이는 방법입니다.

따라서 "quantizes"를 "저정밀도로 변환한다" 혹은 "저정밀도로 표현한다" 정도로 이해하면 적절할 것 같습니다.

quantize는 최근 양자화하다로 많이 사용되고 있어서 그대로 번역하셔도 괜찮을 것 같습니다!
저정밀도로 변환한다는 의미에 대해 부연 설명을 덧붙여주시면, 더 이해하기 쉬운 글이 될 것으로 생각됩니다.
추가로 통용되는 용어를 확인하기 위해 국내 TTA 정보통신사전을 활용하시면 좋습니다 😄

HongB1 · 2023-08-02T03:54:51Z

docs/source/ko/perf_train_gpu_one.md

+
+Since it has been discovered that more parameters lead to better performance, this technique allows to increase the number of parameters by an order of magnitude without increasing training costs.
+
+In this approach every other FFN layer is replaced with a MoE Layer which consists of many experts, with a gated function that trains each expert in a balanced way depending on the input token's position in a sequence.


Experts를 특수 모듈 정도로 번역하면 어떨까요?

MoE(Mixture of Experts)에서 "expert"는 모델 내부의 특정 부분 또는 기능에 특화된 작은 서브모델들을 의미하기 때문에, 말씀주신 특수 모듈로도 번역하셔도 좋을 것 같습니다. expert로도 사용되며 전문 모델, 전문가로도 번역되곤 하니 참고 부탁드립니다!

stevhliu

Thanks for tackling this big doc! We've recently refactored this page in #23963 to be more concise and actionable.

Would you mind updating the translation? 🤗

0525hhgus · 2023-08-27T13:42:56Z

번역 문서가 새롭게 업데이트 되었습니다! 아직 번역이 완료되지 않으셨다면, 문서 업데이트 부탁드립니다 😄

stevhliu

Thanks, I think you still have a couple of sections (see image below) that need to be removed to reflect the current docs! Some of the sections like "Choice of CPU" and "How to scale" have since been removed.

stevhliu · 2023-08-29T14:15:15Z

docs/source/ko/perf_train_gpu_one.md

+
+따라서 GPU 메모리를 절약하거나 작업을 더 빠르게 할 수 있는 몇 가지 포인트들이 있을 수 있습니다. 첫 번째 간단한 최적화인 적절한 배치 크기 선택부터 시작해 보겠습니다.
+
+## 배치 크기 [[batch-sizes]]


All the content above this section has been moved and translated in #25755, so it can be removed now :)

stevhliu · 2023-08-29T14:18:22Z

docs/source/ko/perf_train_gpu_one.md

+
+다음은 도커 이미지를 다운로드하고 배포하는 지침을 따르면 됩니다.
+
+## 희소성 [[sparsity]]


I think this can be removed, there's no section on sparsity in the new docs.

github-actions · 2023-09-23T08:03:03Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

HongB1 added 3 commits July 15, 2023 18:42

dos: ko: perf_train_gpu_one.md

aaaadda

feat: chatgpt draft

ffbbc26

fix: manual edits

90dc92e

HongB1 marked this pull request as ready for review August 2, 2023 03:48

HongB1 commented Aug 2, 2023

View reviewed changes

stevhliu reviewed Aug 10, 2023

View reviewed changes

stevhliu mentioned this pull request Aug 21, 2023

🌐 [i18n-KO] Translating docs to Korean #20179

Open

84 tasks

stevhliu reviewed Aug 29, 2023

View reviewed changes

github-actions bot closed this Oct 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ko perf train gpu one #25250

Ko perf train gpu one #25250

HongB1 commented Aug 2, 2023 •

edited

Loading

HongB1 Aug 2, 2023

0525hhgus Aug 27, 2023

HongB1 Aug 2, 2023

0525hhgus Aug 27, 2023 •

edited

Loading

HongB1 Aug 2, 2023

0525hhgus Aug 27, 2023 •

edited

Loading

stevhliu left a comment

0525hhgus commented Aug 27, 2023

stevhliu left a comment

stevhliu Aug 29, 2023

stevhliu Aug 29, 2023

github-actions bot commented Sep 23, 2023


		This guide focuses on training large models efficiently on a single GPU. These approaches are still valid if you have access to a machine with multiple GPUs but you will also have access to additional methods outlined in the [multi-GPU section](perf_train_gpu_many).

		In this section we have a look at a few tricks to reduce the memory footprint and speed up training for large models and how they are integrated in the [`Trainer`] and [🤗 Accelerate](https://huggingface.co/docs/accelerate/). Each method can improve speed or memory usage which is summarized in the table below:


		### 8-bit Adam

		Instead of aggregating optimizer states like Adafactor, 8-bit Adam keeps the full state and quantizes it. Quantization means that it stores the state with lower precision and dequantizes it only for the optimization. This is similar to the idea behind FP16 training where using variables with lower precision saves memory.


		Since it has been discovered that more parameters lead to better performance, this technique allows to increase the number of parameters by an order of magnitude without increasing training costs.

		In this approach every other FFN layer is replaced with a MoE Layer which consists of many experts, with a gated function that trains each expert in a balanced way depending on the input token's position in a sequence.


		따라서 GPU 메모리를 절약하거나 작업을 더 빠르게 할 수 있는 몇 가지 포인트들이 있을 수 있습니다. 첫 번째 간단한 최적화인 적절한 배치 크기 선택부터 시작해 보겠습니다.

		## 배치 크기 [[batch-sizes]]


		다음은 도커 이미지를 다운로드하고 배포하는 지침을 따르면 됩니다.

		## 희소성 [[sparsity]]

Ko perf train gpu one #25250

Ko perf train gpu one #25250

Conversation

HongB1 commented Aug 2, 2023 • edited Loading

What does this PR do?

Before reviewing

Who can review? (Initial)

Before submitting

Who can review? (Final)

HongB1 Aug 2, 2023

Choose a reason for hiding this comment

0525hhgus Aug 27, 2023

Choose a reason for hiding this comment

HongB1 Aug 2, 2023

Choose a reason for hiding this comment

질문

답변

0525hhgus Aug 27, 2023 • edited Loading

Choose a reason for hiding this comment

HongB1 Aug 2, 2023

Choose a reason for hiding this comment

0525hhgus Aug 27, 2023 • edited Loading

Choose a reason for hiding this comment

stevhliu left a comment

Choose a reason for hiding this comment

0525hhgus commented Aug 27, 2023

stevhliu left a comment

Choose a reason for hiding this comment

stevhliu Aug 29, 2023

Choose a reason for hiding this comment

stevhliu Aug 29, 2023

Choose a reason for hiding this comment

github-actions bot commented Sep 23, 2023

HongB1 commented Aug 2, 2023 •

edited

Loading

0525hhgus Aug 27, 2023 •

edited

Loading

0525hhgus Aug 27, 2023 •

edited

Loading