Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update document for latency configuration for multi numa nodes on one socket #26944

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

wangleis
Copy link
Contributor

@wangleis wangleis commented Oct 8, 2024

Details:

Tickets:

@wangleis wangleis requested a review from a team as a code owner October 8, 2024 06:40
@wangleis wangleis requested review from tsavina and removed request for a team October 8, 2024 06:40
@github-actions github-actions bot added the category: docs OpenVINO documentation label Oct 8, 2024
@dmitry-gorokhov dmitry-gorokhov added this to the 2024.5 milestone Oct 14, 2024
| ``ov::hint::enable_hyper_threading`` | No | No |
+--------------------------------------+-----------------------------------------------------------------------+-----------------------------------------------------------------------+
| ``ov::hint::enable_cpu_pinning`` | No / Not Supported | Yes except using P-cores and E-cores together |
+--------------------------------------+-----------------------------------------------------------------------+-----------------------------------------------------------------------+

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be benefitial to have dedicated subsection here which will describe that starting from 5th Xeon generation Numa node are exposed explicitly (witch SNC=ON) (ideally should be reference on Intel resource with details) and that OV uses only single Numa due to performance consdirations. So usage of only part of the socket cores is expected behavior.
We also need to mentioned that for some models (with big compute demand) default behavior might not be optimal so the recomendation is to try full socket or even multi-socket execution for latency + provide recommendation hot to do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmitry-gorokhov updated. Please take a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: docs OpenVINO documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants