Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R-package] Quick question about num of thread #4192

Closed
issactoast opened this issue Apr 17, 2021 · 6 comments
Closed

[R-package] Quick question about num of thread #4192

issactoast opened this issue Apr 17, 2021 · 6 comments
Assignees

Comments

@issactoast
Copy link
Contributor

Hi, Thank you for making lightGBM in R!

I am using LightGBM in R and have a quick question about the num_thread.

According to the manual, the number of threads is the physical core of CPU. But usually what I have seen in my R code, set the num thread is equal to num_thread - 1 such as

cores <- parallel::detectCores() -1
cores

So if we have 4 core, use 3 for parallel and 1 for the controller. Is this applied to LightGBM too? So if I have 4 physical core of CPU and set the number of thread as 3?

@jameslamb
Copy link
Collaborator

Thanks for using LightGBM!

Assuming that there are not other processes making heavy use of the available CPUs, you will get the best performance by setting num_thread equal to the total number of real CPU cores. There is not really a concept of a "controller" in parallel training with LightGBM.

You could use something like this to test the relative speedup from different settings of num_threads:

library(lightgbm)
library(microbenchmark)
library(nycflights13)

data(flights, package = "nycflights13")
flights <- as.data.frame(flights)

dtrain <- lgb.Dataset(
    as.matrix(
        flights[, c("year", "sched_dep_time", "distance", "hour", "minute")]
    )
    , label = flights[, "dep_delay"]
    , free_raw_data = FALSE
    , max_bin = 350
)

num_cores <- parallel::detectCores()

for (num_thread in c(num_cores - 1, num_cores)) {
    print(paste0("num_thread: ", num_thread))
    print(
        microbenchmark::microbenchmark({
            lgb.train(
                params = list(
                    num_thread = num_thread
                    , objective = "regression_l2"
                    , num_leaves = 31L
                    , max_depth = 8L
                    , learning_rate = 0.01
                    , min_data_in_leaf = 1
                )
                , data = dtrain
                , nrounds = 1000L
                , verbose = -1L
            )
        }, times = 5, unit = "s")
    )
}

I installed {lightgbm} 3.2.1 on my Mac tonight with install.packages("lightgbm", type = "source"), and with that version I got the following results from the code above. I have two 2-core CPUs on this machine.

# num_threads = 3

     min       lq     mean   median       uq    max neval
 5.23764 5.273885 5.497161 5.331118 5.625063 6.0181     5

# num_threads = 4
      min       lq     mean   median       uq      max neval
 4.582549 4.762857 4.907298 4.837003 4.925407 5.428671     5

Your specific results will vary based on your specific dataset and the other learning parameter values you set.

@jameslamb jameslamb changed the title Quick question about num of thread [R-package] Quick question about num of thread Apr 18, 2021
@issactoast
Copy link
Contributor Author

Thank you for the clarification! Just want to make it clear, we need to use num_cores <- parallel::detectCores(logical=FALSE) so that the num_cores is equal to the physical core. I used the same code that you have and it confirms that setting the num_thread = num of physical core is faster. Thank you!

library(lightgbm)
#> Loading required package: R6
library(microbenchmark)
library(nycflights13)

data(flights, package = "nycflights13")
flights <- as.data.frame(flights)

dtrain <- lgb.Dataset(
    as.matrix(
        flights[, c("year", "sched_dep_time", "distance", "hour", "minute")]
    )
    , label = flights[, "dep_delay"]
    , free_raw_data = FALSE
    , max_bin = 350
)

num_cores <- parallel::detectCores(logical = FALSE)

for (num_thread in c(num_cores, num_cores * 2)) {
    print(paste0("num_thread: ", num_thread))
    print(
        microbenchmark::microbenchmark({
            lgb.train(
                params = list(
                    num_thread = num_thread
                    , objective = "regression_l2"
                    , num_leaves = 31L
                    , max_depth = 8L
                    , learning_rate = 0.01
                    , min_data_in_leaf = 1
                )
                , data = dtrain
                , nrounds = 1000L
                , verbose = -1L
            )
        }, times = 5, unit = "s")
    )
}
#> [1] "num_thread: 10"
#> Unit: seconds
#>                                                                                                                                                                                                                                            expr
#>  {     lgb.train(params = list(num_thread = num_thread, objective = "regression_l2",          num_leaves = 31L, max_depth = 8L, learning_rate = 0.01,          min_data_in_leaf = 1), data = dtrain, nrounds = 1000L,          verbose = -1L) }
#>       min       lq     mean   median       uq      max neval
#>  2.522868 2.546463 2.593207 2.563418 2.646491 2.686795     5
#> [1] "num_thread: 20"
#> Unit: seconds
#>                                                                                                                                                                                                                                            expr
#>  {     lgb.train(params = list(num_thread = num_thread, objective = "regression_l2",          num_leaves = 31L, max_depth = 8L, learning_rate = 0.01,          min_data_in_leaf = 1), data = dtrain, nrounds = 1000L,          verbose = -1L) }
#>       min       lq    mean   median       uq      max neval
#>  4.536631 4.673258 4.67483 4.682869 4.687078 4.794311     5

Created on 2021-04-18 by the reprex package (v1.0.0)

@jameslamb
Copy link
Collaborator

Ah yes, you are absolutely right! Are you interested in contributing a change to the documentation? I think others would benefit from that note.

It would just be updating

#' \item{\code{num_threads}: Number of threads for LightGBM. For the best speed, set this to
#' the number of real CPU cores, not the number of threads (most
#' CPU using hyper-threading to generate 2 threads per CPU core).}
to say something like

the number of real CPU cores (\code{parallel::detectCores(logical = FALSE)})

And then re-generating the documentation files with commands like this:

sh build-cran-package.sh
R CMD INSTALL --with-keep.source lightgbm_*.tar.gz
cd R-package
Rscript -e "roxygen2::roxygenize(load = 'installed')"

@issactoast
Copy link
Contributor Author

@jameslamb Sure! I will do that, thanks!

@jameslamb
Copy link
Collaborator

Great, thanks so much! Let me know if you run into any issues.

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants