Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[doc] Add link to Neptune hyperparam tuning guide #4529

Merged
merged 7 commits into from
Aug 21, 2021

Conversation

Blaizzy
Copy link
Contributor

@Blaizzy Blaizzy commented Aug 16, 2021

Hi,

I'm Prince Canuma a Data Scientist and DevRel at Neptune.ai,

We have created a Neptune + LightGBM integration that allows LightGBM users to automatically log the following metadata to Neptune:

  • training and validation metrics
  • parameters
  • feature names, num_features and num_rows for the train_set
  • hardware consumption (CPU, GPU, Memory)
  • stdout and stderr logs
  • training code and git commit information

We are looking for ways to let users of LightGBM know this integration exists, is available and is up to date. To ensure the latter, we want to collaborate with you in future versions of LightGBM.

Finally, besides the README are there other ways you would recommend divulging this information? If so, please let us know!

We want to let LightGBM users know about the Neptune + LightGBM integration.

Kind regards!

Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much! Looks like an interesting tool and we really appreciate the effort you've put in so far on supporting LightGBM.

NOTE: I changed the title of this pull request to be more descriptive of the change. I did this because PR titles become changelog entries in LightGBM releases (e.g. see https://github.com/microsoft/LightGBM/releases/tag/v3.2.1), but in general I think you'll find that open source projects would prefer that your PR titles be descriptive and not just branch names.

I read through https://neptune.ai/blog/lightgbm-parameters-guide and have a few minor notes I'd like you to consider (not necessary for merging this PR though!).

  • https://github.com/huanzhang12/lightgbm-gpu is not actively maintained and has not been updated in 4+ years, so it can't be relied on as an accurate documentation of the current state of LightGBM. You might @szilard and @Laurae2 's work in https://github.com/szilard/GBM-perf a more accurate benchmark of LightGBM's performance with different settings and hardware.
  • remove code samples that use filepaths specific to a one person's local filesystem, like neptune.init('mjbahmani/LightGBM-hyperparameters')

README.md Outdated Show resolved Hide resolved
@jameslamb jameslamb self-requested a review August 16, 2021 15:18
Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my mistake, meant to leave a "request changes" review until the suggestion about the link text is accepted

@jameslamb jameslamb changed the title Pc/neptune integration [doc] Add link to Neptune hyperparam tuning guide Aug 16, 2021
@jameslamb jameslamb added the doc label Aug 16, 2021
Copy link
Collaborator

@StrikerRUS StrikerRUS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to decide whether we accept commercial products here while being FOSS project.
@jameslamb

README.md Outdated Show resolved Hide resolved
@jameslamb
Copy link
Collaborator

I think we need to decide whether we accept commercial products here while being FOSS project.

True, that's a good point I hadn't considered. I wasn't thinking about it because this section of the documentation is just a list of links and I didn't really see it as an endorsement, but I didn't consider the precedent it might set.

That said, I'm personally ok with this PR that is just adding a link to documentation. If someone from AWS or GCP or Azure came here tomorrow and wanted to add a link to a blog post with this level of depth about LightGBM, I'd be ok with that too. If it was just "here is the marketing material about our product and we added LightGBM in a bullet point about supported frameworks", I'd say that isn't something we should accept.

As long as we're not endorsing anything and not implying any sort of affiliation with commercial products, it doesn't bother me.

@StrikerRUS
Copy link
Collaborator

StrikerRUS commented Aug 17, 2021

@jameslamb
Yeah, our docs shouldn't become a free "advertising board", especially for commercial products. That's what I'm worrying about.

That said, I'm personally ok with this PR that is just adding a link to documentation.

This PR actually proposes adding two links.
The first one (about which you've commented) leads to the article about LightGBM's params and how to use Neptune with LightGBM. I agree that it's kind of OK to have it in our docs.
The second link leads to the guide about how to use Neptune with LightGBM. We can't place it in our External (Unofficial) Repositories list because it doesn't follow required format. We place there links to projects' GitHub repositories, not to their docs or guides. I propose replacing that link with one to the Neptune client GitHub repo (I guess the main one among Neptune's repos https://github.com/orgs/neptune-ai/repositories). But it seems that it falls into "here is the marketing material about our product and we added LightGBM in a bullet point about supported frameworks" your category as it's just a client to the commercial project that supports LightGBM...

image

@jameslamb
Copy link
Collaborator

This PR actually proposes adding two links.

Ah! Completely an oversight on my part, I didn't notice the other link under "External Repositories". I don't think anything about this or other commercial products should be added to the "External (Unofficial) Repositories" section.

Co-authored-by: James Lamb <jaylamb20@gmail.com>
@Blaizzy
Copy link
Contributor Author

Blaizzy commented Aug 18, 2021

  • https://github.com/huanzhang12/lightgbm-gpu is not actively maintained and has not been updated in 4+ years, so it can't be relied on as an accurate documentation of the current state of LightGBM. You might @szilard and @Laurae2 's work in https://github.com/szilard/GBM-perf a more accurate benchmark of LightGBM's performance with different settings and hardware.
  • remove code samples that use filepaths specific to a one person's local filesystem, like neptune.init('mjbahmani/LightGBM-hyperparameters')

Thank you for the advice I have requested an update for the links.

When it comes to the second point, there is no worry because that is particular to the hosted Neptune account, not his local filesystem, therefore only he has access to that workspace and project.

@Blaizzy
Copy link
Contributor Author

Blaizzy commented Aug 18, 2021

But it seems that it falls into "here is the marketing material about our product and we added LightGBM in a bullet point about supported frameworks" your category as it's just a client to the commercial project that supports LightGBM...
@StrikerRUS @jameslamb

Start off by apologising I didn't necessarily know where to place the integration link that's why I asked in the first message.
Thank you for guiding me to the right way to do things!

Now, I understand it might feel like it's just marketing and LightGBM is just a line.

But we wanted to do more and I think we did more and than just support LightGBM.
We created an integration that follows your style guide and offers a powerful abstraction for LightGBM users to track metadata produced in their experimentation phase.

As I mention in the PR message, I was asking to know if there is a space in your docs that you dedicated to such tools? (i.e. loggers page).
Many ML FOSS such as sklearn, fastai, pytorch-lightning, xgboost and so on are beginning to open such space.

Because today more than ever ML practitioners are in need of metadata store(data versioning, experiment tracking and model registry), especially for research and production teams that run tens, hundreds or thousands of experiments a day.
Think about those individuals and teams that use LightGBM heavily in their projects!

@Blaizzy
Copy link
Contributor Author

Blaizzy commented Aug 18, 2021

Please follow existing consistent format of list entries
@StrikerRUS

Done ✅

@StrikerRUS
Copy link
Collaborator

@jameslamb Agree! I believe one link to params tuning guide with Neptune will be enough.

@jameslamb
Copy link
Collaborator

Ok great.

@Blaizzy thanks for your patience as we work through this, this isn't a topic that maintainers here have talked about in-depth recently.

For this PR, please remove the link you've added in External (Unofficial) Repositories. We're happy to add a link to the thorough guide on tuning LightGBM hyperparameters that you've suggested adding, but at this time we don't have a dedicated space for listing commercial tools that have built LightGBM integrations.

Many ML FOSS such as sklearn, fastai, pytorch-lightning, xgboost and so on are beginning to open such space.

If you're interested in continuing the discussion about having a space in LightGBM's documentation which explicitly lists commercial products with LightGBM integrations, please open an issue at https://github.com/microsoft/LightGBM/issues which describes why you think this project should do that, including links to the specific examples you've referenced here.

@Blaizzy
Copy link
Contributor Author

Blaizzy commented Aug 20, 2021

Thank you very much @jameslamb @StrikerRUS, I understand and will make the necessary changes!

@Blaizzy
Copy link
Contributor Author

Blaizzy commented Aug 20, 2021

Done ✅

@jameslamb @StrikerRUS

Copy link
Collaborator

@StrikerRUS StrikerRUS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution!

@jameslamb jameslamb self-requested a review August 21, 2021 22:20
Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much!

@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants