Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Multiple backend C++ API #3119

Closed
Tracked by #3122
njzjz opened this issue Jan 9, 2024 · 0 comments · Fixed by #3162
Closed
Tracked by #3122

[Feature Request] Multiple backend C++ API #3119

njzjz opened this issue Jan 9, 2024 · 0 comments · Fixed by #3162
Milestone

Comments

@njzjz
Copy link
Member

njzjz commented Jan 9, 2024

Summary

Propose a multiple backend C++ API. The backend can be switched according to the file format.

Detailed Description

Backend independent library

When @wanghan-iapcm designed v1, libdeepmd.so has been backend-independent, so we can just reuse it.

Factory method pattern

Take the example of DeepPot. Move the current DeepPot to DeepPotTf. Create a base class DeepPotBase, with two subclasses inherited, DeepPotTf and DeepPotPt. Create a wrapper class, DeepPot, to keep the same API as the current one, so no downstream codes need to change.

The wrapper class may look like:

enum Backend {TENSORFLOW, PYTORCH, UNKNOWN};

class DeepPot {
  public:
  DeepPot(const std::string& model,
                 const int& gpu_rank,
                 const std::string& file_content)
    Backend backend = detect_backend(model, file_content);
    if (Backend::TENSORFLOW == backend) {
        dp = std::make_unique<DeepPotTf>(model, gpu_rank, file_content);
    } else if (ProductId::PYTORCH == backend) {
        dp = std::make_unique<DeepPotPt>(model, gpu_rank, file_content);
    } else {
        throw deepmd::exception("Unknown file type");
    }
  }
  void compute(
      std::vector<double> &ener,
      std::vector<std::vector<VALUETYPE>> &force,
      std::vector<std::vector<VALUETYPE>> &virial,
      const std::vector<VALUETYPE> &coord,
      const std::vector<int> &atype,
      const std::vector<VALUETYPE> &box,
      const int nghost,
      const InputNlist &lmp_list,
      const int &ago,
      const std::vector<VALUETYPE> &fparam = std::vector<VALUETYPE>(),
      const std::vector<VALUETYPE> &aparam = std::vector<VALUETYPE>())
  {
    dp->compute(ener, force, virial, coord, atype, box, nghost, lmp_list, ago, fparam, aparam);
  }
  private:
    std::unique_ptr<DeepPotBase> dp;
};

The list of public APIs can be found in deepmd.hpp and c_api.cc. Not implemented methods can throw an exception.

Backend detection

The backend can be detected from the model file described below:

  • PyTorch: in zip files. Zip files start with PK.
  • TensorFlow: It's not easy to detect, but ProtoBuf should throw an error when parsing it.

If a backend is not enabled, we should skip it.
If a backend fails to read the file, continue to the next backend (assume there is a list).
If all backends fail, throw the error, with errors of each backend.

DeepPotModelDevi

DeepPotModelDevi may need to be refactored. The current implementation requires 4 models to come from the same backend, while they may come from different backends.

Further Information, Files, and Links

No response

@njzjz njzjz added this to the v3.0.0 milestone Jan 9, 2024
wanghan-iapcm pushed a commit that referenced this issue Jan 15, 2024
See #3119.
At this time, only TF is supported in such the multiple-backend
framework.

---------

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
njzjz added a commit to njzjz/deepmd-kit that referenced this issue Jan 16, 2024
See deepmodeling#3119

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
wanghan-iapcm pushed a commit that referenced this issue Jan 17, 2024
See #3119

---------

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
wanghan-iapcm pushed a commit that referenced this issue Jan 17, 2024
See #3119

---------

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
@njzjz njzjz linked a pull request Jan 22, 2024 that will close this issue
github-merge-queue bot pushed a commit that referenced this issue Feb 28, 2024
need to test in union environment (tf and pt)
see #3119

---------

Signed-off-by: Lysithea <52808607+CaRoLZhangxy@users.noreply.github.com>
Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Co-authored-by: Han Wang <92130845+wanghan-iapcm@users.noreply.github.com>
@njzjz njzjz closed this as completed Feb 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging a pull request may close this issue.

1 participant