Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU memory usage keeps increasing even hybridize with static_alloc when used in flask debug mode after mxnet 1.6.0post0. #3760

Closed
kohillyang opened this issue Sep 20, 2020 · 2 comments

Comments

@kohillyang
Copy link

I created this issue because I'm not sure the memory leak is caused by mxnet itself or just FLASK.

The original issue in mxnet is apache/mxnet#19159.

I'm using flask with mxnet to write a server. Since it is a web app, we want the GPU memory is fully static allocated.
However, as the title said, I found the GPU memory usage keeps increasing and then raise a OOM when the version of mxnet is 1.6.0post0 and 1.7.0.

Expected Behavior

The GPU memory usage should remain unchanged.

import mxnet as mx
import os
os.environ["MXNET_CUDNN_AUTOTUNE_DEFAULT"] = "0"
os.environ["MXNET_GPU_MEM_POOL_TYPE"] = "Round"


class Predictor(object):
    def __init__(self):
        ctx = mx.gpu(0)
        net = mx.gluon.model_zoo.vision.resnet50_v1()
        net.initialize()
        net.collect_params().reset_ctx(ctx)
        net.hybridize(active=True)
        max_h = 768
        max_w = 768
        _ = net(mx.nd.zeros(shape=(1, 3, max_h, max_w), ctx=ctx))
        self.ctx = ctx
        self.net = net

    def __call__(self, *args, **kwargs):
        max_h = 768
        max_w = 768
        x_h = np.random.randint(100, max_h)
        x_w = np.random.randint(100, max_w)
        xx = np.random.randn(1, 3, x_h, x_w)
        y = self.net(mx.nd.array(xx, ctx=self.ctx))
        return y.asnumpy().sum()


if __name__ == '__main__':
    import flask
    import tornado.wsgi
    import tornado.httpserver
    import os
    import cv2
    import numpy as np
    from flask_cors import CORS
    import os
    import cv2
    import json
    import logging
    import base64

    os.environ["MXNET_CUDNN_AUTOTUNE_DEFAULT"]="0"
    DEBUG = True
    PORT = 21500
    app = flask.Flask(__name__)
    CORS(app, supports_credentials=True)
    predictor = Predictor()

    @app.route('/test', methods=['POST'])
    def net_forward():
        try:
            r = predictor()
            return None
        except Exception as e:
            logging.exception(e)
            print("failed")
            return flask.jsonify(str(e)), 400

    print("starting webserver...")
    if DEBUG:
        app.run(debug=True, host='0.0.0.0', port=PORT)
    else:
        http_server = tornado.httpserver.HTTPServer(
            tornado.wsgi.WSGIContainer(app))
        http_server.listen(PORT, address="0.0.0.0")
        tornado.ioloop.IOLoop.instance().start()

And just run the following code to request the server:

import base64
import json
import time
import os
import numpy as np
import cv2


def remote_call(url):
    register_data = {"Pic": "xx"}
    data = json.dumps(register_data)
    import requests
    return requests.post(url, data)


if __name__ == '__main__':
    import glob
    import matplotlib.pyplot as plt
    while True:
        register_url = 'http://127.0.0.1:21500/test'
        while True:
            try:
                remote_call(register_url)
            except Exception as e:
                print(e)

Actual Behavior

I found the GPU memory usage keeps increasing.

Environment

  • Python version: 3.7.0
  • Flask version: 1.0.2
@kohillyang
Copy link
Author

kohillyang commented Sep 20, 2020

BTW, IF set DEBUG to TRUE, then everything is OK.

@davidism
Copy link
Member

This does not appear to be about Flask.

Please use Stack Overflow for questions about your own code. This tracker is for issues related to the project itself. Be sure to include a minimal, complete, and verifiable example.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 14, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants