Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Help]: Segmentation fault when using yolov10 #38

Closed
timarnoldev opened this issue Jul 31, 2024 · 8 comments
Closed

[Help]: Segmentation fault when using yolov10 #38

timarnoldev opened this issue Jul 31, 2024 · 8 comments
Assignees
Labels
help wanted Extra attention is needed

Comments

@timarnoldev
Copy link

timarnoldev commented Jul 31, 2024

I have a simple demo which should use this library for inference.
But it always crashes with a segmentation fault.

std::vector<std::pair<std::string, cv::Scalar>> generateLabelColorPairs() {
    std::vector<std::pair<std::string, cv::Scalar>> labelColorPairs;


    auto generateRandomColor = []() {
        std::random_device                 rd;
        std::mt19937                       gen(rd());
        std::uniform_int_distribution<int> dis(0, 255);
        return cv::Scalar(dis(gen), dis(gen), dis(gen));
    };


    labelColorPairs.emplace_back("ball", generateRandomColor());
    labelColorPairs.emplace_back("player_red", generateRandomColor());

    return labelColorPairs;
}

// Visualize detection results
void visualize(cv::Mat& image, const deploy::DetectionResult& result, const std::vector<std::pair<std::string, cv::Scalar>>& labelColorPairs) {
    for (size_t i = 0; i < result.num; ++i) {
        const auto& box       = result.boxes[i];
        int         cls       = result.classes[i];
        float       score     = result.scores[i];
        const auto& label     = labelColorPairs[cls].first;
        const auto& color     = labelColorPairs[cls].second;
        std::string labelText = label + " " + cv::format("%.2f", score);

        // Draw rectangle and label
        int      baseLine;
        cv::Size labelSize = cv::getTextSize(labelText, cv::FONT_HERSHEY_SIMPLEX, 0.6, 1, &baseLine);
        cv::rectangle(image, cv::Point(box.left, box.top), cv::Point(box.right, box.bottom), color, 2, cv::LINE_AA);
        cv::rectangle(image, cv::Point(box.left, box.top - labelSize.height), cv::Point(box.left + labelSize.width, box.top), color, -1);
        cv::putText(image, labelText, cv::Point(box.left, box.top), cv::FONT_HERSHEY_SIMPLEX, 0.6, cv::Scalar(255, 255, 255), 1);
    }
}

void inference(cv::Mat currentImage) {
    std::shared_ptr<deploy::BaseDet> model = std::make_shared<deploy::DeployDet>("../model.engine");
    std::vector<std::pair<std::string, cv::Scalar>> labels = generateLabelColorPairs();

    deploy::Image image(currentImage.data, currentImage.cols, currentImage.rows);
    auto result = model->predict(image);
}

I tried it with yolov9 and yolov10. But everytime I get the same error

Process finished with exit code 139 (interrupted by signal 11:SIGSEGV)

Used versions:

NVIDIA-SMI 555.42.06 Driver Version: 555.42.06 CUDA Version: 12.5 TensorRT 10.2

@laugh12321
Copy link
Owner

laugh12321 commented Jul 31, 2024

@timarnoldev Hi! A segmentation fault can occur due to issues related to memory access, possibly caused by model loading, image processing, or other memory management operations. To help diagnose the issue, I have a few suggestions and questions:

  1. Check if demo/detect/detect.cpp Runs Correctly:
    Can you confirm if the detect.cpp file in your demo runs without encountering segmentation faults? This will help determine if the issue is specifically within the code snippet you provided or elsewhere.

  2. Check ONNX Export for EfficientNMS Plugin:
    Verify whether your exported ONNX model includes the EfficientNMS plugin. This is crucial for correct model inference. If not, please refer to this issue for YOLOv9 or this pull request for YOLOv10 on how to include the EfficientNMS plugin during export.

  3. Provide the Complete Demo Code:
    It would be helpful to see the complete code for your demo, especially the parts involving model loading, data input, prediction, and output. This will allow for a more thorough examination of potential issues.

If possible, please share the full project code or key parts of it so I can assist you further in resolving this issue.

@laugh12321 laugh12321 self-assigned this Jul 31, 2024
@laugh12321 laugh12321 added the help wanted Extra attention is needed label Jul 31, 2024
@timarnoldev
Copy link
Author

Thanks for your quick response.
The provided example also crashes with my specific model file. At the same time my code works with the provided yolov8s file. I guess something is wrong with my engine file.

I used the ultralytics cli to convert from .pt -> .onnx with this command

yolo export model=best.pt format=onnx

Then:

trtexec --onnx=model.onnx --saveEngine=model.engine --fp16

@laugh12321
Copy link
Owner

laugh12321 commented Jul 31, 2024

@timarnoldev The ONNX model exported using yolo export does not include the EfficientNMS plugin and cannot be used for inference with this project. If you need to export an ONNX model with the EfficientNMS plugin, you can install the trtyolo CLI using pip install tensorrt_yolo and follow the instructions in model_export.md for model export. If you want to achieve the same inference speed in Python as in C++, you should refer to the installing-tensorrt_yolo guide to build version 4.0 of tensorrt_yolo.

@timarnoldev
Copy link
Author

timarnoldev commented Jul 31, 2024

I followed the steps from #28 exactly, but I it still crashes.

error: execv(/home/tk/TensorRT-YOLO/demo/detect/build/linux/x86_64/release/detect -e model.engine -i images -o output -l labels.txt) failed(-1)

This is the only output I get from the demo/detect/detect.cpp executable. Is there anything I can provide to you to troubleshoot the issue? @laugh12321

Steps to reproduce:

  • Download yolov10s.pt from THU-MIG/yolov10
  • Train on dataset using ultralytics cli: yolo task=detect mode=train epochs=200 batch=32 plots=True model=yolov10s.pt data=data.yaml
  • setup custom yolov10 environment as mentioned above
  • export model to onnx: yolo export model=best.pt format=onnx opset=13 simplify max_det=100 conf=0.17 iou=0.65 nms
  • export to .engine file: trtexec --onnx=best.onnx --saveEngine=models/model.engine --fp16
  • run inference using detection example: xmake run -P . detect -e models/model.engine -i images -o output -l labels.txt --cudaGraph

@timarnoldev timarnoldev changed the title C++ program crashes with segmentation fault Segmentation fault when using yolov10 Jul 31, 2024
@laugh12321
Copy link
Owner

@timarnoldev Your process seems correct. Please confirm the following:

  1. Does the exported ONNX model have four output nodes: num_dets, det_boxes, det_scores, and det_classes, as shown in the example?

image

  1. Before running demo/detect, did you follow the steps in the deploy-build guide to compile the deploy module?

The image below shows an example of inference using demo/detect after converting the yolov10s model to an engine file, following your provided steps.

image

@laugh12321 laugh12321 changed the title Segmentation fault when using yolov10 [Help]: Segmentation fault when using yolov10 Aug 1, 2024
@timarnoldev
Copy link
Author

timarnoldev commented Aug 1, 2024

I can confirm that the outputs shown in Netron seems odd, but I have no clue why.

Bildschirmfoto 2024-08-01 um 07 58 07

Do I have to have the environment activated, when I run the deploy commands?

I also just did the export again confirming I use your custom version of the yolov10 repo, but the output in Netron looks still the same.

@laugh12321
Copy link
Owner

@timarnoldev I suspect you are using the yolo CLI from the THU-MIG yolov10 repository instead of the yolo CLI from the laugh12321 yolov10 repository (nms branch).

Please follow these steps:

git clone -b nms https://github.com/laugh12321/yolov10.git
cd yolov10
conda create -n yolov10 python=3.9
conda activate yolov10
pip install -r requirements.txt
pip install -e .
yolo export model=best.pt format=onnx opset=13 simplify max_det=100 conf=0.17 iou=0.65 nms

@timarnoldev
Copy link
Author

Indeed that was the problem. I guess I first installed the yolo cli from the repo and then switched the branch.
Thank you very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants