Unable to load engine in c++ api #4339

ninono12345 · 2025-01-26T14:22:50Z

Description

I was once working on a small project a year ago. I wanted to convert a model to tensorrt in python and run inference in c++, I was successful. Now with TensorRT 10 out I am facing problems. I am still successfully able to convert onnx to tensorrt and run inference in python api, but I am unsuccessful in loading the engine in TensorRT.

The code is left untouched, last time it worked, and as far as I know from TensorRT 8.6 to 10 the same code should work.
Perhaps it could be that now I am using Visual Studio 2022? Last time I was using windows/linux cmake and everything worked.

`
std::tuple<nvinfer1::ICudaEngine*, nvinfer1::IExecutionContext*> load_feature_extractor(std::string engine_file_name)
{
std::cout << engine_file_name << std::endl;
std::ifstream file(engine_file_name, std::ios::binary | std::ios::ate);
if (!file) {
std::cout << "failed to load engine" << std::endl;
throw std::runtime_error("failed to load engine");
}

	std::streamsize size = file.tellg();
	file.seekg(0, std::ios::beg);

	std::vector<char> buffer(size);
	if (!file.read(buffer.data(), size)) {
		throw std::runtime_error("unable to read engine");
	}

	Logger m_l = Logger();
	nvinfer1::IRuntime* runtime = nvinfer1::createInferRuntime(m_l);

	nvinfer1::ICudaEngine* engine = runtime->deserializeCudaEngine(buffer.data(), buffer.size());
	if (!engine) {
		std::cout << "engine is null" << std::endl;
	}

	for (int i = 0; i < engine->getNbIOTensors(); i++) {
		std::cout << (engine->getIOTensorName(i)) << std::endl;
		//io_tensor_dims.push_back(engine->getTensorShape(io_tensor_names[i]));
	}

	nvinfer1::IExecutionContext* context = engine->createExecutionContext();

	// cudaStream_t* stream;
	// cudaStreamCreate(stream);

	//scale_factors = torch::ones({1});
	// scores_raw = torch::empty({ 1,1,18,18 }, torch::TensorOptions().dtype(torch::kFloat32).device(torch::kCUDA));
	// bbox_preds = torch::empty({ 1,1,4,18,18 }, torch::TensorOptions().dtype(torch::kFloat32).device(torch::kCUDA));

	return std::make_tuple(engine, context);
}

`

output:
IRuntime::deserializeCudaEngine: Error Code 1: Internal Error (Unexpected call to stub loadRunner for ShuffleRunner.)

And also I tried to create the engine from the c++ api:
`
class Logger : public ILogger
{
void log(Severity severity, const char* msg) noexcept override
{
// suppress info-level messages
if (severity <= Severity::kWARNING)
std::cout << msg << std::endl;
}
} logger;

IBuilder* builder = createInferBuilder(logger);
auto strt = 1U << static_cast<uint32_t>(NetworkDefinitionCreationFlag::kEXPLICIT_BATCH);
INetworkDefinition* network = builder->createNetworkV2(strt);

IParser* parser = createParser(*network, logger);

parser->parseFromFile("feature_extractor_tompnet_50.onnx",
static_cast<int32_t>(ILogger::Severity::kWARNING));
for (int32_t i = 0; i < parser->getNbErrors(); ++i)
{
std::cout << parser->getError(i)->desc() << std::endl;
}

IBuilderConfig* config = builder->createBuilderConfig();
// config->setMemoryPoolLimit(MemoryPoolType::kWORKSPACE, 1U << 20);
// config->setMemoryPoolLimit(MemoryPoolType::kTACTIC_SHARED_MEMORY, 48 << 10);

IHostMemory* serializedModel = builder->buildSerializedNetwork(*network, *config);

ICudaEngine* engine = builder->buildEngineWithConfig(*network, *config); // THIS FAILS

`

The engine is successfully serialized, but loading still fails.

buildEngineWithConfig outputs:
Unexpected Internal Error: [virtualMemoryBuffer.cpp::nvinfer1::StdVirtualMemoryBufferImpl::~StdVirtualMemoryBufferImpl::123] Error Code 1: Cuda Driver (TensorRT internal error)

if I print all logs, I get:

If config->setMemoryPoolLimit(MemoryPoolType::kWORKSPACE, 1U << 20) is uncommented then I also get outputs both for buildSerializedNetwork and buildEngineWithConfig:
UNSUPPORTED_STATE: Skipping tactic 0 due to insufficient memory on requested size of 10616832 detected for tactic 0x0000000000000000.

Keep in mind that using python api I am successful in converting and running inference, but c++ api fails.

Could this be due to Visual Studio?

P. S. trtexec is also able to convert and run inference; The most important thing for me is to be able to run inference in c++, I can convert the engine in python

Thank you

Environment

TensorRT Version: 10.7

NVIDIA GPU: GTX 1660 Ti

NVIDIA Driver Version: 561.19

CUDA Version: 12.4

CUDNN Version: 8.9.7

Operating System: Windows 10

Visual Studio 2022

Python Version (if applicable): 3.10

PyTorch Version (if applicable): latest with cuda 12.4

ONNX Model link:
https://drive.google.com/file/d/1S2O6FAm5tbzkbFUUIcuZiAmdSUcSSosa/view?usp=sharing

The text was updated successfully, but these errors were encountered:

ninono12345 · 2025-01-26T23:46:12Z

@zerollzeng I remember last time you we're really helpfull to me, perhaps you could give me insight into what I am missing. Thank you

kevinch-nv · 2025-02-05T19:02:18Z

The API usage looks fine to me, the errors suggest some environment issue:

[virtualMemoryBuffer.cpp::nvinfer1::StdVirtualMemoryBufferImpl::~StdVirtualMemoryBufferImpl::123] Error Code 1: Cuda Driver (TensorRT internal error)

UNSUPPORTED_STATE: Skipping tactic 0 due to insufficient memory on requested size of 10616832 detected for tactic 0x0000000000000000

Can you double check that your program in Visual Studio is linking properly to all the TensorRT and CUDA libraries?

ninono12345 · 2025-02-06T15:41:54Z

Thank you for your answer. You see, there is not a lot of information on visual studio. Mostly everything is in CMake.

Here is how my imports look like, please tell me if you notice, that something should be changed:

Of course I should probably switch to cmake, but this is a very strange error, I am able to import libraries, create nvinfer1::IRuntime, but creating an engine fails.
As well as building from onnx a serialized tensorrt network is successful, but loading it fails.
Both times creating ICudaEngine fails

ninono12345 · 2025-02-06T22:30:56Z

I indeed can confirm that a simple cmake works, the engine is loaded without error. But I only wonder what could I have missed?

kevinch-nv added Engine Build Issues with engine build triaged Issue has been triaged by maintainers labels Feb 5, 2025

kevinch-nv self-assigned this Feb 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to load engine in c++ api #4339

Unable to load engine in c++ api #4339

ninono12345 commented Jan 26, 2025 •

edited

Loading

ninono12345 commented Jan 26, 2025

kevinch-nv commented Feb 5, 2025

ninono12345 commented Feb 6, 2025 •

edited

Loading

ninono12345 commented Feb 6, 2025

Unable to load engine in c++ api #4339

Unable to load engine in c++ api #4339

Comments

ninono12345 commented Jan 26, 2025 • edited Loading

Description

Environment

ninono12345 commented Jan 26, 2025

kevinch-nv commented Feb 5, 2025

ninono12345 commented Feb 6, 2025 • edited Loading

ninono12345 commented Feb 6, 2025

ninono12345 commented Jan 26, 2025 •

edited

Loading

ninono12345 commented Feb 6, 2025 •

edited

Loading