Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues using IoBindings #66

Closed
guillaume-be opened this issue Jul 22, 2023 · 3 comments · Fixed by #78
Closed

Issues using IoBindings #66

guillaume-be opened this issue Jul 22, 2023 · 3 comments · Fixed by #78
Assignees
Labels
documentation Improvements or additions to documentation p: high high priority

Comments

@guillaume-be
Copy link
Contributor

guillaume-be commented Jul 22, 2023

Hello,

I am trying to work with the IoBindings that were recently added, and I am facing a few issues. I could not find documentations or examples in the crate illustrating how this would work -- I am attempting to reproduce a minimal example in Python using onnxruntime: the gist can be found here. I am attaching the tiny onnx model file (net.zip) to this issue, but it can be created again by running the notebook linked above.

Here are the current issues I am facing:

  1. It seems that the constructor for IoBindings is not public:
    pub(crate) fn new(session: &'s Session) -> OrtResult<Self> {
    . I a working off a local copy and removed the (crate) visibility modifier, but am I missing something on how the IoBinding should be created?
  2. The Drop implementation for the IoBindings does not include bound input and output. I believe bind_input needs to take ownership of its value in the current implementation:
pub fn bind_input<'a, 'b: 'a, 'c: 'b, S: AsRef<str> + Clone + Debug>(&'a mut self, name: S, ort_value: Value<'b>) -> OrtResult<()> {
    [...]
}
  1. With these local changes I am still unable to run the model with iobindings: the output does not seem to be properly constructed/populated: Failed to get tensor type and shape: the ort_value must contain a constructed tensor or sparse tensor

The small Rust binary I am using for testing is included below for reference. This would require adding the tch dependency to access libtorch, please let me know if you have any issues doing so:

use anyhow;
use ndarray::{ArrayD, CowArray};
use ort::{AllocationDevice, AllocatorType, Environment, ExecutionProvider, GraphOptimizationLevel, IoBinding, MemoryInfo, MemType, SessionBuilder, Value};
use ort::tensor::OrtOwnedTensor;

fn main() -> anyhow::Result<()> {
    tracing_subscriber::fmt::init();

    let environment = Environment::builder()
        .with_name("test")
        .with_execution_providers([ExecutionProvider::CUDA(Default::default())])
        .build()?
        .into_arc();

    let session = SessionBuilder::new(&environment)?
        .with_optimization_level(GraphOptimizationLevel::Level1)?
        .with_intra_threads(1)?.with_model_from_file("path/to/net.onnx")?;

    let input_tensor = tch::Tensor::arange(16*2, (tch::Kind::Float, tch::Device::cuda_if_available())).view([16,2]);

    // First option: ndarray
    let input_array: ArrayD<f32> = input_tensor.as_ref().try_into()?;
    let input_cow_array = CowArray::from(&input_array);
    let output_array: OrtOwnedTensor<f32, _> = session.run(vec![Value::from_array(session.allocator(), &input_cow_array)?])?[0].try_extract()?;
    println!("{:?}", output_array);

    // Second option: IO Bindings
    let mut io_bindings = IoBinding::new(&session)?;

    let value = Value::from_array(session.allocator(), &input_cow_array)?;
    let _ = io_bindings.bind_input("some_input", value)?;
    let output_mem_info = MemoryInfo::new(AllocationDevice::CPU,0,AllocatorType::Device, MemType::Default)?;
    let _ = io_bindings.bind_output("some_output", output_mem_info)?;

    let outputs = io_bindings.outputs()?;

    for (output_name, output_value) in outputs {
        let output_array: OrtOwnedTensor<f32, _> = output_value.try_extract()?;
        println!("{output_name}: {output_array:?}");
    }

    Ok(())
}

I have also tried extracting the values from the output memory info as follows:

	for (_, output_value) in outputs {
		let output_tensor = unsafe {
                   Tensor::from_blob(output_value.ptr() as *const u8, &[16, 5], &[5, 1], Kind::Float, Device::Cpu) 
                };
		output_tensor.print();
	}

but the values from the tensor are incorrect, so I guess the memory is not read from the right location.

Finally, eventually I think it would be great to be able to run something similar to the Python interface: where we can create 2 torch tensors (input and placeholder output), register pointers to these tensors in the io bindings, and calling session.run() would populate the tensor output. This would probably require being allowed to pass raw pointers to the io-bindings, maybe a "dangerous" module could be created to allow such usecase.

Thank you

@decahedron1 decahedron1 self-assigned this Jul 23, 2023
@decahedron1 decahedron1 added documentation Improvements or additions to documentation p: high high priority labels Jul 23, 2023
@decahedron1
Copy link
Member

Sorry for the delay.

The proper usage of IoBinding as of 1.15.1 is as follows.

use anyhow;
use ndarray::{ArrayD, CowArray};
use ort::tensor::OrtOwnedTensor;
use ort::{AllocationDevice, AllocatorType, Environment, ExecutionProvider, GraphOptimizationLevel, IoBinding, MemType, MemoryInfo, SessionBuilder, Value};

fn main() -> anyhow::Result<()> {
	tracing_subscriber::fmt::init();

	let environment = Environment::builder()
		.with_name("test")
		.with_execution_providers([ExecutionProvider::CUDA(Default::default())])
		.build()?
		.into_arc();

	let session = SessionBuilder::new(&environment)?
		.with_optimization_level(GraphOptimizationLevel::Level1)?
		.with_intra_threads(1)?
		.with_model_from_file("examples/net.onnx")?;

	let input_tensor = tch::Tensor::arange(16 * 2, (tch::Kind::Float, tch::Device::cuda_if_available())).view([16, 2]);

	// First option: ndarray
	let input_array: ArrayD<f32> = input_tensor.as_ref().try_into()?;
	let input_cow_array = CowArray::from(&input_array);
	let output_array: OrtOwnedTensor<f32, _> = session.run(vec![Value::from_array(session.allocator(), &input_cow_array)?])?[0].try_extract()?;
	println!("{:?}", output_array);

	// Second option: IO Bindings
-	let mut io_bindings = IoBinding::new(&session)?;
+	let mut io_bindings = session.bind()?;

	let value = Value::from_array(session.allocator(), &input_cow_array)?;
-	let _ = io_bindings.bind_input("some_input", value)?;
+	let _ = io_bindings.bind_input("some_input", &value)?;
	let output_mem_info = MemoryInfo::new(AllocationDevice::CPU, 0, AllocatorType::Device, MemType::Default)?;
	let _ = io_bindings.bind_output("some_output", output_mem_info)?;
+
+	session.run_with_binding(&io_bindings)?;

	let outputs = io_bindings.outputs()?;

	for (output_name, output_value) in outputs {
		let output_array: OrtOwnedTensor<f32, _> = output_value.try_extract()?;
		println!("{output_name}: {output_array:?}");
	}

	Ok(())
}
OrtOwnedTensor { data: TensorPtr { ptr: 0x12b198def50, array_view: [[-0.89867944, 0.28273344, -0.9206959, -0.35467649, 0.20861459],
 [-0.18996829, 0.26876986, -1.2060621, -0.7395382, -0.6321051],
 [0.5187429, 0.25480628, -1.4914281, -1.1243999, -1.4728248],
 [1.2274541, 0.2408427, -1.7767943, -1.5092617, -2.3135445],
 [1.9361652, 0.22687912, -2.0621605, -1.8941233, -3.1542642],
 [2.644876, 0.21291554, -2.3475266, -2.278985, -3.994984],
 [3.3535876, 0.19895196, -2.6328926, -2.663847, -4.835704],
 [4.062299, 0.18498838, -2.9182587, -3.0487087, -5.6764235],
 [4.7710094, 0.1710248, -3.203625, -3.4335701, -6.5171432],
 [5.4797206, 0.1570617, -3.4889908, -3.8184323, -7.357863],
 [6.1884317, 0.14309764, -3.774357, -4.2032933, -8.198583],
 [6.897143, 0.12913358, -4.0597234, -4.5881553, -9.039303],
 [7.605855, 0.11517048, -4.3450894, -4.973017, -9.880022],
 [8.314566, 0.101207376, -4.630455, -5.357878, -10.720741],
 [9.023276, 0.08724332, -4.915822, -5.7427406, -11.561461],
 [9.731987, 0.07327926, -5.201188, -6.1276016, -12.402181]], shape=[16, 5], strides=[5, 1], layout=Cc (0x5), dynamic ndim=2 } }
some_output: OrtOwnedTensor { data: TensorPtr { ptr: 0x12b198de990, array_view: [[-0.89867944, 0.28273344, -0.9206959, -0.35467649, 0.20861459],
 [-0.18996829, 0.26876986, -1.2060621, -0.7395382, -0.6321051],
 [0.5187429, 0.25480628, -1.4914281, -1.1243999, -1.4728248],
 [1.2274541, 0.2408427, -1.7767943, -1.5092617, -2.3135445],
 [1.9361652, 0.22687912, -2.0621605, -1.8941233, -3.1542642],
 [2.644876, 0.21291554, -2.3475266, -2.278985, -3.994984],
 [3.3535876, 0.19895196, -2.6328926, -2.663847, -4.835704],
 [4.062299, 0.18498838, -2.9182587, -3.0487087, -5.6764235],
 [4.7710094, 0.1710248, -3.203625, -3.4335701, -6.5171432],
 [5.4797206, 0.1570617, -3.4889908, -3.8184323, -7.357863],
 [6.1884317, 0.14309764, -3.774357, -4.2032933, -8.198583],
 [6.897143, 0.12913358, -4.0597234, -4.5881553, -9.039303],
 [7.605855, 0.11517048, -4.3450894, -4.973017, -9.880022],
 [8.314566, 0.101207376, -4.630455, -5.357878, -10.720741],
 [9.023276, 0.08724332, -4.915822, -5.7427406, -11.561461],
 [9.731987, 0.07327926, -5.201188, -6.1276016, -12.402181]], shape=[16, 5], strides=[5, 1], layout=Cc (0x5), dynamic ndim=2 } }

The Drop implementation for the IoBindings does not include bound input and output. I believe bind_input needs to take ownership of its value in the current implementation

I'm not sure if it does. BindInput appears to copy the input values, and the outputs are "owned" by the C++ IoBinding class, so they get destroyed in ReleaseIoBinding. But I'm probably misunderstanding something.

Finally, eventually I think it would be great to be able to run something similar to the Python interface: where we can create 2 torch tensors (input and placeholder output), register pointers to these tensors in the io bindings, and calling session.run() would populate the tensor output. This would probably require being allowed to pass raw pointers to the io-bindings, maybe a "dangerous" module could be created to allow such usecase.

I'll look into this 🙂

@guillaume-be
Copy link
Contributor Author

Thank you very much!

Regarding the Drop implementation, I updated my examples as per your guidelines and I am still facing the following error:

use anyhow;
use ndarray::{ArrayD, CowArray};
use ort::tensor::OrtOwnedTensor;
use ort::{AllocationDevice, AllocatorType, Environment, ExecutionProvider, GraphOptimizationLevel, IoBinding, MemType, MemoryInfo, SessionBuilder, Value};

fn main() -> anyhow::Result<()> {
	tracing_subscriber::fmt::init();

	let environment = Environment::builder()
		.with_name("test")
		.with_execution_providers([ExecutionProvider::CUDA(Default::default())])
		.build()?
		.into_arc();

	let session = SessionBuilder::new(&environment)?
		.with_optimization_level(GraphOptimizationLevel::Level1)?
		.with_intra_threads(1)?
		.with_model_from_file("examples/net.onnx")?;

	let input_tensor = tch::Tensor::arange(16 * 2, (tch::Kind::Float, tch::Device::cuda_if_available())).view([16, 2]);

	// First option: ndarray
	let input_array: ArrayD<f32> = input_tensor.as_ref().try_into()?;
	let input_cow_array = CowArray::from(&input_array);
	let output_array: OrtOwnedTensor<f32, _> = session.run(vec![Value::from_array(session.allocator(), &input_cow_array)?])?[0].try_extract()?;
	println!("{:?}", output_array);

	// Second option: IO Bindings
	let mut io_bindings = session.bind()?;

	let value = Value::from_array(session.allocator(), &input_cow_array)?;
	let _ = io_bindings.bind_input("some_input", &value)?;
	let output_mem_info = MemoryInfo::new(AllocationDevice::CPU, 0, AllocatorType::Device, MemType::Default)?;
	let _ = io_bindings.bind_output("some_output", output_mem_info)?;

	session.run_with_binding(&io_bindings)?;

	let outputs = io_bindings.outputs()?;

	for (output_name, output_value) in outputs {
		let output_array: OrtOwnedTensor<f32, _> = output_value.try_extract()?;
		println!("{output_name}: {output_array:?}");
	}

	Ok(())
}

Does not compile:

error[E0597]: `value` does not live long enough
  --> examples\tch_example.rs:32:47
   |
31 |     let value = Value::from_array(session.allocator(), &input_cow_array)?;
   |         ----- binding `value` declared here
32 |     let _ = io_bindings.bind_input("some_input", &value)?;
   |                                                  ^^^^^^ borrowed value does not live long enough
...
46 | }
   | -
   | |
   | `value` dropped here while still borrowed
   | borrow might be used here, when `value` is dropped and runs the `Drop` code for type `Value`

Updating bind_input to take ownership of the Value solves the issue

@Chikage0o0
Copy link

Chikage0o0 commented Jul 31, 2023

What should I put in inputname?

Edit: Find the Input Name in the model.

thread 'inference::onnx::test_io_bind' panicked at 'called `Result::unwrap()` on an `Err` value: CreateIoBinding(Msg("Failed to find input name in the mapping: some_input"))', ai-assistant\src\inference\onnx.rs:180:57
stack backtrace:
   0: std::panicking::begin_panic_handler
             at /rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library\std\src\panicking.rs:593
   1: core::panicking::panic_fmt
             at /rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library\core\src\panicking.rs:67
   2: core::result::unwrap_failed
             at /rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library\core\src\result.rs:1651
   3: enum2$<core::result::Result<tuple$<>,enum2$<ort::error::OrtError> > >::unwrap<tuple$<>,enum2$<ort::error::OrtError> >
             at /rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225\library\core\src\result.rs:1076
   4: ai_assistant::inference::onnx::test_io_bind
             at .\src\inference\onnx.rs:180
   5: ai_assistant::inference::onnx::test_io_bind::closure$0
             at .\src\inference\onnx.rs:170
   6: core::ops::function::FnOnce::call_once<ai_assistant::inference::onnx::test_io_bind::closure_env$0,tuple$<> >
             at /rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225\library\core\src\ops\function.rs:250
   7: core::ops::function::FnOnce::call_once
             at /rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library\core\src\ops\function.rs:250
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
test inference::onnx::test_io_bind ... FAILED
    let onnx = OnnxRuntime::new(r"best.onnx", 416, 0.5, 0.5, 0).unwrap();
    let array: ndarray::ArrayBase<ndarray::OwnedRepr<f32>, ndarray::Dim<ndarray::IxDynImpl>> =
        Array::zeros((1, 3, 416, 416)).into_dyn();

    // let result = onnx.inference(array).unwrap();

    let mut io_bindings = onnx.session.bind().unwrap();
    let array = CowArray::from(array);
    let value = Value::from_array(onnx.session.allocator(), &array).unwrap();
    let _ = io_bindings.bind_input("some_input", value).unwrap();

@decahedron1 decahedron1 moved this to Todo in ort v2.0 Aug 5, 2023
@decahedron1 decahedron1 moved this from Todo to Done in ort v2.0 Aug 5, 2023
@decahedron1 decahedron1 linked a pull request Aug 20, 2023 that will close this issue
Merged
8 tasks
@decahedron1 decahedron1 mentioned this issue Oct 27, 2023
Merged
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation p: high high priority
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants