-
Notifications
You must be signed in to change notification settings - Fork 11
Issues: robertknight/rten
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Support u8 x u8 non range-reduced operands for MatMulInteger
quantization
Issues related to support for quantized data types or operations
#596
opened Feb 12, 2025 by
robertknight
MatMulNBits support for 4-bit quantization
quantization
Issues related to support for quantized data types or operations
#578
opened Feb 4, 2025 by
robertknight
Adjust matmul tile size on Arm
performance
Issues that affect model inference or loading performance
#553
opened Jan 26, 2025 by
robertknight
Reduce use of Issues related to use of `unsafe` code
unsafe
in portable SIMD library
safety
#549
opened Jan 25, 2025 by
robertknight
Support prepacked weights for non-MatMul operators
performance
Issues that affect model inference or loading performance
#484
opened Dec 25, 2024 by
robertknight
1 of 5 tasks
Support Llama 3 tokenizer (implement Issues related to the rten-text tokenization crate
ignore_merges
behavior)
tokenizers
#453
opened Dec 8, 2024 by
robertknight
Align tokenizer pipeline and terminology with Hugging Face tokenizers
tokenizers
Issues related to the rten-text tokenization crate
#427
opened Dec 2, 2024 by
robertknight
Support fusing Transpose + MatMul where both inputs are transposed
performance
Issues that affect model inference or loading performance
#398
opened Oct 29, 2024 by
robertknight
Fuse pointwise operations into matmul / convolution operations
performance
Issues that affect model inference or loading performance
#371
opened Sep 21, 2024 by
robertknight
Implement better depthwise convolution kernels
performance
Issues that affect model inference or loading performance
#370
opened Sep 21, 2024 by
robertknight
Align ReduceMin / ReduceMax etc. handling of empty tensors with spec
Spec compliance
Issues with RTen behavior not matching the ONNX specifications
#341
opened Sep 1, 2024 by
robertknight
Memoize or precompute subgraphs that depend only on input shapes
#270
opened Jul 5, 2024 by
robertknight
Share implementations for operators based on data type width
#244
opened Jun 21, 2024 by
robertknight
Make unary ops more efficient with non-contiguous inputs
performance
Issues that affect model inference or loading performance
#192
opened May 20, 2024 by
robertknight
1 of 2 tasks
Output a more helpful error if operator is unavailable due to build features
usability
#154
opened May 6, 2024 by
robertknight
Run tests under AddressSanitizer (and possibly other sanitizers)
qa
Quality / correctness checks
#151
opened May 5, 2024 by
robertknight
Validate operator input counts
tooling
Tools for debugging / profiling etc.
#133
opened Apr 29, 2024 by
robertknight
Enable re-using pool across graph executions
performance
Issues that affect model inference or loading performance
#122
opened Apr 26, 2024 by
robertknight
Provide better APIs for working with models that have many inputs / outputs
usability
#71
opened Mar 30, 2024 by
robertknight
Document rten CLI tool
documentation
Improvements or additions to documentation
#52
opened Feb 8, 2024 by
robertknight
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.