Release MediaPipe v0.10.15 · google-ai-edge/mediapipe

Build changes

Fix unwanted dependency on GPU libraries.
Adds TwoTapFirFilterCalculator.
Add public visibility to graph_service headers.
Disable ASAN, TSAN and MSAN tests which take more than 10 minutes.

Framework and core calculator improvements

Update PointToForeign with an optional cleanup object.
Enable BeginLoopCalculator for move-only types (e.g. Tensor) without Packet::Consume usage and copyable types without copying unless it's a fundamental type.
Ensure proper release of resources in case of multiple AHWB reads.
Enables the configuration of GpuBufferPool options via GpuResources::Create();
Bugfix to correctly handle landmark projection in the non-square case.
add utility to wait for a sync (represented by FD)
Change a RET_CHECK to RET_CHECK_EQ
KinematicPathSolver: Avoid overshooting target
Introduce GetDefaultGpuExecutor(GpuResources) to allow executing all calculators on MP GPU thread.
No destruction for static ahwb_usage_track_.
Unbind framebufffer in Affine Transformation Runner GL
Move/isolate ahwb_usage_track_ into tensor_ahwb
Guard ahwb_tensor_track_ with mutex.
Add SidePacketConnectionTest
Update C++ Graph Builder to support executors and support input/output stream handlers.
Node::Input/OutputStreamHandler -> Node::SetInput/OutputStreamHandler
Add Packet::Share() method in replacement of SharedPtrWithPacket() function.
Default to high-performance power preference hint for WebGL contexts. For some computers with dual GPUs (like MBP2019), this will more frequently give us the higher performance GPU, which is generally preferable for most of our use cases (realtime rendering and ML), since speed is more critical than power consumption. If necessary, the user can override this setting by requesting their canvas' WebGL context manually before initializing the graph.
Introduce input_scale parameter to SpectogramCalculator.
Improve documentation of graph options
Add an option to PackMediaSequenceCalculator to add empty clip labels instead of ignoring them. This is useful when we want to distinguish processing errors from no-detections.
Updates language detection headers
Fix dangling error reporter pointer in memory mapped models
Fix for possible infinite stall using setOptions immediately before a loadLoraModel call.
Add relu1p5 op, abs op, Log op, mdspan and Lhs Broadcast Sub with test
Fix missing member move in Tensor class
Add support for single Tensor output streams for ImageToTensorCalculator.
Fix some compilation errors in WebGPU code. These changes are all minor.
Add single tensor output support to tensor_converter_calculator.
Replace QCHECK with ABSL_QCHECK and CHECK with ABSL_CHECK.
Fix a bug in TensorAHWB that triggers a crash with multiple delayed AHWB readers followed by a CPU reader.
Fixes an unnecessary allocation of GraphServiceManager in case it is adopted from the calculator context.
Fix triggering of DFATAL message.
Remove xnn_enable_avx512fp16=false from .bazelrc
Replace uses of TfLiteOperatorCreate with TfLiteOperatorCreateWithData
Compile with '--keep_going' in setup.py
Update ndk version so that our open source users get the best possible performance out of mediapipe.
Correct address of android ndk
Replace absl::make_unique with std::make_unique in tensor.cc and tensor_ahwb.cc.
LLM decode benchmarks fill the cache with a predefined number of tokens before starting decoding.
Add logic to drop the offending non-monotonically increasing timestamp in the MicrophoneHelper.
Make packet payload const.
Pass flag to indicate that consuming op may support prepacked GEMM.
Get timestamp from OpenCV VideoCapture after first frame is read.
Update XNNPack and cpuinfo
Update TensorFlow to 2024-07-18.
Remove deprecated TfLiteOperatorCreateWithData function
Add option to use shifted window in SpectrogramCalculator.
Move AhwbUsage struct and helper methods into a separate library.
Make fields in PacketGetter.Pair public.
The GraphProfiler my be destoried before the task executed in the executor.
Introduce flag in MicrophoneHelper to drop non-increasing timestamps.
llm_test - add batch size of 8 for BM_Llm_QCINT8/512/128
Add method to create MP Tensor from TfLite tensor specs
Refactors AHardwareBufferView class to be instantiated with a TensorAhwbUsage pointer.
Refactor LlmBuilder to have one graph
Add expected_seq_len param to ComputeLogits()
Fix mediapipe::file::Exists() for >2GB files on Windows.
Bump XNNPACK and KleidiAI versions.
Update MP demo app to acquire wake lock
Replace mediapipe::StatusOr with absl::StatusOr
Sync on ssbo_writte_ before mapping an AHWB to a CpuReadView.

MediaPipe Tasks update

This section should highlight the changes that are done specifically for any platform and don't propagate to
other platforms.

Android

Bump targetSdkVersion to 34 throughout MediaPipe.

iOS

Updated documentation in iOS audio classifier
Added iOS holistic landmarker to vision framework build
Changed method name in MPPAudioClassifierResult
Added audio classifier options helpers
Added audio classifier result helpers
Added method to create audio record MPPAudioTaskRunner
Removed unused imports in MPPAudioTaskRunner
Added iOS audio embedder result, classifier result, classifier options, embedder options, embedder options helpers, classifier header and embedder result helpers
Add missing argument for num_draft_tokens.

Javascript

Set quantization bits for LoRA weight conversion to match those specified
Warn on adding packets to a closed input stream instead of silently dropping packets.
Enable experimental support for Chromium WGSL subgroups in LLM API, when available.
Support multi-response generation.

Python

Add prompt template to llm bundler.

Bug fixes

class_weights flag cuases a crash for multiclass case

Model Maker changes

Rename old BinaryAUC metric to BinarySparseAUC(used by text_classifier) and create a new BinaryAUC metric which does not expect sparse inputs.
Allow configuration of num_parallel_calls and cycle_length in hparams
Improve python code format.
Use tf.io.gfile.GFile for writing metadata file in image classifier.
Change SparsePrecision metric to BinarySparsePrecision metric, and same for SparseRecall->BinarySparseRecall in the core library. We only care about these metrics in the binary case, so this change makes the metric classnames more accurate for it's intended usage.
Support multilabel model training in text classifier
Create and add metrics for multi-class case
Support a customized best model monitor for multiclass cases

MediaPipe Dependencies

Update WASM files

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MediaPipe v0.10.15

Build changes

Framework and core calculator improvements

MediaPipe Tasks update

Android

iOS

Javascript

Python

Bug fixes

Model Maker changes

MediaPipe Dependencies

MediaPipe v0.10.15

​Build changes

Framework and core calculator improvements

MediaPipe Tasks update

Android

iOS

Javascript

Python

Bug fixes

Model Maker changes

MediaPipe Dependencies

Build changes