This maintenance release of Intel® Optimizations for TensorFlow* 1.15 UP1 is based on the TensorFlow v1.15.0up1 tag (https://github.com//tensorflow.git) as built with support for oneAPI Deep Neural Network Library (oneDNN). This revision contains the following features and fixes:

New functionality and usability improvements:

• Support for oneDNN version 1.4.0 and integration work with Tensorflow.
• Optimized Bfloat16 data type for MKL backend.
• Add Eigen Bfloat16 vectorization for better performance.
• Compatibility with official TensorFlow 1.15 release unit test pass rate.
• Add comparison and cast op fusion.
• Add Pad+Conv fusion for bf16.
• Replace tensorflow::bfloat16 with Eigen::bfloat16.
• Add MKL support to auto_mixed_precision.
• Adding MklTanh op.
• Threadpool changes for pooling ops.
• Threadpool support for mkl_conv_bwd ops.
• Threadpool support for relu, eltwise and softmax.
• Threadpool api support for misc ops.
• Threadpool support for quantize, dequantize and transpose op.
• Threadpool api implementation for concat and fused batchnorm op.
• Enable DepthwiseConv2D bfloat16 fusions.
• Implement new DNNL1.x MatMul primitive cache.
• Enable BF16 Softmax/SoftmaxGrad.
• Enabling Conv2D bfloat16 fusions.
• Support MatMul fusion for bfloat16 type.
• Enabling conv2D (NCHW format) fusion in grappler remapper.
• Fusing BN and Relu in mkl path.
• Enable DepthwiseConv2D + BiasAdd (+ Relu/Relu6/Elu) fusion.
• Make BFloat16 support for MatMul and BatchMatMul conditionally compatible by removing macros that were guarding DNNLv1.2 specific code.
• Support MKL Quantized Matmul With Bias and Dequantize Op and DNNL 1.0.
• Upgrading RequantizePerChannel Op with API changes in MKLDNN 1.0.
• Changes for DNNL1.x fused_batch_norm and Pooling Ops(Max and Avg).
• Updating QuantizeV2 and Dequantize Ops with API changes in MKLDNN 1.0.
• Upgrading RequantizePerChannel Op with API changes in MKLDNN 1.0.
• DNN1.0 integration - concat op.
• Updating MatMul kernels with MKLDNN 1.x API changes.
• DNNL 1.0 op support for Softmax, Identity_op, and Lrn ops.
• Adding support of Conv backward for DNN 1.0.
• MKL-DNNL v1.0 integration with AddN ops.
• Slice and Reshape op support with MKLDNN 1.0.
• MKL-DNN v1.0 integration with pooling ops.
• Relu op MKL-DNN 1.x integration.
• DNNL1.x integration for tf_conv_ops.h and transpose.cc.
• Add weight cache for FP32 MatMul.
• Use buffer as primitive key.
• Avoid unnecessary data reorders.
• Create a partial key for output_scale.
• matmul,qmatmul and fusedops support for threadpool api.
• Batch Matmul enhancements.
• Remove duplicate registration for softmax bf16 op.
• Optimization for MirrorPad op.
• Adding BFloat16 unit tests for MKL layout pass.
• Add FP32 fusion of MatMul and Relu.
• Transpose + Maxpool3D + Transpose fusion.
• Reimplement CompareMklDnnLayouts.
• Enable TF_NUM_INTEROP_THREADS for MKL-DNN backend.
• Reuse input tensor in mkl conv2d.
• Conditionally enabling bfloat16.
• Add primitive cache for mkl concat.
• Reverting bias cache optimization.
• Add primitive cache for mkl softmax.
• Enable FP32 FusedMatMul for MKL-DNN.
• Supporting MatMul, Transpose and Softmax with BFloat16 type.
• Add support for Addv2.
• Integrated MKL input conversion op with MKL-DNN v1.x.
• Update Keras LayerNorm to use FusedBatchNorm.

Bug fixs:

• Fix 3 unit test failures
//tensorflow/python/kernel_tests:svd_op_test
//tensorflow/python:layers_normalization_test
//tensorflow/python/ops/parallel_for:gradients_test
• Fix bug in MKlConcat.
• Fix bug MKlMaxPoolGrad.
• Fix bfloat16 build failure in MklRelu.
• Fix dequantize op regression issue.
• Fix incorrect DNNL1.2 integration in pooling backprop.
• Fix bfloat16 integration in MatMul and BatchMatMul.
• Fix a bug in MKL Concat op.
• Fix performance regression in DNNL 1 due to lack of primitive cache on reorder
• Fix build error.
• Fix compilation for DNNL 1.0.
• Fix Shape compilation issue in MKL build.
• Fix a bug in Elu Op.
• Fix MatMul and Elu fusion issue.
• Fix memory leak.
• Fix Eigen related compilation error.
• Fix ut check_numerics_callback_test.
• Fix bias cache accuracy issue.
• Fix missing libiomp5 issue due to missing deps.
• Fix dequantize accuracy issue and re-enable this op.
• Fix quantize accuracy loss.
• Fix spurious omp thread spawning.

Additional security and performance patches:

• Upgrade Sqlite3 to 3.33.0 to fix CVE-2020-11656
• Upgrade curl version to 7.71.1 to fix CVE-2019-15601
• Remove fft2d.

Known issues:

• 2 unit test failure as below are left, which are same with Tensorflow 2.3 branch.
//tensorflow/python/kernel_tests:relu_op_test
//tensorflow/python/debug:analyzer_cli_test

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Intel® Optimizations for TensorFlow* 1.15.0 UP1

New functionality and usability improvements:

Bug fixs:

Additional security and performance patches:

Known issues: