forked from xamarin/xamarin-macios
-
Notifications
You must be signed in to change notification settings - Fork 1
MetalPerformanceShaders iOS xcode9 beta4
Sebastien Pouliot edited this page Jul 24, 2017
·
1 revision
#MetalPerformanceShaders.framework
diff -ruN /Applications/Xcode9-beta3.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS.sdk/System/Library/Frameworks/MetalPerformanceShaders.framework/Headers/MetalPerformanceShaders.h /Applications/Xcode9-beta4.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS.sdk/System/Library/Frameworks/MetalPerformanceShaders.framework/Headers/MetalPerformanceShaders.h
--- /Applications/Xcode9-beta3.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS.sdk/System/Library/Frameworks/MetalPerformanceShaders.framework/Headers/MetalPerformanceShaders.h 2017-06-30 01:02:30.000000000 -0400
+++ /Applications/Xcode9-beta4.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS.sdk/System/Library/Frameworks/MetalPerformanceShaders.framework/Headers/MetalPerformanceShaders.h 2017-07-14 04:30:41.000000000 -0400
@@ -56,6 +56,21 @@
* - collection of kernels to implement and run neural networks using previously obtained training data, on the GPU
* - new image processing filters to perform color-conversion and for building a gaussian pyramid
*
+ * In iOS11, MetalPerformanceShaders.framework adds support for the following kernels:
+ * - Image Processing Filters: FindKeypoints, Statistics (Min-Max, Mean-Variance, Mean), Arithmetic Operations, Bilinear scale
+ * Histogram filter takes a minPixelThresholdValue when computing histogram
+ * - Linear Algebra Primitives: Triangular, LU and Cholesky Solvers, LU and Cholesky Decomposition
+ * Support for multiple input types for Matrix-Matrix Multiplication
+ * Matrix-Vector Multiply (gemv)
+ * - Convolution Neural Networks: New Neuron Functions: HardSigmoid, SoftELU, ELU, PReLU, ReLUN
+ * Convolution Transpose, Depth-wise Convolution, Dilated Convolution, Sub-pixel Convolution
+ * Dilated Pooling, Upsampling
+ * - Recurrent Neural Networks
+ * - A neural network graph API that makes it easy to create and execute neural networks on the GPU
+ *
+ * The MetalPerformanceShaders.framework is now also available as API in macOS 10.13. All primitives/filters supported
+ * by the framework in iOS 11 are also avalable on macOS 10.13.
+ *
* @subsection subsection_usingMPS Using MPS
* To use MPS:
* @code
@@ -175,7 +190,7 @@
* label, should one be required. From this are derived the MPSUnaryImageKernel and MPSBinaryImageKernel
* sub-classes which define shared behavior for most image processing kernels (filters) such as
* edging modes, clipping and tiling support for image operations that consume one or two source textures.
- * Neither these or the MPSKernel are typically be used directly. They just provide API abstraction
+ * Neither these or the MPSKernel are typically used directly. They just provide API abstraction
* and in some cases may allow some level of polymorphic manipulation of MPS image kernel objects.
*
* Subclasses of the MPSUnaryImageKernel and MPSBinaryImageKernel provide specialized -init and -encode
@@ -282,22 +297,24 @@
*
* @subsubsection subsubsection_options MPSKernelOptions
* Each MPSKernel takes a MPSKernelOptions bit mask to indicate various options to use when running the filter:
- *
+ * @code
* typedef NS_OPTIONS(NSUInteger, MPSKernelOptions)
*
- * MPSKernelOptionsNone Use default options
+ * MPSKernelOptionsNone
+ * Use default options
*
- * MPSKernelOptionsSkipAPIValidation Do not spend time looking at parameters passed to MPS
- * for errors.
- *
- * MPSKernelOptionsAllowReducedPrecision When possible, MPSKernels use a higher precision data representation internally than
- * the destination storage format to avoid excessive accumulation of computational
- * rounding error in the result. MPSKernelOptionsAllowReducedPrecision advises the
- * MPSKernel that the destination storage format already has too much precision for
- * what is ultimately required downstream, and the MPSKernel may use reduced precision
- * internally when it feels that a less precise result would yield better performance.
- * When enabled, the precision of the result may vary by hardware and operating system.
+ * MPSKernelOptionsSkipAPIValidation
+ * Do not spend time looking at parameters passed to MPS for errors.
*
+ * MPSKernelOptionsAllowReducedPrecision
+ * When possible, MPSKernels use a higher precision data representation internally than
+ * the destination storage format to avoid excessive accumulation of computational
+ * rounding error in the result. MPSKernelOptionsAllowReducedPrecision advises the
+ * MPSKernel that the destination storage format already has too much precision for
+ * what is ultimately required downstream, and the MPSKernel may use reduced precision
+ * internally when it feels that a less precise result would yield better performance.
+ * When enabled, the precision of the result may vary by hardware and operating system.
+ * @endcode
* @section subsection_availableFilters Available MPSKernels
*
* @subsection subsection_convolution Image Convolution
@@ -575,6 +592,8 @@
* MPSCNNNeuronSoftPlus <MPSNeuralNetwork/MPSCNNConvolution.h> A parametric SoftPlus neuron activation function a*log(1+e**(b*x))
* MPSCNNNeuronSoftSign <MPSNeuralNetwork/MPSCNNConvolution.h> A SoftSign neuron activation function x/(1+|x|)
* MPSCNNNeuronELU <MPSNeuralNetwork/MPSCNNConvolution.h> A parametric ELU neuron activation function x<0 ? (a*(e**x-1)) : x
+ * MPSCNNNeuronReLUN <MPSNeuralNetwork/MPSCNNConvolution.h> A rectified linear N neuron activation function min((x >= 0 ? x : a * x), b)
+ * MPSCNNNeuronPReLU <MPSNeuralNetwork/MPSCNNConvolution.h> ReLU, except a different a value is provided for each feature channel
* MPSCNNConvolution <MPSNeuralNetwork/MPSCNNConvolution.h> A 4D convolution tensor
* MPSCNNConvolutionTranspose <MPSNeuralNetwork/MPSCNNConvolution.h> A 4D convolution transpose tensor
* MPSCNNFullyConnected <MPSNeuralNetwork/MPSCNNConvolution.h> A fully connected CNN layer
@@ -611,13 +630,10 @@
* the application can make a large MPSImage or MPSTemporaryImage and fill in parts of it with multiple layers
* (as long as the destination feature channel offset is a multiple of 4).
*
- * The standard MPSCNNConvolution operator also does dilated convolution and sub-pixel convolution. There are
- * also bit-wise convolution operators that can use only a single bit for precision of the weights. The
- * precision of the image can be reduced to 1 bit in this case as well. The bit {0,1} represents {-1,1}.
- *
- * @subsection subsection_RNN Recurrent Neural Networks
- *
- * @subsection subsection_matrix_primitives Matrix Primitives
+ * The standard MPSCNNConvolution operator also does dilated convolution, sub-pixel convolution and
+ * depth-wise convolution. There are also bit-wise convolution operators that can use only a single bit
+ * for precision of the weights. The precision of the image can be reduced to 1 bit in this case as well.
+ * The bit {0,1} represents {-1,1}.
*
* Some CNN Tips:
* - Think carefully about the edge mode requested for pooling layers. The default is clamp to zero, but there
@@ -645,6 +661,11 @@
* - Because MPS encodes its work in place in your MTLCommandBuffer, you always have the option to insert your own
* code in between MPSCNNKernels as a Metal shader for tasks not covered by MPS. You need not use MPS for everything.
*
+ *
+ * @subsection subsection_RNN Recurrent Neural Networks
+ *
+ * @subsection subsection_matrix_primitives Matrix Primitives
+ *
* @section section_validation MPS API validation
* MPS uses the same API validation layer that Metal uses to alert you to API mistakes while
* you are developing your code. While this option is turned on (Xcode: Edit Scheme: options: Metal API Validation),
@@ -869,6 +890,25 @@
* be exceptionally costly because the wait for new work to appear allows the GPU clock to spin down.
* Factor of two or more performance increases are common with -addCompletedHandler:.
*
+ * A graph can also be encoded using the higher level -[MPSNNGraph executeAsyncWithSourceImages:completionHandler:]
+ * which requires minimal experience with Metal. Assuming you have already gotten a list of MPSImages as input
+ * to your graph (typically one), you may use that instead:
+ *
+ * @code
+ * MPSImage * result = [k[0] executeAsyncWithSourceImages: @[image]
+ * completionHandler: ^(MPSImage * __nullable i, NSError * __nullable error ){
+ * if( error)
+ * MyLogError("Error: -computeAsyncWithSourceImages:completionHandler: failed: %s\n\t",
+ * [error.localizedDescription cStringUsingEncoding: NSASCIIStringEncoding],
+ * [error.localizedFailureReason cStringUsingEncoding: NSASCIIStringEncoding]);
+ *
+ * MyProcessResult(i);
+ * }];
+ * @endcode
+ * The image returned directly from the left hand side of -executeAsyncWithSourceImages:completionHandler:
+ * and passed into the completion hander are the same. The contents of the image will be valid once the
+ * completion handler is called.
+ *
* @section subsection_mpsnngraph_sizing MPSNNGraph intermediate image sizing and centering
* The MPSNNGraph will automatically size and center the intermediate images that appear in the graph.
* However, different neural network frameworks do so differently. In addition, some filters may