[Android+NN] RenderScript - Parallel computing framework
需求与应用
Ref: Artistic style transfer & other advanced image editing
涉及到了手机异构编程,貌似从Android 7.0开始支持,有github代码。
当效率遇到瓶颈时,考虑本篇的并行方案做优化。
-
Healing Brush

-
Artistic Style Transfer

简介openCL的一些背景知识,虽然android貌似还是倾向于RenderScript。
OpenCL
Ref: https://www.zhihu.com/question/20958771
基于Android的CPU+GPU的异构编程开发,目前主要有以下几种平台:
- Google Play OpenCL-Z - Google Play 上的 Android 应用
- 豌豆荚:「OpenCL-Z」安卓版免费下载【此链接中有手机GPU支持情况列表】

Most mobile GPU vendors like ARM (Mali), Imagination Tech (PowerVR) and Qualcomm (Adreno) have or are working on embedded OpenCL drivers and tools.
On Android, you can use the NDK, with C++ support to access the OpenCL libraries.
There is also an alternative for Android, which is Renderscript. It's a language developed by Google, so it's embedded in every Android device, and capable of accessing the GPU since version 4.2.2 if I remember correctly.
iOS should be OpenCL capable, since Apple is one of the inventors of the language. I never tried this though.
Is parallel computing possible in mobile hardware like Android or IOS phones?. Available from: https://www.researchgate.net/post/Is_parallel_computing_possible_in_mobile_hardware_like_Android_or_IOS_phones [accessed Apr 7, 2017].
其他链接:
查看Android手机与GPU Info虽然AMD、Intel、NVIDIA、苹果等支持opencl,但是google好像不太支持opencl。
RenderScript
Introduction
RenderScript is the parallel computing framework which is widely used on image processing related Android Applications.
On the other hand, Deep Neural Net (DNN) based image filters are gaining more and more attention, which traditionally runs on desktops or servers.
With the help of CPU & GPU acceleration of RenderScript, these compute intensive applications are now feasible on mobile devices.
What you will learn
-
- How to implement Convolutional NeuralNet on top of RenderScript.
- Other advanced image editing implementation like HealingBrush built on top of RenderScript
- Tips to improve performance.
- How to use RenderScript Support Library.
What you will need
-
- Android Studio 2.2.3 or above
- Build Tools 25.0.3 or above
- Android device running 7.0 (Nougat) or above
BLAS implementation
Make it faster
The implementation still uses the straightforward convolution implementation. It is easy to understand, but the performance is rather slow. Fortunately, we have a much faster way to implement 2D convolution: by turning 2D convolution to matrix-matrix multiplication.
In general, the implementation contains two steps:
- Convert the input image to a column-matrix, by duplicating the data based on the shape on the convolution kernel. This operation is known as im2col.
- Multiply the column-matrix with the kernel matrix.
Now let's make the change:
Open Convolution2D.java, and look for " // TODO Step2: Use convolve2DGEMM instead. "
Allocation out_alloc = convolve2D(img_padded, img_h, img_w);
Allocation out_alloc = convolve2DGEMM(img_padded, img_h, img_w);
RenderScript provides high performance BLAS implementation on Android, especially for GEMM (GEneral Matrix Multiplication).
For details, see https://developer.android.com/reference/android/renderscript/ScriptIntrinsicBLAS.html
Summary
As you have already seen, RenderScript is capable of many things, from image processing to deep learning.
As a recap, RenderScript has several strong points:
-
- Easy to use programming model.
- High performance CPU / GPU acceleration.
- Able to cover back to Android 2.3 with the RenderScript support lib.
- Highly efficient intrinsics: Blur, YuvToRgba, BLAS, etc.
More details can be found on: https://developer.android.com/guide/topics/renderscript/compute.html
What is RenderScript
在Android上要开发opencl,手机端要有libopencl.so文件(也就是opencl驱动);但是现在android手机端很少有这个文件;原因是虽然AMD、Intel、NVIDIA、苹果等支持opencl,但是google好像不太支持opencl;
在移动端,google有RenderScript(渲染脚本,也是基于异构计算的思想实现的API,优点是跨平台性好,适合各种android操作系统,但是性能比opencl稍差点);google要推广自己的API的可能性大一些;因此大部分android手机支持RenderScript,却很少有支持opencl的。(网上有一个opencl info 小程序可以判断手机是否支持opencl)
Java:
int width = mInBitmap.getWidth();
int height = mInBitmap.getHeight();
for (int x = 0; x < width; x++) {
for (int y = 0; y < height; y++) {
int color = mInBitmap.getPixel(x, y);
int r= 255-(Color.red(color) ;
int g= 255-(Color.green(color) ;
int b= 255-(Color.blue(color) ;
int c = Color.rgb(gray, gray, gray);
mOutBitmap.setPixel(x, y, c);
}
}
RenderScript:
#pragma version(1)
#pragma rs java_package_name(com.hc.renderscript)
uchar4 __attribute__((kernel)) invert(uchar4 in)
{
uchar4 out = in;
out.r = 255-in.r;
out.g = 255-in.g;
out.b = 255-in.b;
return out;
}
提出问题:
tensorflow里的卷积操作在android中不好?
The trivial implementation in RenderScript is already much faster than single threaded C++ / Java code. But in the next steps, we will make convolution even faster (~ 4X or more, actually!).
貌似有了两个卷积操作?

浙公网安备 33010602011771号