[Android+NN] RenderScript - Parallel computing framework

需求与应用

Ref: Artistic style transfer & other advanced image editing

涉及到了手机异构编程，貌似从Android 7.0开始支持，有github代码。

当效率遇到瓶颈时，考虑本篇的并行方案做优化。

Healing Brush

Artistic Style Transfer

简介openCL的一些背景知识，虽然android貌似还是倾向于RenderScript。

OpenCL

Ref: https://www.zhihu.com/question/20958771

基于Android的CPU+GPU的异构编程开发，目前主要有以下几种平台：

如果不确定手头的设备是否支持OpenCL，可以使用OpenCL-Z Android进行检测，这款软件可以显示详细的OpenCL的设备信息，同时运行micro-benchmark检测设备的计算能力。下载链接：

Google Play OpenCL-Z - Google Play 上的 Android 应用
豌豆荚：「OpenCL-Z」安卓版免费下载【此链接中有手机GPU支持情况列表】

Ref: http://blog.csdn.net/dj0379/article/details/39484061

在Android上使用OpenCL调用GPU加速

通过Sobel滤波这样的程序来完成基于OpenCL实现的Android平台并行化。

目标：将OpenCL实现的并行算法编译为可以被Android工程调用的libSobelFilter.so（lib***.so均可），然后在程序中调用该文件中的算法实现并行。

并行的效果还可以！

Ref: https://streamcomputing.eu/blog/2014-06-30/opencl-support-recent-android-smartphones/

Is parallel computing possible in mobile hardware like Android or IOS phones?

Most mobile GPU vendors like ARM (Mali), Imagination Tech (PowerVR) and Qualcomm (Adreno) have or are working on embedded OpenCL drivers and tools.

On Android, you can use the NDK, with C++ support to access the OpenCL libraries.

There is also an alternative for Android, which is Renderscript. It's a language developed by Google, so it's embedded in every Android device, and capable of accessing the GPU since version 4.2.2 if I remember correctly.

iOS should be OpenCL capable, since Apple is one of the inventors of the language. I never tried this though. 

Is parallel computing possible in mobile hardware like Android or IOS phones?. Available from: https://www.researchgate.net/post/Is_parallel_computing_possible_in_mobile_hardware_like_Android_or_IOS_phones [accessed Apr 7, 2017].

View Code

其他链接：

查看Android手机与GPU Info

OpenCL-Z

Linux安装opencl：ubuntu14.04+opencl1.1

虽然AMD、Intel、NVIDIA、苹果等支持opencl，但是google好像不太支持opencl。

RenderScript

Introduction

RenderScript is the parallel computing framework which is widely used on image processing related Android Applications.

On the other hand, Deep Neural Net (DNN) based image filters are gaining more and more attention, which traditionally runs on desktops or servers.

With the help of CPU & GPU acceleration of RenderScript, these compute intensive applications are now feasible on mobile devices.

What you will learn

- How to implement Convolutional NeuralNet on top of RenderScript.
- Other advanced image editing implementation like HealingBrush built on top of RenderScript
- Tips to improve performance.
- How to use RenderScript Support Library.

What you will need

- Android Studio 2.2.3 or above
- Build Tools 25.0.3 or above
- Android device running 7.0 (Nougat) or above

BLAS implementation

Make it faster

The implementation still uses the straightforward convolution implementation. It is easy to understand, but the performance is rather slow. Fortunately, we have a much faster way to implement 2D convolution: by turning 2D convolution to matrix-matrix multiplication.

In general, the implementation contains two steps:

1. Convert the input image to a column-matrix, by duplicating the data based on the shape on the convolution kernel. This operation is known as im2col.
2. Multiply the column-matrix with the kernel matrix.

Now let's make the change:

Open Convolution2D.java, and look for " // TODO Step2: Use convolve2DGEMM instead. "

Allocation out_alloc = convolve2D(img_padded, img_h, img_w);
Allocation out_alloc = convolve2DGEMM(img_padded, img_h, img_w);

RenderScript provides high performance BLAS implementation on Android, especially for GEMM (GEneral Matrix Multiplication).

For details, see https://developer.android.com/reference/android/renderscript/ScriptIntrinsicBLAS.html

Summary

As you have already seen, RenderScript is capable of many things, from image processing to deep learning.

As a recap, RenderScript has several strong points:

- Easy to use programming model.
- High performance CPU / GPU acceleration.
- Able to cover back to Android 2.3 with the RenderScript support lib.
- Highly efficient intrinsics: Blur, YuvToRgba, BLAS, etc.

More details can be found on: https://developer.android.com/guide/topics/renderscript/compute.html

What is RenderScript

在Android上要开发opencl，手机端要有libopencl.so文件（也就是opencl驱动）；但是现在android手机端很少有这个文件；原因是虽然AMD、Intel、NVIDIA、苹果等支持opencl，但是google好像不太支持opencl；

在移动端，google有RenderScript（渲染脚本，也是基于异构计算的思想实现的API，优点是跨平台性好，适合各种android操作系统，但是性能比opencl稍差点）；google要推广自己的API的可能性大一些；因此大部分android手机支持RenderScript，却很少有支持opencl的。（网上有一个opencl info 小程序可以判断手机是否支持opencl）

Java:

int width  = mInBitmap.getWidth();
int height = mInBitmap.getHeight();

for (int x = 0; x < width; x++) {
    for (int y = 0; y < height; y++) {
        int color = mInBitmap.getPixel(x, y);
        int r=  255-(Color.red(color) ;
        int g=  255-(Color.green(color) ;
        int b=  255-(Color.blue(color) ;
        int c = Color.rgb(gray, gray, gray);
        mOutBitmap.setPixel(x, y, c);
     }
}

RenderScript:

#pragma version(1)
#pragma rs java_package_name(com.hc.renderscript)

uchar4 __attribute__((kernel)) invert(uchar4 in)
{
  uchar4 out = in;
  out.r = 255-in.r;
  out.g = 255-in.g;
  out.b = 255-in.b;
  return out;
}

提出问题：

tensorflow里的卷积操作在android中不好?

The trivial implementation in RenderScript is already much faster than single threaded C++ / Java code. But in the next steps, we will make convolution even faster (~ 4X or more, actually!).

貌似有了两个卷积操作？

posted @ 2017-08-26 09:55 郝壹贰叁阅读(599) 评论(0) 收藏举报

刷新页面返回顶部

机器学习水很深

We all have two lives. The second one starts when we realize that we only have one. --- Tom Hiddleston