默如诉

2014年8月30日

摘要：这一周主要用CUDA实现了BP前馈神经网络，但是一路也遇到了很多问题。1. 批梯度下降时修改权值与偏置时候没有将累积的误差项/偏置项除以总样本数，导致每次修改值远远大于真实值，程序最后全1或全0. 我最后用matlab运行时候，结合李春光老师的神经计算课件找到了这个bug2.CUDA运行多块多线程并... 阅读全文

posted @ 2014-08-30 01:24 默如诉阅读(436) 评论(0) 推荐(0) 编辑

2014年8月24日

20140824

摘要：矩阵转置：__global__ void TransDtD(float*des, float*src, int srcH, int srcW){ int idx = blockIdx.x*blockDim.x + threadIdx.x; //如果srcH*srcW>BLOCK_NUM*THREA... 阅读全文

posted @ 2014-08-24 22:08 默如诉阅读(293) 评论(0) 推荐(0) 编辑

2014年8月18日

【转】Hadooop学习笔记

摘要： http://www.cnblogs.com/zjfstudio/p/3859704.html 阅读全文

posted @ 2014-08-18 08:53 默如诉阅读(102) 评论(0) 推荐(0) 编辑

2014年8月17日

【转】CUDA优化小记录

摘要： http://blog.csdn.net/gamesdev/article/category/1778017处理DATA_SIZE =1048576个随机数（int）数据（4M）的平方和。#define DATA_SIZE 1048576 #define THREAD_NUM 256 如果设置了多... 阅读全文

posted @ 2014-08-17 22:09 默如诉阅读(292) 评论(0) 推荐(0) 编辑

【转】CUDA程序优化要点

摘要： CUDA程序优化应该考虑的点：精度：只在关键步骤使用双精度，其他部分仍然使用单精度浮点以获得指令吞吐量和精度的平衡；目前GPU的单精度性能要远远超过双精度性能，整数乘法、求模、求余等运算的指令吞吐量也较为有限。在科学计算中，由于需要处理的数据量巨大，往往采用双精度或者四精度才能获得可靠的结果，目... 阅读全文

posted @ 2014-08-17 22:07 默如诉阅读(1008) 评论(0) 推荐(0) 编辑

cublas 矩阵相乘API详解

摘要： #include "cuda_runtime.h"#include "device_launch_parameters.h"#include #include #include "cublas_v2.h"void multiCPU(float *c, float *a, float *b, unsi... 阅读全文

posted @ 2014-08-17 00:19 默如诉阅读(1164) 评论(0) 推荐(0) 编辑

2014年8月16日

CUDA 矩阵相乘完整代码

摘要： #include "cuda_runtime.h"#include "device_launch_parameters.h"#include #include #include #include "cublas_v2.h"#define BLOCK_SIZE 16cudaError_t multiC... 阅读全文

posted @ 2014-08-16 21:06 默如诉阅读(1916) 评论(0) 推荐(0) 编辑

CUDA 矩阵相乘

摘要： #include "cuda_runtime.h"#include "device_launch_parameters.h"#include #include #include "cublas_v2.h"#define BLOCK_SIZE 16/***************/用cuBlas的内置... 阅读全文

posted @ 2014-08-16 20:58 默如诉阅读(655) 评论(0) 推荐(0) 编辑

2013年10月24日

二分查找、中位数查找

摘要：注意递归停止的要求，要么中间找到返回m 要么start==end找到返回m，要么还是找不到，返回-1.int Binary_search(int *a,int start,int end,int x){ if(start==end){ if(a[start]==x) return start; else return -1; } int m=(start+end)/2; if(x>a[m]){ Binary_search(a,m+1,end,x); } else if(x... 阅读全文

posted @ 2013-10-24 22:51 默如诉阅读(934) 评论(0) 推荐(0) 编辑

归并排序

摘要：合并两个有序数组并排序：int *combination(int *a,int n1,int *b,int n2){ int *c=new int[n1+n2]; int i=0; int j=0; int count=0; while((i<n1)&&(j<n2)){ if(a[i]<b[j]){ c[count++]=a[i++]; } else{ c[count++]=b[j++]; } } if(count<n1+n2-1){ while(i<n1){ c[count++]=a[i++]; } while... 阅读全文

posted @ 2013-10-24 19:37 默如诉阅读(124) 评论(0) 推荐(0) 编辑

相顾无言以默如诉

公告

默如诉

相顾无言 以默如诉

公告

相顾无言以默如诉