文章分类 -  R

R语言技术学习交流
摘要:The Stan project for statistical computation has a great collection of curated case studies which anybody can contribute to, maybe even me, I was thin 阅读全文
posted @ 2017-08-15 10:21 payton数据之旅 阅读(188) 评论(0) 推荐(0)
摘要:When I first started using R there was no such thing as the tidyverse. Although some of the tidyverse packages were available independently, I learned 阅读全文
posted @ 2017-08-15 10:13 payton数据之旅 阅读(167) 评论(0) 推荐(0)
摘要:Many introductions to Bayesian analysis use relatively simple didactic examples (e.g. making inference about the probability of success given bernoull 阅读全文
posted @ 2017-08-09 13:40 payton数据之旅 阅读(327) 评论(0) 推荐(0)
摘要:In my last posts ([here](http://flovv.github.io/Logo_detection_deep_learning/ and here, I described how one can detect logos in images with R. The fir 阅读全文
posted @ 2017-08-03 10:38 payton数据之旅 阅读(310) 评论(0) 推荐(0)
摘要:As promised in my last post, here is a short guide with some tips and tricks for building a documentation website for an R package using pkgdown.In th 阅读全文
posted @ 2017-08-02 14:40 payton数据之旅 阅读(225) 评论(0) 推荐(0)
摘要:In this post, I'd like to focus on data munging, e.g. the process of acquiring and arranging data (typically in a tidymanner) prior to data analysis. 阅读全文
posted @ 2017-08-02 14:36 payton数据之旅 阅读(186) 评论(0) 推荐(0)
摘要:In the third part in a series on Tidy Time Series Analysis, we’ll use the runCor function from TTRto investigate rolling (dynamic) correlations. We’ll 阅读全文
posted @ 2017-07-31 10:04 payton数据之旅 阅读(424) 评论(0) 推荐(0)
摘要:For R users, there hasn’t been a production grade solution for deep learning (sorryMXNET). This post introduces the Keras interface for R and how it c 阅读全文
posted @ 2017-06-09 15:51 payton数据之旅 阅读(378) 评论(0) 推荐(0)
摘要:目录: 连接数据库报错:negative length vectors are not allowed 连接数据库报错:first argument is not an open RODBC channel 连接数据库报错:incorrect number of dimensions RStudio 阅读全文
posted @ 2017-05-29 00:45 payton数据之旅 阅读(4464) 评论(0) 推荐(0)
摘要:概述 data.table包是一个超高性能处理包,在数据处理上代码异常简洁,速度非常快。 由于data.table的语法主要基于[],有些用法和基础函数会不一致,所以没有放在前面两个专题中一起讲,而是单独拿出来讲。在这个系列里,我会详细说明data.table和基础函的差异,并系统地讲解data.t 阅读全文
posted @ 2017-05-26 10:27 payton数据之旅 阅读(1309) 评论(0) 推荐(0)
摘要:亚马逊将MXNet指定为官方深度学习平台,1月23日MXNet成为Apache的卵化项目。 无疑,这些将MXNet推向深度学习的热潮中,成为热捧的项目。当然,学习MXNet也是很有必要的。哈哈,加油深度学习。 目前支持以下的语言: Python R C++ Julia Scala Python R 阅读全文
posted @ 2017-04-12 15:38 payton数据之旅 阅读(1196) 评论(0) 推荐(0)
摘要:Increasing amount of data is available on the web. Web scraping is a technique developed to extract data from web pages automatically and transforming 阅读全文
posted @ 2017-04-11 09:58 payton数据之旅 阅读(395) 评论(0) 推荐(0)
摘要:Introduction If you ask any experienced analytics or data science professional, what differentiates a good model from a bad model – chances are that y 阅读全文
posted @ 2017-04-10 09:47 payton数据之旅 阅读(205) 评论(0) 推荐(0)
摘要:Introduction Reinforcement learning has recently gained a great deal of traction in studies that call for human-like learning. In settings where an ex 阅读全文
posted @ 2017-04-10 09:45 payton数据之旅 阅读(179) 评论(0) 推荐(0)
摘要:Introduction If you have spent some time in machine learning and data science, you would have definitely come across imbalanced class distribution. Th 阅读全文
posted @ 2017-03-17 13:49 payton数据之旅 阅读(503) 评论(0) 推荐(0)
摘要:Machine learning uses so called features (i.e. variables or attributes) to generate predictive models. Using a suitable combination of features is ess 阅读全文
posted @ 2017-03-17 10:46 payton数据之旅 阅读(291) 评论(0) 推荐(0)
摘要:This week I want to show how to run machine learning applications on a Spark cluster. I am using the sparklyr package, which provides a handy interfac 阅读全文
posted @ 2017-02-20 11:23 payton数据之旅 阅读(237) 评论(0) 推荐(0)
摘要:Introduction Over the last 12 months, I have been participating in a number of machine learning hackathons on Analytics Vidhya and Kaggle competitions 阅读全文
posted @ 2017-02-15 14:53 payton数据之旅 阅读(464) 评论(0) 推荐(0)
摘要:互动演讲版本请点这里 # dir <- 'F:/Projects/Rpackage/streamlineR' dir <- 'C:/Users/Jianhua/Dropbox/work_doc/Rpackage/streamlineR' knitr::opts_chunk$set(echo = TR 阅读全文
posted @ 2017-01-20 14:03 payton数据之旅 阅读(785) 评论(0) 推荐(0)
摘要:分类模型是数据挖掘中应用非常广泛的算法之一,常用的分类算法有Logistic模型、决策树、随机森林、神经网络、Boosting等。针对同一个数据集,可以有这么多的算法进行分析,那如何评估什么样的模型比较合理呢?本文就讲讲常用的模型验证武器,主要包括混淆矩阵、ROC曲线、提升度、增益法和KS统计量。 阅读全文
posted @ 2016-12-20 20:09 payton数据之旅 阅读(709) 评论(0) 推荐(0)