随笔分类 - R
R语言学习分享
摘要:I want to consider an approach of forecasting I really like and frequently use. It allows to include the promo campaigns (or another activities and ot...
阅读全文
摘要:This is the third post aboutLifeCycle Grids. You can find the first post about the sense of LifeCycle Grids and A-Z process for creating and visualizi...
阅读全文
摘要:We studied a very powerful approach for customer segmentation inthe previous post, which is based on the customer’s lifecycle. We used two metrics:fre...
阅读全文
摘要:I want to share a very powerful approach for customer segmentation in this post. It is based on customer’s lifecycle, specifically onfrequencyandrecen...
阅读全文
摘要:I love interactivepivot tables. That is the number one reason why I keep using spreadsheet software. The ability to look at data quickly in lots of di...
阅读全文
摘要:Targeted learning methods build machine-learning-based estimators of parameters defined as features of the probability distribution of the data, while...
阅读全文
摘要:The Central Limit Theorem (CLT), and the concept of the sampling distribution, are critical for understanding why statistical inference works. There a...
阅读全文
摘要:Machine learning is a branch in computer science that studies the design of algorithms that can learn. Typical machine learning tasks are concept lear...
阅读全文
摘要:As always a more colourful version of this post is available onrpubs.Even if LM are very simple models at the basis of many more complex ones, LM stil...
阅读全文
摘要:What Is It?A hash table, or associative array, is awell known key-value data structure. In R there is no equivalent, but you do have some options. You...
阅读全文
摘要:This is to continue on the topic of using themelt/castfunctions inreshapeto convert between long and wide format of data frame. Here is the example I ...
阅读全文
摘要:Statistical approaches to randomised controlled trial analysisThe statistical approach used in the design and analysis of the vast majority of clinica...
阅读全文
摘要:Today is a good day to start parallelizing your code. I’ve been using the parallel package since its integration with R (v. 2.14.0) and its much easie...
阅读全文
摘要:From original post @http://analyticsblog.mecglobal.it/analytics-tools/bashr/In the world of data analysis, the term automation runs hand in hand with ...
阅读全文
摘要:dplyr 0.4.0January 9, 2015 inUncategorizedI’m very pleased to announce that dplyr 0.4.0 is now available from CRAN. Get the latest version by running:...
阅读全文
摘要:关于分类算法我们之前也讨论过了KNN、决策树、naivebayes、SVM、ANN、logistic回归。关于这么多的分类算法,我们自然需要考虑谁的表现更加的优秀。 既然要对分类算法进行评价,那么我们自然得有评价依据。到目前为止,我们讨论分类的有效性都是基于分类成功率来说的,但是这个指标科学吗...
阅读全文
摘要:This post builds on aprevious post, but can be read and understood independently.As part of my course on statistical learning, we created 3D graphics ...
阅读全文
摘要:(1)C4.5算法的特点为:输入变量(自变量):为分类型变量或连续型变量。输出变量(目标变量):为分类型变量。连续变量处理:N等分离散化。树分枝类型:多分枝。分裂指标:信息增益比率gain ratio(分裂后的目标变量取值变异较小,纯度高)前剪枝:叶节点数是否小于某一阈值。后剪枝:使用置信度法和减少...
阅读全文
摘要:Joanna Zhao’s and Jenny Bryan’sR graph catalogis meant to be a complement to the physical book,Creating More Effective Graphs, but it’s a really nice ...
阅读全文
摘要:In preparation for a R Workgroup meeting, I started thinking about what would be my "Top 5 R Functions". I ruled out the functions for basic mechanics...
阅读全文

浙公网安备 33010602011771号