bootstrapping

在这段R Markdown代码中，代表bootstrapping思想的代码片段是以下几段：

这部分代码使用bootstrapping方法来估计活性（Active）和抑制（Repressed）状态下ave列的中位数：

active_med <- c()
repress_med <- c()
for (rep in 1:100) {
  active_sample <- sample(active_rep$ave, size = length(active_rep), replace = T)
  repress_sample <- sample(repress_rep$ave,size = length(repress_rep),replace = T)
  active_med <- c(c(active_med),median(active_sample))
  repress_med <- c(c(repress_med),median(repress_sample))
}

这部分代码通过多次随机抽样来估计result中1的数量的分布，并计算其均值和标准差：

num_count <- c()
for (rep in 1:1000) {
  sample_num <- sample(result,276,replace = T)
  num_count <- c(length(sample_num[sample_num==1]),c(num_count))
}

这部分代码使用一个双层循环来为movie数据集中的每个电影计算95%置信区间的上下界，这也是bootstrapping方法的应用：

min_list <- c()
max_list <- c()
for (i in 1:length(movie$students)){
  size0 <- 267 - movie$students[i]
  size1 <- movie$students[i]
  sample0 <- rep(0, size0)
  sample1 <- rep(1, size1)
  result <- c(sample0, sample1)
  num_count <- c()
  for (rep in 1:1000) {
    sample_num <- sample(result,276,replace = T)
    num_count <- c(length(sample_num[sample_num==1]),c(num_count))
  }
  quan <- quantile(num_count,probs = c(0.025,0.975))
  result <- as.matrix(quan)
  min_list <- c(c(min_list),result[1])
  max_list <- c(c(max_list),result[2])
}

Bootstrapping是一种统计方法，它通过从数据集中进行多次随机抽样（有放回），来估计统计量的分布。在上述代码中，这种方法被用来估计中位数、数量的分布以及构建置信区间。

使用bootsrapping进行代替chisq的测试，当chisq不满足

Null hypothesis: there is no difference between the proportion of students who are satisfied with an early or late opening time
Alternative hypothesis: there is a difference between the proportion of students or students prefer a late opening time (this would be the equivalent of a one-tailed test)

可以构建verctors, 然后用bootstrapping绘制出分布和置信区间
bootstrapping 的结果比较的时候，尽量用概率（正则化，两组的总数可能不一样）

for (a in 1:100) { first_sample <-
mean(sample(first_results, length(first_results), replace = T)) second_sample <-
mean(sample(second_results, length(second_results), replace = T))
first_bootstraps <- c(first_bootstraps, first_sample)
second_bootstraps <- c(second_bootstraps, second_sample) 
}
first_upper <- quantile(first_bootstraps, probs = c(0.975)) 
second_lower <- quantile(second_bootstraps, probs = c(0.025)) 

boxplot(
first_bootstraps,
second_bootstraps,
notch = T,
names = c('early', 'late'),
ylab = 'Prop. of satisfied button presses'
)

posted @ 2024-05-30 09:11 chen生信阅读(99) 评论(0) 收藏举报

刷新页面返回顶部

chen-heybro

bootstrapping

使用bootsrapping进行代替chisq的测试，当chisq不满足

公告