一个草稿

Analysis of Prehistoric Human Remains

Introduction

In this report, we aim to analyze the remains of a prehistoric human population, specifically a single femur bone found during an excavation. The archaeologists have provided information about the average height distribution of males and females from previous findings. Using this data, we will attempt to estimate the sex of the individual based on the length of the femur and provide a measure of confidence in our estimation. Additionally, we will suggest steps that should be taken to confirm the estimate.

Data

The given data is as follows:

  • Average height (in cm) by sex (male or female) is normally distributed.
  • Mean height for males: 167.4 cm
  • Standard deviation for males: 5.5 cm
  • Mean height for females: 155.3 cm
  • Standard deviation for females: 5.2 cm
  • Length of the femur found: 42.8 cm
  • The femur is typically 26.74% of a person's height.

Analysis

To estimate the sex of the individual based on the length of the femur, we need to calculate the height implied by the femur length and compare it with the height distributions of males and females.

# Calculate the implied height from the femur length
implied_height <- 42.8 / 0.2674

# Print the implied height
implied_height

The implied height from the femur length is r round(implied_height, 2) cm.

Next, we need to calculate the z-scores for the implied height with respect to the male and female height distributions. The z-score represents the number of standard deviations the implied height is away from the mean of each distribution.

# Calculate z-scores
z_score_male <- (implied_height - 167.4) / 5.5
z_score_female <- (implied_height - 155.3) / 5.2

# Print z-scores
cat("Z-score for males:", round(z_score_male, 2), "\n")
cat("Z-score for females:", round(z_score_female, 2), "\n")
# Calculate probabilities
prob_male <- pnorm(implied_height, mean = 167.4, sd = 5.5, lower.tail = FALSE)
prob_female <- pnorm(implied_height, mean = 155.3, sd = 5.2, lower.tail = FALSE)

# Print probabilities
cat("Probability of observing this height or greater for males:", round(prob_male, 4), "\n")
cat("Probability of observing this height or greater for females:", round(prob_female, 4), "\n")
W_G <- 0.9
W_L <- 0.1
E_G <- 0.6
E_L <- 0.4

new_genotype <- data.frame(
  WT = c(nrow(subset(genotype, sex == "female" & genotype == "WT")),
         nrow(subset(genotype, sex == "male" & genotype == "WT"))),
  het = c(nrow(subset(genotype, sex == "female" & genotype == "het")),
          nrow(subset(genotype, sex == "male" & genotype == "het"))),
  mut = c(nrow(subset(genotype, sex == "female" & genotype == "mut")),
          nrow(subset(genotype, sex == "male" & genotype == "mut"))),
  row.names = c("female", "male")
)
new_genotype
vmn <- read.csv("vmndata.csv")
anyNA(vmn)
anyDuplicated(vmn)

ggplot(data = vmn, aes(x = hap1, y = hap2, color= type)) + geom_point() + ggtitle("Original classification")

dots <- vmn[,c(2,3)]
library(factoextra)
set.seed(123)
fviz_nbclust(dots, kmeans, method = "wss") +
  geom_vline(xintercept = 4, linetype = 2)
 

$ P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)} $

vmndata <- read.csv("vmndata.csv")
dots <- vmndata[,c(2,3)]


library(factoextra)
set.seed(123)
fviz_nbclust(dots, kmeans, method = "wss") +
  geom_vline(xintercept = 4, linetype = 2)

km_model <- kmeans(dots,5)

km_model



vmndata$cluster <- as.factor(km_model$cluster)
ggplot(data = vmndata, aes(x = hap1, y = hap2, color= cluster)) + geom_point() + ggtitle("Cluter classification")
posted @ 2024-05-30 13:26  人类81  阅读(24)  评论(0)    收藏  举报