【原】MicrobiomeStatPlot | 散點(diǎn)圖教程Scatter plot tutorial

宏基因組 2024-12-09 發(fā)布于北京

展開全文

散點(diǎn)圖簡介

散點(diǎn)圖（Scatter Plot）是一種統(tǒng)計圖形，用于展示兩個變量之間的關(guān)系。它通過二維坐標(biāo)系中的點(diǎn)來表示觀察數(shù)據(jù)，每個點(diǎn)的位置由兩個變量的取值決定，橫坐標(biāo)表示自變量（X 軸），縱坐標(biāo)表示因變量（Y 軸）。散點(diǎn)圖的應(yīng)用：相關(guān)性分析,判斷兩個變量之間是否存在某種關(guān)聯(lián)，以及這種關(guān)聯(lián)是正相關(guān)、負(fù)相關(guān)還是無關(guān)。模型擬合,通過觀察散點(diǎn)圖，可以選擇合適的回歸模型（如線性回歸、非線性回歸等）來擬合數(shù)據(jù)。異常檢測,可以通過觀察散點(diǎn)圖中的異常點(diǎn)來檢測潛在的異常數(shù)據(jù)或錯誤觀測值。

標(biāo)簽：#微生物組數(shù)據(jù)分析 #MicrobiomeStatPlot #散點(diǎn)圖 #R語言可視化 #Scatter plot

作者：First draft(初稿)：Defeng Bai(白德鳳)；Proofreading(校對)：Ma Chuang(馬闖) and Jiani Xun(荀佳妮)；Text tutorial(文字教程)：Defeng Bai(白德鳳)

源代碼及測試數(shù)據(jù)鏈接：

https://github.com/YongxinLiu/MicrobiomeStatPlot/項(xiàng)目中目錄 3.Visualization_and_interpretation/ScatterPlot

或公眾號后臺回復(fù)“MicrobiomeStatPlot”領(lǐng)取

散點(diǎn)圖應(yīng)用案例1

這是Patrick Lehodey在2020年發(fā)表于Biogeosciences上的文章，第一作者為Audrey Delpech，題目為：Influence of oceanic conditions in the energy transfer efficiency estimation of a micronekton model. https://bg./articles/17/833/2020/

圖 5 | 顯示了在平面 (a) (??,??)和(b)(??,??)上基于核密度估計的散點(diǎn)圖及邊緣分布，分別對應(yīng)表3中實(shí)驗(yàn)3a、3b、3c 和 3d 的配置。

綠色圓點(diǎn)代表微生物，六邊形代表目標(biāo)基因（調(diào)整后的Benjamini-Hochberg P < .05），六邊形內(nèi)的黑色三角形代表每個基因所涉及的免疫功能。邊的顏色表示微生物節(jié)點(diǎn)和基因節(jié)點(diǎn)的Spearman相關(guān)性。BCR表示B細(xì)胞受體；TCR表示T細(xì)胞受體。

結(jié)果

圖 5 顯示，每種配置的次要變量分布都非常接近，足以進(jìn)行實(shí)驗(yàn)比較，避免了任何交叉相關(guān)的風(fēng)險。

散點(diǎn)圖應(yīng)用案例2

這是Masahira Hattori 在2022年發(fā)表于NC上的文章，第一作者為Suguru Nishijima，題目為：Extensive gut virome variation and its associations with host and environmental factors in a population-level cohort https:///10.1038/s41467-022-32832-W

圖 2 | 從 4198 個人類腸道元基因組中重建的噬菌體基因組概覽

從 4198 個全元基因組數(shù)據(jù)集中重建的噬菌體基因組（n = 4709）的基因組大小和 GC 含量。頂部和右側(cè)的柱狀圖分別描述了基因組大小和 GC 含量的分布情況。

結(jié)果

對序列相似度大于 95% 的 4709條噬菌體序列進(jìn)行聚類，生成了1347個病毒操作分類單元（vOTUs）（相當(dāng)于物種水平）（圖 1a 和補(bǔ)充圖 2a，補(bǔ)充數(shù)據(jù) 4）。

散點(diǎn)圖應(yīng)用案例3

這是Damian R. plichta 在2023年發(fā)表于Nature Microbiology上的文章，第一作者為Joachim Johansen ，題目為：Centenarians have a diverse gut virome with the potential to modulate metabolism and promote healthy lifespan https://www./articles/s41467-024-45793-z

圖 3 | 噬菌體特征與獨(dú)特的百歲老人細(xì)菌群落相關(guān)群落相關(guān)。

一些富集的溫和 vOTUs 的 PtW 豐度（log10 標(biāo)度）與 MSP 宿主豐度呈相關(guān)趨勢，如 E. boltea vOTUs 豐度的 Spearman 等級相關(guān)性為 0.51（P = 4.82 × 10-66）、C. scindens（P = 4.49 × 10-08）和 P. distasonis（P = 6.00 × 10-11）。

結(jié)果

推測這一趨勢可能反映了病毒細(xì)菌宿主的豐度。因此，我們將預(yù)測的溫帶病毒的病毒特征與其預(yù)測的宿主相關(guān)聯(lián)。預(yù)測的溫帶病毒的病毒特征與其預(yù)測的宿主相關(guān)聯(lián)。發(fā)現(xiàn)C.scindens等過量細(xì)菌的特征與相關(guān)病毒有顯著相關(guān)性（Cor = 0.30, P = 4.49 × 10-08）。在 Akkermansia muciniphila中也發(fā)現(xiàn)了類似的趨勢（Cor = 0.25、 P=6.23×10-11）、Enterocloster bolteae（Cor=0.51，P=4.82×10-66）和 Parabacteroides distasonis（Cor = 0.35，P = 6.00 × 10-11）（圖 3d）。

散點(diǎn)圖R語言實(shí)戰(zhàn)

源代碼及測試數(shù)據(jù)鏈接：

https://github.com/YongxinLiu/MicrobiomeStatPlot/

或公眾號后臺回復(fù)“MicrobiomeStatPlot”領(lǐng)取

軟件包安裝

# 基于CRAN安裝R包，檢測沒有則安裝p_list = c("ggplot2", "ggpubr", "ggpmisc", "doBy", "FactoMineR", "factoextra",            "tidyverse", "ggExtra", "vegan", "cowplot", "MASS", "scales","showtext","grid")for(p in p_list){if (!requireNamespace(p)){install.packages(p)}    library(p, character.only = TRUE, quietly = TRUE, warn.conflicts = FALSE)}
# 加載R包 Load the packagesuppressWarnings(suppressMessages(library(ggplot2)))suppressWarnings(suppressMessages(library(ggpubr)))suppressWarnings(suppressMessages(library(ggpmisc)))suppressWarnings(suppressMessages(library(doBy)))suppressWarnings(suppressMessages(library(FactoMineR)))suppressWarnings(suppressMessages(library(factoextra)))suppressWarnings(suppressMessages(library(tidyverse)))suppressWarnings(suppressMessages(library(ggExtra)))suppressWarnings(suppressMessages(library(vegan)))suppressWarnings(suppressMessages(library(cowplot)))suppressWarnings(suppressMessages(library(MASS)))suppressWarnings(suppressMessages(library(scales)))suppressWarnings(suppressMessages(library(showtext)))suppressWarnings(suppressMessages(library(grid)))

實(shí)戰(zhàn)1

散點(diǎn)圖加擬合線

# 讀取數(shù)據(jù)# Load dataScatterplot <- read.table("data/figure1e.txt", header = TRUE, sep = "\t", comment.char = "")colnames(Scatterplot) <- c("PROGENY", "KEGG")
# 散點(diǎn)圖# Plotp1 <- ggplot(Scatterplot, aes(x = PROGENY, y = KEGG)) +  geom_point(size = 3.5, color = "#1F78B4") +    geom_smooth(method = "lm", color = "#E31A1C", size = 1.3, linetype = "dashed") +   stat_cor(method = "pearson",           label.sep = '\n',           p.accuracy = 0.001,           r.digits = 3,           label.x = 9.2,           size = 5, color = "darkred") +  scale_x_continuous(limits = c(8.9, 10.6), expand = c(0, 0)) +  scale_y_continuous(limits = c(6.2, 9.8), expand = c(0, 0)) +  labs(x = "MAPK-PROGENy GES", y = "IL-17 GES") +  ggtitle("Scatter Plot of PROGENy vs IL-17 GES") +   theme_minimal(base_size = 16) +    theme(    axis.text = element_text(color = "black", size = 13),    axis.title = element_text(size = 15, face = "bold"),    plot.title = element_text(size = 17, face = "bold", hjust = 0.5),    panel.border = element_rect(color = "gray", fill = NA, size = 0.8)   )
# 將圖形保存為PDF文件# Save plotggsave("results/scatterplot01.pdf", plot = p1, width = 6, height = 4.5)

實(shí)戰(zhàn)2

散點(diǎn)圖加直方圖

參考：https://mp.weixin.qq.com/s/3wBXO3gHWj2TpgHqJdBm6w https://mp.weixin.qq.com/s/k6XH4KQaJ6BSDwNO_xUx4w

# 創(chuàng)建示例數(shù)據(jù)# Create sample datacovariance <- matrix(c(1, -0.2, -0.2, 1), nrow = 2, byrow = TRUE)data <- mvrnorm(n = 1000, mu = c(0.5, 3), covariance)data <- as.data.frame(data)colnames(data) <- c("Genome size", "GC content")
# 添加分組信息# Add group infodata$group <- "Host unknown"data$group[order(data$`Genome size`)[c(sample(10:30, 10), sample(500:1000, 300))]] <- "Bacteroidetes"data$group[order(data$`Genome size`)[c(sample(30:60, 20), sample(200:800, 300))]] <- "Firmicutes"data$group[order(data$`Genome size`)[c(sample(40:80, 20), sample(100:200, 20))]] <- "Actinobacteria"data$group[order(data$`Genome size`)[sample(1:1000, 10)]] <- "Proteobacteria"
# 設(shè)置顏色# Set colorcolors <- c("Bacteroidetes" = "#EB746A",              "Firmicutes" = "#7AA82C",                "Actinobacteria" = "#1EB5B8",            "Proteobacteria" = "#A07DB7",            "Host unknown" = "#AAAAAA")  
# 散點(diǎn)圖# Plotp21 <- ggplot(data) +  geom_point(aes(`Genome size`, `GC content`, color = group), alpha = 0.8, size = 3.5) +  scale_color_manual(values = colors) +  theme_minimal() +  theme(    legend.position = "top",    legend.title = element_blank(),    axis.text = element_text(size = 12),    axis.title = element_text(size = 14, face = "bold"),    axis.line = element_line(size = 0.6),     panel.grid = element_line(color = "grey90")  ) +  labs(x = "Genome Size", y = "GC Content")
# 上方直方圖# Top bar plotp22 <- ggplot(data) +  geom_histogram(aes(`Genome size`), binwidth = 0.2, fill = "#EB746A", color = "white", size = 0.8) +  theme_minimal() +  theme(    axis.title.x = element_blank(),    axis.text.x = element_blank(),    axis.ticks.x = element_blank(),    axis.line = element_line(size = 0.6),   # 添加刻度線    panel.grid = element_blank()  ) +  ylab("# of vOTUs")
# 右側(cè)直方圖# Right bar plotp23 <- ggplot(data) +  geom_histogram(aes(`GC content`), binwidth = 0.2, fill = "#A07DB7", color = "white", size = 0.8) +  theme_minimal() +  theme(    axis.title.y = element_blank(),    axis.text.y = element_blank(),    axis.ticks.y = element_blank(),    axis.line = element_line(size = 0.6),   # 添加刻度線    panel.grid = element_blank()  ) +  coord_flip() +  ylab("# of vOTUs")
# 拼圖# Patchworkp24 <- ggdraw() +   draw_plot(p21, 0.05, 0, .75, .8) +  draw_plot(p22, 0.05, 0.8, .75, .2) +  draw_plot(p23, 0.8, 0, .2, .8)
# 保存優(yōu)化后的圖像為PDF# Save plotggsave("results/scatter_histogram_high.pdf", plot = p24, height = 6, width = 8)

實(shí)戰(zhàn)3

散點(diǎn)圖加密度圖

# 數(shù)據(jù)為“2.Scatter plot plus histogram 散點(diǎn)圖加直方圖”使用的數(shù)據(jù)# Data inherted from "2.Scatter plot plus histogram 散點(diǎn)圖加直方圖"
# 繪制密度圖# Plot density plotp31 <- ggplot(data, aes(x = `Genome size`, color = group, fill=group)) +  geom_density(alpha = 0.4, size = 1) +  scale_y_continuous(expand = c(0,0))+  scale_color_manual(values = c("#f28c88","#ffbe5d","#9db1c1","#d6d6d6","#77bda1"))+  scale_fill_manual(values =alpha(c("#f28c88","#ffbe5d","#9db1c1","#d6d6d6","#77bda1"),0.6) )+  theme_classic()+  theme(panel.background = element_blank(),        panel.grid.major = element_blank(),        panel.grid.minor = element_blank(),        axis.ticks = element_line(size=0.6,colour = "black"),        axis.text.y = element_text(size=10,colour = "black"),        axis.title.y = element_text(size=16),        axis.title.x = element_blank(),        axis.ticks.x = element_blank(),        axis.text.x = element_blank(),    legend.position = "none"    )#p31
p32 <- ggplot(data, aes(x = `GC content`, color = group, fill=group)) +  geom_density(alpha = 0.4, size = 1) +  scale_y_continuous(expand = c(0,0))+  scale_color_manual(values = c("#f28c88","#ffbe5d","#9db1c1","#d6d6d6","#77bda1"))+  scale_fill_manual(values =alpha(c("#f28c88","#ffbe5d","#9db1c1","#d6d6d6","#77bda1"),0.6) )+  theme_classic()+  theme(panel.background = element_blank(),        panel.grid.major = element_blank(),        panel.grid.minor = element_blank(),        axis.ticks = element_line(size=0.6,colour = "black"),        axis.text.x = element_text(size = 12,colour = "black"),        axis.title.y = element_blank(),        axis.ticks.y = element_blank(),        axis.text.y = element_blank(),    legend.position = "none"    )+coord_flip()#p32
# 拼圖# Patchworkp33 <- ggdraw() +   draw_plot_label("a", size = 20)+  draw_plot(p21, 0.05, 0, .75, .8)+  draw_plot(p31, 0.05, 0.8, .75, .2)+  draw_plot(p32, 0.8, 0, .2, .8)ggsave("results/plot03.pdf", height = 5, width = 8, plot = p33)#p33

實(shí)戰(zhàn)4

散點(diǎn)圖加箱線圖

# 載入數(shù)據(jù)# Load datadistance_mat=read.table(paste0("data/OTUID_beta_diversity_yjb_all.txt"), header=T, row.names=1, sep="\t", comment.char="")distance_mat[1:3, 1:3]metadata=read.table("data/metadata_double_single3.txt", header=T, row.names=1, sep="\t", comment.char="", stringsAsFactors=F)
df <- metadatadf <- df[rownames(df)%in%rownames(distance_mat), ]
# 進(jìn)行PCoA分析(PCoA analysis)# pcoa_result <- cmdscale(distance_mat, eig = TRUE, k = 2)# pcoa_result2 <- pcoa_result[["points"]]
# bray_dis <- vegdist(varespec, method = 'bray')#基于Bray-curtis 距離測算(Based on Bray-Curtis distance measurement)nmds_dis <- metaMDS(distance_mat, k = 2)#NMDS排序計算，一般定義 2 個維度(NMDS sorting calculation generally defines 2 dimensions)#> Run 0 stress 0.2595632 #> Run 1 stress 0.2705372 #> Run 2 stress 0.2719117 #> Run 3 stress 0.2767482 #> Run 4 stress 0.2696427 #> Run 5 stress 0.270412 #> Run 6 stress 0.2639526 #> Run 7 stress 0.276302 #> Run 8 stress 0.2705819 #> Run 9 stress 0.2721413 #> Run 10 stress 0.2796006 #> Run 11 stress 0.2667527 #> Run 12 stress 0.2614544 #> Run 13 stress 0.2842278 #> Run 14 stress 0.2627725 #> Run 15 stress 0.2724872 #> Run 16 stress 0.2630389 #> Run 17 stress 0.2624092 #> Run 18 stress 0.2691989 #> Run 19 stress 0.2830036 #> Run 20 stress 0.2712429 #> *** Best solution was not repeated -- monoMDS stopping criteria:#>      7: no. of iterations >= maxit#>     12: stress ratio > sratmax#>      1: scale factor of the gradient < sfgrminnmds_dis_site <- data.frame(nmds_dis$points)#計算樣點(diǎn)得分(Calculate sample point scores)
df$nmds1 <- nmds_dis_site[, 1]df$nmds2 <- nmds_dis_site[, 2]
# 定義顏色# Define colorcolors <- c("Patients" = "#FF6F61", "Healthy" = "#6B5B95")
# 繪制PCoA散點(diǎn)圖# Draw PCoA scatter plotp41 <- ggplot(df, aes(x = nmds1, y = nmds2, color = Group)) +  geom_point(alpha = 0.7, size = 3) +  stat_ellipse(level = 0.68) +  scale_color_manual(values = colors) +  theme_bw(base_size = 15) +   labs(x = paste0("NMDS1"),       y = paste0("NMDS2")       ) +  theme(    legend.position = "bottom",    legend.title = element_text(size = 14, face = "bold"),    legend.text = element_text(size = 12),    panel.grid.major = element_blank(),    panel.grid.minor = element_blank(),    axis.text = element_text(size = 12),    axis.title = element_text(size = 14, face = "bold"),    plot.title = element_text(size = 16, face = "bold", hjust = 0.5),    plot.margin = unit(c(1, 1, 1, 1), "cm")  )
# 添加邊緣箱線圖# Add marginal box plotsp42 <- ggMarginal(p41, type = "boxplot", margins = "both", groupColour = TRUE, groupFill = TRUE)
# 顯示圖(Show plot)# print(p)ggsave("results/NMDS_plot01.pdf", plot = p42, width = 6, height = 6)

實(shí)戰(zhàn)5

分組散點(diǎn)圖添加誤差棒和平均值

參考：https://mp.weixin.qq.com/s/e6CVI23i-qgWANTaDXAJdw

# 載入數(shù)據(jù)# Load datatest_design <- read.csv("data/test_design.csv",row.names = 1)df2 <- test_design[,c(2,5)]
# 統(tǒng)計平均值# Calculate the average valuemean_df <- aggregate(df2[,2],by=list(df2[,1]),FUN=mean)rownames(mean_df) <- mean_df[,1]
# 統(tǒng)計標(biāo)準(zhǔn)差# Standard deviationsd_df <- aggregate(df2[,2],by=list(df2[,1]),FUN=sd)rownames(sd_df) <- sd_df[,1]sd_df <- sd_df[,-1]
# 計算標(biāo)準(zhǔn)誤# Calculate the standard errorsd_df<-sd_df/2 se<-as.data.frame(sd_df)
# 合并數(shù)據(jù)# Merge datadf1<-cbind(mean_df,se)colnames(df1)<-c("group","mean","se")colnames(df2)<-c("group","Gene_Abundance")
# 繪圖# Plotp51 <- ggplot()+stat_summary(fun = "mean",geom = "crossbar", mapping = aes(x=df1$group,y=df1$mean),                      width=0.65,size=1.0, color = "gray2")+  geom_jitter(data=df2,mapping=aes(x=group,y=Gene_Abundance,fill = group,colour = group,shape = group),              size = 3.5, height = 0.01,width = 0.1)+   scale_color_manual(values = c("#EB746A","#7AA82C","#1EB5B8","#A07DB7"))+   geom_errorbar(data=df1,mapping=aes(x = group,ymin = mean-se, ymax = mean+se), width = 0.2,                  color = "gray2",size=1)+  labs(y="Abundance", x="")+  theme_bw(base_line_size = 1.05,base_rect_size =1.05)+  theme(panel.grid.major=element_blank(),panel.grid.minor=element_blank())+  theme(axis.text=element_text(colour='black',size=12))
# 改變樣式# Change the stylep52 <- ggplot()+stat_summary(fun = "mean",geom = "crossbar", mapping = aes(x=df1$group,y=df1$mean),                      width=0.2,size=0.5, height =0.01 ,color = c("#EB746A","#7AA82C","#1EB5B8","#A07DB7"))+  geom_jitter(data=df2,mapping=aes(x=group,y=Gene_Abundance,fill = group,colour = group),              size = 3.5,height = 0.01,width = 0.1, shape = 20)+   scale_color_manual(values = c("#EB746A","#7AA82C","#1EB5B8","#A07DB7"))+   geom_errorbar(data=df1,mapping=aes(x = group,ymin = mean-se, ymax = mean+se), width = 0.2,                  color = c("#EB746A","#7AA82C","#1EB5B8","#A07DB7"),size=1)+  labs(y="Abundance", x="")+  theme_classic()+  theme(panel.grid.major=element_blank(),panel.grid.minor=element_blank())+  theme(axis.text=element_text(colour='black',size=12))
# 改變樣式# Change the stylep53 <- ggplot(df2, aes(group, Gene_Abundance))+  stat_summary(fun.data="mean_sdl", fun.args = list(mult=1), geom="pointrange", color = "pink",size = 1.2)+  stat_summary(fun="mean", fun.args = list(mult=1), geom="point", color = "white",size = 4)+  geom_jitter(data=df2,mapping=aes(x=group,y=Gene_Abundance,fill = group,colour = group),              size = 3.5,height = 0.01,width = 0.1, shape = 20)+   scale_color_manual(values = c("#EB746A","#7AA82C","#1EB5B8","#A07DB7"))+  labs(y="Abundance", x="")+  theme_classic()
# 添加顯著性標(biāo)記# Add significancep54 <- ggplot()+stat_summary(fun = "mean",geom = "crossbar", mapping = aes(x=df1$group,y=df1$mean),                      width=0.2,size=0.5, height =0.01 ,color = c("#EB746A","#7AA82C","#1EB5B8","#A07DB7"))+  geom_jitter(data=df2,mapping=aes(x=group,y=Gene_Abundance,fill = group,colour = group),              size = 3.5,height = 0.01,width = 0.1)+   scale_color_manual(values = c("#EB746A","#7AA82C","#1EB5B8","#A07DB7"))+   geom_errorbar(data=df1,mapping=aes(x = group,ymin = mean-se, ymax = mean+se), width = 0.2,                  color = c("#EB746A","#7AA82C","#1EB5B8","#A07DB7"),size=1)+  labs(y="Abundance", x="")+  theme_classic()+  scale_y_continuous(limits = c(0, 50))+  geom_signif(data = df2, mapping=aes(x=group,y=Gene_Abundance),               comparisons = list(c("AA", "BB"),                                  c("AA", "CC"),                                 c("AA", "DD")                                 ),              map_signif_level=F,               tip_length=c(0,0,0,0,0,0,0,0,0,0,0,0),               y_position = c(45,40,35),               size=1,              textsize = 4,              test = "wilcox.test"              )ggsave("results/scatter_plot_error_bar.pdf", p54, width = 6, height = 6)

實(shí)戰(zhàn)6

質(zhì)心+誤差棒1

例：Wang, T., et al. (2024). “Divergent age-associated and metabolism-associated gut microbiome signatures modulate cardiovascular disease risk.” Nat Med 30(6): 1722-1731.https:///10.1038/s41591-024-03038-y

參考：https://mp.weixin.qq.com/s/h2EwRvvM7BXVlwpzC7_gNQ

# 讀取數(shù)據(jù)# Load datamydata <- read.csv("data/pca_data.csv", header = TRUE)data1 <- mydata[, 1:10]data <- scale(data1)
# 進(jìn)行PCA分析# Perform PCA analysisres.pca <- PCA(data, graph = FALSE)
# 提取樣本PC1、PC2坐標(biāo)和分組信息# Extract sample PC1, PC2 coordinates and grouping informationresult <- data.frame(res.pca$ind$coord)pc1 <- result[, 1]pc2 <- result[, 2]group <- mydata[, 11]
# 合并數(shù)據(jù)用于繪圖# Combining data for plottingplotdata <- data.frame(pc1 = pc1, pc2 = pc2, group = group)
# 計算均值、標(biāo)準(zhǔn)差、標(biāo)準(zhǔn)誤差和95%置信區(qū)間# Calculation of mean, standard deviation, standard error and 95% confidence intervalse <- function(x) sd(x) / sqrt(length(x))conf_95 <- function(x) t.test(x, conf.level = 0.95)$conf.int[2]
sumdata <- summaryBy(pc1 + pc2 ~ group, data = plotdata, FUN = c(mean, sd, se, conf_95))
# 繪圖# Plotp61 <- ggplot(sumdata, aes(x = pc1.mean, y = pc2.mean, color = group)) +  geom_point(size = 3) +  scale_color_manual(values = c("#1b9e77", "#d95f02", "#7570b3", "#e7298a", "#66a61e", "#e6ab02")) +  #theme_minimal() +  theme_bw()+  #theme_classic()+  labs(x = 'PC1', y = 'PC2') +  theme(    text = element_text(family = "Arial", size = 16),    axis.title = element_text(face = "bold"),    axis.ticks.length = unit(0.2, "cm"),    axis.text = element_text(color = "black"),    legend.position = "right",    legend.title = element_blank()  )
# 添加誤差棒# Add error barsp62 <- p61 +  geom_errorbarh(aes(xmin = pc1.mean - pc1.sd, xmax = pc1.mean + pc1.sd), height = 0.1, size = 0.8) +  geom_errorbar(aes(ymin = pc2.mean - pc2.sd, ymax = pc2.mean + pc2.sd), width = 0.1, size = 0.8)
# 添加誤差棒# Adding error barsp63 <- p61 +  geom_errorbarh(aes(xmin = pc1.mean - pc1.se, xmax = pc1.mean + pc1.se), height = 0.1, size = 0.8) +  geom_errorbar(aes(ymin = pc2.mean - pc2.se, ymax = pc2.mean + pc2.se), width = 0.1, size = 0.8)
# 添加誤差棒#  Add error barsshowtext_auto()p64 <- p61 +  geom_errorbarh(aes(xmin = pc1.mean - pc1.conf_95, xmax = pc1.mean + pc1.conf_95), height = 0.1, size = 0.8) +  geom_errorbar(aes(ymin = pc2.mean - pc2.conf_95, ymax = pc2.mean + pc2.conf_95), width = 0.1, size = 0.8) +  theme(text = element_text(family = "Arial", size = 16)) 
# 保存圖像為PDF文件# Save as PDFggsave("results/Centroid_Error_Bar.pdf", plot = p64, width = 7, height = 4)

實(shí)戰(zhàn)7

質(zhì)心+誤差棒2

參考：https://mp.weixin.qq.com/s/7SHgJXhctRCi4QnHyaAZug

例：P?rn, J., Verhoeven, J.T.A., Butterbach-Bahl, K. et al. Nitrogen-rich organic soils under warm well-drained conditions are global nitrous oxide emission hotspots. Nat Commun 9, 1135 (2018).https:///10.1038/s41467-018-03540-1

# 加載數(shù)據(jù)# Load datadata1 <- read.csv("data/test_data2.csv",header = T)
# 重新計算質(zhì)心數(shù)據(jù)# Recalculate the centre of mass dataX <- aggregate(X ~ group, data1, function(x) c(mean = mean(x), sd = sd(x)))Y <- aggregate(Y ~ group, data1, function(x) c(mean = mean(x), sd = sd(x)))data2 <- data.frame(  group=X$group,  X=X$X[,1],  X_error=X$X[,2],  Y=Y$Y[,1],  Y_error=Y$Y[,2])
# 繪圖# Plotp71 <- ggplot(data2, aes(X, Y)) +  geom_errorbar(aes(xmin = X - X_error, xmax = X + X_error, color = group),                 linewidth = 0.6, width = 0.2, alpha = 0.7) +  geom_errorbar(aes(ymin = Y - Y_error, ymax = Y + Y_error, color = group),                 linewidth = 0.6, width = 0.2, alpha = 0.7) +  geom_point(aes(color = group), size = 3, alpha = 0.8) +  scale_color_manual(values = c("#D55E00", "#0072B2", "#F0E442", "#009E73",                                "#CC79A7", "#56B4E9", "#E69F00", "#999999",                                "#1A85FF", "#FF6666", "#66CC99", "#006400",                                "#8B0000", "#DAA520", "#556B2F", "#FF4500",                                "#4682B4", "#9400D3")) +  geom_smooth(method = "lm", color = "darkblue", fill = "#ADD8E6", se = TRUE,               formula = y ~ x, linetype = "dashed", alpha = 0.3) +  stat_poly_eq(formula = y ~ x,                aes(label = paste(after_stat(eq.label), after_stat(adj.rr.label), sep = "~~~")),                parse = TRUE, size = 4, label.x.npc = "right", label.y.npc = "top") +  labs(x = " category_A", y = " category_B", color = "group") +  #theme_minimal(base_size = 15) +  theme_bw()+  theme(    legend.position = "right",            legend.title = element_text(size = 12, face = "bold"),    legend.text = element_text(size = 10),     axis.title = element_text(size = 16),       axis.text = element_text(size = 14),        panel.grid.major = element_line(size = 0.5, linetype = "dotted", color = "gray80"),      panel.grid.minor = element_blank()        )
# 保存PDF# Save as PDFggsave("results/Centroid_Error_Bar02.pdf", plot = p71, width = 8, height = 6, dpi = 300)

使用此腳本，請引用下文：

Yong-Xin Liu, Lei Chen, Tengfei Ma, Xiaofang Li, Maosheng Zheng, Xin Zhou, Liang Chen, Xubo Qian, Jiao Xi, Hongye Lu, Huiluo Cao, Xiaoya Ma, Bian Bian, Pengfan Zhang, Jiqiu Wu, Ren-You Gan, Baolei Jia, Linyang Sun, Zhicheng Ju, Yunyun Gao, Tao Wen, Tong Chen. 2023. EasyAmplicon: An easy-to-use, open-source, reproducible, and community-based pipeline for amplicon data analysis in microbiome research. iMeta 2: e83. https:///10.1002/imt2.83