在分析物種和基因相關(guān)性,或者分析多界物種與基因或者其它因子的相關(guān)性時(shí),常常用到Spearman相關(guān)性網(wǎng)絡(luò)來展示多個(gè)分組之間的相關(guān)性。這里從實(shí)際案例出發(fā),嘗試用Gephi和R軟件實(shí)現(xiàn)物種和基因的Spearman相關(guān)性網(wǎng)絡(luò)分析。 標(biāo)簽:#微生物組數(shù)據(jù)分析 #MicrobiomeStatPlot #物種和基因Spearman相關(guān)網(wǎng)絡(luò)分析 #R語言可視化 #Species and Gene Spearman Correlation Network Analysis 作者:First draft(初稿):Defeng Bai(白德鳳);Proofreading(校對):Ma Chuang(馬闖) and Jiani Xun(荀佳妮);Text tutorial(文字教程):Defeng Bai(白德鳳) 源代碼及測試數(shù)據(jù)鏈接: https://github.com/YongxinLiu/MicrobiomeStatPlot/項(xiàng)目中目錄 3.Visualization_and_interpretation/SpearmanCorrelationNetworkAnalysis 或公眾號后臺(tái)回復(fù)“MicrobiomeStatPlot”領(lǐng)取 物種和基因Spearman相關(guān)網(wǎng)絡(luò)分析案例 這是來自于上海交通大學(xué)醫(yī)學(xué)院Haoyan Chen和Jie Hong課題組2023年發(fā)表于Cell Host & Microbe上的一篇論文。論文題目為:Multi-kingdom gut microbiota analyses define bacterial-fungal interplay and microbial markers of pan-cancer immunotherapy across cohorts. https:///10.1016/j.chom.2023.10.005 圖 6 | 阻斷反應(yīng)者和無反應(yīng)者中多界標(biāo)記和代謝差異豐度 KO 基因的共現(xiàn)網(wǎng)絡(luò)。 細(xì)菌節(jié)點(diǎn)用綠色表示,真菌節(jié)點(diǎn)用藍(lán)色表示,KO基因用紅色表示。正相關(guān)性用橙色表示,負(fù)相關(guān)性用藍(lán)色表示。 為了探索代謝功能與微生物群之間的關(guān)系,研究人員評估了差異代謝 KO 基因與多界標(biāo)記之間的相關(guān)性。注意到,真菌裂殖酵母 (Schizosaccharomyces octosporus) 是響應(yīng)者多界網(wǎng)絡(luò)的中心 (圖 5B),與 2 個(gè) KO 基因呈正相關(guān)。在無響應(yīng)者中沒有觀察到這種情況 (圖 6C),這表明裂殖酵母的富集及其代謝活動(dòng)可能對響應(yīng)者具有特異性。 源代碼及測試數(shù)據(jù)鏈接: https://github.com/YongxinLiu/MicrobiomeStatPlot/ 或公眾號后臺(tái)回復(fù)“MicrobiomeStatPlot”領(lǐng)取 # 基于CRAN安裝R包,檢測沒有則安裝 p_list = c("igraph","Hmisc","psych","dplyr","tidyr") for(p in p_list){if (!requireNamespace(p)){install.packages(p)} library(p, character.only = TRUE, quietly = TRUE, warn.conflicts = FALSE)}
# 加載R包 Load the package suppressWarnings(suppressMessages(library(igraph))) suppressWarnings(suppressMessages(library(Hmisc))) suppressWarnings(suppressMessages(library(psych))) suppressWarnings(suppressMessages(library(dplyr))) suppressWarnings(suppressMessages(library(tidyr)))
# 載入數(shù)據(jù) # Load data mic <- read.table("data/Species_data.txt", sep="\t", header=T, check.names=F,row.names = 1) mic = apply(mic, 2, function(x) x/100) gene <- read.table("data/KO_data.txt", sep="\t", header=T, check.names=F,row.names = 1) group <- read.table("data/group.txt", sep="\t", header=T, check.names=F) mic <- as.data.frame(t(mic)) mic$sample <- rownames(mic) gene <- as.data.frame(t(gene)) gene$sample <- rownames(gene) df <- merge(mic, gene, by = "sample") rownames(df) <- df$sample df <- df[-1] head(df) # 計(jì)算相關(guān)性并以p>0.05作為篩選閾值進(jìn)行數(shù)據(jù)處理 data<-as.matrix(df) cor<- corr.test(data, method="spearman",adjust="BH") data.cor <- as.data.frame(cor$r) r.cor<-data.frame(cor$r)[91:117,1:90] p.cor<-data.frame(cor$p)[91:117,1:90] r.cor[p.cor>0.05] <- 0 r.cor[abs(r.cor) < 0.3] <- 0 # 構(gòu)建網(wǎng)絡(luò)連接屬性及節(jié)點(diǎn)屬性 # 將數(shù)據(jù)轉(zhuǎn)換為long format進(jìn)行合并并添加連接屬性 r.cor$from = rownames(r.cor) p.cor$from = rownames(p.cor) p_value <- p.cor %>% gather(key = "to", value = "p", -from) %>% data.frame() #p_value$FDR <- p.adjust(p_value$p,"BH") p_value <- p_value[, -3] cor.data<- r.cor %>% gather(key = "to", value = "r", -from) %>% data.frame() %>% left_join(p_value, by=c("from","to")) %>% #diff$p.value <- p.adjust(diff$p.value,"BH") #filter(FDR <= 1e-5, from != to) %>% #filter(p <= 0.001, from != to) %>% mutate( linecolor = ifelse(r > 0,"positive","negative"), linesize = abs(r) ) cor.data <- cor.data[abs(cor.data$r)>0.3, ] write.csv(cor.data, "results/Species_KO_all_correlations_0.2.csv") ###設(shè)置節(jié)點(diǎn)屬性 vertices <- c(as.character(cor.data$from),as.character(cor.data$to)) %>% as_tibble() %>% group_by(value) %>% summarise() colnames(vertices) <- "name" vertices <- vertices %>% left_join(group,by="name") vertices$group <- factor(vertices$group, levels = c("Species","KO" )) vertices <- vertices %>% arrange(group) #構(gòu)建graph數(shù)據(jù)結(jié)構(gòu)并添加網(wǎng)絡(luò)基礎(chǔ)屬性、保存數(shù)據(jù) ###構(gòu)建graph數(shù)據(jù)結(jié)構(gòu) graph <- graph_from_data_frame(cor.data, vertices = vertices, directed = FALSE ) E(graph)$weight <- abs(E(graph)$r) V(graph)$label <- V(graph)$name ###保存數(shù)據(jù) #write_graph(graph, "Healthy_180_net13_new0911.graphml", format="graphml") write_graph(graph, "results/Species_KO_0.2.graphml", format="graphml") # 可視化方式1:基于Gephi軟件進(jìn)行可視化 https://gephi.org/ # 可視化方式2:利用igraph進(jìn)行可視化 g <- graph # 準(zhǔn)備網(wǎng)絡(luò)圖布局?jǐn)?shù)據(jù) # Preparing network diagram layout data。 layout1 <- layout_in_circle(g) layout5 <- layout_with_graphopt(g) ## 設(shè)置繪圖顏色 ## Setting the drawing color color <- c("#879b56","#ce77ad") names(color) <- unique(V(g)$group) V(g)$point.col <- color[match(V(g)$group,names(color))] ## 邊顏色按照相關(guān)性正負(fù)設(shè)置 ## The edge color is set according to the positive or negative correlation #E(g)$color <- ifelse(E(g)$linecolor == "positive","#ff878c",rgb(0,147,0,maxColorValue = 255)) E(g)$color <- ifelse(E(g)$linecolor == "positive","#ff878c","#5ea6c2") pdf("results/network_group_graphopt.pdf",family = "Times",width = 10,height = 12) par(mar=c(5,2,1,2)) plot.igraph(g, layout=layout5, vertex.color=V(g)$point.col, vertex.border=V(g)$point.col, vertex.size=6, vertex.frame.color="white", vertex.label=g$name, vertex.label.cex=0.8, vertex.label.dist=0, vertex.label.degree = pi/2, vertex.label.col="black", edge.arrow.size=0.5, edge.width=abs(E(g)$r)*6, ) # 設(shè)置圖例 legend( title = "group", list(x = min(layout1[,1])-0.05, y = min(layout1[,2])-0.05), legend = c(unique(V(g)$group)), fill = color, #pch=1 ) legend( title = "|r-value|", list(x = min(layout1[,1])+0.6, y = min(layout1[,2])-0.05), legend = c(0.2,0.4,0.6,0.8,1.0), col = "black", lty=1, lwd=c(0.2,0.4,0.6,0.8,1.0)*4, ) legend( title = "Correlation (±)", list(x = min(layout1[,1])+1.0, y = min(layout1[,2])-0.05), legend = c("positive","negative"), col = c("#ff878c",rgb(0,147,0,maxColorValue = 255)), lty=1, lwd=1 ) dev.off() #> png #> 2
Yong-Xin Liu, Lei Chen, Tengfei Ma, Xiaofang Li, Maosheng Zheng, Xin Zhou, Liang Chen, Xubo Qian, Jiao Xi, Hongye Lu, Huiluo Cao, Xiaoya Ma, Bian Bian, Pengfan Zhang, Jiqiu Wu, Ren-You Gan, Baolei Jia, Linyang Sun, Zhicheng Ju, Yunyun Gao, Tao Wen, Tong Chen. 2023. EasyAmplicon: An easy-to-use, open-source, reproducible, and community-based pipeline for amplicon data analysis in microbiome research. iMeta 2: e83. https:///10.1002/imt2.83 Copyright 2016-2024 Defeng Bai baidefeng@caas.cn, Chuang Ma 22720765@stu.ahau.edu.cn, Jiani Xun 15231572937@163.com, Yong-Xin Liu liuyongxin@caas.cn 本公眾號現(xiàn)全面開放投稿,希望文章作者講出自己的科研故事,分享論文的精華與亮點(diǎn)。投稿請聯(lián)系小編(微信號:yongxinliu 或 meta-genomics)
|