諏圖——ggcor簡介（十四）

生物_醫(yī)藥_科研 2020-04-02

展開全文

這一期會很簡單，主要是提供一種mantel檢驗的替代方案——procrutes 檢驗。procrutes 檢驗在圖像識別領域用的比較多，用來比較兩組數(shù)據(jù)之間的相似性，本質(zhì)上說是通過平移、旋轉(zhuǎn)、縮放等一系列變換，使原數(shù)據(jù)集的點匹配到目標數(shù)據(jù)集上，并使其和對應點的誤差平方和最小。和mantel類似，p值也是通過置換檢驗得到的。原理的東西確實懂得不多，只能班門弄斧，強行解釋。

安裝

這部分內(nèi)容應該是0.9.4就加進去了，一直沒有介紹，若是0.9.4版本，可以不用更新。

if(!require(ggcor)) {
  devtools::install_github('houyunhuang/ggcor')
}
## 若之前安裝過老版本，加上`force = TRUE`參數(shù)

packageVersion('ggcor')

## [1] '0.9.5'

檢驗

ggcor并不會像vegan、ade4包一樣，詳細的分析一組procrutes 檢驗的結果，而是用來處理多組之間比較和檢驗。默認情況下，spec整體當成一組，env是每列一組，然后分析每個spec組和每個env組之間的關系，也就是說默認情況下是比較的spec整體和env每個單列之間的相似性。procrutes_test()和mantel_test()函數(shù)之所以有如此變態(tài)的設置，是因為science的那個組合圖，但是這樣也會誤導，讓人覺得這兩種檢驗就該這樣使用。

library(ggcor)
args(procrutes_test)

## function (spec, env, procrutes.fun = 'protest', spec.select = NULL,

## env.select = NULL, spec.pre.fun = 'identity',

## spec.pre.params = list(),

## env.pre.fun = spec.pre.fun,

## env.pre.params = spec.pre.params,

##           ...)
## NULL

data('varespec', package = 'vegan')
data('varechem', package = 'vegan')
procrutes_test(varespec, varechem)

## # A tibble: 14 x 4
##    spec  env          r p.value
##  * <chr> <chr>    <dbl>   <dbl>
##  1 spec  N        0.326   0.072
##  2 spec  P        0.329   0.041
##  3 spec  K        0.301   0.092
##  4 spec  Ca       0.324   0.059
##  5 spec  Mg       0.302   0.087
##  6 spec  S        0.251   0.21
##  7 spec  Al       0.452   0.003
##  8 spec  Fe       0.410   0.01
##  9 spec  Mn       0.443   0.003
## 10 spec  Zn       0.269   0.152
## 11 spec  Mo       0.139   0.76
## 12 spec  Baresoil 0.428   0.006
## 13 spec  Humdepth 0.436   0.008
## 14 spec  pH       0.358   0.016

若我們想自己設置spec中的分組順序，這和mantel_test()完全一致，通過spec.select參數(shù)設置，同樣的道理也適用于env。再次強調(diào)，每個分組都要指定名字，否則內(nèi)部直接自動命名。

## 自定義spec分組
procrutes_test(varespec, varechem, 
               spec.select = list(A = 1:7, B = 8:40))

## # A tibble: 28 x 4
##    spec  env       r p.value
##  * <chr> <chr> <dbl>   <dbl>
##  1 A     N     0.301   0.073
##  2 A     P     0.107   0.926
##  3 A     K     0.260   0.191
##  4 A     Ca    0.219   0.314
##  5 A     Mg    0.205   0.378
##  6 A     S     0.211   0.415
##  7 A     Al    0.392   0.006
##  8 A     Fe    0.309   0.08
##  9 A     Mn    0.202   0.424
## 10 A     Zn    0.104   0.895
## # … with 18 more rows

## 自定義spec和env分組
procrutes_test(varespec, varechem, 
               spec.select = list(A = 1:7, B = 8:40),
               env.select = list(Env01 = 1:7, Env02 = 8:14))

## # A tibble: 4 x 4
##   spec  env       r p.value
## * <chr> <chr> <dbl>   <dbl>
## 1 A     Env01 0.382   0.063
## 2 A     Env02 0.367   0.119
## 3 B     Env01 0.476   0.007
## 4 B     Env02 0.524   0.001

預處理

procrutes檢驗比mantel檢驗復雜的地方是，mantel是先進行距離變換，而procrutes可以做任何變換，只要保證輸出結果是矩陣即可。目前我看到的有主成份降維和monoMDS處理。procrutes_test()函數(shù)提供了非常靈活的接口，支持自定義預處理方式。默認情況下，spec和env的預處理函數(shù)完全一樣，都是identity，即不做任何變換。

我們先來看看主成份變換的例子。dudi_pca()函數(shù)是ggcor封裝的ade4::dudi.pca()函數(shù)，若是要自定義主成份分析參數(shù)，可以通過spec.pre.params處理。

procrutes_test(varespec, varechem, 
               spec.select = list(A = 1:7, B = 8:40),
               env.select = list(Env01 = 1:7, Env02 = 8:14), 
               spec.pre.fun = 'dudi_pca')

## # A tibble: 4 x 4
##   spec  env       r p.value
## * <chr> <chr> <dbl>   <dbl>
## 1 A     Env01 0.454   0.106
## 2 A     Env02 0.445   0.252
## 3 B     Env01 0.496   0.036
## 4 B     Env02 0.557   0.003

## 求pca之前先中心標準化
procrutes_test(varespec, varechem, 
               spec.select = list(A = 1:7, B = 8:40),
               env.select = list(Env01 = 1:7, Env02 = 8:14), 
               spec.pre.fun = 'dudi_pca',
               spec.pre.params = list(center = TRUE))

## # A tibble: 4 x 4
##   spec  env       r p.value
## * <chr> <chr> <dbl>   <dbl>
## 1 A     Env01 0.454   0.096
## 2 A     Env02 0.445   0.245
## 3 B     Env01 0.496   0.044
## 4 B     Env02 0.557   0.001

monoMDS變換也很簡單，只需要調(diào)整下spec.pre.fun參數(shù)就行。

procrutes_test(varespec, varechem, 
               spec.select = list(A = 1:7, B = 8:40),
               env.select = list(Env01 = 1:7, Env02 = 8:14), 
               spec.pre.fun = 'mono_mds')

## # A tibble: 4 x 4
##   spec  env       r p.value
## * <chr> <chr> <dbl>   <dbl>
## 1 A     Env01 0.416   0.029
## 2 A     Env02 0.358   0.091
## 3 B     Env01 0.521   0.006
## 4 B     Env02 0.532   0.002

也有可能，ggcor直接支持的上述三種預處理方式都不能滿足你的需求，那么你需要自定義預處理函數(shù)。這里我們定義一個行極值標準化的預處理函數(shù)作為例子，這個只是表面可以這樣定義，不具備理論意義。

row_scale <- function(x, na.rm = TRUE) {
  if(!is.matrix(x))
    x <- as.matrix(x)
  row.max <- apply(x, 1, max, na.rm = na.rm)
  row.min <- apply(x, 1, min, na.rm = na.rm)
  (x - row.min) / (row.max - row.min)
}

procrutes_test(varespec, varechem, 
               spec.select = list(A = 1:7, B = 8:40),
               env.select = list(Env01 = 1:7, Env02 = 8:14), 
               spec.pre.fun = 'row_scale')

## # A tibble: 4 x 4
##   spec  env       r p.value
## * <chr> <chr> <dbl>   <dbl>
## 1 A     Env01 0.359   0.109
## 2 A     Env02 0.320   0.389
## 3 B     Env01 0.474   0.006
## 4 B     Env02 0.512   0.003

組合圖

既然procrutes檢驗是作為mantel檢驗的替代，那么當然也是可以很方便的來做組合圖的，看個例子，不啰嗦了。

library(ggplot2)
corr <- fortify_cor(varechem, type = 'upper') ## 只能是上下三角 我們保留上三角
procrutes <- procrutes_test(varespec, varechem,
                            spec.select = list(Spec01 = 1:7,
                                               Spec02 = 8:18,
                                               Spec03 = 19:37,
                                               Spec04 = 38:44),
                            spec.pre.fun = 'mono_mds') %>%
  combination_layout(cor_tbl = corr) %>%
  mutate(xend = xend + 1,
         rd = cut(r, breaks = c(-Inf, 0.2, 0.4, Inf),
                  labels = c('< 0.2', '0.2 - 0.4', '>= 0.4')),
         pd = cut(p.value, breaks = c(-Inf, 0.01, 0.05, Inf),
                  labels = c('< 0.01', '0.01 - 0.05', '>= 0.05')))

## 先關閉映射繼承
options(ggcor.link.inherit.aes = FALSE)
quickcor(corr) + geom_square() +
  geom_link(aes(colour = pd, size = rd), data = procrutes,
            curvature = 0.05) +
  geom_link_point(data = procrutes) +
  geom_start_label(aes(x = x - 0.5), hjust = 1, data = procrutes) +
  scale_size_manual(values = c(0.5, 1, 2)) +
  scale_colour_manual(values = c('#D95F02', '#1B9E77', '#A2A2A288')) +
  guides(size = guide_legend(title = 'Procrutes' r',
                             override.aes = list(colour = 'grey35'),
                             order = 2),
         colour = guide_legend(title = 'Procrutes' p',
                               override.aes = list(size = 3),
                               order = 1),
         fill = guide_colorbar(title = 'Pearson's r', order = 3)) +
  expand_axis(x = -6)