R語言的一大特色是繪制精美的的統(tǒng)計圖,而其中R包ggplot2專為繪圖而生
一起簡單了解一下ggplot2的基本語法
目錄
圖層
一開始先明確ggplot2的繪圖邏輯,和PS類似,采用圖層疊加的方式,不同的圖層用 "+" 相連,多個圖層最終結合成一幅圖
library(ggplot2)
ggplot(data=mtcars,aes(x=wt,y=mpg))
以mtcars為例,以wt為x軸,mpg為y軸用ggplot()
先建立一個最基礎的圖層
ggplot(data=mtcars,aes(x=wt,y=mpg)) + geom_point()
通過 "+" 在基礎圖層上添加上散點(geom_point()
) ,得到一幅簡單的散點圖,后面還能添加更多的圖層得到復雜的圖形
映射
映射即視覺通道映射,通俗來說就是將數據映射到圖形的某一成分中,數據會以指定的形式在圖形中得以呈現(xiàn)
使用到函數aes()
,除了最基礎的x,y軸的映射,還有其他映射類型color,fill,alpha,size,shape,linetype等等
ggplot(data=mtcars,aes(x=wt,y=mpg,color=as.factor(am)))+ geom_point()
x,y軸不變加上顏色映射類型,并傳入因子型的數據,得到了兩種顏色二分類的散點圖
ggplot(data=mtcars,aes(x=wt,y=mpg,shape=as.factor(cyl)))+ geom_point()
x,y軸不變加上形狀映射類型,傳入cyl的三分類數據,得到有三種不同形狀的散點圖
ggplot(data=mtcars,aes(x=wt,y=mpg,size=qsec,alpha=hp))+ geom_point()
也可以同時加上兩種互不干擾的映射,透明度映射和點大小映射
這里只是演示了關于點的映射,相對的還有線、圖形、文本的映射,后面遇到再介紹
幾何圖形和統(tǒng)計變換
通??梢允褂?code style="overflow-wrap: break-word;padding: 2px 4px;border-radius: 4px;margin-right: 2px;margin-left: 2px;background-color: rgba(27, 31, 35, 0.05);font-family: "Operator Mono", Consolas, Monaco, Menlo, monospace;word-break: break-all;color: rgb(239, 112, 96);">geom_類函數來繪制指定的統(tǒng)計圖
圖形 | 函數 |
---|
點圖 | geom_point() |
折線圖 | geom_line() |
箱線圖 | geom_boxplot() |
密度圖 | geom_density() |
柱狀圖 | geom_bar() |
小提琴圖 | geom_violin() |
... | ... |
library(gridExtra)
library(ggplot2)
p1 <- ggplot(mtcars, aes(wt, mpg))+geom_point()
p2 <- ggplot(economics, aes(date, unemploy)) + geom_line()
p3 <- ggplot(mpg, aes(class, hwy))+geom_boxplot()
p4 <- ggplot(diamonds, aes(carat))+geom_density()
p5 <- ggplot(mpg, aes(class))+geom_bar()
p6 <- ggplot(mtcars, aes(mpg, factor(cyl)))+geom_violin()
grid.arrange(p1,p2,p3,p4,p5,p6,nrow=2,ncol=3)
以上為6種常見圖形的實例
以geom_point()
為例,簡單介紹一下參數
##?geom_point##
geom_point(
mapping = NULL,
data = NULL,
stat = "identity",
position = "identity",
...,
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE
)
- mapping、inherit.aes = TRUE 可使用
aes()
指定相關映射
ggplot(mtcars, aes(wt, mpg))+geom_point(aes(shape=factor(cyl)),color="green",size=3)+geom_smooth()
geom_point
里面能重新指定映射(全局和局部的關系),也能添加參數改變圖形屬性,在點圖的基礎上還可以疊加光滑曲線(geom_smooth()
)
geom_
(幾何圖形)和stat_
(統(tǒng)計變換)都能作為一種疊加圖層的方法,且兩者繪圖效果相似。
geom_
側重圖形的繪制,通過參數stat指定統(tǒng)計方法stat_
側重統(tǒng)計變換,通過參數geom指定繪圖類型
兩者能夠相互轉換,在幫助文件上也會同時寫出兩種繪圖方法
geom_bar(
mapping = NULL,
data = NULL,
stat = "count",
position = "stack",
...,
width = NULL,
na.rm = FALSE,
orientation = NA,
show.legend = NA,
inherit.aes = TRUE
)
stat_count(
mapping = NULL,
data = NULL,
geom = "bar",
position = "stack",
...,
width = NULL,
na.rm = FALSE,
orientation = NA,
show.legend = NA,
inherit.aes = TRUE
)
geom_bar
stat = "count"、stat_count
geom = "bar" 兩種方法所得到的柱形圖相同。
geom_bar
,指定count統(tǒng)計變換方法,將數據轉換成頻數從而得到柱形圖stat_count
默認count的統(tǒng)計變換方法,指定bar繪圖方法從而得到柱形圖
標尺(Scale)
之前介紹的圖大都都是按照默認參數生成的,scale_
類函數可以修改圖形的細節(jié)
坐標軸
1、坐標軸刻度及標簽
對于連續(xù)性變量,通常使用函數scale_x_continuous
、scale_y_continuous
scale_x_continuous(
name = waiver(),
breaks = waiver(),
minor_breaks = waiver(),
n.breaks = NULL,
labels = waiver(),
limits = NULL,
expand = waiver(),
oob = censor,
na.value = NA_real_,
trans = "identity",
guide = waiver(),
position = "bottom",
sec.axis = waiver()
)
- name 修改軸標題,使用函數
labs()
也能達到相同效果
library(gridExtra)
p1 <- ggplot(mtcars, aes(wt, mpg))+geom_point()+scale_x_continuous(name="AAA")
p2 <- ggplot(mtcars, aes(wt, mpg))+geom_point()+labs(x="BBB")
grid.arrange(p1,p2,ncol=2)
- breaks 將數據進行指定分組,搭配參數label可以修改組名
ggplot(mpg, aes(displ, hwy))+geom_point()+ scale_x_continuous(breaks = c(2, 4, 6),label = c("two", "four", "six"))
- limits 限定坐標軸的刻度范圍,和函數
xlim
效果一樣
library(gridExtra)
p1 <- ggplot(mtcars, aes(wt, mpg))+geom_point()+scale_x_continuous(name="AAA",limits=c(1,7)) ##限定x軸刻度在1到7
p2 <- ggplot(mtcars, aes(wt, mpg))+geom_point()+xlim(1,8)
grid.arrange(p1,p2,ncol=2)
library(gridExtra)
df <- data.frame(
x = rnorm(10) * 100000,
y = seq(0, 1, length.out = 10)
)
p1 <- ggplot(df, aes(y, x)) + geom_point()+scale_x_continuous(labels = scales::percent,name="percent")
p2 <- ggplot(df, aes(y, x)) + geom_point()+scale_x_continuous(labels = scales::dollar,name="dollar")
grid.arrange(p1,p2,ncol=2)
scales::percent
、scales::dollar
分別指定x軸刻度的類別,分別為百分比和美元
"asn", "atanh", "boxcox", "date", "exp", "hms", "identity", "log", "log10", "log1p", "log2", "logit", "modulus", "probability", "probit", "pseudo_log", "reciprocal", "reverse", "sqrt" , "time"
p1 <- ggplot(mtcars, aes(wt, mpg))+geom_point()+scale_x_continuous(name="None")
p2 <- ggplot(mtcars, aes(wt, mpg))+geom_point()+scale_x_continuous(name="log2",trans="log2")
grid.arrange(p1,p2,ncol=2)
- position 設定坐標軸的位置,x軸 “top”、“bottom” ,y軸 "left"、"right"
對于離散型數據
scale_x_discrete()
、scale_y_discrete()
函數的用法和連續(xù)型變量的用法類似,參數幾乎通用
ggplot(diamonds, aes(cut))+geom_bar()+scale_x_discrete("Cut",labels = c("Fair" = "F","Good" = "G","Very Good" = "VG","Perfect" = "P","Ideal" = "I"))
## 軸標題、刻度標簽替換
對于時間變量
一般使用函數scale_x_date()
、scale_y_date()
scale_x_date(
name = waiver(),
breaks = waiver(),
date_breaks = waiver(),
labels = waiver(),
date_labels = waiver(),
minor_breaks = waiver(),
date_minor_breaks = waiver(),
limits = NULL,
expand = waiver(),
oob = censor,
guide = waiver(),
position = "bottom",
sec.axis = waiver()
)
library(gridExtra)
last_month <- Sys.Date() - 0:29 ## 生成從今天起往前30天的時間序列
df <- data.frame(
date = last_month,
price = runif(30)
) ## 為30個時間序列隨機生成一個對應的值
base <- ggplot(df, aes(date, price)) +
geom_line()
p1 <- base + scale_x_date(date_labels = "%b %d")+labs(title="p1")
p2 <- base + scale_x_date(date_breaks = "1 week", date_labels = "%W")+labs(title="p2")
p3 <- base + scale_x_date(date_minor_breaks = "1 day")+labs(title="p3")
p4 <- base + scale_x_date(limits = c(Sys.Date() - 7, NA))+labs(title="p4")
grid.arrange(p1,p2,p3,p4,ncol=2)
p1中通過參數"date_labels"可以格式化輸出時間序列
p2中通過參數“date_breaks”指定主坐標的間隔
p3中通過參數"date_minor_breaks"指定主坐標間的分隔距離
圖形標題
函數labs()
能為圖形修改或添加各種文字屬性
labs(
...,
title = waiver(),
subtitle = waiver(),
caption = waiver(),
tag = waiver(),
alt = waiver(),
alt_insight = waiver()
)
p <- ggplot(mtcars, aes(mpg, wt, colour = cyl)) + geom_point()
p
p_re <- p+labs(x="XXX",y="YYY",title="title",subtitle="subtitle",
tag="tag",caption="caption",colour="colour",alt="This is alt")
p_re
##############
> get_alt_text(p_re)
[1] "This is alt"
以上兩圖對比,展示出tag、title、subtiltle等的顯示位置
關于參數alt,相當于對圖形變量的描述,不會展示在具體圖形中,需要用函數get_alt_text()
來調用
顏色
1、顏色漸變
scale_colour_gradient(
...,
low = "#132B43",
high = "#56B1F7",
space = "Lab",
na.value = "grey50",
guide = "colourbar",
aesthetics = "colour"
)
- guide 圖例的形式,連續(xù)型“colourbar”、離散型"legend"
- aesthetics 設定顏色映射通道 “fill”、"colour"
ggplot(mpg, aes(displ, hwy, color = hwy))+geom_point()+scale_color_gradient(low = "#132B43", high = "#56B1F7",guide="colourbar")
一幅從"#132B43"到"#56B1F7"的漸變點圖
2、調用調色板顏色
scale_colour_brewer(
...,
type = "seq",
palette = 1,
direction = 1,
aesthetics = "colour"
)
- type seq (sequential)、div (diverging)、qual (qualitative)
- palette 指定調色版,可字符指定調色板,也可數字指定調色板列表中的種類(順序未知)。也可自己創(chuàng)建調色板。
library(RColorBrewer)
display.brewer.all() ## 默認可選調色板類型
- direction 顏色變換方向,1 正向,-1 反向
library(gridExtra)
dsamp <- diamonds[sample(nrow(diamonds), 1000), ]
d <- ggplot(dsamp, aes(carat, price)) + geom_point(aes(colour = clarity)) + labs(title="default")
d1 <- ggplot(dsamp, aes(carat, price)) + geom_point(aes(colour = clarity)) + scale_color_brewer(palette="BuGn") +labs(title="BuGn")
d2 <- ggplot(dsamp, aes(carat, price)) + geom_point(aes(colour = clarity)) + scale_color_brewer(palette="Set1") + labs(title="Set1")
d3 <- ggplot(dsamp, aes(carat, price)) + geom_point(aes(colour = clarity)) + scale_color_brewer(palette="PiYG") + labs(title="PiYG")
grid.arrange(d,d1,d2,d3,nrow=2)
調整映射參數
scale_alpha()
、scale_shape()
、scale_size()
p1 <- ggplot(mpg, aes(displ, hwy)) + geom_point(aes(alpha = year))
p2 <- ggplot(mpg, aes(displ, hwy)) + geom_point(aes(alpha = year)) + scale_alpha(range = c(0.4, 0.8))
grid.arrange(p1,p2)
將透明度映射范圍限定在0.4~0.8
dsmall <- diamonds[sample(nrow(diamonds), 100), ]
d <- ggplot(dsmall, aes(carat, price)) + geom_point(aes(shape = cut))
d1 <- ggplot(dsmall, aes(carat, price)) + geom_point(aes(shape = cut)) + scale_shape(solid=F)
grid.arrange(d,d1)
參數solid可改變點的填充
p1 <- ggplot(mpg, aes(displ, hwy, size = hwy)) + geom_point()
p2 <- ggplot(mpg, aes(displ, hwy, size = hwy)) + geom_point() + scale_size(range=c(0,10))
p3 <- ggplot(mpg, aes(displ, hwy, size = hwy)) + geom_point() + scale_size_binned()
grid.arrange(p1,p2,p3)
參數range指定點的大小范圍
函數scale_size_binned()
使圖例分箱,更易觀察
坐標系
ggplot2中提供了很多修改坐標系的函數
1、coord_cartesian
默認的直角坐標系
coord_cartesian(
xlim = NULL,
ylim = NULL,
expand = TRUE,
default = FALSE,
clip = "on"
)
p1 <- ggplot(mtcars, aes(disp, wt))+geom_point()+geom_smooth()
p2 <- ggplot(mtcars, aes(disp, wt))+geom_point()+geom_smooth()+coord_cartesian(expand=F)
grid.arrange(p1,p2)
- clip 邊界外能否顯示點,默認"on"不顯示,"off"顯示
p1 <- ggplot(mtcars, aes(disp, wt))+geom_point()+geom_smooth()+coord_cartesian(expand=F,clip="off")+labs(title = "clip=\"off\"")
p2 <- ggplot(mtcars, aes(disp, wt))+geom_point()+geom_smooth()+coord_cartesian(expand=F)+labs(title = "clip=\"on\"")
grid.arrange(p1,p2)
2、coord_fixed()
調整x軸與y軸的比例長度
coord_fixed(ratio = 1, xlim = NULL, ylim = NULL, expand = TRUE, clip = "on")
- ratio 默認比例為1,即x軸與y軸上每個長度單位都一一對應
p1 <- ggplot(mtcars, aes(mpg, wt)) + geom_point() + labs(title="default")
p2 <- ggplot(mtcars, aes(mpg, wt)) + geom_point()+coord_fixed() + labs(title="ratio=1")
p3 <- ggplot(mtcars, aes(mpg, wt)) + geom_point()+coord_fixed(ratio=5) + labs(title="ratio=5")
grid.arrange(p1,p2,p3)
3、coord_flip()
x軸,y軸調換
p1 <- ggplot(diamonds, aes(cut, price))+geom_boxplot()
p2 <- ggplot(diamonds, aes(cut, price))+geom_boxplot()+coord_flip()
grid.arrange(p1,p2)
4、coord_polar
極坐標系變換
coord_polar(theta = "x", start = 0, direction = 1, clip = "on")
pie <- ggplot(mtcars, aes(x = factor(1), fill = factor(cyl)))+geom_bar(width = 1)
p1 <- pie + coord_polar()
p2 <- pie + coord_polar(theta="y")
grid.arrange(p1,p2)
通過將柱形圖進行坐標系轉換,并將y值映射就可得到餅圖
- direction 繪制的順序,1順時針,-1逆時針
玫瑰圖的繪制,將柱形圖進行極坐標轉換
p1 <- ggplot(mpg,aes(class,fill=model))+geom_bar()+theme(legend.position="none")
p2 <- ggplot(mpg,aes(class,fill=model))+geom_bar()+coord_polar()+theme(legend.position="none")
grid.arrange(p1,p2)
主題
ggplot2默認出圖是灰底的圖,自帶函數theme_
有已配置好的主題可供選擇,也可使用函數theme()
自定義自己的主題
library(patchwork)
mtcars2 <- within(mtcars, {
vs <- factor(vs, labels = c("V-shaped", "Straight"))
am <- factor(am, labels = c("Automatic", "Manual"))
cyl <- factor(cyl)
gear <- factor(gear)
})
p <- ggplot(mtcars2) + geom_point(aes(x = wt, y = mpg, colour = gear))
p1 <- p+theme_gray()+labs(title="theme_gray")
p2 <- p+theme_bw()+labs(title="theme_bw")
p3 <- p+theme_linedraw()+labs(title="theme_linedraw")
p4 <- p+theme_light()+labs(title="theme_light")
p5 <- p+theme_dark()+labs(title="theme_dark")
p6 <- p+theme_minimal()+labs(title="theme_minimal")
p7 <- p+theme_classic()+labs(title="theme_classic")
p8 <- p+theme_void()+labs(title="theme_void")
p9 <- p+theme_test()+labs(title="theme_test")
(p1 / p4 / p7) | (p2 / p5 / p8) | (p3 / p6 / p9)
以上為9種預設的主題
theme(......)
可自定義的范圍太大了,先挖個坑,之后單獨填
注釋
注釋是一個特殊的圖層,不繼承全局設置,使用函數annotation()
對統(tǒng)計圖進行注釋
annotate(
geom,
x = NULL,
y = NULL,
xmin = NULL,
xmax = NULL,
ymin = NULL,
ymax = NULL,
xend = NULL,
yend = NULL,
...,
na.rm = FALSE
)
- 主要參數是 geom 指定需要添加注釋的類型,如,文字(text)、矩形(rect)等。后面的參數根據 geom 的不同而不同
ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point() +
annotate("text", x = 4, y = 20, label = "text", size = 10, colour = "green")
簡單的向(4,20)處添加文本"text",還可以對文字樣式進行自定義
ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point() +
annotate("rect", xmin = 3, xmax = 4.2, ymin = 12, ymax = 21,alpha = .2,fill="green")
繪制一個矩形,并指定透明度和填充顏色
ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point() +
annotate("segment", x = 2.5, xend = 4, y = 15, yend = 25,colour = "blue",size=1)
"segment"指定兩點繪制線段
geom_()
類函數中,geom_abline()
(指定斜率、截距)、geom_hline()
(繪制橫線)、geom_vline()
(繪制豎線)也能繪制指定直線
圖例
1、連續(xù)型變量
guide_colourbar()
或 guide_colorbar()
guide_colorbar(
title = waiver(),
title.position = NULL,
title.theme = NULL,
title.hjust = NULL,
title.vjust = NULL,
label = TRUE,
label.position = NULL,
label.theme = NULL,
label.hjust = NULL,
label.vjust = NULL,
barwidth = NULL,
barheight = NULL,
nbin = 300,
raster = TRUE,
frame.colour = NULL,
frame.linewidth = 0.5,
frame.linetype = 1,
ticks = TRUE,
ticks.colour = "white",
ticks.linewidth = 0.5,
draw.ulim = TRUE,
draw.llim = TRUE,
direction = NULL,
default.unit = "line",
reverse = FALSE,
order = 0,
available_aes = c("colour", "color", "fill"),
...
)
- barwidth、barheight 調整連續(xù)型圖例的寬度和高度
p <- ggplot(mtcars,aes(drat,mpg,fill=qsec))+geom_point()
p1 <- p + guides(fill = guide_colourbar(title="title",label=F,
title.position="bottom",barwidth=1,
frame.colour = "black",ticks = F))
grid.arrange(p,p1)
使用函數時,需嵌套入函數guides()
且指定映射
2、離散型變量
guide_legend(
title = waiver(),
title.position = NULL,
title.theme = NULL,
title.hjust = NULL,
title.vjust = NULL,
label = TRUE,
label.position = NULL,
label.theme = NULL,
label.hjust = NULL,
label.vjust = NULL,
keywidth = NULL,
keyheight = NULL,
direction = NULL,
default.unit = "line",
override.aes = list(),
nrow = NULL,
ncol = NULL,
byrow = FALSE,
reverse = FALSE,
order = 0,
...
)
keywidth、keyheight 每個離散點外圍框框的大小
p1 <- ggplot(mtcars, aes(drat, mpg, colour = factor(cyl)))+geom_point()
p2 <- ggplot(mtcars, aes(drat, mpg, colour = factor(cyl)))+geom_point()+guides(colour=guide_legend(title = "title",keyheight=2))
grid.arrange(p1,p2)
分面(Facetting)
根據數據的分組信息繪制多幅子圖,做到將高維數據降維表示的目的
函數facet_grid()
、facet_wrap()
兩種方式表示分面
1、facet_grid()
網格狀的分面,指定變量定義行和列
facet_grid(
rows = NULL,
cols = NULL,
scales = "fixed",
space = "fixed",
shrink = TRUE,
labeller = "label_value",
as.table = TRUE,
switch = NULL,
drop = TRUE,
margins = FALSE,
facets = NULL
)
rows、cols 分別指定行列分面對象,搭配函數vars()
使用,或使用簡便格式 行分組變量~列分組變量 ,空著用 . 表示
scales 設置是否共用坐標軸,fixed共用坐標軸、free不共用
p<-ggplot(mpg, aes(cty, hwy)) + geom_point(size=2,alpha=0.4)
p1 <- p + facet_grid(rows=vars(fl))
p2 <- p + facet_grid(.~fl)
p3 <- p + facet_grid(vars(drv),vars(fl))
p4 <- p + facet_grid(drv~fl,scales="free")
grid.arrange(p1,p2,p3,p4,ncol=2)
2、facet_wrap()
先按分組變量生成多個子圖,再按順序排列
簡便格式 ~ 分組變量1 + 分組變量2
facet_wrap(
facets,
nrow = NULL,
ncol = NULL,
scales = "fixed",
shrink = TRUE,
labeller = "label_value",
as.table = TRUE,
switch = NULL,
drop = TRUE,
dir = "h",
strip.position = "top"
)
p <- ggplot(mpg, aes(displ, hwy)) + geom_point()
p1 <- p + facet_wrap(vars(class))
p2 <- p + facet_wrap(vars(class), nrow = 4)
p3 <- p + facet_wrap(~cyl+drv)
p4 <- p + facet_wrap(vars(cyl, drv), labeller = "label_both")
grid.arrange(p1,p2,p3,p4,nrow=2)
基礎語法就先簡單介紹這些。
還有很多ggplot2的細節(jié),比如自定義主題、各種geom_
的具體使用等,等實際用到再記錄一下,或者之后再單獨研究研究。