awk數(shù)組統(tǒng)計

土心園 2018-07-25

展開全文

處理以下文件內(nèi)容,將域名取出并根據(jù)域名進行計數(shù)排序處理:(百度和sohu面試題)

1 http://www./index.html
2 http://www./1.html
3 http://post./index.html
4 http://mp3./index.html
5 http://www./3.html
6 http://post./2.html

要求結(jié)果：

mp3. 1
post. 2
www. 3

思路：

取出域名
1. 以斜線為菜刀取出第二列（域名）

進行加工
1. 創(chuàng)建一個數(shù)組
2. 把第二列（域名）作為數(shù)組的下標
3. 通過類似于i++的形式進行計算數(shù)量
統(tǒng)計后把結(jié)果輸出

1、查看需要處理的文件

1 [root@martin ~]# cat test.txt 
2 http://www./index.html
3 http://www./1.html
4 http://post./index.html
5 http://mp3./index.html
6 http://www./3.html
7 http://post./2.html

2、以斜線為分割符，取出第二列，+表示連續(xù)的。

1 [root@martin ~]# awk -F "/+" '{print $2}' test.txt 
2 www.
3 www.
4 post.
5 mp3.
6 www.
7 post.

3、創(chuàng)建數(shù)組和進行統(tǒng)計

1 [root@martin ~]# awk -F "/+" '{hotel[$2]}' test.txt             #創(chuàng)建數(shù)組
2 [root@martin ~]# awk -F "/+" '{hotel[$2];print $2}' test.txt    #創(chuàng)建數(shù)組，并通過print 輸出元素名字
3 www.
4 www.
5 post.
6 mp3.
7 www.
8 post.

1 [root@martin ~]# awk -F "/+" '{hotel[$2]++}' test.txt                    #對數(shù)組相同下標的數(shù)組進行計數(shù)統(tǒng)計
2 [root@martin ~]# awk -F "/+" '{hotel[$2]++;print $2,hotel[$2]}' test.txt #通過print輸出元素名字和統(tǒng)計數(shù)
3 www. 1
4 www. 2
5 post. 1
6 mp3. 1
7 www. 3
8 post. 2

$2表示的是每一行的第二列，是一個變量；hotel[$2]++這種形式類似于i++，只不過把變量i換成了數(shù)組hotel[$2]

4、統(tǒng)計完畢后再用for循環(huán)打印輸出數(shù)組不同下表和對應(yīng)統(tǒng)計數(shù)

1 [root@martin ~]# awk -F "/+" '{hotel[$2]++}END{for(pole in hotel) print pole,hotel[pole]}' test.txt
2 mp3. 1
3 post. 2
4 www. 3

1 優(yōu)化顯示，格式化輸出
2 [root@martin ~]# awk -F "/+" '{hotel[$2]++}END{for(pole in hotel) print pole,hotel[pole]}' test.txt|sort -k2|column -t
3 mp3.   1
4 post.  2
5 www.   3

5、統(tǒng)計linux系統(tǒng)的history歷史記錄使用前10的命令

 1 [root@martin ~]# history|awk '{order[$2]++}END{for(n in order) print n,order[n]}'|sort -rnk2|head|column -t
 2 awk                          54
 3 history|awk                  44
 4 [                            22
 5 ll                           19
 6 rpm                          12
 7 yum                          8
 8 w                            6
 9 uname                        6
10 history                      6
11 /etc/rc.d/init.d/keepalived  5