小白如何入門Tensorboard！？深度學(xué)習(xí)可視化必備！

LibraryPKU 2018-11-18

展開全文

Brief 概述

在理解了建構(gòu)神經(jīng)網(wǎng)絡(luò)的大致函數(shù)用途，且熟悉了神經(jīng)網(wǎng)絡(luò)原理后，我們已經(jīng)大致具備可以編寫神經(jīng)網(wǎng)絡(luò)的能力了，在涉及比較復(fù)雜的神經(jīng)網(wǎng)絡(luò)結(jié)構(gòu)前，還有兩件重要的事情需要了解，那就是中途存檔和事后讀取的函數(shù)，它攸關(guān)到龐大的算力和時間投入后產(chǎn)出的結(jié)果是否能夠被再次使用，是一個絕對必須弄清楚的環(huán)節(jié)。另外是 Tensorflow 提供的的一個工具名為 Tensorboard，它可以很有效率的為我們呈現(xiàn)數(shù)據(jù)流圖可視化的過程，包含了計(jì)算的結(jié)果和數(shù)據(jù)分布的狀態(tài)，讓我們在尋找錯誤的時候有一個更為清晰的邏輯脈絡(luò)，因此本節(jié)主要圍繞這兩個主題展開：

Checkpoint

tf.train.Saver().save()
tf.train.Saver().restore()

Tensorboard

前者如同會議記錄一般，可以針對性的把訓(xùn)練過程記錄下來，除了避免前功盡棄之外，還可以讓我們有機(jī)會一窺訓(xùn)練過程的究竟，從演變過程中尋找改善算法的方案；而后者提供一個在瀏覽器梳理計(jì)算過程的核心工具，提升了整體的開發(fā)效率與優(yōu)化參數(shù)的過程。

p.s. 關(guān)于設(shè)備如果手邊沒有，非常建議直接使用云端的計(jì)算服務(wù)，如 AWS， FloydHub 等平臺

1. Checkpoint 檢查點(diǎn)

在初期一般訓(xùn)練模型簡單且訓(xùn)練速度極快，對于參數(shù)中間變化的過程我們也不會特別在意，但是到了復(fù)雜的神經(jīng)網(wǎng)絡(luò)訓(xùn)練過程時，為參數(shù)訓(xùn)練過程中途存檔這件事情就會變得非常重要，這就像我們玩電玩游戲闖關(guān)的時候，希望最好能夠中途存檔，如果死在半路上可以直接從存檔的地方恢復(fù)游戲。

1-1. Save checkpoints 儲存檢查點(diǎn)

同理深度學(xué)習(xí)訓(xùn)練過程，一般訓(xùn)練耗費(fèi)時間約為幾天乃至一周，如果中途發(fā)生機(jī)器停機(jī)或是任何意外導(dǎo)致訓(xùn)練終止，我們可以從檢查點(diǎn)記錄的地方重新開始。抑或者如果我們要分析訓(xùn)練過程中參數(shù)的變化走勢，檢查點(diǎn)也非常實(shí)用。使用的類為：

tf.train.Saver(max_to_keep=None) 檔名：「.ckpt」

.Saver({’save_w‘： weight}) 括弧中可以用字典的方式指定只要儲存哪一個參數(shù)
max_to_keep=None: 最多有幾個檢查點(diǎn)被保存下來，如果是 None 或是 0 則表示全保存
keep_checkpoint_every_n_hours=1: 設(shè)置幾個小時保存一次檢查點(diǎn)

變量以二進(jìn)制的方式被存在名為 .ckpt 的檔案中，內(nèi)容包含了變量的名字和對應(yīng)張量的數(shù)值，創(chuàng)建一個該類的示例，就可以呼叫里面儲存與載入儲存文件內(nèi)容的函數(shù)方法:

tf.train.Saver().save(sess, './file_directory', global_step=int(num))

sess: 表示要儲存哪個繪話里面的參數(shù)
'./file_directory/file_name': 儲存的路徑沿著執(zhí)行訓(xùn)練的 .py 文檔路徑位置繼續(xù)指定路徑,如果文件夾不存在指定目錄的話，它會自行創(chuàng)建。官網(wǎng)教程中建議檔名后面連同后綴一起加上，如下代碼…
global_step：指定一個數(shù)字，將一起被納入檢查點(diǎn)文件命名中

!!! 儲存這些參數(shù)的時候特別需要注意申明清楚參數(shù)的數(shù)據(jù)類型非常重要，它攸關(guān)到之后要呼叫回這些參數(shù)的時候是否順利，如果沒有事先申明清楚，大概率上會有錯誤發(fā)生。

下面代碼展示如何保存檢查點(diǎn)：

 1import numpy as np
 2import tensorflow as tf
 3
 4x_data = np.random.rand(100).astype(np.float32)
 5y_data = x_data * 0.1 + 0.3
 6
 7weight = tf.Variable(tf.random_uniform(shape=[1], minval=-1.0, maxval=1.0), 
 8                     dtype=np.float32, name='weight')
 9bias = tf.Variable(tf.zeros(shape=[1]), dtype=np.float32, name='bias')
10y = weight * x_data + bias
11
12loss = tf.reduce_mean(tf.square(y - y_data))
13optimizer = tf.train.GradientDescentOptimizer(0.5)
14training = optimizer.minimize(loss)
15
16sess = tf.Session()
17init = tf.global_variables_initializer()
18sess.run(init)
19
20# The instance is created to call the method saving checkpoint
21saver = tf.train.Saver(max_to_keep=2)
22save_w = tf.train.Saver({'a_name': weight, 'b_name': bias})
23
24for step in range(101):
25    sess.run(training)
26    if step % 10 == 0:
27        print('Round {}, weight: {}, bias: {}'
28              .format(step, sess.run(weight[0]), sess.run(bias[0])))
29        saver.save(sess=sess, save_path='./checkpoint/linear.ckpt', global_step=step)
30        save_w.save(sess=sess, save_path='./weight/linear.ckpt', global_step=step)
31
32saver.save(sess, './checkpoint/linear.ckpt')
33sess.close()

 1Round 0, weight: -0.4444888234138489, bias: 0.8327240943908691
 2Round 10, weight: -0.20793604850769043, bias: 0.46674099564552307
 3Round 20, weight: -0.0472208596765995, bias: 0.37971681356430054
 4Round 30, weight: 0.0296153761446476, bias: 0.3381116986274719
 5Round 40, weight: 0.06634990125894547, bias: 0.3182207942008972
 6Round 50, weight: 0.08391226083040237, bias: 0.3087111711502075
 7Round 60, weight: 0.09230863302946091, bias: 0.30416470766067505
 8Round 70, weight: 0.09632284939289093, bias: 0.3019911050796509
 9Round 80, weight: 0.09824200719594955, bias: 0.30095192790031433
10Round 90, weight: 0.09915953129529953, bias: 0.30045512318611145
11Round 100, weight: 0.09959817677736282, bias: 0.3002175986766815

檢查點(diǎn)的路徑設(shè)置需要使用「./…/…/…」的格式去寫路徑，尤其是開頭的 ./ 必須加上，否則在某些平臺上會出現(xiàn)錯誤，等代碼運(yùn)行完畢后在下面 .py 文檔執(zhí)行路徑下出現(xiàn)我們設(shè)置的儲存文件夾和文件名稱，如下圖：

在默認(rèn)情況下 tf.train.Saver(max_to_keep=5) 是我們無特別設(shè)定的結(jié)果，因此只會保存離最近更新的五個參數(shù)，其他的參數(shù)將即自動刪除。

1-2. Read checkpoints 讀取檢查點(diǎn)

文件存好之后接下來就是讀取上圖中儲存的文件，儲存在文件里面的數(shù)據(jù)是一個原封不動的 tf.Variable() 物件，有著與儲存前一模一樣的名字和屬性，甚至在呼叫回該儲存的變量時也不用初始化，是一個非常全面的保存結(jié)果，只是需要記得：「同樣變量名的物件需要事先存在在代碼中，并且數(shù)據(jù)類型和長相必須一模一樣。」

讀取的方式也很直觀，同樣的創(chuàng)建一個 tf.train.Saver() 示例，并用該示例里面的方法 .restore() 完成讀取，讀取完畢后儲存的參數(shù)就回像起死回生一般重新回到我們的代碼中。

tf.train.Saver().restore(sess, 'file_directory')

sess: 表示我們希望把該儲存的內(nèi)容重新叫回哪一個繪話中
'./file_directory/file_name': 表示我們要呼叫的該存檔文件

p.s. 如果在儲存過程中有加上 global_step 參數(shù)，呼叫文檔名的時候就必須一起把數(shù)字也加上去，如下代碼。

呼叫儲存文件的時候有以下三種情況：

最直接：使用 tf.train.Saver() 創(chuàng)建示例后，呼叫 .restore() 方法配合對應(yīng)名字，成功回到訓(xùn)練中途的記錄
第一個方法受阻：繞道使用 .meta 儲存文件，并使用 tf.import_meta_graph() 示例的 .restore() 方法，同樣可以成功回到訓(xùn)練中途的記錄
呼叫只儲存部分參數(shù)的記錄檔：創(chuàng)建一個示例前先在 tf.train.Saver() 括弧中使用字典形式聲明好當(dāng)時部分儲存的時候?qū)?yīng)一模一樣名字的字典鍵和參數(shù)名，再用 .restore() 方法成功回到訓(xùn)練中途的記錄

詳細(xì)代碼如下演示：

 1import tensorflow as tf
 2
 3# tf.reset_default_graph()
 4weight = tf.Variable([33], dtype=tf.float32, name='weight')
 5bias = tf.Variable([3], dtype=tf.float32, name='bias')
 6
 7saver_1 = tf.train.Saver()
 8# saver = tf.train.import_meta_graph('./checkpoint/linear.ckpt-100.meta')
 9saver_2 = tf.train.Saver({'b_name': bias})#, 'a_name': weight})
10# init = tf.global_variables_initializer()
11
12sess = tf.Session()
13# sess.run(init)
14# path1 = saver_1.restore(sess, save_path=tf.train.latest_checkpoint('./checkpoint'))
15path1 = saver_1.restore(sess, './checkpoint/linear.ckpt-100')
16path2 = saver_2.restore(sess, './weight/linear.ckpt-80')
17print(sess.run(weight))
18print(sess.run(bias))
19sess.close()
20
21
22# print(sess.run(bias))
23
24# ### ----- Result as follow ----- ###
25# FailedPreconditionError: 
26# Attempting to use uninitialized value Variable
27# [[Node: _retval_Variable_0_0 = _Retval[T=DT_FLOAT, index=0, 
28#   _device='/job:localhost/replica:0/task:0/device:CPU:0'](Variable)]]

 1/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/importlib/_bootstrap.py:205: RuntimeWarning: compiletime version 3.5 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.6
 2  return f(*args, **kwds)
 3/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
 4  from ._conv import register_converters as _register_converters
 5
 6
 7INFO:tensorflow:Restoring parameters from ./checkpoint/linear.ckpt-100
 8INFO:tensorflow:Restoring parameters from ./weight/linear.ckpt-80
 9[0.10062683]
10[0.2989355]

可以觀察到，如果沒有成功導(dǎo)入內(nèi)容， sess.run() 執(zhí)行一個參數(shù)的時候就會被通知該參數(shù)沒有初始化，需要特別注意。另外如果重復(fù)導(dǎo)入同樣的值到該代碼中，那么該值以最后一次導(dǎo)入為主，如上面代碼中的 weight，最近導(dǎo)入的 60 個回合訓(xùn)練的 weight 值比訓(xùn)練 90 個回合的 bias 值還要不準(zhǔn)得多。

ValueError: At least two variables have the same name: Variable

花了大把的時間才找出在回傳參數(shù)的時候發(fā)生錯誤的癥結(jié)點(diǎn)，最后原因還是在于 tf.Variable() 的格式?jīng)]有完全一樣，前面只專注在數(shù)據(jù)格式上面，但是其節(jié)點(diǎn)名稱必須也完全一致才可以！如果表明名稱 name='a_name'，那么就都不要寫，如果表明了名稱，那就必須完全一致才行！

下面提煉三個有關(guān)儲存和導(dǎo)出的要點(diǎn)：

儲存的時候官網(wǎng)建議我們呢加上 .ckpt 后綴的目的僅僅只是為了區(qū)隔該儲存文件與別的文件之間的不同，其實(shí)并非真正去修改其后檔名，因此沒有加上此后綴的情況下，儲存的參數(shù)照樣可以被導(dǎo)入
儲存后的文件狀態(tài)有些電腦中是三個，有些是四個，他們的全名甚至沒有與當(dāng)時儲存文件時的名字完全相同，但是實(shí)際上不用擔(dān)心，在回傳參數(shù)的時候只要想著當(dāng)時儲存的檔名是什么即可，完全可以忽視不同檔名的影響
回傳參數(shù)的時候必須與法官一樣嚴(yán)格的審視要被導(dǎo)入的框架容器是否跟當(dāng)時儲存參數(shù)的時候完全一樣，包含命名節(jié)點(diǎn)的名字，和每個節(jié)點(diǎn)設(shè)置的數(shù)據(jù)類型

p.s. 如果是使用 Jupyter Notebook 啟動代碼的話，切記在使用 .restore 回傳參數(shù)之前確定沒有先啟動了訓(xùn)練的過程，需要該變量的值是空的情況下才能順利傳參

Save / Restore Related Useful Functions 好用函數(shù)

1. tf.train.latest_checkpoint('./…/…')

回傳的是該目錄下最近一次被儲存的 checkpoint 文件完整位置，數(shù)據(jù)類型是字符串，此類方法放入的路徑切記是文件夾目錄，而非文件本身的目錄，因此通常只要找到存放儲存點(diǎn)的文件夾目錄，用此方法回傳一個字符串結(jié)果后，放入 .restore() 內(nèi)就可以順利呼叫最新的存檔參數(shù)內(nèi)容。

2. Tensorboard 可視化工具

在使用 Tensorflow 之初，我們首先了解的兩個觀念肯定是節(jié)點(diǎn) node 和邊 edge，借由在這兩者里面添加張量和運(yùn)算單元等方法，我們最終可以驅(qū)動計(jì)算機(jī)完成我們期望的運(yùn)算結(jié)果。然而這些運(yùn)算過程都是在我們腦子里面抽象概念，放到計(jì)算機(jī)中也只是一行一行的代碼，并沒有辦法提供給我們太直觀的感受，而 Tensorboard 就是中間的潤滑劑，它良好的構(gòu)建一個硬邦邦的代碼與直觀感受之間的橋梁，讓如同一個黑箱般的神經(jīng)網(wǎng)絡(luò)運(yùn)算過程透出一線光亮，只要在現(xiàn)有的代碼中添加一些 作用域 Scope 的設(shè)置后，導(dǎo)出文件并使用終端的 Tensorboard 指令執(zhí)行，Tensorflow 就會自動為我們建構(gòu)出一個完整的數(shù)據(jù)布告欄在瀏覽器中使用。

而這些額外添加的代碼這里，目的就是用來給不同的節(jié)點(diǎn)和邊之間設(shè)置各自歸屬的作用域，分為下面兩種：

node names
name scopes

繼續(xù)沿用上面的代碼，加上作用域的方法如下：

 1import numpy as np
 2import tensorflow as tf
 3
 4# Create a graph and set this graph as the default one.
 5# By doing this, the original default graph would not be called
 6graph = tf.Graph()
 7with graph.as_default():
 8    x_data = np.random.rand(100).astype(np.float32)
 9    y_data = x_data * 0.1 + 0.3
10
11    with tf.name_scope('linear'):
12        weight = tf.Variable(tf.random_uniform(shape=[1], minval=-1.0, maxval=1.0), 
13                             dtype=np.float32, name='weight')
14        bias = tf.Variable(tf.zeros(shape=[1]), dtype=np.float32, name='bias')
15        y = weight * x_data + bias
16
17    with tf.name_scope('gradient_descent'):
18        loss = tf.reduce_mean(tf.square(y - y_data), name='loss')
19        optimizer = tf.train.GradientDescentOptimizer(0.5)
20        training = optimizer.minimize(loss)
21        init = tf.global_variables_initializer()
22
23
24sess = tf.Session(graph=graph)
25
26writer = tf.summary.FileWriter('/Users/kcl/Documents/Python_Projects/01_AI_Tutorials/tb')
27writer.add_graph(graph)
28
29sess.run(init)
30
31for step in range(101):
32    sess.run(training)
33
34writer.close()
35sess.close()

 1Round 0, weight: 0.3617558479309082, bias: 0.21680830419063568
 2Round 10, weight: 0.22135570645332336, bias: 0.23605786263942719
 3Round 20, weight: 0.1615721732378006, bias: 0.26755768060684204
 4Round 30, weight: 0.1312398761510849, bias: 0.283539742231369
 5Round 40, weight: 0.11585018038749695, bias: 0.2916485667228699
 6Round 50, weight: 0.10804189741611481, bias: 0.2957627475261688
 7Round 60, weight: 0.1040802150964737, bias: 0.2978501617908478
 8Round 70, weight: 0.10207019001245499, bias: 0.2989092469215393
 9Round 80, weight: 0.10105035454034805, bias: 0.29944658279418945
10Round 90, weight: 0.10053293406963348, bias: 0.2997192144393921
11Round 100, weight: 0.10027039051055908, bias: 0.29985755681991577

代碼的運(yùn)行結(jié)果會在指定的目錄下創(chuàng)建一個文件，該文件只能夠在終端 (Terminal or CMD) 使用下面指令開啟：
tensorboard --logdir='/Users/01_AI_Tutorials/tb' 或者 tensorboard --logdir /Users/01_AI_Tutorials/tb 的方式開啟，之后終端里面會創(chuàng)建一個本地伺服器和一個對應(yīng)網(wǎng)址，用來讓使用者在瀏覽器中開啟 Tensorboard 的頁面，下圖即為開啟上面代碼背后的 Tensorboard 模樣：

在每個 Tensor 后面都可以加上 name 予以名稱，而 tf.name_scope() 則可以把一搓的 Tensors 打包起來成為一個更大的涵蓋范圍。

p.s. 但是值得我們注意的一點(diǎn)，只有屬于 tf 的節(jié)點(diǎn)與運(yùn)算子，或者是跟節(jié)點(diǎn)產(chǎn)生關(guān)聯(lián)的算式才能夠被記錄在 Tensorboard 上面，如果跟 tf 一點(diǎn)關(guān)系都沒有的算式如 x_data，y_data 就不會顯示在數(shù)據(jù)流圖中。

Dive into the Details of Graph 深究細(xì)節(jié)

在建立數(shù)據(jù)流圖的上面過程中，實(shí)際上完整的步驟如下：

創(chuàng)建一個 tf.Graph() 對象
使用對象的方法 .as_default() 設(shè)定該圖為指定的圖
在該圖中設(shè)定需要的節(jié)點(diǎn)內(nèi)容并給予名稱
使用 tf.name_scope() 等方法設(shè)定更大作用域的名稱，名稱注意不能夠有空格
使用 tf.summary.FileWriter() 方法在指定路徑下創(chuàng)建一個文件
把步驟五的方法指向一個對象，然后使用 .add_graph() 添加指定 session 創(chuàng)建好的 graph

完成后，才可以如上述步驟去終端指定路徑下開啟 Tensorboard 觀察數(shù)據(jù)流圖，一般代碼中之所以沒那么復(fù)雜，原因是在創(chuàng)建數(shù)據(jù)流圖的時候，Tensorflow 框架本身已經(jīng)為我們預(yù)設(shè)了一個數(shù)據(jù)流圖，因此省去許多麻煩。

Unveil the default Graph 一探預(yù)設(shè)流圖

1. Graph is the foundation of All

我們在使用 Tensorflow 創(chuàng)建自己的數(shù)據(jù)流圖的時候，都是基于 一張圖 的構(gòu)建的，此圖就如同一張無邊際的畫布讓我們在上面自由的創(chuàng)建張量和運(yùn)算子，即便在我們不特別去設(shè)定一個數(shù)據(jù)流圖的時候，tf 框架本身也已經(jīng)自動幫使用者生成好了，因此可以確保每個節(jié)點(diǎn)和運(yùn)算子都落在這張圖里面。

但是如果如上圖的情況一樣，是我們自己從頭到尾設(shè)定的數(shù)據(jù)流圖作用域，那就必須考慮到是否把全部的節(jié)點(diǎn)和運(yùn)算子全部都建立在圖上，一旦某個東西落在了圖外面，到了執(zhí)行的時候 tf 框架就會不明白該對象是誰家的孩子。

2. Calculation and the Graph

把哪個節(jié)點(diǎn)和運(yùn)算子歸類到了哪個作用域這件事情，說到底其實(shí)也就只是在 Tensorboard 上面呈現(xiàn)的畫面不同而已，并不會對運(yùn)算的功能產(chǎn)生任何影響。

另外，Tensorboard 只會打印數(shù)據(jù)流圖本身的結(jié)構(gòu)，并不會參與 sess.run() 執(zhí)行運(yùn)算的結(jié)果輸出，因此不論儲存文檔的代碼放在 sess.run() 的前還是后，結(jié)果都會是一樣的結(jié)構(gòu)呈現(xiàn)，但并沒有運(yùn)算數(shù)值。

Import Data 數(shù)據(jù)導(dǎo)入

直到目前為止我們在 '圖' 上創(chuàng)建的數(shù)據(jù)流圖也僅僅只是數(shù)據(jù)流圖，并沒有數(shù)據(jù)實(shí)際上在 Tensorboard 中被參與進(jìn)來，把參數(shù)寫入到 Tensorboard 的方法不外乎 tf.summary 系列的函數(shù)。然而，用這些函數(shù)寫入數(shù)據(jù)到 Tensorboard 的過成類似于 sess.run() 的過程，對同一個對象而言一次只會寫一個數(shù)字，完成圖之所以會有一個連貫的結(jié)果，那是因?yàn)槭褂?Python 建構(gòu)一個回圈運(yùn)行出來的結(jié)果導(dǎo)致，系列函數(shù)包含了如下幾個：

tf.summary.scalar()
tf.summary.image()
tf.summary.audio()
tf.summary.histogram()
tf.summary.tensor()

上面陳列方法的參數(shù)項(xiàng)內(nèi)容都是一樣的，皆為 …('a_name/node', object)，第一個參數(shù)是一個字符串，其內(nèi)容是什么，在圖上顯示出來坐標(biāo)的名稱就是什么，而第二項(xiàng)則是一個經(jīng)過運(yùn)算后的對象，誰被放到這里的話，誰就能夠在圖上的坐標(biāo)軸中顯示方程式經(jīng)過訓(xùn)練的變化過程。

面對簡單的數(shù)據(jù)流圖，我們尚且可以數(shù)得出來一共有多少個數(shù)值需要被放入 Tensorboard 中，但是如果是像 Inception 這類的巨大神經(jīng)網(wǎng)絡(luò)，那我們就需要一個工具統(tǒng)合所有需要被寫入的對象，一次性的完成寫入的動作：tf.summary.merge_all()，我們可以進(jìn)一步想像上面陳列的五種方法就像是剪刀，負(fù)責(zé)剪下我們期望截取的數(shù)值，然而一次一次貼上 board 實(shí)在太麻煩，因此先全部把這些剪下來的值融為一體后，再使用 .FileWriter() 方法一次貼上，并以 .add_summary() 方法逐步更新。函數(shù)方法使用方式如下代碼：

 1import numpy as np
 2import tensorflow as tf
 3
 4graph = tf.Graph()
 5with graph.as_default():
 6    x_data = np.random.rand(100).astype(np.float32)
 7    y_data = x_data * 0.1 + 0.3
 8
 9    with tf.name_scope('linear'):
10        weight = tf.Variable(tf.random_uniform(shape=[1], minval=-1.0, maxval=1.0), 
11                             dtype=np.float32, name='weight')
12        # In order to watch the changes of weight, use histogram method
13        tf.summary.histogram('linear_weight', weight)
14
15        bias = tf.Variable(tf.zeros(shape=[1]), dtype=np.float32, name='bias')
16        # Mind that we don't need to assign the method to an object
17        tf.summary.histogram('linear_bias', bias)
18        y = weight * x_data + bias
19
20    with tf.name_scope('gradient_descent'):
21        loss = tf.reduce_mean(tf.square(y - y_data), name='loss')
22        # The changes of loss would be updated in each iteration of optimization
23        tf.summary.scalar('linear_loss', loss)
24        optimizer = tf.train.GradientDescentOptimizer(0.5)
25        training = optimizer.minimize(loss)
26
27    ### --! Mind the graph scope if we are using the graph set by ourselves !-- ###
28    init = tf.global_variables_initializer()
29    # As we can see above, we have total 3 values needed to be merged
30    merge = tf.summary.merge_all()
31
32# If we are using the default graph, 'graph=graph' would not need to be emphasized
33sess = tf.Session(graph=graph)
34
35# A graph should only be created once outside of the for loop
36writer = tf.summary.FileWriter('/Users/kcl/Documents/Python_Projects/01_AI_Tutorials/tb')
37# If the default graph is used, 'graph' should be replaced by 'sess.graph' 
38writer.add_graph(graph)
39
40sess.run(init)
41
42for step in range(101):
43    if step%2 == 0:
44        # In order to update the value throughout training iteration,
45        # we should use add_summary method to record the values into the graph.
46        # But before the recording, we have to merge all summary nodes again first.
47        m = sess.run(merge)
48        writer.add_summary(m, step)
49
50    sess.run(training)
51    if step % 10 == 0:
52        print('Round {}, weight: {}, bias: {}'
53              .format(step, sess.run(weight[0]), sess.run(bias[0])))
54
55writer.close()
56sess.close()

 1Round 0, weight: 0.14186275005340576, bias: 0.3834246098995209
 2Round 10, weight: 0.09872959554195404, bias: 0.3006848394870758
 3Round 20, weight: 0.09932643920183182, bias: 0.3003629744052887
 4Round 30, weight: 0.09964289516210556, bias: 0.3001924455165863
 5Round 40, weight: 0.09981069713830948, bias: 0.30010202527046204
 6Round 50, weight: 0.09989965707063675, bias: 0.30005407333374023
 7Round 60, weight: 0.09994679689407349, bias: 0.3000286817550659
 8Round 70, weight: 0.09997180849313736, bias: 0.3000152111053467
 9Round 80, weight: 0.09998505562543869, bias: 0.30000805854797363
10Round 90, weight: 0.0999920666217804, bias: 0.3000042736530304
11Round 100, weight: 0.09999579191207886, bias: 0.3000022768974304

p.s. 需要非常注意如果是自己定義的 Graph，就必須非常小心作用域錯誤位造成的程序崩潰。

代碼當(dāng)中，我們可以設(shè)定要在經(jīng)歷幾個 step 之后，才 merge 一次上面的所有值并把這些值寫入到文件當(dāng)中，成為 Tensorboard 里面顯示坐標(biāo)圖的精度控制，下面是經(jīng)過不同的 tf.summary 方法回傳值到文件中并用瀏覽器顯示出來的結(jié)果：

利用 Tensorboard 提供的諸多工具可以讓我們在訓(xùn)練模型的過程中更容易找到問題所在，并根據(jù)過程的變化來判斷優(yōu)化模型的方向。此工具是 Google 話費(fèi)大量人力所共同完成的一項(xiàng)厲害的作品，同時也是其他深度學(xué)習(xí)框架中沒有的工具之一，如果使用其他的模型同時需要觀察例如損失函數(shù)的數(shù)值變化，那么使用者只能夠自己從頭建構(gòu)坐標(biāo)軸，使用如 matplotlib 等工具一點(diǎn)一點(diǎn)的把數(shù)值記錄到圖上，但這么一來又將降低代碼的執(zhí)行效率，是一個魚與熊掌的關(guān)系。反觀 Tensorboard 是一個嵌入在 tf 的工具，有非常好的效率和兼容性，值得讓我們花時間一探究竟。

本站是提供個人知識管理的網(wǎng)絡(luò)存儲空間，所有內(nèi)容均由用戶發(fā)布，不代表本站觀點(diǎn)。請注意甄別內(nèi)容中的聯(lián)系方式、誘導(dǎo)購買等信息，謹(jǐn)防詐騙。如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容，請點(diǎn)擊一鍵舉報(bào)。

轉(zhuǎn)藏 分享

QQ空間 QQ好友新浪微博微信

獻(xiàn)花（0） +1

來自： LibraryPKU > 《機(jī)器學(xué)習(xí)》

舉報(bào)/認(rèn)領(lǐng)