將Keras作為tensorflow的精簡(jiǎn)接口

田杰4 2017-01-26

展開全文

將Keras作為tensorflow的精簡(jiǎn)接口

文章信息

本文地址：https://blog./keras-as-a-simplified-interface-to-tensorflow-tutorial.html

本文作者：Francois Chollet

使用Keras作為TensorFlow工作流的一部分

如果Tensorflow是你的首選框架，并且你想找一個(gè)簡(jiǎn)化的、高層的模型定義接口來讓自己活的不那么累，那么這篇文章就是給你看的

Keras的層和模型與純TensorFlow的tensor完全兼容，因此，Keras可以作為TensorFlow的模型定義，甚至可以與其他TensoFlow庫(kù)協(xié)同工作。

注意，本文假定你已經(jīng)把Keras配置為tensorflow后端，如果你不懂怎么配置，請(qǐng)查看這里

keras-tensorflow-logo

在tensorflow中調(diào)用Keras層

讓我們以一個(gè)簡(jiǎn)單的例子開始：MNIST數(shù)字分類。我們將以Keras的全連接層堆疊構(gòu)造一個(gè)TensorFlow的分類器，

import tensorflow as tfsess = tf.Session()from keras import backend as KK.set_session(sess)

然后，我們開始用tensorflow構(gòu)建模型：

# this placeholder will contain our input digits, as flat vectorsimg = tf.placeholder(tf.float32, shape=(None, 784))

用Keras可以加速模型的定義過程：

from keras.layers import Dense# Keras layers can be called on TensorFlow tensors:x = Dense(128, activation='relu')(img)  # fully-connected layer with 128 units and ReLU activationx = Dense(128, activation='relu')(x)preds = Dense(10, activation='softmax')(x)  # output layer with 10 units and a softmax activation

定義標(biāo)簽的占位符和損失函數(shù)：

labels = tf.placeholder(tf.float32, shape=(None, 10))from keras.objectives import categorical_crossentropyloss = tf.reduce_mean(categorical_crossentropy(labels, preds))

然后，我們可以用tensorflow的優(yōu)化器來訓(xùn)練模型：

from tensorflow.examples.tutorials.mnist import input_datamnist_data = input_data.read_data_sets('MNIST_data', one_hot=True)train_step = tf.train.GradientDescentOptimizer(0.5).minimize(loss)with sess.as_default():    for i in range(100):        batch = mnist_data.train.next_batch(50)        train_step.run(feed_dict={img: batch[0],                                  labels: batch[1]})

最后我們來評(píng)估一下模型性能：

from keras.metrics import categorical_accuracy as accuracyacc_value = accuracy(labels, preds)with sess.as_default():    print acc_value.eval(feed_dict={img: mnist_data.test.images,                                    labels: mnist_data.test.labels})

我們只是將Keras作為生成從tensor到tensor的函數(shù)（op）的快捷方法而已，優(yōu)化過程完全采用的原生tensorflow的優(yōu)化器，而不是Keras優(yōu)化器，我們壓根不需要Keras的Model

關(guān)于原生TensorFlow和Keras的優(yōu)化器的一點(diǎn)注記：雖然有點(diǎn)反直覺，但Keras的優(yōu)化器要比TensorFlow的優(yōu)化器快大概5-10%。雖然這種速度的差異基本上沒什么差別。

訓(xùn)練和測(cè)試行為不同

有些Keras層，如BN，Dropout，在訓(xùn)練和測(cè)試過程中的行為不一致，你可以通過打印layer.uses_learning_phase來確定當(dāng)前層工作在訓(xùn)練模式還是測(cè)試模式。

如果你的模型包含這樣的層，你需要指定你希望模型工作在什么模式下，通過Keras的backend你可以了解當(dāng)前的工作模式：

from keras import backend as Kprint K.learning_phase()

向feed_dict中傳遞1（訓(xùn)練模式）或0（測(cè)試模式）即可指定當(dāng)前工作模式：

# train modetrain_step.run(feed_dict={x: batch[0], labels: batch[1], K.learning_phase(): 1})

例如，下面代碼示范了如何將Dropout層加入剛才的模型中：

from keras.layers import Dropoutfrom keras import backend as Kimg = tf.placeholder(tf.float32, shape=(None, 784))labels = tf.placeholder(tf.float32, shape=(None, 10))x = Dense(128, activation='relu')(img)x = Dropout(0.5)(x)x = Dense(128, activation='relu')(x)x = Dropout(0.5)(x)preds = Dense(10, activation='softmax')(x)loss = tf.reduce_mean(categorical_crossentropy(labels, preds))train_step = tf.train.GradientDescentOptimizer(0.5).minimize(loss)with sess.as_default():    for i in range(100):        batch = mnist_data.train.next_batch(50)        train_step.run(feed_dict={img: batch[0],                                  labels: batch[1],                                  K.learning_phase(): 1})acc_value = accuracy(labels, preds)with sess.as_default():    print acc_value.eval(feed_dict={img: mnist_data.test.images,                                    labels: mnist_data.test.labels,                                    K.learning_phase(): 0})

與變量名作用域和設(shè)備作用域的兼容

Keras的層與模型和tensorflow的命名完全兼容，例如：

x = tf.placeholder(tf.float32, shape=(None, 20, 64))with tf.name_scope('block1'): y = LSTM(32, name='mylstm')(x)

我們LSTM層的權(quán)重將會(huì)被命名為block1/mylstm_W_i, block1/mylstm_U, 等..類似的，設(shè)備的命名也會(huì)像你期望的一樣工作：

with tf.device('/gpu:0'):    x = tf.placeholder(tf.float32, shape=(None, 20, 64))    y = LSTM(32)(x)  # all ops / variables in the LSTM layer will live on GPU:0

與Graph的作用域兼容

任何在tensorflow的Graph作用域定義的Keras層或模型的所有變量和操作將被生成為該Graph的一個(gè)部分，例如，下面的代碼將會(huì)以你所期望的形式工作

from keras.layers import LSTMimport tensorflow as tfmy_graph = tf.Graph()with my_graph.as_default():    x = tf.placeholder(tf.float32, shape=(None, 20, 64))    y = LSTM(32)(x)  # all ops / variables in the LSTM layer are created as part of our graph

與變量作用域兼容

變量共享應(yīng)通過多次調(diào)用同樣的Keras層或模型來實(shí)現(xiàn)，而不是通過TensorFlow的變量作用域?qū)崿F(xiàn)。TensorFlow變量作用域?qū)?duì)Keras層或模型沒有任何影響。更多Keras權(quán)重共享的信息請(qǐng)參考這里

Keras通過重用相同層或模型的對(duì)象來完成權(quán)值共享，這是一個(gè)例子：

# instantiate a Keras layerlstm = LSTM(32)# instantiate two TF placeholdersx = tf.placeholder(tf.float32, shape=(None, 20, 64))y = tf.placeholder(tf.float32, shape=(None, 20, 64))# encode the two tensors with the *same* LSTM weightsx_encoded = lstm(x)y_encoded = lstm(y)

收集可訓(xùn)練權(quán)重與狀態(tài)更新

某些Keras層，如狀態(tài)RNN和BN層，其內(nèi)部的更新需要作為訓(xùn)練過程的一步來進(jìn)行，這些更新被存儲(chǔ)在一個(gè)tensor tuple里：layer.updates，你應(yīng)該生成assign操作來使在訓(xùn)練的每一步這些更新能夠被運(yùn)行，這里是例子：

from keras.layers import BatchNormalizationlayer = BatchNormalization()(x)update_ops = []for old_value, new_value in layer.updates:    update_ops.append(tf.assign(old_value, new_value))

注意如果你使用Keras模型，model.updates將與上面的代碼作用相同（收集模型中所有更新）

另外，如果你需要顯式的收集一個(gè)層的可訓(xùn)練權(quán)重，你可以通過layer.trainable_weights來實(shí)現(xiàn)，對(duì)模型而言是model.trainable_weights，它是一個(gè)tensorflow變量對(duì)象的列表：

from keras.layers import Denselayer = Dense(32)(x)  # instantiate and call a layerprint layer.trainable_weights  # list of TensorFlow Variables

這些東西允許你實(shí)現(xiàn)你基于TensorFlow優(yōu)化器實(shí)現(xiàn)自己的訓(xùn)練程序

使用Keras模型與TensorFlow協(xié)作

將Keras Sequential模型轉(zhuǎn)換到TensorFlow中

假如你已經(jīng)有一個(gè)訓(xùn)練好的Keras模型，如VGG-16，現(xiàn)在你想將它應(yīng)用在你的TensorFlow工作中，應(yīng)該怎么辦？

首先，注意如果你的預(yù)訓(xùn)練權(quán)重含有使用Theano訓(xùn)練的卷積層的話，你需要對(duì)這些權(quán)重的卷積核進(jìn)行轉(zhuǎn)換，這是因?yàn)門heano和TensorFlow對(duì)卷積的實(shí)現(xiàn)不同，TensorFlow和Caffe實(shí)際上實(shí)現(xiàn)的是相關(guān)性計(jì)算。點(diǎn)擊這里查看詳細(xì)示例。

假設(shè)你從下面的Keras模型開始，并希望對(duì)其進(jìn)行修改以使得它可以以一個(gè)特定的tensorflow張量my_input_tensor為輸入，這個(gè)tensor可能是一個(gè)數(shù)據(jù)feeder或別的tensorflow模型的輸出

# this is our initial Keras modelmodel = Sequential()first_layer = Dense(32, activation='relu', input_dim=784)model.add(Dense(10, activation='softmax'))

你只需要在實(shí)例化該模型后，使用set_input來修改首層的輸入，然后將剩下模型搭建于其上：

# this is our modified Keras modelmodel = Sequential()first_layer = Dense(32, activation='relu', input_dim=784)first_layer.set_input(my_input_tensor)# build the rest of the model as beforemodel.add(first_layer)model.add(Dense(10, activation='softmax'))

在這個(gè)階段，你可以調(diào)用model.load_weights(weights_file)來加載預(yù)訓(xùn)練的權(quán)重

然后，你或許會(huì)收集該模型的輸出張量：

output_tensor = model.output

對(duì)TensorFlow張量中調(diào)用Keras模型

Keras模型與Keras層的行為一致，因此可以被調(diào)用于TensorFlow張量上：

from keras.models import Sequentialmodel = Sequential()model.add(Dense(32, activation='relu', input_dim=784))model.add(Dense(10, activation='softmax'))# this works! x = tf.placeholder(tf.float32, shape=(None, 784))y = model(x)

注意，調(diào)用模型時(shí)你同時(shí)使用了模型的結(jié)構(gòu)與權(quán)重，當(dāng)你在一個(gè)tensor上調(diào)用模型時(shí)，你就在該tensor上創(chuàng)造了一些操作，這些操作重用了已經(jīng)在模型中出現(xiàn)的TensorFlow變量的對(duì)象

多GPU和分布式訓(xùn)練

將Keras模型分散在多個(gè)GPU中訓(xùn)練

TensorFlow的設(shè)備作用域完全與Keras的層和模型兼容，因此你可以使用它們來將一個(gè)圖的特定部分放在不同的GPU中訓(xùn)練，這里是一個(gè)簡(jiǎn)單的例子：

with tf.device('/gpu:0'):    x = tf.placeholder(tf.float32, shape=(None, 20, 64))    y = LSTM(32)(x)  # all ops in the LSTM layer will live on GPU:0with tf.device('/gpu:1'):    x = tf.placeholder(tf.float32, shape=(None, 20, 64))    y = LSTM(32)(x)  # all ops in the LSTM layer will live on GPU:1

注意，由LSTM層創(chuàng)建的變量將不會(huì)生存在GPU上，不管TensorFlow變量在哪里創(chuàng)建，它們總是生存在CPU上，TensorFlow將隱含的處理設(shè)備之間的轉(zhuǎn)換

如果你想在多個(gè)GPU上訓(xùn)練同一個(gè)模型的多個(gè)副本，并在多個(gè)副本中進(jìn)行權(quán)重共享，首先你應(yīng)該在一個(gè)設(shè)備作用域下實(shí)例化你的模型或?qū)?，然后在不同GPU設(shè)備的作用域下多次調(diào)用該模型實(shí)例，如：

with tf.device('/cpu:0'):    x = tf.placeholder(tf.float32, shape=(None, 784))    # shared model living on CPU:0    # it won't actually be run during training; it acts as an op template    # and as a repository for shared variables    model = Sequential()    model.add(Dense(32, activation='relu', input_dim=784))    model.add(Dense(10, activation='softmax'))# replica 0with tf.device('/gpu:0'):    output_0 = model(x)  # all ops in the replica will live on GPU:0# replica 1with tf.device('/gpu:1'):    output_1 = model(x)  # all ops in the replica will live on GPU:1# merge outputs on CPUwith tf.device('/cpu:0'):    preds = 0.5 * (output_0 + output_1)# we only run the `preds` tensor, so that only the two# replicas on GPU get run (plus the merge op on CPU)output_value = sess.run([preds], feed_dict={x: data})

分布式訓(xùn)練

通過注冊(cè)Keras會(huì)話到一個(gè)集群上，你可以簡(jiǎn)單的實(shí)現(xiàn)分布式訓(xùn)練：

server = tf.train.Server.create_local_server()sess = tf.Session(server.target)from keras import backend as KK.set_session(sess)

關(guān)于TensorFlow進(jìn)行分布式訓(xùn)練的配置信息，請(qǐng)參考這里

使用TensorFlow-serving導(dǎo)出模型

TensorFlow-Serving是由Google開發(fā)的用于將TensoFlow模型部署于生產(chǎn)環(huán)境的工具

任何Keras模型都可以被TensorFlow-serving所導(dǎo)出（只要它只含有一個(gè)輸入和一個(gè)輸出，這是TF-serving的限制），不管它是否作為TensroFlow工作流的一部分。事實(shí)上你甚至可以使用Theano訓(xùn)練你的Keras模型，然后將其切換到tensorflow后端，然后導(dǎo)出模型

如果你的graph使用了Keras的learning phase（在訓(xùn)練和測(cè)試中行為不同），你首先要做的事就是在graph中硬編碼你的工作模式（設(shè)為0，即測(cè)試模式），該工作通過1）使用Keras的后端注冊(cè)一個(gè)learning phase常量，2）重新構(gòu)建模型，來完成。

這里是實(shí)踐中的示范：

from keras import backend as KK.set_learning_phase(0)  # all new operations will be in test mode from now on# serialize the model and get its weights, for quick re-buildingconfig = previous_model.get_config()weights = previous_model.get_weights()# re-build a model where the learning phase is now hard-coded to 0from keras.models import model_from_confignew_model = model_from_config(config)new_model.set_weights(weights)

現(xiàn)在，我們可使用Tensorflow-serving來導(dǎo)出模型，按照官方教程的指導(dǎo)：

from tensorflow_serving.session_bundle import exporterexport_path = ... # where to save the exported graphexport_version = ... # version number (integer)saver = tf.train.Saver(sharded=True)model_exporter = exporter.Exporter(saver)signature = exporter.classification_signature(input_tensor=model.input,                                              scores_tensor=model.output)model_exporter.init(sess.graph.as_graph_def(),                    default_graph_signature=signature)model_exporter.export(export_path, tf.constant(export_version), sess)

如想看到包含本教程的新主題，請(qǐng)看我的Twitter

本站是提供個(gè)人知識(shí)管理的網(wǎng)絡(luò)存儲(chǔ)空間，所有內(nèi)容均由用戶發(fā)布，不代表本站觀點(diǎn)。請(qǐng)注意甄別內(nèi)容中的聯(lián)系方式、誘導(dǎo)購(gòu)買等信息，謹(jǐn)防詐騙。如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容，請(qǐng)點(diǎn)擊一鍵舉報(bào)。

轉(zhuǎn)藏 分享

QQ空間 QQ好友新浪微博微信

獻(xiàn)花（0） +1

來自：田杰4 > 《keras》

舉報(bào)/認(rèn)領(lǐng)