Getting started with the Keras Sequential model開(kāi)始使用Keras Sequential模型
https:///getting-started/sequential-model-guide/ The Sequential model is a linear stack of layers. Sequential模型是一個(gè)線性的圖層堆棧。
You can create a Sequential model by passing a list of layer instances to the constructor: 您可以通過(guò)將圖層實(shí)例列表傳遞給構(gòu)造函數(shù)來(lái)創(chuàng)建Sequential模型:
from keras.models import Sequential
從keras.models導(dǎo)入Sequential
from keras.layers import Dense, Activation
來(lái)自keras.layers進(jìn)口密集,激活
model = Sequential([
Dense(32, input_shape=(784,)),
Activation('relu'),
Dense(10),
Activation('softmax'),
])
You can also simply add layers via the .add() method: 您也可以通過(guò).add()方法簡(jiǎn)單地添加圖層:
model = Sequential()
model.add(Dense(32, input_dim=784))
model.add(Activation('relu'))
Specifying the input shape指定輸入形狀
The model needs to know what input shape it should expect. For this reason, the first layer in a Sequential model (and only the first, because following layers can do automatic shape inference) needs to receive information about its input shape. There are several possible ways to do this: 該模型需要知道它應(yīng)該期望的輸入形狀。 出于這個(gè)原因,Sequential模型中的第一層(只有第一層,因?yàn)橄旅娴膶涌梢宰鲎詣?dòng)形狀推斷)需要接收關(guān)于其輸入形狀的信息。 有幾種可能的方法來(lái)做到這一點(diǎn):
- Pass an
input_shape argument to the first layer. This is a shape tuple (a tuple of integers or None entries, where None indicates that any positive integer may be expected). In input_shape , the batch dimension is not included. - 將一個(gè)input_shape參數(shù)傳遞給第一層。 這是一個(gè)形狀元組(一個(gè)整數(shù)或無(wú)條目的元組,其中None表示可能會(huì)有任何正整數(shù))。 在input_shape中,不包含批次維度
- Some 2D layers, such as
Dense , support the specification of their input shape via the argument input_dim , and some 3D temporal layers support the arguments input_dim and input_length . - 一些2D層(如Dense)通過(guò)參數(shù)input_dim支持其輸入形狀的規(guī)范,并且一些3D時(shí)間層支持參數(shù)input_dim和input_length。
- If you ever need to specify a fixed batch size for your inputs (this is useful for stateful recurrent networks), you can pass a
batch_size argument to a layer. If you pass both batch_size=32 and input_shape=(6, 8) to a layer, it will then expect every batch of inputs to have the batch shape (32, 6, 8) . - 如果您需要為輸入指定固定批量大?。ㄟ@對(duì)于有狀態(tài)循環(huán)網(wǎng)絡(luò)非常有用),則可以將batch_size參數(shù)傳遞給圖層。 如果您將batch_size = 32和input_shape =(6,8)同時(shí)傳遞到圖層,則會(huì)期望每批輸入都具有批次形狀(32,6,8)。
As such, the following snippets are strictly equivalent: 因此,以下片段嚴(yán)格等同:
model = Sequential()
model.add(Dense(32, input_shape=(784,)))
model = Sequential()
model.add(Dense(32, input_dim=784))
Compilation匯編
Before training a model, you need to configure the learning process, which is done via the compile method. It receives three arguments: 在訓(xùn)練模型之前,您需要配置學(xué)習(xí)過(guò)程,這是通過(guò)編譯方法完成的。 它收到三個(gè)參數(shù):
- An optimizer. This could be the string identifier of an existing optimizer (such as
rmsprop or adagrad ), or an instance of the Optimizer class. See: optimizers. - 優(yōu)化器。 這可以是現(xiàn)有優(yōu)化器(如rmsprop或adagrad)的字符串標(biāo)識(shí)符,也可以是Optimizer類(lèi)的實(shí)例。 請(qǐng)參閱:優(yōu)化器。
- A loss function. This is the objective that the model will try to minimize. It can be the string identifier of an existing loss function (such as
categorical_crossentropy or mse ), or it can be an objective function. See: losses. - 損失函數(shù)。這是模型將盡量減少的目標(biāo)。它可以是現(xiàn)有損失函數(shù)的字符串標(biāo)識(shí)符(如CyraseTyStrordRoPy或MSE),或者它可以是一個(gè)目標(biāo)函數(shù)。參見(jiàn):損失。
- A list of metrics. For any classification problem you will want to set this to
metrics=['accuracy'] . A metric could be the string identifier of an existing metric or a custom metric function. - 指標(biāo)列表。 對(duì)于任何分類(lèi)問(wèn)題,您都希望將其設(shè)置為metrics = ['accuracy']。 度量可以是現(xiàn)有度量或自定義度量函數(shù)的字符串標(biāo)識(shí)。
# For a multi-class classification problem 對(duì)于多類(lèi)分類(lèi)問(wèn)題
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
# For a binary classification problem 對(duì)于二元分類(lèi)問(wèn)題
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
# For a mean squared error regression problem 對(duì)于均方誤差回歸問(wèn)題
model.compile(optimizer='rmsprop',
loss='mse')
# For custom metrics 用于自定義指標(biāo)
import keras.backend as K
def mean_pred(y_true, y_pred):
return K.mean(y_pred)
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy', mean_pred])
Training
Keras models are trained on Numpy arrays of input data and labels. For training a model, you will typically use the fit function. Read its documentation here. Keras模型接受Numpy輸入數(shù)據(jù)和標(biāo)簽數(shù)組的訓(xùn)練。 為了訓(xùn)練模型,您通常會(huì)使用擬合函數(shù)。 在這里閱讀它的文檔。
# For a single-input model with 2 classes (binary classification): 對(duì)于具有2個(gè)類(lèi)的單輸入模型(二進(jìn)制分類(lèi)):
model = Sequential()
model.add(Dense(32, activation='relu', input_dim=100))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
# Generate dummy data 生成虛擬數(shù)據(jù)
import numpy as np
data = np.random.random((1000, 100))
labels = np.random.randint(2, size=(1000, 1))
# Train the model, iterating on the data in batches of 32 samples 訓(xùn)練模型,對(duì)32個(gè)樣本批量重復(fù)數(shù)據(jù)
model.fit(data, labels, epochs=10, batch_size=32)
# For a single-input model with 10 classes (categorical classification): 對(duì)于具有10個(gè)類(lèi)的單輸入模型(分類(lèi)分類(lèi)):
model = Sequential()
model.add(Dense(32, activation='relu', input_dim=100))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
# Generate dummy data 生成虛擬數(shù)據(jù)
import numpy as np
data = np.random.random((1000, 100))
labels = np.random.randint(10, size=(1000, 1))
# Convert labels to categorical one-hot encoding 將標(biāo)簽轉(zhuǎn)換為分類(lèi)單熱編碼
one_hot_labels = keras.utils.to_categorical(labels, num_classes=10)
# Train the model, iterating on the data in batches of 32 samples 訓(xùn)練模型,對(duì)32個(gè)樣本批量重復(fù)數(shù)據(jù)
model.fit(data, one_hot_labels, epochs=10, batch_size=32)
Examples
Here are a few examples to get you started!
In the examples folder, you will also find example models for real datasets: 在examples文件夾中,您還可以找到真實(shí)數(shù)據(jù)集的示例模型:
- CIFAR10 small images classification: Convolutional Neural Network (CNN) with realtime data augmentation
- IMDB movie review sentiment classification: LSTM over sequences of words
- Reuters newswires topic classification: Multilayer Perceptron (MLP)
- MNIST handwritten digits classification: MLP & CNN
- Character-level text generation with LSTM
- CIFAR10小圖像分類(lèi):具有實(shí)時(shí)數(shù)據(jù)增強(qiáng)功能的卷積神經(jīng)網(wǎng)絡(luò)(CNN)
- IMDB電影評(píng)論情感分類(lèi):?jiǎn)卧~序列的LSTM
- CNN newswire主題分類(lèi):多層感知器(MLP)
- MNIST手寫(xiě)數(shù)字分類(lèi):MLP和CNN
- 使用LSTM生成字符級(jí)文本
...and more.
Multilayer Perceptron (MLP) for multi-class softmax classification:用于多級(jí)softmax分類(lèi)的多層感知器(MLP):
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.optimizers import SGD
# Generate dummy data
import numpy as np
x_train = np.random.random((1000, 20))
y_train = keras.utils.to_categorical(np.random.randint(10, size=(1000, 1)), num_classes=10)
x_test = np.random.random((100, 20))
y_test = keras.utils.to_categorical(np.random.randint(10, size=(100, 1)), num_classes=10)
model = Sequential()
# Dense(64) is a fully-connected layer with 64 hidden units.
# in the first layer, you must specify the expected input data shape:
# here, 20-dimensional vectors.
model.add(Dense(64, activation='relu', input_dim=20))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
model.fit(x_train, y_train,
epochs=20,
batch_size=128)
score = model.evaluate(x_test, y_test, batch_size=128)
MLP for binary classification: 用于二進(jìn)制分類(lèi)的MLP:
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, Dropout
# Generate dummy data
x_train = np.random.random((1000, 20))
y_train = np.random.randint(2, size=(1000, 1))
x_test = np.random.random((100, 20))
y_test = np.random.randint(2, size=(100, 1))
model = Sequential()
model.add(Dense(64, input_dim=20, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
model.fit(x_train, y_train,
epochs=20,
batch_size=128)
score = model.evaluate(x_test, y_test, batch_size=128)
VGG-like convnet: 類(lèi)似VGG的網(wǎng)絡(luò):
import numpy as np
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.optimizers import SGD
# Generate dummy data
x_train = np.random.random((100, 100, 100, 3))
y_train = keras.utils.to_categorical(np.random.randint(10, size=(100, 1)), num_classes=10)
x_test = np.random.random((20, 100, 100, 3))
y_test = keras.utils.to_categorical(np.random.randint(10, size=(20, 1)), num_classes=10)
model = Sequential()
# input: 100x100 images with 3 channels -> (100, 100, 3) tensors.
# this applies 32 convolution filters of size 3x3 each.
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(100, 100, 3)))
model.add(Conv2D(32, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd)
model.fit(x_train, y_train, batch_size=32, epochs=10)
score = model.evaluate(x_test, y_test, batch_size=32)
Sequence classification with LSTM: 用LSTM進(jìn)行序列分類(lèi):
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.layers import Embedding
from keras.layers import LSTM
max_features = 1024
model = Sequential()
model.add(Embedding(max_features, output_dim=256))
model.add(LSTM(128))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=16, epochs=10)
score = model.evaluate(x_test, y_test, batch_size=16)
Sequence classification with 1D convolutions: 用一維卷積進(jìn)行序列分類(lèi):
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.layers import Embedding
from keras.layers import Conv1D, GlobalAveragePooling1D, MaxPooling1D
seq_length = 64
model = Sequential()
model.add(Conv1D(64, 3, activation='relu', input_shape=(seq_length, 100)))
model.add(Conv1D(64, 3, activation='relu'))
model.add(MaxPooling1D(3))
model.add(Conv1D(128, 3, activation='relu'))
model.add(Conv1D(128, 3, activation='relu'))
model.add(GlobalAveragePooling1D())
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=16, epochs=10)
score = model.evaluate(x_test, y_test, batch_size=16)
Stacked LSTM for sequence classification 堆積的LSTM用于序列分類(lèi)
In this model, we stack 3 LSTM layers on top of each other,
making the model capable of learning higher-level temporal representations. 在這個(gè)模型中,我們將3個(gè)LSTM層層疊在一起,使模型能夠?qū)W習(xí)更高層次的時(shí)間表示。
The first two LSTMs return their full output sequences, but the last one only returns
the last step in its output sequence, thus dropping the temporal dimension
(i.e. converting the input sequence into a single vector). 前兩個(gè)LSTM返回其全部輸出序列,但最后一個(gè)僅返回其輸出序列中的最后一個(gè)步驟,從而降低時(shí)間維度(即將輸入序列轉(zhuǎn)換為單個(gè)向量)。
from keras.models import Sequential
from keras.layers import LSTM, Dense
import numpy as np
data_dim = 16
timesteps = 8
num_classes = 10
# expected input data shape: (batch_size, timesteps, data_dim)
model = Sequential()
model.add(LSTM(32, return_sequences=True,
input_shape=(timesteps, data_dim))) # returns a sequence of vectors of dimension 32
model.add(LSTM(32, return_sequences=True)) # returns a sequence of vectors of dimension 32
model.add(LSTM(32)) # return a single vector of dimension 32
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
# Generate dummy training data
x_train = np.random.random((1000, timesteps, data_dim))
y_train = np.random.random((1000, num_classes))
# Generate dummy validation data
x_val = np.random.random((100, timesteps, data_dim))
y_val = np.random.random((100, num_classes))
model.fit(x_train, y_train,
batch_size=64, epochs=5,
validation_data=(x_val, y_val))
Same stacked LSTM model, rendered "stateful"
A stateful recurrent model is one for which the internal states (memories) obtained after processing a batch
of samples are reused as initial states for the samples of the next batch. This allows to process longer sequences
while keeping computational complexity manageable.
You can read more about stateful RNNs in the FAQ.
from keras.models import Sequential
from keras.layers import LSTM, Dense
import numpy as np
data_dim = 16
timesteps = 8
num_classes = 10
batch_size = 32
# Expected input batch shape: (batch_size, timesteps, data_dim)
# Note that we have to provide the full batch_input_shape since the network is stateful.
# the sample of index i in batch k is the follow-up for the sample i in batch k-1.
model = Sequential()
model.add(LSTM(32, return_sequences=True, stateful=True,
batch_input_shape=(batch_size, timesteps, data_dim)))
model.add(LSTM(32, return_sequences=True, stateful=True))
model.add(LSTM(32, stateful=True))
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
# Generate dummy training data
x_train = np.random.random((batch_size * 10, timesteps, data_dim))
y_train = np.random.random((batch_size * 10, num_classes))
# Generate dummy validation data
x_val = np.random.random((batch_size * 3, timesteps, data_dim))
y_val = np.random.random((batch_size * 3, num_classes))
model.fit(x_train, y_train,
batch_size=batch_size, epochs=5, shuffle=False,
validation_data=(x_val, y_val))
|