用Python實(shí)現(xiàn)機(jī)器學(xué)習(xí)算法——Softmax 回歸算法

taotao_2016 2018-04-07

展開全文

Anna-Lena Popkes，德國波恩大學(xué)計(jì)算機(jī)科學(xué)專業(yè)的研究生，主要關(guān)注機(jī)器學(xué)習(xí)和神經(jīng)網(wǎng)絡(luò)。

編譯 | 林椿眄

出品 | 人工智能頭條

導(dǎo)讀：Python 被稱為是最接近 AI 的語言。最近一位名叫Anna-Lena Popkes的小姐姐在GitHub上分享了自己如何使用Python（3.6及以上版本）實(shí)現(xiàn)7種機(jī)器學(xué)習(xí)算法的筆記，并附有完整代碼。所有這些算法的實(shí)現(xiàn)都沒有使用其他機(jī)器學(xué)習(xí)庫。這份筆記可以幫大家對(duì)算法以及其底層結(jié)構(gòu)有個(gè)基本的了解，但并不是提供最有效的實(shí)現(xiàn)。

Softmax 回歸算法，又稱為多項(xiàng)式或多類別的 Logistic 回歸算法。

給定：

數(shù)據(jù)集

是d-維向量
是對(duì)應(yīng)于的目標(biāo)變量，例如對(duì)于K=3分類問題，

Softmax 回歸模型有以下幾個(gè)特點(diǎn)：

對(duì)于每個(gè)類別，都存在一個(gè)獨(dú)立的、實(shí)值加權(quán)向量
這個(gè)權(quán)重向量通常作為權(quán)重矩陣中的行。
對(duì)于每個(gè)類別，都存在一個(gè)獨(dú)立的、實(shí)值偏置量b
它使用 softmax 函數(shù)作為其激活函數(shù)
它使用交叉熵( cross-entropy )作為損失函數(shù)

訓(xùn)練 Softmax 回歸模型有不同步驟。首先(在步驟0中)，模型的參數(shù)將被初始化。在達(dá)到指定訓(xùn)練次數(shù)或參數(shù)收斂前，重復(fù)以下其他步驟。

第 0 步：用 0 (或小的隨機(jī)值)來初始化權(quán)重向量和偏置值

第 1 步：對(duì)于每個(gè)類別k，計(jì)算其輸入的特征與權(quán)重值的線性組合，也就是說為每個(gè)類別的訓(xùn)練樣本計(jì)算一個(gè)得分值。對(duì)于類別k，輸入向量為,則得分值的計(jì)算如下：

其中表示類別k的權(quán)重矩陣，·表示點(diǎn)積。

我們可以通過矢量化和矢量傳播法則計(jì)算所有類別及其訓(xùn)練樣本的得分值：

其中 X 是所有訓(xùn)練樣本的維度矩陣，W 表示每個(gè)類別的權(quán)重矩陣維度，其形式為；

第 2 步：用 softmax 函數(shù)作為激活函數(shù)，將得分值轉(zhuǎn)化為概率值形式。屬于類別 k 的輸入向量的概率值為：

同樣地，我們可以通過矢量化來對(duì)所有類別同時(shí)處理，得到其概率輸出。模型預(yù)測出的表示的是該類別的最高概率。

第 3 步：計(jì)算整個(gè)訓(xùn)練集的損失值。

我們希望模型預(yù)測出的高概率值是目標(biāo)類別，而低概率值表示其他類別。這可以通過以下的交叉熵?fù)p失函數(shù)來實(shí)現(xiàn)：

在上面公式中，目標(biāo)類別標(biāo)簽表示成獨(dú)熱編碼形式( one-hot )。因此為1時(shí)表示的目標(biāo)類別是 k，反之則為 0。

第 4 步：對(duì)權(quán)重向量和偏置量，計(jì)算其對(duì)損失函數(shù)的梯度。

關(guān)于這個(gè)導(dǎo)數(shù)實(shí)現(xiàn)的詳細(xì)解釋，可以參見這里（http://ufldl./tutorial/supervised/SoftmaxRegression/）。

一般形式如下：

對(duì)于偏置量的導(dǎo)數(shù)計(jì)算，此時(shí)為1。

第 5 步：對(duì)每個(gè)類別k，更新其權(quán)重和偏置值。

其中，表示學(xué)習(xí)率。

In [1]:

from sklearn.datasets import load_iris
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_blobs
import matplotlib.pyplot as plt
np.random.seed(13)

數(shù)據(jù)集

In [2]:

X, y_true = make_blobs(centers=4, n_samples = 5000)
fig = plt.figure(figsize=(8,6))
plt.scatter(X[:,0], X[:,1], c=y_true)
plt.title('Dataset')
plt.xlabel('First feature')
plt.ylabel('Second feature')
plt.show()

In [3]:

# reshape targets to get column vector with shape (n_samples, 1)
y_true = y_true[:, np.newaxis]
# Split the data into a training and test set
X_train, X_test, y_train, y_test = train_test_split(X, y_true)
print(f'Shape X_train: {X_train.shape}')
print(f'Shape y_train: {y_train.shape}')
print(f'Shape X_test: {X_test.shape}')
print(f'Shape y_test: {y_test.shape}')

Shape X_train: (3750, 2)
Shape y_train: (3750, 1)
Shape X_test: (1250, 2)
Shape y_test: (1250, 1)

Softmax回歸分類

class SoftmaxRegressor:
def __init__(self):
pass
def train(self, X, y_true, n_classes, n_iters=10, learning_rate=0.1):
'''
Trains a multinomial logistic regression model on given set of training data
'''
self.n_samples, n_features = X.shape
self.n_classes = n_classes
self.weights = np.random.rand(self.n_classes, n_features)
self.bias = np.zeros((1, self.n_classes))
all_losses = []
for i in range(n_iters):
scores = self.compute_scores(X)
probs = self.softmax(scores)
y_predict = np.argmax(probs, axis=1)[:, np.newaxis]
y_one_hot = self.one_hot(y_true)
loss = self.cross_entropy(y_one_hot, probs)
all_losses.append(loss)
dw = (1 / self.n_samples) * np.dot(X.T, (probs - y_one_hot))
db = (1 / self.n_samples) * np.sum(probs - y_one_hot, axis=0)
self.weights = self.weights - learning_rate * dw.T
self.bias = self.bias - learning_rate * db
if i % 100 == 0:
print(f'Iteration number: {i}, loss: {np.round(loss, 4)}')
return self.weights, self.bias, all_losses
def predict(self, X):
'''
Predict class labels for samples in X.
Args:
X: numpy array of shape (n_samples, n_features)
Returns:
numpy array of shape (n_samples, 1) with predicted classes
'''
scores = self.compute_scores(X)
probs = self.softmax(scores)
return np.argmax(probs, axis=1)[:, np.newaxis]
def softmax(self, scores):
'''
Tranforms matrix of predicted scores to matrix of probabilities
Args:
scores: numpy array of shape (n_samples, n_classes)
with unnormalized scores
Returns:
softmax: numpy array of shape (n_samples, n_classes)
with probabilities
'''
exp = np.exp(scores)
sum_exp = np.sum(np.exp(scores), axis=1, keepdims=True)
softmax = exp / sum_exp
return softmax
def compute_scores(self, X):
'''
Computes class-scores for samples in X
Args:
X: numpy array of shape (n_samples, n_features)
Returns:
scores: numpy array of shape (n_samples, n_classes)
'''
return np.dot(X, self.weights.T) + self.bias
def cross_entropy(self, y_true, scores):
loss = - (1 / self.n_samples) * np.sum(y_true * np.log(scores))
return loss
def one_hot(self, y):
'''
Tranforms vector y of labels to one-hot encoded matrix
'''
one_hot = np.zeros((self.n_samples, self.n_classes))
one_hot[np.arange(self.n_samples), y.T] = 1
return one_hot

初始化并訓(xùn)練模型

regressor = SoftmaxRegressor()
w_trained, b_trained, loss = regressor.train(X_train, y_train, learning_rate=0.1, n_iters=800, n_classes=4)
fig = plt.figure(figsize=(8,6))
plt.plot(np.arange(800), loss)
plt.title('Development of loss during training')
plt.xlabel('Number of iterations')
plt.ylabel('Loss')
plt.show()Iteration number: 0, loss: 1.393

Iteration number: 100, loss: 0.2051
Iteration number: 200, loss: 0.1605
Iteration number: 300, loss: 0.1371
Iteration number: 400, loss: 0.121
Iteration number: 500, loss: 0.1087
Iteration number: 600, loss: 0.0989
Iteration number: 700, loss: 0.0909

測試模型

n_test_samples, _ = X_test.shape
y_predict = regressor.predict(X_test)
print(f'Classification accuracy on test set: {(np.sum(y_predict == y_test)/n_test_samples) * 100}%')

測試集分類準(zhǔn)確率：99.03999999999999%

原文鏈接：

https://github.com/zotroneneis/machine_learning_basics

本站是提供個(gè)人知識(shí)管理的網(wǎng)絡(luò)存儲(chǔ)空間，所有內(nèi)容均由用戶發(fā)布，不代表本站觀點(diǎn)。請(qǐng)注意甄別內(nèi)容中的聯(lián)系方式、誘導(dǎo)購買等信息，謹(jǐn)防詐騙。如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容，請(qǐng)點(diǎn)擊一鍵舉報(bào)。

轉(zhuǎn)藏 分享

QQ空間 QQ好友新浪微博微信

獻(xiàn)花（0） +1

來自： taotao_2016 > 《計(jì)算機(jī)》

舉報(bào)/認(rèn)領(lǐng)