一区二区三区日韩精品-日韩经典一区二区三区-五月激情综合丁香婷婷-欧美精品中文字幕专区

分享

Gradient Descent Method in Machine Learning

 預(yù)見未來的我 2018-03-19

Introduction

This article is a summary of week2 of Andrew Ng's neural network course available in coursera. Week 2 mainly tells us about how to deal with logistic regression problem such as binary classification.

Logistic regression has wide applications through our daily life, predict the housing price, for instance, and estimate how many goods you could sell in this month. Logistic regression problem is that given a set of input data, and real output result, you are asked to simulate a linear function to make the predictions and real results as close as possible. And binary classification is under the same given conditions, the only difference being that the result is only two numbers, conventionally, 0 and 1. One way to implement in mathematics is to apply least square to it, and according to the formula, we could get our coefficients. But what about there are 100 training sets with each sets contains 5 features? I don't know whether there is a formula to apply, even if it exists, it must be prohibitively sophisticated, hard to compute. So here comes the gradient descent.

Gradient Descent

As we know, a gradient of a function points to its steepest descending direction, which is just like coming down a mountain, that you would always choose the steepest way to go down for it is the fastest way. Gradient descent's philosophy lies here. In each step, you take the steepest descending direction and then you look around, finding another direction which is the steepest in your current position, and do it recursively until you get the wanted result. In this case, result is a minimum value we can get for the errors between estimated output and real output. 

How does gradient descent really works? Here is an example, and I am sure having seen this, you would be clear about gradient descent and write a piece of code using it.

 

Problem: Find the a value x such that f(x)=3 14x-5x^2,initial x=0

You must be scoffing at it for it's too simple to use as an illustration. But I have to say that starters are better getting started with simple ones, especially good if you are using an example which you could handle it with another method.

Ok, let's get started. So first let's define what is error. Obviously, error here is the difference between f(x) and 0, which is just f(x), but error as we defined, should be the difference between f(x) and 0, it could be negative, which is not good, for we want it to be always positive. So we define error function which is also called cost function

L(x)=f(x)^2=(3 14x-5x^2)^2 

According to gradient descent, we need to take the steepest direction on each step, so We have to compute its derivative, which is 

L`(x)=2(3 14x-5x^2)(14-10x)

So apply gradient descent, x should be updated as the following rule

x=x-alpha * L`(x)=x- alpha * 2(3 14x-5x^2)(14-10x),

where alpha decides how big step you're going to take, is called the learning rate. 

 

The whole process is that you exam the difference of the real output and predicted output, usually you'll get a very bad result for it's the first guess. In this case, x=0 ,which mean error=9(3 square). If you're satisfied with this result, then quit the programme.(But I am sure no one is satisfied with this), otherwise, go to the updating rule, update x and see the result. In this case, if I make alpha=0.01, then the newer value of x should be 0-2*0.01*3*14=-0.84. Use this x again to compute the error, and if you're not satisfied with it, update x and do it again!

 

Some careful readers may notice that this function must have 2 different roots. Yes, you are right. But by this mean, we could only get one result. If you want another result, what should you do? You could think about it.

Yeah, we could make initial x=10 or if you're not sure how large the root is, you could make it as big as you want(but do not let python complain about overflow), to make sure you can get the positive root!

Background

We all want our machines to be more intelligent, meaning that they can adjust their actions according to the environment, that they can be clever than you to tell you whether this can work or not, even they can diagnose a kind of disease! 

I am zesty about that, making machines could do things human can do, or even things we can not. For example, according some statistic data, it can tell the probability whether your tumor is benign or malignant, which, to be honest I am unable to.

Using the code

This piece of code is for the example illustrated above, you could code yourself, actually it's quite simple.

x=2 alpha=0.001 error=(3 14*x-5*(x**2))**2 count=0 accuracy=0.00000000001 while error>accuracy or error<-accuracy:     x-=(-10*x 14)*(-5*x**2 14*x 3)*alpha*2     error=(3 14*x-5*(x**2))**2     count =1

 

Forecast

Next time, I would write an article about how to use gradient descent to predict whether a tumour is malignant or benign. See you!

 

    本站是提供個(gè)人知識(shí)管理的網(wǎng)絡(luò)存儲(chǔ)空間,所有內(nèi)容均由用戶發(fā)布,不代表本站觀點(diǎn)。請注意甄別內(nèi)容中的聯(lián)系方式、誘導(dǎo)購買等信息,謹(jǐn)防詐騙。如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容,請點(diǎn)擊一鍵舉報(bào)。
    轉(zhuǎn)藏 分享 獻(xiàn)花(0

    0條評論

    發(fā)表

    請遵守用戶 評論公約

    類似文章 更多

    日韩精品一级一区二区| 日韩欧美国产精品自拍| 色哟哟哟在线观看视频| 亚洲精品中文字幕无限乱码| 婷婷色香五月综合激激情| 少妇被粗大进猛进出处故事| 国产麻豆一线二线三线| 国产精品流白浆无遮挡| 国产免费无遮挡精品视频| 最好看的人妻中文字幕| 尤物天堂av一区二区| 国产又大又黄又粗又免费| 亚洲国产欧美精品久久| 大尺度剧情国产在线视频| 亚洲一区二区欧美在线| 91欧美视频在线观看免费| 亚洲熟女诱惑一区二区| 免费特黄欧美亚洲黄片| 免费特黄一级一区二区三区| 日韩国产传媒在线精品| 色播五月激情五月婷婷| 日本一区二区三区久久娇喘| 精品视频一区二区三区不卡| 亚洲日本韩国一区二区三区| 美日韩一区二区精品系列| 在线九月婷婷丁香伊人| 久久这里只精品免费福利| 中文字幕亚洲人妻在线视频| 国产一区在线免费国产一区| 亚洲精品福利入口在线| 懂色一区二区三区四区| 国产精品一区二区丝袜| 在线免费国产一区二区| 日韩精品中文字幕在线视频| 国产性情片一区二区三区| 开心激情网 激情五月天| 久久精品中文字幕人妻中文| 日韩精品一区二区亚洲| 国产精品午夜一区二区三区 | 久久精品亚洲欧美日韩| 五月婷婷综合激情啪啪|