一区二区三区日韩精品-日韩经典一区二区三区-五月激情综合丁香婷婷-欧美精品中文字幕专区

分享

使用Excel+VBA對網(wǎng)頁進行操作

 心靈地圖sxh 2018-02-23

對于Excel和VBA我所知有限,僅能解決自己遇到的一些問題,并不一定適用于所有場景。以下內(nèi)容建立在了解基本VBA使用以及HTML語言知識的基礎(chǔ)上:

一、前期準(zhǔn)備

就我所知,VBA并不能操作任意瀏覽器及網(wǎng)頁,我們所能做的僅僅是對IE進行一些操作,是的,僅僅是IE。不要告訴我電腦上沒有IE,那樣就可以Exit Sub了。就像Python用import、C#用using一樣,VBA也需要引用一些庫才能對IE進行操作,不過好在同屬微軟產(chǎn)品,所以我們能很簡便的利用VBA自帶的一些庫。

首先我們要做的就是在VBA中引用Micorsoft Internet Controls,看這個名字就知道是幫助我們控制IE頁面用的。

二、網(wǎng)頁操作

引用Micorsoft Internet Controls之后,我們就可以對頁面為所欲為了,不過首頁我們要有個頁面,上帝說要有頁面!

1、打開網(wǎng)頁

我們以在百度搜索“扯乎”關(guān)鍵詞為例:

    With CreateObject("internetexplorer.application")
        .Visible = True
        .Navigate "https://www.baidu.com/s?wd=扯乎"
'關(guān)閉網(wǎng)頁
'       .Quit
    End With

代碼很簡單,先創(chuàng)建一個IE對象,然后給一些屬性賦值。Visible是可見性,說的是在對網(wǎng)頁進行操作時,這個網(wǎng)頁是不是會被看見。熟練之后可以設(shè)置為False,不僅讓程序在跑的時候有種神秘感(并沒有),還能稍微加快一點速度。

不過有一點要記住,這個網(wǎng)頁我們打開之后并沒有關(guān)閉,也就是說程序結(jié)束后需要手動關(guān)閉,如果網(wǎng)頁不可見是無法手動關(guān)閉的。代碼中注釋的部分就是關(guān)閉網(wǎng)頁用的。Navigate不用多說就是URL。

我們必須要等網(wǎng)頁完全加載完才能開始信息的抓取,這個時候使用到:(從這里開始,所有的代碼都需要寫在With代碼塊中

        While .ReadyState <> 4 Or .Busy
            DoEvents
        Wend

Busy是網(wǎng)頁忙碌狀態(tài),ReadyState是HTTP的5種就緒狀態(tài),對應(yīng)如下:

  • 0:請求未初始化(還沒有調(diào)用 open())。
  • 1:請求已經(jīng)建立,但是還沒有發(fā)送(還沒有調(diào)用 send())。
  • 2:請求已發(fā)送,正在處理中(通?,F(xiàn)在可以從響應(yīng)中獲取內(nèi)容頭)。
  • 3:請求在處理中;通常響應(yīng)中已有部分數(shù)據(jù)可用了,但是服務(wù)器還沒有完成響應(yīng)的生成。
  • 4:響應(yīng)已完成;您可以獲取并使用服務(wù)器的響應(yīng)了。

2、獲取信息

我們先把頁面中的所有內(nèi)容抓下來,后期篩選出有用的部分再慢慢給抓取添加條件。

        Set dmt = .Document
        For i = 0 To dmt.all.Length - 1
            Set htMent = dmt.all(i)
            With ActiveSheet
                .Cells(i + 2, "A") = htMent.tagName
                .Cells(i + 2, "B") = TypeName(htMent)
                .Cells(i + 2, "C") = htMent.ID
                .Cells(i + 2, "D") = htMent.Name
                .Cells(i + 2, "E") = htMent.Value
                .Cells(i + 2, "F") = htMent.Text
                .Cells(i + 2, "G") = htMent.innerText
            End With
        Next i

這塊代碼和JS有些相似,需要從IE.Document.all中把頁面上所有節(jié)點找出來。這里也提供其他幾種方法:

  • getElementById("IDName"):返回第一個內(nèi)部標(biāo)有IDName的標(biāo)簽
  • getElementsByName("a") :返回所有的<a>標(biāo)簽,返回值為集合
  • getElementsByClassName("css"):返回所有樣式名稱為css的標(biāo)簽,返回值為集合

這些都是在抓取了全部頁面內(nèi)容后幫助篩選有效信息時使用起來比較方便的。當(dāng)然all還是最好用的,因為all也存在all("IDName")以及all.IDName等等用法。

上面代碼部分返回的屬性值都是HTML基本內(nèi)容,就不一一解釋了。

3、填充信息

網(wǎng)抓神器當(dāng)然還是Python,大部分人使用Excel的目的還是在于對頁面內(nèi)容進行自動填充,直接讓表格提交網(wǎng)頁,問卷錄入之類的工作都省心不少。在抓取了頁面內(nèi)容之后,想填充更加是易如反掌的事情,只需要直接給頁面標(biāo)簽的Value屬性賦值就可以了。

不過網(wǎng)頁中除了文本框,可能還存在一些其他沒有Value的標(biāo)簽,比如:下拉菜單、單選框。給這些內(nèi)容賦值就需要一些基本的HTML知識了。

'下拉菜單選擇
.all("select")(0).Selected = True
'單選按鈕選擇
.all("radio").Checked = True
'復(fù)選按鈕選擇
.all("checkbox").Checked = True

下拉菜單是select標(biāo)簽,每個選項都在一個option標(biāo)簽里,所以返回一個集合,需要選中某個選項就要修改對應(yīng)的Selected屬性為True。單選和復(fù)選按鈕都是input標(biāo)簽,區(qū)別在于類型分別是radio和checkbox,要選中某個選項需要修改對應(yīng)的Checked屬性。

三、數(shù)據(jù)接口

有時候我們能直接拿到一些API,通過API返回數(shù)據(jù)當(dāng)然比打開網(wǎng)頁更方便快捷,所使用的方法也有一些不太一樣。

1、請求接口

比如我從網(wǎng)上得到一個能通過城市查詢免費WIFI的API,通過Excel接口訪問就使用下面的代碼:(雖然是免費的,為了避免麻煩還是把我的AppKey隱去了)

  Dim http
  Set http = CreateObject("Microsoft.XMLHTTP")
  http.Open "GET", "http://api./Wifi/QueryByCity", False
  http.setRequestHeader "CONTENT-TYPE", "application/x-www-form-urlencoded"
  http.send "key=[AppKey]&city=北京&page=1"

這時我們創(chuàng)建的對象就不再是IE,而是HTTP對象。這里用的是ajax的Open方法,GET是數(shù)據(jù)發(fā)送方式,第二個參數(shù)是接口地址,第三個參數(shù)是指定請求方式是否為異步。如果這個API有賬號密碼,分別寫在第四第五個參數(shù)。

setRequestHeader就是給接口發(fā)送一個HTTP協(xié)議頭文件,最后send的內(nèi)容是接口參數(shù)。當(dāng)然,這個QueryString也可以直接寫在URL里,send一個空字符串就可以了。

2、接口返回

接口返回獲取的方式很簡單:

  If http.Status = 200 Then Range("A1").Value = http.responseText

這里的HTTP狀態(tài)又變成200了,和之前說好的不一樣啊摔~有興趣可以自己查查具體有哪些。

不過接口返回要么是JSON要么是XML,Excel處理起來十分不方便。這里提供一個處理JSON的方法,是從網(wǎng)上找來的類模塊,具體內(nèi)容放在附錄里。在添加了這個clsJSON類模塊后,對JSON的處理就變得十分簡單了。

將上面的代碼改成:

  If http.Status = 200 Then
     Dim json$
     json = http.responseText
     Dim objJSON As New clsJSON, dicJSON As Object
     Set dicJSON = objJSON.parse(json)
     
     For i = 1 To dicJSON("result")("data").Count
       Sheet1.Cells(i + 1, 1) = dicJSON("result")("data")(i)("name")
       Sheet1.Cells(i + 1, 2) = dicJSON("result")("data")(i)("intro")
       Sheet1.Cells(i + 1, 3) = dicJSON("result")("data")(i)("address")
     Next i
  End If

接口返回的示例我也放在附錄里了,根據(jù)接口返回的對象名、數(shù)組名去修改dicJSON后面的內(nèi)容就可以了。這個處理JSON的模塊用的是VBA中字典+集合的原理,所以數(shù)據(jù)處理后的調(diào)用方式也參照字典和集合。

以上是我用Excel+VBA進行網(wǎng)頁操作的一些個人經(jīng)驗,希望能幫助到一些有需要的人。有什么錯漏的地方,也希望知乎大牛批評指正。

附錄一:VBA處理JSON的類模塊

Option Explicit
'================================
' VBA處理JSON文件的類模塊
'
' http://www.
'================================
Const INVALID_JSON      As Long = 1
Const INVALID_OBJECT    As Long = 2
Const INVALID_ARRAY     As Long = 3
Const INVALID_BOOLEAN   As Long = 4
Const INVALID_NULL      As Long = 5
Const INVALID_KEY       As Long = 6

Private Sub Class_Initialize()

End Sub

Private Sub Class_Terminate()

End Sub

Public Function parse(ByRef str As String) As Object

    Dim index As Long
    index = 1
    
    On Error Resume Next

    Call skipChar(str, index)
    Select Case Mid(str, index, 1)
    Case "{"
        Set parse = parseObject(str, index)
    Case "["
        Set parse = parseArray(str, index)
    End Select

End Function

Private Function parseObject(ByRef str As String, ByRef index As Long) As Object

    Set parseObject = CreateObject("Scripting.Dictionary")
    
    ' "{"
    Call skipChar(str, index)
    If Mid(str, index, 1) <> "{" Then Err.Raise vbObjectError + INVALID_OBJECT, Description:="char " & index & " : " & Mid(str, index)
    index = index + 1
    
    Do
    
        Call skipChar(str, index)
        If "}" = Mid(str, index, 1) Then
            index = index + 1
            Exit Do
        ElseIf "," = Mid(str, index, 1) Then
            index = index + 1
            Call skipChar(str, index)
        End If
        
        Dim key As String
        
        ' add key/value pair
        parseObject.Add key:=parseKey(str, index), Item:=parseValue(str, index)
        
    Loop

End Function

Private Function parseArray(ByRef str As String, ByRef index As Long) As Collection

    Set parseArray = New Collection
    
    ' "["
    Call skipChar(str, index)
    If Mid(str, index, 1) <> "[" Then Err.Raise vbObjectError + INVALID_ARRAY, Description:="char " & index & " : " + Mid(str, index)
    index = index + 1
    
    Do
        
        Call skipChar(str, index)
        If "]" = Mid(str, index, 1) Then
            index = index + 1
            Exit Do
        ElseIf "," = Mid(str, index, 1) Then
            index = index + 1
            Call skipChar(str, index)
        End If
        
        ' add value
        parseArray.Add parseValue(str, index)
        
    Loop

End Function

Private Function parseValue(ByRef str As String, ByRef index As Long)

    Call skipChar(str, index)
    
    Select Case Mid(str, index, 1)
    Case "{"
        Set parseValue = parseObject(str, index)
    Case "["
        Set parseValue = parseArray(str, index)
    Case """", "'"
        parseValue = parseString(str, index)
    Case "t", "f"
        parseValue = parseBoolean(str, index)
    Case "n"
        parseValue = parseNull(str, index)
    Case Else
        parseValue = parseNumber(str, index)
    End Select

End Function

Private Function parseString(ByRef str As String, ByRef index As Long) As String

    Dim quote   As String
    Dim char    As String
    Dim code    As String
    
    Call skipChar(str, index)
    quote = Mid(str, index, 1)
    index = index + 1
    Do While index > 0 And index <= Len(str)
        char = Mid(str, index, 1)
        Select Case (char)
        Case "\"
            index = index + 1
            char = Mid(str, index, 1)
            Select Case (char)
            Case """", "\\", "/"
                parseString = parseString & char
                index = index + 1
            Case "b"
                parseString = parseString & vbBack
                index = index + 1
            Case "f"
                parseString = parseString & vbFormFeed
                index = index + 1
            Case "n"
                parseString = parseString & vbNewLine
                index = index + 1
            Case "r"
                parseString = parseString & vbCr
                index = index + 1
            Case "t"
                parseString = parseString & vbTab
                index = index + 1
            Case "u"
                index = index + 1
                code = Mid(str, index, 4)
                parseString = parseString & ChrW(Val("&h" + code))
                index = index + 4
            End Select
        Case quote
            index = index + 1
            Exit Function
        Case Else
            parseString = parseString & char
            index = index + 1
        End Select
    Loop

End Function

Private Function parseNumber(ByRef str As String, ByRef index As Long)

    Dim value   As String
    Dim char    As String
    
    Call skipChar(str, index)
    Do While index > 0 And index <= Len(str)
        char = Mid(str, index, 1)
        If InStr("+-0123456789.eE", char) Then
            value = value & char
            index = index + 1
        Else
            If InStr(value, ".") Or InStr(value, "e") Or InStr(value, "E") Then
                parseNumber = CDbl(value)
            Else
                parseNumber = CInt(value)
            End If
            Exit Function
        End If
    Loop


End Function

Private Function parseBoolean(ByRef str As String, ByRef index As Long) As Boolean

    Call skipChar(str, index)
    If Mid(str, index, 4) = "true" Then
        parseBoolean = True
        index = index + 4
    ElseIf Mid(str, index, 5) = "false" Then
        parseBoolean = False
        index = index + 5
    Else
        Err.Raise vbObjectError + INVALID_BOOLEAN, Description:="char " & index & " : " & Mid(str, index)
    End If

End Function

Private Function parseNull(ByRef str As String, ByRef index As Long)

    Call skipChar(str, index)
    If Mid(str, index, 4) = "null" Then
        parseNull = Null
        index = index + 4
    Else
        Err.Raise vbObjectError + INVALID_NULL, Description:="char " & index & " : " & Mid(str, index)
    End If

End Function

Private Function parseKey(ByRef str As String, ByRef index As Long) As String

    Dim dquote  As Boolean
    Dim squote  As Boolean
    Dim char    As String
    
    Call skipChar(str, index)
    Do While index > 0 And index <= Len(str)
        char = Mid(str, index, 1)
        Select Case (char)
        Case """"
            dquote = Not dquote
            index = index + 1
            If Not dquote Then
                Call skipChar(str, index)
                If Mid(str, index, 1) <> ":" Then
                    Err.Raise vbObjectError + INVALID_KEY, Description:="char " & index & " : " & parseKey
                End If
            End If
        Case "'"
            squote = Not squote
            index = index + 1
            If Not squote Then
                Call skipChar(str, index)
                If Mid(str, index, 1) <> ":" Then
                    Err.Raise vbObjectError + INVALID_KEY, Description:="char " & index & " : " & parseKey
                End If
            End If
        Case ":"
            If Not dquote And Not squote Then
                index = index + 1
                Exit Do
            End If
        Case Else
            If InStr(vbCrLf & vbCr & vbLf & vbTab & " ", char) Then
            Else
                parseKey = parseKey & char
            End If
            index = index + 1
        End Select
    Loop

End Function

Public Sub skipChar(ByRef str As String, ByRef index As Long)

    While index > 0 And index <= Len(str) And InStr(vbCrLf & vbCr & vbLf & vbTab & " ", Mid(str, index, 1))
        index = index + 1
    Wend

End Sub

Public Function toString(ByRef obj As Variant) As String

    Select Case VarType(obj)
        Case vbNull
            toString = "null"
        Case vbDate
            toString = """" & CStr(obj) & """"
        Case vbString
            toString = """" & encode(obj) & """"
        Case vbObject
            Dim bFI, i
            bFI = True
            If TypeName(obj) = "Dictionary" Then
                toString = toString & "{"
                Dim keys
                keys = obj.keys
                For i = 0 To obj.Count - 1
                    If bFI Then bFI = False Else toString = toString & ","
                    Dim key
                    key = keys(i)
                    toString = toString & """" & key & """:" & toString(obj(key))
                Next i
                toString = toString & "}"
            ElseIf TypeName(obj) = "Collection" Then
                toString = toString & "["
                Dim value
                For Each value In obj
                    If bFI Then bFI = False Else toString = toString & ","
                    toString = toString & toString(value)
                Next value
                toString = toString & "]"
            End If
        Case vbBoolean
            If obj Then toString = "true" Else toString = "false"
        Case vbVariant, vbArray, vbArray + vbVariant
            Dim sEB
            toString = multiArray(obj, 1, "", sEB)
        Case Else
            toString = Replace(obj, ",", ".")
    End Select

End Function

Private Function encode(str) As String
    
    Dim i, j, aL1, aL2, c, p

    aL1 = Array(&H22, &H5C, &H2F, &H8, &HC, &HA, &HD, &H9)
    aL2 = Array(&H22, &H5C, &H2F, &H62, &H66, &H6E, &H72, &H74)
    For i = 1 To Len(str)
        p = True
        c = Mid(str, i, 1)
        For j = 0 To 7
            If c = Chr(aL1(j)) Then
                encode = encode & "\" & Chr(aL2(j))
                p = False
                Exit For
            End If
        Next

        If p Then
            Dim a
            a = AscW(c)
            If a > 31 And a < 127 Then
                encode = encode & c
            ElseIf a > -1 Or a < 65535 Then
                encode = encode & "\u" & String(4 - Len(Hex(a)), "0") & Hex(a)
            End If
        End If
    Next
End Function

Private Function multiArray(aBD, iBC, sPS, ByRef sPT)   ' Array BoDy, Integer BaseCount, String PoSition
    Dim iDU, iDL, i ' Integer DimensionUBound, Integer DimensionLBound
    On Error Resume Next
    iDL = LBound(aBD, iBC)
    iDU = UBound(aBD, iBC)
    
    Dim sPB1, sPB2  ' String PointBuffer1, String PointBuffer2
    If Err.Number = 9 Then
        sPB1 = sPT & sPS
        For i = 1 To Len(sPB1)
            If i <> 1 Then sPB2 = sPB2 & ","
            sPB2 = sPB2 & Mid(sPB1, i, 1)
        Next
'        multiArray = multiArray & toString(Eval("aBD(" & sPB2 & ")"))
        multiArray = multiArray & toString(aBD(sPB2))
    Else
        sPT = sPT & sPS
        multiArray = multiArray & "["
        For i = iDL To iDU
            multiArray = multiArray & multiArray(aBD, iBC + 1, i, sPT)
            If i < iDU Then multiArray = multiArray & ","
        Next
        multiArray = multiArray & "]"
        sPT = Left(sPT, iBC - 2)
    End If
    Err.Clear
End Function

附錄二:JSON返回示例

{
"resultcode":"200",
"reason":"ReturnSuccessd!",
"result":{
"data":[
{
"name":"北京市法雨合",
"intro":"法雨合0層",
"address":"北京市朝陽區(qū)朝陽區(qū)三里屯",
"google_lat":"39.9372423",
"google_lon":"116.4480615",
"baidu_lat":"39.942952987502",
"baidu_lon":"116.45464108129",
"province":"北京市",
"city":"北京市"
},
{
"name":"北京朝陽西壩河光熙門北里",
"intro":"朝陽西壩河光熙門北里34-8號0層",
"address":"北京市朝陽區(qū)朝陽區(qū)西壩河光熙門北里34號-8號0層",
"google_lat":"39.9635121",
"google_lon":"116.435895",
"baidu_lat":"39.969407173324",
"baidu_lon":"116.44243487981",
"province":"北京市",
"city":"北京市"
},
{
"name":"北京朝陽三里屯北街",
"intro":"",
"address":"北京市朝陽區(qū)朝陽三里屯北街8號0層",
"google_lat":"39.9254286",
"google_lon":"116.4605935",
"baidu_lat":"39.931073085771",
"baidu_lon":"116.46719483818",
"province":"北京市",
"city":"北京市"
},
{
"name":"北京大都酒吧街",
"intro":"",
"address":"北京市朝陽區(qū)元大都酒吧街11號",
"google_lat":"39.975984",
"google_lon":"116.424389",
"baidu_lat":"39.982089966811",
"baidu_lon":"116.43086831752",
"province":"北京市",
"city":"北京市"
},
{
"name":"北京西城前海北沿",
"intro":"",
"address":"北京市西城區(qū)西城前海北沿10號0層",
"google_lat":"39.9369032",
"google_lon":"116.3919335",
"baidu_lat":"39.943215619704",
"baidu_lon":"116.39830652238",
"province":"北京市",
"city":"北京市"
},
{
"name":"北京市西城后海南沿36號對面",
"intro":"后海南沿36號對面0層",
"address":"北京市西城區(qū)后海南沿36號",
"google_lat":"39.9396792",
"google_lon":"116.389129",
"baidu_lat":"39.945967638433",
"baidu_lon":"116.39551153315",
"province":"北京市",
"city":"北京市"
},
{
"name":"北京市賽百味",
"intro":"ok",
"address":"北京市西城區(qū)中關(guān)村東路18號",
"google_lat":"39.9810991",
"google_lon":"116.3333866",
"baidu_lat":"39.9867766224",
"baidu_lon":"116.34001632032",
"province":"北京市",
"city":"北京市"
},
{
"name":"北京市光華路數(shù)碼01",
"intro":"",
"address":"北京市朝陽區(qū)光華路數(shù)碼01大廈0層",
"google_lat":"39.9132392",
"google_lon":"116.4592309",
"baidu_lat":"39.918885961978",
"baidu_lon":"116.46583845234",
"province":"北京市",
"city":"北京市"
},
{
"name":"北京市盛銘幫逸園會館",
"intro":"盛銘幫逸園會館0",
"address":"北京市朝陽區(qū)逸園25號",
"google_lat":"39.8710876",
"google_lon":"116.4602965",
"baidu_lat":"39.876744728506",
"baidu_lon":"116.46693498949",
"province":"北京市",
"city":"北京市"
},
{
"name":"北京市地平線酒吧",
"intro":"",
"address":"北京市朝陽區(qū)朝陽三里屯北街70號",
"google_lat":"39.9254286",
"google_lon":"116.4605935",
"baidu_lat":"39.931073085771",
"baidu_lon":"116.46719483818",
"province":"北京市",
"city":"北京市"
}
],
"pageinfo":{
"pnums":20,
"current":1
	}
	}
}

網(wǎng)頁抓取VBA

    本站是提供個人知識管理的網(wǎng)絡(luò)存儲空間,所有內(nèi)容均由用戶發(fā)布,不代表本站觀點。請注意甄別內(nèi)容中的聯(lián)系方式、誘導(dǎo)購買等信息,謹防詐騙。如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容,請點擊一鍵舉報。
    轉(zhuǎn)藏 分享 獻花(0

    0條評論

    發(fā)表

    請遵守用戶 評論公約

    類似文章 更多

    99久久国产综合精品二区| 欧美成人高清在线播放| 国产一区二区三区丝袜不卡| 国产成人午夜av一区二区| 99国产高清不卡视频| 91精品视频全国免费| 少妇人妻一级片一区二区三区| 精品精品国产自在久久高清| 国产色偷丝袜麻豆亚洲| 女生更色还是男生更色| 欧美日韩乱码一区二区三区| 91播色在线免费播放| 欧美一级日韩中文字幕| 久久福利视频视频一区二区| 久久精品国产在热亚洲| 国产精品一区二区三区日韩av| 日韩精品视频香蕉视频| 亚洲国产色婷婷久久精品| 亚洲精品一区二区三区日韩| 一个人的久久精彩视频 | 日韩国产亚洲一区二区三区| 亚洲欧美中文字幕精品| 日韩不卡一区二区视频| 亚洲欧洲在线一区二区三区| 日韩中文字幕欧美亚洲| 深夜视频在线观看免费你懂| 国产精品自拍杆香蕉视频| 久久精品欧美一区二区三不卡 | 欧美日韩精品久久亚洲区熟妇人| 黑人巨大精品欧美一区二区区| 国产一区欧美一区日本道| 亚洲精品成人午夜久久| 国产午夜精品福利免费不| 蜜桃av人妻精品一区二区三区| 日本精品中文字幕在线视频| 国产在线成人免费高清观看av| 色小姐干香蕉在线综合网| 又色又爽又黄的三级视频| 亚洲第一视频少妇人妻系列 | 精品高清美女精品国产区| 亚洲精品成人福利在线|