+ Reply to Thread
Results 1 to 2 of 2

Getting runtime error on reading html code

Hybrid View

  1. #1
    Forum Contributor
    Join Date
    01-18-2013
    Location
    Prague, Czech rep.
    MS-Off Ver
    Excel 2003 - 2016
    Posts
    138

    Getting runtime error on reading html code

    Hi guys,

    so I had this code to search the source code of websites for relevant information, then I upgraded to w10 + office16 and BANG its not working, spent half of the day to find the reason and solution but failed. So here I´m, hopeless and tired. any ideas what Microsoft F*ed up again? and most importantly, how to fix it?

    here is the code:

    Sub Harvest()
    
    Dim InternetWindow As InternetExplorerMedium
    Dim HTML_Code As String
    Dim Start_of_Code As Long
    Dim End_of_Code As Long
    Dim The_URL As String
    Dim Main_Web As String
    Dim Sub_Web() As Variant
    Dim Cap As Long
    Dim Street_Name As String
    Dim Row_No As Long
    
    Set InternetWindow = Nothing
    
    ReDim Sub_Web(O To 99999)
    Debug.Print Time
    
    Application.ScreenUpdating = False
    Set InternetWindow = New InternetExplorerMedium
    'GetObject ("new:{D5E8041D-920F-45e9-B8FB-B1DEB82C6E5E}")
    'Set InternetWindow = New InternetExplorer
    'CreateObject("InternetExplorer.Application")
    
    j = 1
    
    For i = 1 To 1
        If i = 1 Then
            Main_Web = "http://www.praha.cz/mestske-casti-praha?q=%2Fmestske-casti-praha"
        Else
            Main_Web = "http://www.praha.cz/mestske-casti-praha/strana-" & i & "?q=%2Fmestske-casti-praha"
        End If
            'InternetWindow.Visible = True
        InternetWindow.navigate (Main_Web)
        
        Do
        DoEvents
        
        Loop Until InternetWindow.ReadyState = READYSTATE_COMPLETE
        
        HTML_Code = InternetWindow.Document.Body.InnerHTML
        
        Do
        DoEvents
        
        Start_of_Code = InStr(HTML_Code, "Část obce Praha - ")
        If Start_of_Code > 0 Then
            The_URL = Mid(HTML_Code, Start_of_Code + 18, Len(HTML_Code))
            HTML_Code = The_URL
            The_URL = "http://www.praha.cz/mestske-casti-praha/cast-obce-praha-" & Mid(The_URL, 1, InStr(The_URL, "</a>"))
            Sub_Web(j) = The_URL
            HTML_Code = Mid(The_URL, InStr(The_URL, "</a>"), Len(The_URL))
            j = j + 1
        End If
        
        Debug.Print The_URL
        
        Loop Until Start_of_Code = 0
    Next i
    
    Cap = j
    
    For i = 1 To j
        If Sub_Web(i) <> "" Then
            'InternetWindow.Visible = True
            InternetWindow.navigate (Sub_Web(i))
            
            Do
            DoEvents
            
            Loop Until InternetWindow.ReadyState = READYSTATE_COMPLETE
    
    On Error GoTo skip1
            
            HTML_Code = InternetWindow.Document.Body.InnerHTML
            Start_of_Code = InStr(HTML_Code, "<b>Ulice")
            End_of_Code = Len(HTML_Code)
            HTML_Code = Mid(HTML_Code, Start_of_Code, End_of_Code)
                  
    On Error GoTo 0
                  
            Start_of_Code = InStr(HTML_Code, "contentpagetitle")
            If Start_of_Code > 0 Then
                HTML_Code = Mid(HTML_Code, Start_of_Code + 2, End_of_Code)
                Street_Name = Mid(HTML_Code, 1, InStr(HTML_Code, "<") - 1)
            Else
                Street_Name = ""
            End If
            Row_No = WorksheetFunction.CountA(Sheets(2).Columns(2)) + 1
            Sheets(2).ceels(Row_No, 2).Value = Replace(Sub_Web(i), "http://www.praha.cz/mestske-casti-praha/cast-obce-praha-", "")
            Sheets(2).Cells(Row_No, 3).Value = Street_Name
        End If
    skip1:
    On Error GoTo 0
    Next i
    Application.ScreenUpdating = True
    
    Debug.Print Time
    
    End Sub
    I always get the error around the first loop where the code is waiting for the IE to be ready to proceed, respectively uploading the innerHTML(OR it gets stucked infinitely) - depends on if I used createobject (error) or new internetexplorer (stucked). I tried changing the protection setting in IE, nothing works.

    The outcome should be list of all streets in Prague with the city district where it belongs (there is supposedly 7,5K streets, so you see I rly dont want to go manualy through all of them).

    Thanks for any ideas that might lead to solution, cause right now I´m rly desperate. I used the same code (just with different url) about 3 months ago and worked just fine, harvested database of 25K rows in about 2-3 hours.

    Best regards

    Soul

  2. #2
    Forum Contributor
    Join Date
    01-18-2013
    Location
    Prague, Czech rep.
    MS-Off Ver
    Excel 2003 - 2016
    Posts
    138

    Re: Getting runtime error on reading html code

    Nvm, solved it myself...:-D...apart from some minor mistakes like there was no loop to actualy collect all the street (or neglecting diacritics), the main trick to get past whatever S* microsoft deployed was to use CreateObject("MSXML2.XMLHTTP") instead of any other option that I´ve posted above. With this u want to reference the Microsoft XML, v3.0.

    for those more curious, here goes working code:

    (btw. took exactly 44 seconds to collect 7.189 rows)

    Sub Harvest()
    
    Dim InternetWindow As MSXML2.XMLHTTP
    Dim HTML_Code As String
    Dim Start_of_Code As Long
    Dim End_of_Code As Long
    Dim The_URL As String
    Dim Main_Web As String
    Dim Sub_Web() As Variant
    Dim District_Name() As Variant
    Dim Cap As Long
    Dim Street_Name As String
    Dim Dist_Name As String
    Dim Row_No As Long
    
    Set InternetWindow = Nothing
    
    ReDim Sub_Web(O To 99999)
    ReDim District_Name(O To 99999)
    Debug.Print Time
    
    Application.ScreenUpdating = False
    Set InternetWindow = CreateObject("MSXML2.XMLHTTP")
    
    j = 1
    
    For i = 1 To 7
        If i = 1 Then
            Main_Web = "http://www.praha.cz/mestske-casti-praha?q=%2Fmestske-casti-praha"
        Else
            Main_Web = "http://www.praha.cz/mestske-casti-praha/strana-" & i & "?q=%2Fmestske-casti-praha"
        End If
        InternetWindow.Open "GET", Main_Web, False
        InternetWindow.Send
        
        Do
        DoEvents
        
        Loop Until InternetWindow.ReadyState = READYSTATE_COMPLETE
        
        HTML_Code = InternetWindow.responseText
        
        Do
        DoEvents
        
        Start_of_Code = InStr(HTML_Code, "/mestske-casti-praha/cast-obce-praha")
        If Start_of_Code > 0 Then
            The_URL = Mid(HTML_Code, Start_of_Code, Len(HTML_Code))
            HTML_Code = The_URL
            Start_of_Code = InStr(HTML_Code, "Část obce Praha") + 18
            End_of_Code = InStr(HTML_Code, "</a>") - (Start_of_Code)
            District_Name(j) = Mid(HTML_Code, Start_of_Code, End_of_Code)
            The_URL = "http://www.praha.cz" & Mid(HTML_Code, 1, InStr(HTML_Code, "contentpagetitle") - 10)
            Sub_Web(j) = The_URL
            HTML_Code = Mid(HTML_Code, InStr(HTML_Code, "</a>"), Len(HTML_Code))
            j = j + 1
        End If
           
        Loop Until Start_of_Code = 0
    Next i
    
    Cap = j
    
    
    For i = 1 To j
        If Sub_Web(i) <> "" Then
            InternetWindow.Open "GET", Sub_Web(i), False
            InternetWindow.Send
            
            Do
            DoEvents
            
            Loop Until InternetWindow.ReadyState = READYSTATE_COMPLETE
    
    On Error GoTo skip1
            
            HTML_Code = InternetWindow.responseText
            Start_of_Code = InStr(HTML_Code, "Ulice:")
            End_of_Code = Len(HTML_Code)
            HTML_Code = Mid(HTML_Code, Start_of_Code, End_of_Code)
                  
    On Error GoTo 0
                          
            Do
            DoEvents
            Start_of_Code = InStr(HTML_Code, "contentpagetitle")
            If Start_of_Code > 0 Then
                HTML_Code = Mid(HTML_Code, Start_of_Code + 18, End_of_Code)
                Street_Name = Mid(HTML_Code, 1, InStr(HTML_Code, "<") - 1)
                Dist_Name = District_Name(i)
            Else
                Street_Name = ""
                Dist_Name = ""
            End If
            Row_No = WorksheetFunction.CountA(Sheets(2).Columns(2)) + 1
            Sheets(2).Cells(Row_No, 2).Value = Dist_Name
            Sheets(2).Cells(Row_No, 3).Value = Street_Name
            Loop Until Start_of_Code = 0
        End If
    skip1:
    On Error GoTo 0
    Next i
    Application.ScreenUpdating = True
    
    Debug.Print Time
    
    End Sub
    Regards,
    Soul

    If you liked my help, consider raising my reputation by clicking that button, and please don?t forget to mark this thread [SOLVED], when you?re done here.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Runtime error 13 problem with a code
    By elmnas in forum Excel Programming / VBA / Macros
    Replies: 2
    Last Post: 07-28-2015, 07:54 AM
  2. [SOLVED] Getting a runtime error with this code
    By rinser in forum Excel Programming / VBA / Macros
    Replies: 2
    Last Post: 02-20-2013, 10:22 AM
  3. [SOLVED] Runtime Error on Grouping code
    By dezzer_18 in forum Excel Programming / VBA / Macros
    Replies: 3
    Last Post: 12-14-2012, 11:52 AM
  4. [SOLVED] Range error in code, runs alone but not inside my full program, giving runtime error 1004
    By charizzardd in forum Excel Programming / VBA / Macros
    Replies: 3
    Last Post: 08-23-2012, 03:34 PM
  5. Reading HTML source code
    By Chinnick in forum Excel Programming / VBA / Macros
    Replies: 3
    Last Post: 02-11-2010, 03:38 PM
  6. Tweaking VBA Code for HTML Table Cell Reading
    By ShannonLarson in forum Excel Programming / VBA / Macros
    Replies: 5
    Last Post: 12-07-2009, 05:08 PM
  7. runtime error 9 problem with this code
    By cuewoz in forum Excel Programming / VBA / Macros
    Replies: 0
    Last Post: 10-03-2006, 09:54 AM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1