+ Reply to Thread
Results 1 to 5 of 5

Thread: Pull Data from Website Source

  1. #1
    Registered User
    Join Date
    04-17-2011
    Location
    Pittsburgh, PA, USA
    MS-Off Ver
    Excel 2010
    Posts
    3

    Question Pull Data from Website Source

    I have a column of URL's and would like to write a function that hits an API to return a single piece of data for each of them. The API is a simple querystring-based interface, and I want the entire response added to a cell.

    A simple example would be to pull the <Title> element from any URL. (WITHOUT the use of the Web Query data connection.) That doesn't work in this case. My goal is to have an easy way to read a cell with a URL and return a piece of HTML content.

    I've been searching on this for a few days and haven't found a solution that seems to work.

    Any help or links to related posts are appreciated.
    Last edited by JasonGreen; 04-18-2011 at 12:02 PM. Reason: Solved
    - Jason Green
    - BusinessHut.com

  2. #2
    Forum Moderator Leith Ross's Avatar
    Join Date
    01-15-2005
    Location
    San Francisco, Ca
    MS-Off Ver
    2000, 2003, & read 2007
    Posts
    15,979

    Re: Pull Data from Website Source

    Hello Jason,

    Welcome to the Forum!

    Are using Windows?
    Are using Internet Explorer as your browser?
    Can you post a sample workbook with before and after examples?
    Sincerely,
    Leith Ross

    Remember To Do the Following....

    1. Use code tags. Place [CODE] before the first line of code and [/CODE] after the last line of code.
    2. Thank those who have helped you by clicking the Star below the post.
    3. Please mark your post [SOLVED] if it has been answered satisfactorily.


    Old Scottish Proverb...
    Luathaid gu deanamh maille! (Rushing causes delays!)

  3. #3
    Registered User
    Join Date
    04-17-2011
    Location
    Pittsburgh, PA, USA
    MS-Off Ver
    Excel 2010
    Posts
    3

    Re: Pull Data from Website Source

    Thanks!

    I am using Windows XP or Windows 7 with Excel 2007 or 2010.

    Unfortunately, I don't have any samples, as I can't seem to get anything close to working.

    Just asking in general what method someone else would use to get this data. It seems like it should be extremely simple, but I can't find any examples of someone making this work.

    It would be something like the "GetData" formula in Google Docs, where the only parameter is the URL and it returns the text of the resulting page. I can parse out what I need after we get this first part working.
    - Jason Green
    - BusinessHut.com

  4. #4
    Forum Moderator Leith Ross's Avatar
    Join Date
    01-15-2005
    Location
    San Francisco, Ca
    MS-Off Ver
    2000, 2003, & read 2007
    Posts
    15,979

    Re: Pull Data from Website Source

    Hello Jason,

    This macro uses the Windows API to return the page source text of the URL and its status. The macro has an option to allow redirects from the URL you provide. The default for this flag is False.

    Copy this code into a standard VBA module in your project.
    Macro Code
    'Written: March 15, 2011
    'Author:  Leith Ross
    
    Public PageSource As String
    Public httpRequest As Object
    
    Function GetURLStatus(ByVal URL As String, Optional AllowRedirects As Boolean)
    
      Const WinHttpRequestOption_UserAgentString = 0
      Const WinHttpRequestOption_EnableRedirects = 6
      
      
        On Error Resume Next
           Set httpRequest = CreateObject("WinHttp.WinHttpRequest.5.1")
             If httpRequest Is Nothing Then
                Set httpRequest = CreateObject("WinHttp.WinHttpRequest.5")
             End If
           Err.Clear
        On Error GoTo 0
    
        httpRequest.Option(WinHttpRequestOption_UserAgentString) = "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
        httpRequest.Option(WinHttpRequestOption_EnableRedirects) = AllowRedirects
    
    
      'Clear any pervious web page source information
        PageSource = ""
        
            'Add protocol if missing
              If InStr(1, URL, "://") = 0 Then
                 URL = "http://" & URL
              End If
    
              'Launch the HTTP httpRequest synchronously
                On Error Resume Next
                  httpRequest.Open "GET", URL, False
                  If Err.Number <> 0 Then
                    'Handle connection errors
                      GetURLStatus = Err.Description
                      Err.Clear
                      Exit Function
                  End If
                On Error GoTo 0
               
              'Send the http httpRequest for server status
                On Error Resume Next
                  httpRequest.Send
                  httpRequest.WaitForResponse
                  If Err.Number <> 0 Then
                    ' Handle server errors
                      PageSource = "Error"
                      GetURLStatus = Err.Description
                      Err.Clear
                  Else
                    'Show HTTP response info
                      GetURLStatus = httpRequest.Status & " - " & httpRequest.StatusText
                    'Save the web page text
                      PageSource = httpRequest.responsetext
                  End If
                On Error GoTo 0
                
    End Function

    Example
    Sub GetPageSource()
    
      Dim Status As String
      Dim Text As String
      
        Status = GetURLStatus(URL:="www.google.com", AllowRedirects:=True)
        Text = PageSource
             
    End Sub
    Sincerely,
    Leith Ross

    Remember To Do the Following....

    1. Use code tags. Place [CODE] before the first line of code and [/CODE] after the last line of code.
    2. Thank those who have helped you by clicking the Star below the post.
    3. Please mark your post [SOLVED] if it has been answered satisfactorily.


    Old Scottish Proverb...
    Luathaid gu deanamh maille! (Rushing causes delays!)

  5. #5
    Registered User
    Join Date
    04-17-2011
    Location
    Pittsburgh, PA, USA
    MS-Off Ver
    Excel 2010
    Posts
    3

    Thumbs up Re: Pull Data from Website Source

    Leith,

    Thank you so much! This worked perfectly.

    I was able to get the URL status right "out of the box", and was able to change the section below to return the source code. Exactly what I was looking for!

    GetURLStatus = httpRequest.responsetext
    Thanks again for your help!
    *And even for adding the error handling as well. You are awesome!
    - Jason Green
    - BusinessHut.com

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.2.0