+ Reply to Thread
Results 1 to 5 of 5

web page data scraping every day data required please support

  1. #1
    Registered User
    Join Date
    04-21-2012
    Location
    India
    MS-Off Ver
    Excel 2007
    Posts
    55

    web page data scraping every day data required please support

    required coding for vahan data downloading from vahan website

    https://vahan.parivahan.gov.in/vahan...portview.xhtml

    first select state , then rto name , maker , monthwise and year 2025 then refresh

    left side more 4-6 option click and then refresh

    and donwload excel file

    same repeat change rto name refresh & left side 4-6 option click then refresh and excel downloaded


    daily 4 state data download with all rto

    please help & support

  2. #2
    Registered User
    Join Date
    07-24-2023
    Location
    Oirschot, Netherlands
    MS-Off Ver
    2016
    Posts
    18

    Re: web page data scraping every day data required please support

    I can't open the link.

    I get data daily from a certain website where I have an account. My default browser is Chrome and Office 365. And I use Selenium, without this program Excel can't scrape websites. I want to help you but I'm not going to offer a complete solution. Just like me, you'll have to read up a bit yourself. Let me know if you're interested.

  3. #3
    Registered User
    Join Date
    04-21-2012
    Location
    India
    MS-Off Ver
    Excel 2007
    Posts
    55

    Re: web page data scraping every day data required please support

    i m interested but 365office not in my office system & selenium also not in my system,


    link
    https://vahan.parivahan.gov.in/vahan...portview.xhtml

    required filter in web site
    Type: Actual Value
    State : Haryana 4 states (Haryana, up , Punjab & Maharashtra)
    RTO : one by one (all)
    Y-axis : Maker
    X-axis : Month wise
    Year Type : Calendar year
    year : 2025
    Then refresh

    Left side

    multiple option selection i.e. Motor Car, LUXURY CAB, MAXI CAB, MOTOR CAB
    then refresh
    and excel download

    again repeated change RTO name refresh and left side again option choose & refresh then Excel file download


    and i have one coding, but i don't know how to work, i searching in google & one coding found regarding vahan data scraping

    import time, sys
    from selenium import webdriver
    from selenium.webdriver.chrome.options import Options
    from selenium.webdriver.chrome.service import Service
    from webdriver_manager.chrome import ChromeDriverManager

    from selenium.webdriver.support import expected_conditions as EC
    from selenium.webdriver.support.ui import Select
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.common.exceptions import TimeoutException
    from selenium.webdriver.common.by import By
    import csv


    filename = "rto_file.csv"
    url = 'https://vahan.parivahan.gov.in/vahan4dashboard/vahan/view/reportview.xhtml'
    refresh_button_id = "j_idt63"



    ### Helper tools ###
    def init_webdriver():
    # --- Options to make scraping faster
    options = Options()
    options.add_argument("--disable-extensions")
    options.add_argument("--headless")
    options.add_argument("--allow-insecure-localhost")
    driver = webdriver.Chrome(options=options, service=Service(ChromeDriverManager().install()))
    return driver

    # Identifies element to be clicked by id, waits for element to be clickable for
    # at most 10s and then clicks. If it takes too long (elem doesn't exist or
    # rendering takes too long) then sys exits
    def safe_click_by_id(id : str):
    try:
    WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.ID, id))).click()
    except TimeoutException:
    print(TimeoutException)
    print(id)
    sys.exit()

    def choose_elem_from_dropdown(dropdownID, elemID):
    safe_click_by_id(dropdownID)
    # Might be better to search by text than by id for long-term stability
    # I don't know how yet
    safe_click_by_id(elemID)
    safe_click_by_id(dropdownID)

    # Clicks refresh button
    def refresh():
    safe_click_by_id(refresh_button_id)



    ### Functions that interact with the website ###
    #Starts up the actual page
    def init_page():
    driver = init_webdriver()
    driver.get(url)
    return driver

    # Extracting data from dropdown menu names to enter into csv
    # This is a simple interpreter, requiring menu names to be
    # of the expected format
    def RTO_label_interpreter(label):
    RTO_details = label.split(" - ")

    # Dealing with exceptions where RTO's have " - " in their names
    if len(RTO_details) > 2:
    name = RTO_details[0]
    for idx in range(1, len(RTO_details)-1):
    name = name + " - " + RTO_details[idx]
    RTO_details = [name, RTO_details[-1]]

    RTO_name = RTO_details[0]
    end_index = RTO_details[1].find('(')
    if end_index == -1:
    print("RTO naming format changed in dropdown menu")
    print(RTO_details)
    sys.exit()
    state_code = RTO_details[1][0:2]
    RTO_code = RTO_details[1][2:end_index]

    return RTO_name, state_code, RTO_code

    #Loops through every rto for a chosen state
    def get_rto_codes(state_name):
    rto_select = Select(driver.find_element(by=By.ID, value="selectedRto_input"))
    #This includes the 0th option which we don't want
    num_rto_options = len(rto_select.options)
    #Loop through the relevant options in the RTO dropdown menu
    for idx in range(1, num_rto_options):
    opt = rto_select.options[idx]
    RTO_label = opt.get_attribute("text")

    RTO_name, state_code, RTO_code = RTO_label_interpreter(RTO_label)

    # To be entered into csv
    data = [state_name, RTO_name, state_code, RTO_code]

    csvfile = open(filename, 'a', newline='')
    csvwriter = csv.writer(csvfile)
    csvwriter.writerow(data)
    csvfile.close()

    #Loops through all available states in the dropdown menu
    def state():
    state_select = Select(driver.find_element(by=By.ID, value="j_idt33_input"))
    #This includes the 0th option which we don't want
    num_state_options = len(state_select.options)
    #Loop through the relevant options in the state dropdown menu
    for idx in range(1, num_state_options):
    state_label = state_select.options[idx].get_attribute('text')
    state_name = state_label[:state_label.find('(')]
    choose_elem_from_dropdown("j_idt33", f"j_idt33_{idx}")
    time.sleep(5)
    get_rto_codes(state_name)
    refresh()
    continue


    def main():
    fields = ['State', 'RTO', 'State Code', 'RTO Code']

    csvfile = open(filename, 'w', newline='')
    csvwriter = csv.writer(csvfile)
    csvwriter.writerow(fields)
    csvfile.close()

    global driver
    driver = init_page()
    state()
    time.sleep(1)
    return True

    if __name__ == "__main__":
    main()
    Last edited by sonu_kumar444; 06-17-2025 at 06:38 AM.

  4. #4
    Registered User
    Join Date
    07-24-2023
    Location
    Oirschot, Netherlands
    MS-Off Ver
    2016
    Posts
    18

    Re: web page data scraping every day data required please support

    I still can't open the link without changing my system. And I'm not going to do that. I'll create an Office Word file explaining how to log in to a certain website and read out data. I'll explain this step by step and add images. This code also works with Office 2016 and Selenium can be downloaded for free here. https://github.com/florentbr/Seleniu...s/tag/v2.0.9.0
    Every time Chrome gets an update, the Chromedriver will also need to get an update. I've written a macro for this that automatically performs this update when necessary. You can download the latest Chromedriver here if necessary. The bottom Chromedriver is the most recent. https://googlechromelabs.github.io/c...esting/#stable
    Put that Chromedriver in the Selenium installation folder which is by default C:\Users\YOURNAME\AppData\Local\SeleniumBasic

    I'm going to create the Office Word file now. That should point you in the right direction. I'll post it when it's done.

  5. #5
    Registered User
    Join Date
    07-24-2023
    Location
    Oirschot, Netherlands
    MS-Off Ver
    2016
    Posts
    18

    Re: web page data scraping every day data required please support

    You have PM

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. web page data scraping every day data required please support
    By sonu_kumar444 in forum Excel Programming / VBA / Macros
    Replies: 2
    Last Post: 05-13-2025, 11:20 PM
  2. VBA code for scraping data from a URL - not pulling all data
    By jjin in forum Excel Programming / VBA / Macros
    Replies: 0
    Last Post: 08-14-2018, 12:37 PM
  3. VBA scraping web page data
    By guloomet in forum Excel Programming / VBA / Macros
    Replies: 0
    Last Post: 11-08-2016, 05:00 AM
  4. [SOLVED] Webpage scraping into Excel - Run-time error '438': Object doesn't support this property..
    By kaseyleigh in forum Excel Programming / VBA / Macros
    Replies: 10
    Last Post: 12-29-2014, 08:15 AM
  5. Scraping website data when Get External Data from Web doesnt work
    By Zipping2010 in forum Excel Programming / VBA / Macros
    Replies: 1
    Last Post: 02-01-2013, 12:16 AM
  6. Wait for Page to Load When Scraping Data
    By bairdgbaird in forum Excel Programming / VBA / Macros
    Replies: 0
    Last Post: 07-05-2012, 01:35 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1