RSelenium & Web Scraping











up vote
0
down vote

favorite












I'm trying to scrape data but I'm having trouble scraping it. I'm able to navigate through website using RSelenium. You can find my code below. I want to scrape names from each drop down so that I can store them in an object and run a loop.



library(RSelenium)
library(rvest)
library(XML)
library(RCurl)

rd<-rsDriver()
remDr<-rd[["client"]]

url<-"https://kvk.icar.gov.in/facilities_list.aspx"

jsScript <- "var element = arguments[0]; return element.outerHTML;"

webpage<-read_html(url)

remDr$navigate("https://kvk.icar.gov.in/facilities_list.aspx")

remDr$refresh()

#First drop down

stateEle<-remDr$findElement("id", "ContentPlaceHolder1_ddlState")
#webElem <- remDr$findElement("id", "ContentPlaceHolder1_ddlDistrict")
stateHTML <- remDr$executeScript(jsScript, list(stateEle))[[1]]
statedoc <- htmlParse(appHTML)
states<-doc["//option", fun = function(x) xmlGetAttr(x, "name")]
stateEle$clickElement()
stateEle$sendKeysToElement(states[[30]])
stateEle$clickElement()

#Second drop down

distEle<-remDr$findElement("id", "ContentPlaceHolder1_ddlDistrict")
distHTML <- remDr$executeScript(jsScript, list(distEle))[[1]]
distdoc <- htmlParse(appHTML)
districts<-doc["//option", fun = function(x) xmlGetAttr(x, "value")]
distEle$clickElement()
distEle$sendKeysToElement(list(distdoc[[2]]))
distEle$clickElement()

#Third drop down

kvkEle<-remDr$findElement("id", "ContentPlaceHolder1_ddlKvk")
appHTML <- remDr$executeScript(jsScript, list(kvkEle))[[1]]
kvkdoc <- htmlParse(appHTML)
kvk<-doc["//option", fun = function(x) xmlGetAttr(x, "value")]
kvkEle$clickElement()
kvkEle$sendKeysToElement(list(kvk[[2]]))
kvkEle$clickElement()

#submitting the values

submitEle<-remDr$findElement("id", "ContentPlaceHolder1_btnSubmit")
submitEle$clickElement()


Also I want to scrape the results into a dataframe.










share|improve this question
























  • You could try just clicking on the select list, without value = part, and sending the first few keys of your value (say ANDAMAN). Note that the value you want should be highlighted, then send Enter key. For this you will want to read about sendKey functionality of Selenium.
    – Nutle
    Nov 22 at 10:58












  • I tried using this. Now i want to store the names from drop downs so that I can run a loop.
    – Paritosh Sharma
    Nov 29 at 12:07










  • Checkout my answer and please accept it if does what you're looking for to close the issue
    – Nutle
    Nov 29 at 13:03















up vote
0
down vote

favorite












I'm trying to scrape data but I'm having trouble scraping it. I'm able to navigate through website using RSelenium. You can find my code below. I want to scrape names from each drop down so that I can store them in an object and run a loop.



library(RSelenium)
library(rvest)
library(XML)
library(RCurl)

rd<-rsDriver()
remDr<-rd[["client"]]

url<-"https://kvk.icar.gov.in/facilities_list.aspx"

jsScript <- "var element = arguments[0]; return element.outerHTML;"

webpage<-read_html(url)

remDr$navigate("https://kvk.icar.gov.in/facilities_list.aspx")

remDr$refresh()

#First drop down

stateEle<-remDr$findElement("id", "ContentPlaceHolder1_ddlState")
#webElem <- remDr$findElement("id", "ContentPlaceHolder1_ddlDistrict")
stateHTML <- remDr$executeScript(jsScript, list(stateEle))[[1]]
statedoc <- htmlParse(appHTML)
states<-doc["//option", fun = function(x) xmlGetAttr(x, "name")]
stateEle$clickElement()
stateEle$sendKeysToElement(states[[30]])
stateEle$clickElement()

#Second drop down

distEle<-remDr$findElement("id", "ContentPlaceHolder1_ddlDistrict")
distHTML <- remDr$executeScript(jsScript, list(distEle))[[1]]
distdoc <- htmlParse(appHTML)
districts<-doc["//option", fun = function(x) xmlGetAttr(x, "value")]
distEle$clickElement()
distEle$sendKeysToElement(list(distdoc[[2]]))
distEle$clickElement()

#Third drop down

kvkEle<-remDr$findElement("id", "ContentPlaceHolder1_ddlKvk")
appHTML <- remDr$executeScript(jsScript, list(kvkEle))[[1]]
kvkdoc <- htmlParse(appHTML)
kvk<-doc["//option", fun = function(x) xmlGetAttr(x, "value")]
kvkEle$clickElement()
kvkEle$sendKeysToElement(list(kvk[[2]]))
kvkEle$clickElement()

#submitting the values

submitEle<-remDr$findElement("id", "ContentPlaceHolder1_btnSubmit")
submitEle$clickElement()


Also I want to scrape the results into a dataframe.










share|improve this question
























  • You could try just clicking on the select list, without value = part, and sending the first few keys of your value (say ANDAMAN). Note that the value you want should be highlighted, then send Enter key. For this you will want to read about sendKey functionality of Selenium.
    – Nutle
    Nov 22 at 10:58












  • I tried using this. Now i want to store the names from drop downs so that I can run a loop.
    – Paritosh Sharma
    Nov 29 at 12:07










  • Checkout my answer and please accept it if does what you're looking for to close the issue
    – Nutle
    Nov 29 at 13:03













up vote
0
down vote

favorite









up vote
0
down vote

favorite











I'm trying to scrape data but I'm having trouble scraping it. I'm able to navigate through website using RSelenium. You can find my code below. I want to scrape names from each drop down so that I can store them in an object and run a loop.



library(RSelenium)
library(rvest)
library(XML)
library(RCurl)

rd<-rsDriver()
remDr<-rd[["client"]]

url<-"https://kvk.icar.gov.in/facilities_list.aspx"

jsScript <- "var element = arguments[0]; return element.outerHTML;"

webpage<-read_html(url)

remDr$navigate("https://kvk.icar.gov.in/facilities_list.aspx")

remDr$refresh()

#First drop down

stateEle<-remDr$findElement("id", "ContentPlaceHolder1_ddlState")
#webElem <- remDr$findElement("id", "ContentPlaceHolder1_ddlDistrict")
stateHTML <- remDr$executeScript(jsScript, list(stateEle))[[1]]
statedoc <- htmlParse(appHTML)
states<-doc["//option", fun = function(x) xmlGetAttr(x, "name")]
stateEle$clickElement()
stateEle$sendKeysToElement(states[[30]])
stateEle$clickElement()

#Second drop down

distEle<-remDr$findElement("id", "ContentPlaceHolder1_ddlDistrict")
distHTML <- remDr$executeScript(jsScript, list(distEle))[[1]]
distdoc <- htmlParse(appHTML)
districts<-doc["//option", fun = function(x) xmlGetAttr(x, "value")]
distEle$clickElement()
distEle$sendKeysToElement(list(distdoc[[2]]))
distEle$clickElement()

#Third drop down

kvkEle<-remDr$findElement("id", "ContentPlaceHolder1_ddlKvk")
appHTML <- remDr$executeScript(jsScript, list(kvkEle))[[1]]
kvkdoc <- htmlParse(appHTML)
kvk<-doc["//option", fun = function(x) xmlGetAttr(x, "value")]
kvkEle$clickElement()
kvkEle$sendKeysToElement(list(kvk[[2]]))
kvkEle$clickElement()

#submitting the values

submitEle<-remDr$findElement("id", "ContentPlaceHolder1_btnSubmit")
submitEle$clickElement()


Also I want to scrape the results into a dataframe.










share|improve this question















I'm trying to scrape data but I'm having trouble scraping it. I'm able to navigate through website using RSelenium. You can find my code below. I want to scrape names from each drop down so that I can store them in an object and run a loop.



library(RSelenium)
library(rvest)
library(XML)
library(RCurl)

rd<-rsDriver()
remDr<-rd[["client"]]

url<-"https://kvk.icar.gov.in/facilities_list.aspx"

jsScript <- "var element = arguments[0]; return element.outerHTML;"

webpage<-read_html(url)

remDr$navigate("https://kvk.icar.gov.in/facilities_list.aspx")

remDr$refresh()

#First drop down

stateEle<-remDr$findElement("id", "ContentPlaceHolder1_ddlState")
#webElem <- remDr$findElement("id", "ContentPlaceHolder1_ddlDistrict")
stateHTML <- remDr$executeScript(jsScript, list(stateEle))[[1]]
statedoc <- htmlParse(appHTML)
states<-doc["//option", fun = function(x) xmlGetAttr(x, "name")]
stateEle$clickElement()
stateEle$sendKeysToElement(states[[30]])
stateEle$clickElement()

#Second drop down

distEle<-remDr$findElement("id", "ContentPlaceHolder1_ddlDistrict")
distHTML <- remDr$executeScript(jsScript, list(distEle))[[1]]
distdoc <- htmlParse(appHTML)
districts<-doc["//option", fun = function(x) xmlGetAttr(x, "value")]
distEle$clickElement()
distEle$sendKeysToElement(list(distdoc[[2]]))
distEle$clickElement()

#Third drop down

kvkEle<-remDr$findElement("id", "ContentPlaceHolder1_ddlKvk")
appHTML <- remDr$executeScript(jsScript, list(kvkEle))[[1]]
kvkdoc <- htmlParse(appHTML)
kvk<-doc["//option", fun = function(x) xmlGetAttr(x, "value")]
kvkEle$clickElement()
kvkEle$sendKeysToElement(list(kvk[[2]]))
kvkEle$clickElement()

#submitting the values

submitEle<-remDr$findElement("id", "ContentPlaceHolder1_btnSubmit")
submitEle$clickElement()


Also I want to scrape the results into a dataframe.







r for-loop web-scraping rvest rselenium






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 29 at 12:06

























asked Nov 22 at 6:23









Paritosh Sharma

33




33












  • You could try just clicking on the select list, without value = part, and sending the first few keys of your value (say ANDAMAN). Note that the value you want should be highlighted, then send Enter key. For this you will want to read about sendKey functionality of Selenium.
    – Nutle
    Nov 22 at 10:58












  • I tried using this. Now i want to store the names from drop downs so that I can run a loop.
    – Paritosh Sharma
    Nov 29 at 12:07










  • Checkout my answer and please accept it if does what you're looking for to close the issue
    – Nutle
    Nov 29 at 13:03


















  • You could try just clicking on the select list, without value = part, and sending the first few keys of your value (say ANDAMAN). Note that the value you want should be highlighted, then send Enter key. For this you will want to read about sendKey functionality of Selenium.
    – Nutle
    Nov 22 at 10:58












  • I tried using this. Now i want to store the names from drop downs so that I can run a loop.
    – Paritosh Sharma
    Nov 29 at 12:07










  • Checkout my answer and please accept it if does what you're looking for to close the issue
    – Nutle
    Nov 29 at 13:03
















You could try just clicking on the select list, without value = part, and sending the first few keys of your value (say ANDAMAN). Note that the value you want should be highlighted, then send Enter key. For this you will want to read about sendKey functionality of Selenium.
– Nutle
Nov 22 at 10:58






You could try just clicking on the select list, without value = part, and sending the first few keys of your value (say ANDAMAN). Note that the value you want should be highlighted, then send Enter key. For this you will want to read about sendKey functionality of Selenium.
– Nutle
Nov 22 at 10:58














I tried using this. Now i want to store the names from drop downs so that I can run a loop.
– Paritosh Sharma
Nov 29 at 12:07




I tried using this. Now i want to store the names from drop downs so that I can run a loop.
– Paritosh Sharma
Nov 29 at 12:07












Checkout my answer and please accept it if does what you're looking for to close the issue
– Nutle
Nov 29 at 13:03




Checkout my answer and please accept it if does what you're looking for to close the issue
– Nutle
Nov 29 at 13:03












1 Answer
1






active

oldest

votes

















up vote
0
down vote













Using your code,



stateEle<-remDr$findElement("id", "ContentPlaceHolder1_ddlState")


From here, if you want to get all values to do the looping, use:



library(magrittr)
stateEle$getElementText()[[1]] %>% strsplit(., '\n')


This will provide a list of text elements, where you could further remove the "--Select--" option:



stateEle$getElementText()[[1]] %>% strsplit(., '\n') %>% unlist %>% setdiff(., '--Select--')


Repeat this for all other select lists.






share|improve this answer





















    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53424971%2frselenium-web-scraping%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    0
    down vote













    Using your code,



    stateEle<-remDr$findElement("id", "ContentPlaceHolder1_ddlState")


    From here, if you want to get all values to do the looping, use:



    library(magrittr)
    stateEle$getElementText()[[1]] %>% strsplit(., '\n')


    This will provide a list of text elements, where you could further remove the "--Select--" option:



    stateEle$getElementText()[[1]] %>% strsplit(., '\n') %>% unlist %>% setdiff(., '--Select--')


    Repeat this for all other select lists.






    share|improve this answer

























      up vote
      0
      down vote













      Using your code,



      stateEle<-remDr$findElement("id", "ContentPlaceHolder1_ddlState")


      From here, if you want to get all values to do the looping, use:



      library(magrittr)
      stateEle$getElementText()[[1]] %>% strsplit(., '\n')


      This will provide a list of text elements, where you could further remove the "--Select--" option:



      stateEle$getElementText()[[1]] %>% strsplit(., '\n') %>% unlist %>% setdiff(., '--Select--')


      Repeat this for all other select lists.






      share|improve this answer























        up vote
        0
        down vote










        up vote
        0
        down vote









        Using your code,



        stateEle<-remDr$findElement("id", "ContentPlaceHolder1_ddlState")


        From here, if you want to get all values to do the looping, use:



        library(magrittr)
        stateEle$getElementText()[[1]] %>% strsplit(., '\n')


        This will provide a list of text elements, where you could further remove the "--Select--" option:



        stateEle$getElementText()[[1]] %>% strsplit(., '\n') %>% unlist %>% setdiff(., '--Select--')


        Repeat this for all other select lists.






        share|improve this answer












        Using your code,



        stateEle<-remDr$findElement("id", "ContentPlaceHolder1_ddlState")


        From here, if you want to get all values to do the looping, use:



        library(magrittr)
        stateEle$getElementText()[[1]] %>% strsplit(., '\n')


        This will provide a list of text elements, where you could further remove the "--Select--" option:



        stateEle$getElementText()[[1]] %>% strsplit(., '\n') %>% unlist %>% setdiff(., '--Select--')


        Repeat this for all other select lists.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 29 at 12:42









        Nutle

        180115




        180115






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53424971%2frselenium-web-scraping%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Sphinx de Gizeh

            Dijon

            Langue