rvest - R - Web Scrape of job board -


i trying list of companies , jobs in table indeed.com's job board.

i using rvest package using url base of http://www.indeed.com/jobs?q=proprietary+trader&

install.packages("gtools")  install.packages('rvest")  library(rvest)  library(gtools)        mydata = read.csv("setup.csv", header=true)    url_base <- "http://www.indeed.com/jobs?q=proprietary+trader&"  names <- mydata$page      results<-data.frame()  (name in names){  url <-paste0(url_base,name)  title.results <- url %>%     html() %>%     html_nodes(".jobtitle") %>%     html_text()    company.results <- url %>%     html() %>%     html_nodes(".company") %>%     html_text()      results <- smartbind(company.results, title.results)  results3<-data.frame(company=company.results, title=title.results)    }    new <- results(company=company, title=title) 

and looping contatenation. reason not grabbing of jobs , mixing companies , jobs.

it might because make 2 separate requests page. should change middle part of code to:

page <- url %>%    html()  title.results <- page %>%    html_nodes(".jobtitle") %>%    html_text()  company.results <- page %>%    html_nodes(".company") %>%    html_text() 

when that, seems give me 10 jobs , companies match. can give example otherwise of query url doesn't work?


Comments

Popular posts from this blog

c# - Validate object ID from GET to POST -

node.js - Custom Model Validator SailsJS -

php - Find a regex to take part of Email -