html - Data Mining from Tripadvisor: source code information with no attribute -


i attempting mine review data tripadvisor. following hadley wickham's code (found here). have got working hotel reviewing.

however, when apply case (eg. pichavaram mangrove forest) dates come out na's. have found problem dates in review's source code have attribute 'title'. none of sites searching have attribute tag date information. rather when view pages' source code, dates found in following line

'< span class="ratingdate" >reviewed 16 may 2015'.  

does know how can modify code fetch date information? date scraping section of hadley's code is:

date <- reviews %>%   html_node(".rating .ratingdate") %>%   html_attr("title") %>%   strptime("%b %d, %y") %>%       as.posixct() 

i new r (and coding in general) appreciate help.

it's not fair expect examples work 100% of time given websites changing.

any how... here's solution works today...

library("rvest") url <- "http://www.tripadvisor.com/attraction_review-g790280-d2408767-reviews-pichavaram_mangrove_forest-chidambaram_tamil_nadu.html" html(url) %>% html_node(".rating .ratingdate") %>%    html_text %>%   strptime("reviewed %b %d, %y") %>%       as.posixct() 

Comments

Popular posts from this blog

javascript - Google App Script ContentService downloadAsFile not working -

javascript - Function overwritting -

php - Find a regex to take part of Email -