#
console
Mapping, R, Twitter

Stream(ish) Tweets location to Google Earth

In a previous post I detailed how to plot the location of tweets on Google Earth. When running the code I noticed the changes made to the kml file from the R console were instantly displayed on Google Earth, without having to re-write the KML file or having to re-open Google Earth. So I thought about streaming and plotting the location of tweets in near real-time.

Featured packages: streamRROAuth, spacetime, sp and plotKML.

EDIT: This is now neatly wrapped into a package; tweets2earth. Github and manual.

Because I haven’t figured out how to plot the tweets during the stream of tweets I have to open and close the stream every 5 seconds, hence “stream~ish”.

First we’ll load the libraries.

libs <- c("streamR", "ROAuth", "spacetime", "sp", "plotKML")
lapply(libs, library, character.only=TRUE)

Then we’ll have to get our token, get the consumer_key and consumer_secret from your app created here.

requestURL <- "https://api.twitter.com/oauth/request_token"
accessURL <- "https://api.twitter.com/oauth/access_token"
authURL <- "https://api.twitter.com/oauth/authorize"
my_oauth <- OAuthFactory$new(consumerKey = "your_consumer_key", consumerSecret = "your_consumer_secret
requestURL = requestURL, accessURL = accessURL, authURL = authURL)
download.file(url="http://curl.haxx.se/ca/cacert.pem",
              destfile="cacert.pem")
my_oauth$handshake(cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl"))
save(my_oauth, file = "my_oauth.Rdata") #Save oauth for future sessions

Now for the stream. First we set up the length of the while() loop which will last for Sys.Date(), today’s date, + x days, I’ve set this to one. The connection to the Streaming API is opened for 5 seconds then closed so that we can parse the tweets (from json). Why 5 seconds? because we are limited to 180 connections/15 minutes. The if() statement checks whether tweets were returned to not break the loop.

The stream opens and closes, so how far behind “real-time” are we exactly?. There is a 30 to 60 seconds cache on Twitter server and parsing the tweets as well as ploting is extremely fast so I would say that we are no more than 30 + 5 + 1 seconds behind (cache + stream + parse). Parsing this few tweets takes a fraction of a second – haven’t System.time() it #lazy

We use the filterStream() function from the streamR package to get the tweets.

## date when loop will be stopped - System date + 1 day (24hour stream)
end.date <- Sys.Date() + 1
tw.df <- data.frame()

## continue running until current date is end.date
while (Sys.Date() < end.date){
  ## preparing file name so that it mentions time
  current.time <- format(Sys.time(), "%Y_%m_%d_%H_%M")
  json.file <- paste("scotland_tweets_", current.time, ".json", sep="")
  ## capture tweets about Scotland or #Indyref (Independence Referendum)
  filterStream(file=json.file, track="scotland", oauth=my_oauth, timeout=5)
  if (file.info(json.file)$size > 0) {
    df <- parseTweets(json.file)
    #Remove unknown locations
    df <- df[complete.cases(df), ]
    #Bind all files
    tw.df <- as.data.frame(rbind(tw.df, df))
  } else {
    print ("no tweets found")
  }
  if (length(df$place_lon) >= 1) {
    #get icon
    shape = "http://maps.google.com/mapfiles/kml/pal2/icon18.png"
    #bloody timestap
    tw.df$time <- substr(tw.df$created_at, 12, 19)
    tw.df$time <- as.POSIXct(strptime(tw.df$time, "%H:%M:%S"))
    #Plot
    sp <- SpatialPoints(tw.df[,c("place_lon","place_lat")])
    proj4string(sp) <- CRS("+proj=longlat +datum=WGS84")
    df_st <- STIDF(sp, time = tw.df$time, data = tw.df[,c("name","retweet_count")])
    #Plot KML - opens Google Earth on loop[1] then updates for following
    plotKML(df_st, dtime = 24*3600, points_names=df$name, LabelScale = .4)
  } else {
    Sys.sleep(1)
    print("No geo-located tweets found, restart stream")
  }
}

The above will “stream” tweets location to Google Earth. Please let me know if you have an idea how to process the tweets while the stream is open.

google_earth2

Advertisements

Discussion

4 thoughts on “Stream(ish) Tweets location to Google Earth

  1. One way to circumvent the issue and achieve near real time is to open an other instance of R, have it periodically read the json.file and push the data to google earth.

    Hope this helps

    Liked by 1 person

    Posted by Stephane | March 7, 2015, 8:09 pm

Trackbacks/Pingbacks

  1. Pingback: StreamR – filterStream() | SocialFunction() - January 16, 2015

  2. Pingback: StreamR - filterStream() - Jabber Cruncher - May 24, 2015

reply()

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: