-
Notifications
You must be signed in to change notification settings - Fork 82
TwitterMap documentation
Twittermap has three components
-
Resolves the geolocation of tweets from raw json files in
twittermap/gnosis/src/main/resources/raw/
-
Gets streamed tweets, parses the json result, resolve the geotag by asking
Gnosis
, then ingests the tweets in AsterixDB -
Based on the
Play Framework
requirement, the HTML source is located in thetwittermap/app/views/index.scala.html
. It is a scripted HTML (.scala.html
) which will be rendered by the framework. Since we are using Angular to control the main logic, we don't put too many scripts here. The main logic is implemented in the javascript.The javascript codes located in
twittermap/web/public/javascripts/
folder. Theapp.js
is the entrance of the js. Each front-end component is implemented as an Angular Directive. The meaning of each folder is introduced in below.-
The
common
module defines an Angular service that communicates with the back-end server by using JSON request via web socket connection.It defines
- a
query
function that can be called to send the JSON requests to theNeo
server; - a
ws.onmessage
function that receives the JSON messages from theNeo
server and updates the corresponding global values;
Examples of how to query cloudberry is available at Query Cloudberry
It also defines several global values (e.g.
mapResults
,timeResults
, etc) to store the results. The dependent modules UI can be bound to specific values by using Angular watch function - a
The
map
directive is implemented by extends the existing Angularleaflet-directive
. Initially, it loads the state and the county shapes by asking the resource file fromNeo
server. Then if the map has thezoom-in
,zoom-out
, ordrag
actions, it callsquery
function incommon
module. It also watches themapResults
values that thedraw
function will be called once the results has changed.The directive to control the search box, and autocomplete.
The directive to show the time serial chart that is implemented using dc.js.
It controls the hashtag and the live tweets.
Cache is an angular-service that renders cityPolygon data to
map
directive. It caches city polygons requested by users. Next time, a user requests data that is already incache
, the response is provided bycache
rather than sending anhttp
request to middleware. If the requested data is not in cache, cache requests data for the user requested area along with some extra region (pre-fetching
) from middleware and stores in cache. So the next time if user has requested a nearby region, it will be in cache.This helps us to reduce the number of requests to middleware and faster rendering of data when user's requests are concentrated on a particular area.
The data structure to store the geo JSON data is rTree. When the cache becomes full we completely empty the cache and start over. For cache replacement, we consider both temporal and spatial data before removing the region.
-
The document of playframwork may be help.
- Visit WebJars, search webjars version of the library you want to use.
- Copy the line in "Build Tool" column (the build tool we use is sbt), to "cloudberry/examples/twittermap/project/dependencies.scala".
- Turn off the server of twittermap, and restart it. The new library will be downloaded by build tool.
- Add the required .js into head tag of "cloudberry/examples/twittermap/web/app/views/twittermap/main.scala.html". If you don't know where the .js located at, check the folder "cloudberry/examples/twittermap/web/target/web/web-moudles/main/webjars/lib"
- After that, you can use the library as you want.
An experimental demo to let each state clickable.
To use AsterixDB’s data feed, we need to open a socket using AQL to listen to connections. Example AQL, see cloudberry/examples/twittermap/noah/src/main/resources/twitter/aql/feed.aql. Then create a socketAdapterClient to connect to AsterixDB’s socket and send records to AsterixDB through the socket.
FeedSocketAdapterClient could initialize a socket connection with AsterixDB and send records to AsterixDB. It contains three important functions:
- initialize(): should be called after new a FeedSocketAdapterClient object. It sets up socket connection with AsterixDB.
- ingest(String record): sends a record to AsterixDB through the socket.
- finalized(): should be called after the feed ends. It closes the socket.
Both FileFeedDriver and TwitterFeedStreamDriver create a FeedSocketAdapterClient object and call ingest function to send records to AsterixDB.
It feeds data from an adm file to AsterixDB. First, it initializes a FeedSocketAdapterClient. Then, it reads record from a file line by line and calls FeedSocketAdapterClient. ingest to send the record to AsterixDB.
To use the FileFeedDriver, run fileFeed.sh from cloudberry/examples/twittermap/script
This class is the current pipeline which fetches real time twitter data and feeds the data to AsterixDB. The procedure is:
- Use twitter streaming API to fetch real time twitter data.
- For every tweet, geotag it, convert it from json format to adm format.
- Call FeedSocketAdapterClient.ingest to send the record to AsterixDB.
To use TwitterFeedStreamDriver, modify and run streamFeed.sh as per the instructions here