Data Collection
This page really needs one of those old-school "under construction" images (and, for this week end, an "under snow" one too).
A warning
The first thing to note is that Twitter does not guarantee to return all matching Tweets, whatever method is used (unless you have direct access to their "fire hose", which I don't). There is no public information, to my knowledge, about exactly how Twitter determines what tweets to include or exclude; the best I have found is the documentation on Twitter Search Best Practises. This filtering should be taken into account when looking at the analysis presented here. You can see the result of this when comparing the values I get for most-popular retweets with those from Twitter; my values are up to 10 percent less for the number of retweets.
What is a sensible search?
A few years ago there was discussion amongst the attendees whether to use #AAS or #AAS<meeting number>, but fortunately the AAS have been promoting the use of the latter for the last few meetings. This makes data collection easier (assuming people read the tweets of the AAS Office!). As I also wanted to follow the participation in the first ever hack day at AAS, I decided to track the following three terms: aas221, aas and 221, and hackaas. Note that case does not matter, that I am not actually searching on the hash tags, and having the middle term means that I would select a tweet saying "The 221st meeting of the AAS was awesome!". An open question is how much irrelevant material was returned by this approach; anecdotal evidence suggests the rate is low but I should try to quantify this.
As an aside, the use of #AAS for the meetings, although having the advantage of saving 3 charaters, can lead to a lot of noise due to unrelated tweets. I did not try it this year but it has been the case in previous years.
For users following along to a meeting, I would suggest using the search #aas<meeting number> -RT to hide re-tweets, since they form a significant fraction of the volumne, but as I am interested in seeing what is re-tweeted, and by whom, I've left these in.
How did I search?
Unlike the AAS 219 meeting, my primary analysis was based on a search using the Twitter Streaming API. The code to do this is available at my astrosearch bitbucket account. The process requires running the astroserver, which acts as the database, and then astrosearch which deals with the Twitter search. Building it is likely to be non trivial since it is writen in Haskell - which most Astronomers will not have installed on their system - and several patched modules, which are listed in the README. In previous searches this approach was not robust; rather than fixing the issues I ended up treating the search service as a daemon which would be automatically restarted whenever it crashed. This only happened once during the run, much later than the main conference, and lead to a down time of less than 2 seconds.
Unlike previous meetings I did not run a search using the Twitter REST API with my grabtweets code since it was not needed. However, I did take advantage of the TAGSExplorer archive and visualization system which does use the REST API.
I did not use the Archivist site since this service seems to have evolved since I last used it, and didn't offer something that met my needs.
When was the search run
The search started at 2013-01-02 14:20:56.941684 UTC and ended around Fri Feb 1 20:34:01 EST 2013 (apologies for the mix in precision, time zone, and format). As explained below there are actually two tweets included in the dataset that were made before the search started; they have been left in since they are not going to sigificantly bias the results of any analysis I present.
Transformation
Since I used the Twitter Streaming API, the search continually produced results, which were written to disk as JSON (in previous versions I had tried converting them to Haskell data structures but to simplify things I did no processing in the search program). At intervals I would process all the matches, creating a RDF graph of the results, which was passed to a 4store instance. This instance was then queried using SPARQL to produce the results, as described in the Analysis section.
I used the 1.1 POST statuses/filter API for the search, using the track parameter. When transforming the JSON from Twitter, two types were processed: tweets and retweets (although you can see other message types, I didn't). The retweets include full information on the original tweet; as well as letting me link the two tweets together in the RDF this let me find a few tweets which had not been matched by Twitter. Two of these are due to re-tweets of messages which were sent before the search was started.
Missing tweets and the AAS
Overall, 30 "missing" tweets were found, so it is not a huge number, but they do indicate a phenomemon observed at this (and I have seen this with AAS meetings), that the AAS twitter accounts do not seem to have achieved enough "twitter-cred" to be included in searches:
twitter experts: @aas_office is having some visibility issues. they've been tweeting but not showing up in #aas221 search. can u help?
— Kelle Cruz (@kellecruz) January 10, 2013
Below are the missing tweets, grouped by author, and excluding the two that were from before the main search started.
Note that all the posts from AAS_Office and AAS_Press were missed by the search (i.e. they were only found because they were retweeted). This means I will have missed any posts from these two accounts that were not retweeted.
- AAS Executive Office
-
-
Explore Long Beach #aas221 visitlongbeach.com
— AAS Executive Office (@AAS_Office) January 5, 2013 -
We're trending!! #aas221 twitter.com/AAS_Office/sta\u2026
— AAS Executive Office (@AAS_Office) January 7, 2013 -
Reminder: Enter to win $100 AMEX GC by following us before 8pm tonight @aas_office. Must be registered at #aas221 #randomdrawing
— AAS Executive Office (@AAS_Office) January 8, 2013 -
SPS Evening of Undergraduate Science #aas221 twitter.com/AAS_Office/sta\u2026
— AAS Executive Office (@AAS_Office) January 9, 2013 -
Trouble in the Blue today at 12:45pm MOVED to Ballroom E #aas221
— AAS Executive Office (@AAS_Office) January 9, 2013 -
Tonight's Space Science & Public Policy Talk at 8:00pm MOVED to 103B #aas221
— AAS Executive Office (@AAS_Office) January 9, 2013 -
Battery running low? Check out the new charging station at #aas221 sponsored by #northropgrumman
— AAS Executive Office (@AAS_Office) January 10, 2013 -
2013 Rodger Doxsey Travel Prize Winners & Runner-Ups #aas221 aas.org/grants/rodger_\u2026
— AAS Executive Office (@AAS_Office) January 18, 2013
-
- AAS Press Office
-
-
FYI: Correct hashtag for the 221st American Astronomical Society (AAS) meeting now under way in Long Beach, CA, is #aas221, not #aas.
— AAS Press Office (@AAS_Press) January 6, 2013 -
NRAO: Massive Outburst in Neighbor Galaxy [NGC 660] Surprises Astronomers. #aas221 tinyurl.com/a38vmlk
— AAS Press Office (@AAS_Press) January 7, 2013 -
#aas221 press-conference webcast problems seem to be behind us now. tinyurl.com/bkkwfb6
— AAS Press Office (@AAS_Press) January 7, 2013 -
CfA: At Least One in Six Stars Has an Earth-Sized Planet.#AAS221tinyurl.com/axcy6vn
— AAS Press Office (@AAS_Press) January 7, 2013 -
Did you know that AAS press conferences are open to all attendees? We're in room 204, Long Beach Convention Center. #aas221
— AAS Press Office (@AAS_Press) January 7, 2013 -
JPL: NASA'S Kepler Discovers 461 New Planet Candidates.#AAS221tinyurl.com/bfvp4tx
— AAS Press Office (@AAS_Press) January 8, 2013 -
NASA/CXC: New Chandra Movie Features Neutron Star Action [Vela pulsar's jet]. #aas221 tinyurl.com/ac94jzb tinyurl.com/a5rure2
— AAS Press Office (@AAS_Press) January 8, 2013 -
NWU: Radio wave technique uncovers shadows of clouds and stars in Milky Way\u2019s center. #aas221 tinyurl.com/by4dlo9
— AAS Press Office (@AAS_Press) January 8, 2013 -
UTA: UT Arlington Researchers Try new Approach For Simulating Supernovas #AAS221goo.gl/k0xgo
— AAS Press Office (@AAS_Press) January 8, 2013 -
NASA/JPL: NASA, ESA Telescopes Find Evidence for Asteroid Belt Around Vega. #aas221 tinyurl.com/b2jf8ry
— AAS Press Office (@AAS_Press) January 8, 2013 -
UCB: Exocomets may be as common as exoplanets. #aas221 tinyurl.com/avfg6w5
— AAS Press Office (@AAS_Press) January 8, 2013 -
CfA: First "Bone" of the Milky Way Identified. #AAS221tinyurl.com/ajuyx4j
— AAS Press Office (@AAS_Press) January 8, 2013 -
Including on-site registrants, attendee count at #aas221 AAS meeting in Long Beach is now 2,929. aas.org/meetings/aas221
— AAS Press Office (@AAS_Press) January 9, 2013 -
This morning's AAS press conference (10:30 am, Room 204) is on supernovae & dark energy & features Nobel laureate Saul Perlmutter. #aas221
— AAS Press Office (@AAS_Press) January 9, 2013 -
LBNL: The Farthest Supernova Yet for Measuring Cosmic History. #aas221 tinyurl.com/a35pwlj
— AAS Press Office (@AAS_Press) January 9, 2013 -
NRAO: Mapping the Milky Way - Radio Telescopes Give Clues to Structure, History. #aas221 tinyurl.com/awkjcca
— AAS Press Office (@AAS_Press) January 9, 2013 -
Gemini: Next-Generation Adaptive Optics Brings Remarkable Details to Light in Stellar Nursery. #aas221 gemini.edu/node/11925
— AAS Press Office (@AAS_Press) January 9, 2013 -
Keck: Surprise! Earth-sized Planets Are Common. #aas221 keckobservatory.org/news/surprise_\u2026
— AAS Press Office (@AAS_Press) January 10, 2013 -
Caltech: A Cloudy Mystery - A puzzling cloud near the galaxy's center may hold clues to how stars are born. #aas221 caltech.edu/content/cloudy\u2026
— AAS Press Office (@AAS_Press) January 11, 2013 -
AAS: News-briefing videos from #aas221 in Long Beach, Jan. 7-10, are now on our archived-press-conferences page: aas.org/press/archived\u2026
— AAS Press Office (@AAS_Press) January 22, 2013
-
So, what should the AAS Twitter accounts do?
So, it looks like the AAS accounts need to improve there "Twitteriness", presumably by tweeting regularly outside the conference, including being involved in conversations (i.e. reply to and being replied to by other accounts), although this is a guess on my part. I wonder whether other scholarly societies see this (or have seen this)?
Analysis
To write.
Credits
The data collection and analysis is written in Haskell, using version 7.4.2 of the ghc Haskell compiler, and uses a bunch of packages from the Haskell package database (hackage).
The visualizations presented on this web site use the d3.js Javascript library to create groovy data-driven documents. I have also used Gephi and BioFabric to visualize and explore the user network (i.e. the hair ball and matrix views).
Last, but not least, thank you to all the Astronomers who uses Twitter to discuss the meeting, and those that followed along.