A first attempt at an experiment, and not a particularly rigorous one at that, in tracking information flows through Twitter.
On Monday afternoon (31 August), Australian-time, a new YouTube video was publicised*. There’s nothing particularly unusual in that, except that this particular video concerned Perth. The capital city of Western Australia, Perth is both extremely isolated and not always seen as the most exciting of places – being often scathingly referred to using terms such as ‘Dullsville’. So, when a three-minute video mocking aspects of Perth life and making up other information (possibly qualifying as what John Hartley describes as silly citizenship, but that’s for another time), hit YouTube, it quickly spread through Twitter, Facebook, and into the blogosphere, as Perth locals and expats (of which I am one) became aware of it.
So, this gently mocking, amusing video was made, people watched it, told their friends. This can be tracked anecdotally; my personal experience of the video started at around 5pm Brisbane-time (all times from now on will be Brisbane time, despite this concerning Perth data – what I grabbed from Twitter was in my local time, and I did not want to overcomplicate things by starting to change times, especially since I was manually collecting the data. For Perth time, subtract two hours from Brisbane time), when Tama re-tweeted the link to the video. At this point, the RT was at least three steps down the line from its source, and the video itself was at around 350 views. Within a couple of hours, it had appeared three times on my friend feed in Facebook, within 24 hours it was up to 9000 views on YouTube, in 48 was well worth 35,000, and was at over 48,000 views at the time of writing. Links were also appearing in friends’ blog posts, and as the video spread, the media coverage grew too**. However, this isn’t the most precise or admissible form of measuring what had happened.
The most visible signs of people noticing the video and telling other people, at least from Brisbane, were through the likes of Twitter and Facebook. Searching Facebook for data was not the most successful of tasks, and indeed the variety of privacy settings can make finding content such as posted links hard to locate. Casually browsing livejournal posts and using blog search engines provided more results, but the re-tweeting activity on Twitter was the most immediately enticing option – it may be advantageous to return to the blogs and grab that data too, for comparison, but for now the only data source is Twitter.
The data set covering ‘This is Perth’-related tweets was obtained through multiple searches of Twitter, repeated over a couple of days to track new tweets. Without being as inclusive as possible, these searches attempted to locate as many tweets made between 31 August and 3 September linking to the video, discussing it or the articles on the West and PerthNow already covering it. Search terms included ‘This is Perth’, #thisisperth, and the various bit.ly and tinyurl addresses linking to the video, while further tweets were found by following the RT trails. The advantage of Twitter as opposed to Facebook was the prevalence of publicly accessible tweets; where locked posts were found, they were not included in the sample. However, if an RT included a user who had locked posts, the user was still included in the network created to show, where possible, the Twitter users acting as source nodes and hubs.
After the latest round of searches, carried out at 2pm Brisbane time, 227 tweets had been collected, not including those made by bots***. These had been made by, or took material from, 201 Twitter users. Of these users, 149 had specified a unique location, or made it apparent in their tweets – unsurprisingly, the majority of posts from which location could be determined came from Perth (92 tweets), with Sydney (16) and Melbourne (12) the next highest contributing cities. Outside of Australia, only nine tweets were from users declaring they were located internationally, with content being posted from the US, UK, Singapore, Canada, the Netherlands, and Malaysia. Such behaviour may be because of the localised nature of the video – for example, without knowing anything about Perth, the video may not be entertaining or interesting. Similarly, for people in or from Perth, seeing a video sending up their town may have meant some kind of connection with the video, and subsequently meant that it was passed on to friends, sharing the joke.
While geographically the mentions of the video were centred on Perth, time-wise the four hours after the video was first tweeted saw the highest activity; the earliest mention found in these searches was at 3.55pm on 31 August, with 25 additional tweets by 5pm and 41 between 5pm and 6pm. These coincided with the novelty of the video, spreading it when there was a good chance other people hadn’t seen it, and also with the end of the working day in Perth (peaking between 3pm and 4pm Perth-time). The WA-dominance of the coverage can be seen in the graph above. The graph depicts the number of tweets in hourly blocks, with the periods of little or no activity correspond with early hours of the morning, while the small increases in posting on Tuesday are during the work day and, in particular, the 7pm – 10pm period – however, these periods still contain less than 10 tweets an hour relating to the video. [The graph does not feature the last tweets from Wednesday night, when A Current Affair had a story on the video, as the exact time posted could not be determined, being in the format around 16 hours ago]
While the video hits continued to increase over the period covered here, Twitter coverage died down quickly, with occasional flurries of re-tweets as people who had not seen it earlier discovered it and passed it on. However, the longest chains of re-tweets occured in the first hours of the Twitter activity. The network visualisation above shows each Twitter user (excluding bots) featured in the sample as a node. The visualisation uses directed edges – the connections are not necessarily reciprocal links between users, but show a one-way link from one source user to a second user who may have either directly replied to a tweet or re-tweeted the work of the first user. Many nodes are not connected to others, having posted once and not been re-tweeted or not discussing it further with other users (at least, in a way that the particular searches used here would have found). There are also several small groups of two or three nodes, showing one user responding to or re-tweeting the post of another user. Most notably, there is a large, connected system of nodes in the middle of the visualisation, and for the most part these are connections that were made, or build from those made, in the first few hours of the Twitter coverage.
This closer look at the visualisation shows several paths for information flows, originating at a few source nodes. The longest paths contain nine nodes – starting at SixThousand, the Perth edition of a national network of subcultural e-newsletters and guides, re-tweets flow through people connected with The West Australian, and eventually crossed the country, reaching, for example, Fake Stephen Conroy, a popular Australian user satirising the Federal Communications Minister. To get to the end of these longest paths only took three hours from when SixThousand posted the first link – and by that point the number of tweets per hour covering the video was already declining.
The point of this exercise was not to claim anything about the nature of interpersonal communication using Twitter, or in Perth, or anything of that nature. For one thing, the data set is far too small to make any conclusions about information flows, while not looking at other data from additional sources such as Facebook or blogs means that a wider overview of the spread of the This is Perth video is lacking. Similarly, private communication such as email (the primary way I personally told friends about it) is not represented here. The main aim, instead, was to examine how to mine data from Twitter and what to do with it. The work here is a useful starting point for carrying out larger processes, ideally using automated tools such as NodeXL. One particular aspect I would have liked to cover here, and may do so later, is a comparison of the main connected group in the visualisation above and the actual followers of these users, whether what is depicted above shows information crossing groups or whether there is a high degree of interlinking amongst a group of friends.
In the meantime, what is shown is a short-lived burst of activity surrounding an amusing video about Perth, that quickly spread amongst a number of people either from or with connections to Perth, and then became a less prominent topic. While some coverage, such as last night’s A Current Affair story, and discussion of the video has appeared since the peak buzz surrounding it, activity hit a definite peak very early on – possibly reaching saturation point amongst a small audience? – and as the video itself has continued to gain hits, there just might not be any need to keep publicising it…
The network visualisation was made using GUESS, the graph through ManyEyes
* And possibly uploaded; the video’s page says 30 August, as opposed to 31, but there may be time difference issues.
** For example, stories posted on PerthNow and the West online, radio coverage on Nova 93.7, and a story on A Current Affair.
*** This may be a point of contention, as bots may be seen as further publicising the content and making it visible to more users, but for this initial work they have been excluded as the chain of re-tweets ended with them.