Graphic Art

Visual Memes in Social Media

of 10
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Lexing Xie Australian National University Visual Memes in Social Media Tracking Real-World News in YouTube Videos Matthew Hill IBM Research Apostol Natsev IBM Research
Lexing Xie Australian National University Visual Memes in Social Media Tracking Real-World News in YouTube Videos Matthew Hill IBM Research Apostol Natsev IBM Research John R. Smith IBM Research John R. Kender Columbia University ABSTRACT We propose visual memes, or frequently reposted short video segments, for tracking large-scale video remix in social media. Visual memes are extracted by novel and highly scalable detection algorithms that we develop, with over 96% precision and 80% recall. We monitor real-world events on YouTube, and we model interactions using a graph model over memes, with people and content as nodes and meme postings as links. This allows us to define several measures of influence. These abstractions, using more than two million video shots from several large-scale event datasets, enable us to quantify and efficiently extract several important observations: over half of the videos contain re-mixed content, which appears rapidly; video view counts, particularly high ones, are poorly correlated with the virality of content; the influence of traditional news media versus citizen journalists varies from event to event; iconic single images of an event are easily extracted; and content that will have long lifespan can be predicted within a day after it first appears. Visual memes can be applied to a number of social media scenarios: brand monitoring, social buzz tracking, ranking content and users, among others. Categories and Subject Descriptors: J.4 [Social and Behavioral Sciences]: Sociology, I.4.9 [Image Processing and Computer Vision]: Applications General Terms: Algorithms, Measurement, Experimentation. 1. INTRODUCTION Important happenings from around the world are increasingly captured on video and uploaded to news and social media sites. The ease of publishing and sharing videos has outpaced the progress of modern search engines, collaborative tagging sites, and content aggregation services leaving users to deal with a deluge of content [2]. This information overload problem is particularly prominent for linear media (such as audio, video, animations), where at-a-glance impressions are hard to develop and are often unreliable. While *Area chair: Mor Naaman Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. MM 11, November 28 December 1, 2011, Scottsdale, Arizona, USA. Copyright 2011 ACM /11/11...$ text-based information networks such as Twitter rely on retweets [5, 18], hashtags, mentions, or trackbacks to identify influence and trending topics [1], similar functions for large video-sharing repository is lacking. A reliable videobased quote tracking and popularity analysis system would find immediate practical applications in many domains e.g., selecting of the most typical video for a given topic or collection; measuring influence and ranking people in news events; improving targeted advertising based on page/author influence; and denoising video search and query expansion results, to name a few. We propose to use visual memes for making sense of video buzz. A meme 1 is defined as a cultural unit (e.g., an idea, value, or pattern of behavior) that is passed from one person to another in social settings. We define a visual meme as a short segment of video that is frequently remixed and reposted by more than one author. Video-making requires significant effort and time, so we regard reposting a video meme as a deeper stamp of approval or awareness than simply viewing a video, leaving a comment, giving a rating, or sending a tweet. Example video memes are shown in Figures 1, 2 and 3, represented in a static keyframe format. We can see that each meme instance is semantically consistent, despite many variations in the videos that contain them, such as size, coloring, captions, editing, and so on. Figure 1 summarizes the approach proposed in this paper. We develop a large-scale event monitoring system for video content, using generic text queries as a pre-filter for content collection on a given topic (Box D). We deploy this system for YouTube, and collect large video datasets over a range of topics. We then perform fast visual meme detection on tens of thousands of videos and millions of video shots (Box A). We showcase the potential applications of visual memes using a network model over the meme videos and authors (Box B). Using this model, we derive graph metrics that capture content influence and user roles. Using such visual meme extraction and exploitation strategies, we have made several observations on real-world news event collections (Box C), such as: over half of the event videos contain remixed content, and about 70% of authors participate in video remixing; video view counts are a poor proxy for the likelihood of a video being reposted; over 50% of memes are discovered and re-posted within 3 hours after their first appearance; meme influence indices can be used to delineate the roles of different user groups, such as mavens or connec (A) Scalable visual meme detection - Section (B) Meme graph model - Section (C) Observations and applications - Section what fraction of content is remixed? how much is new? what are the roles of citizen journalist and official news media on youtube? memes who are the most influential users? what are the important memes? influence analysis feature design can we predict virality? event metadata and textual descriptions continous querying (D) Content acquisition and extraction - Section youtube event streams Figure 1: Overview of visual meme tracking and analysis in social event streams. In Box A, the border color of meme clusters denotes the event they are from: green (top): Iran; orange (bottom): SwineFlu. tors who play notable roles in social changes [14]. We use features derived from the meme network model to predict the lifespan of memes, achieving an area-under-roc-curve (AUC) measure of The main contributions of this work are as follows: news websites. Prior studies have shown that the frequency of video reuse can be used as an implicit video quality indicator [23]. However, none of the prior work has defined the unit for retweet or meme on a video sharing network. Tracking near-duplicates in images and video has been a problem of interest since the early years of content-based retrieval. Recent focus of this problem has been on userdependent definitions of duplicates [8], speeding up detection on image sequence, frame, or local image points [26], and scaling out to web-scale computations using large compute clusters [20]. We note, however, that most prior work in this area is concerned with optimizing retrieval accuracy of detecting near-duplicate frames or sequences, rather than tracking large-scale duplication behavior. Kennedy and Chang [17] tracked editing and provenance of images on the web, with a focus on distinguishing different types of image edits and their ideological perspective. Our work, in comparison, tracks large-scale video remixes using both content and metadata such as authorship and creation time, and focuses on inferring social roles in video propagation. Several recent works have looked at YouTube phenomena Biel and Gatica-Perez [4] focused on individual social behavior such as non-verbal cues, while Hong et al. [15] presented content summarization by monitoring a query over time. In comparison, we use visual memes to capture the behavior of large groups and to track information dissemination. We propose visual memes as a novel tool to track largescale video remixing in social media. We implement a scalable system that can extract all memes from over a million video shots, in a few hours on a single CPU. We design and implement the first large-scale eventbased social video monitoring and visual content analysis system. We propose an application for visual memes by building a network model on videos and authors, which can in turn be used to characterize user roles and predict meme lifespan. We conduct empirical evaluations with several large event datasets, producing observations about percentage of video remix, user participation, timing of video meme production, meme popularity against traditional metrics, and different user group roles. 2. RELATED WORK This work relates to active research areas in both multimedia analysis and social media mining. YouTube has been the focal platform for many social network monitoring studies. The first large-scale YouTube measurement study [6] characterized content category distributions, and tracked exact duplicates of popular videos. Benevenuto et al. studied video response actions on YouTube using metadata [3], and De Choudhury et al. monitored user comments to determine interesting conversations [10]. Recently, early views of YouTube videos have been used to predict ultimate popularity, characterized by view counts [25]. Quoting, duplication, and reposting are popular phenomena in online information networks. One well-known example is retweeting on micro-blogs [5, 18], where users often quote the original text message verbatim, having little freedom for remixing and context changes within the 140 character limit. Another example is MemeTracker [19], which tracks the lifecycles of popular phrases among blogs and 3. VISUAL MEMES AND VIDEO REMIXES Visual memes are defined as frequently reposted video segments or images. It has been observed that users tend to create curated selections based on what they liked or thought was important ([24], page 270). News event collections are particularly suited for studying large-scale user curation, since remixing is more prevalent here than on video genres designed for self-expression, such as video blogs. The unit of interaction appears to be video segments, consisting of one or a few contiguous shots. The remixed shots typically contain minor modifications that include video formatting changes (such as aspect ratio, color, contrast, gamma), and video production edits (such as the superimposition of text, captions, borders, transition effects). Most of these are well-known as the targets of visual copy detection benchmarks [22]. In this paper, meme refers both to individual 54 - ./0123 4545 #$%&(& #$%&'& 2%3-34$&& #$%&0/5$& )$*$& +,%-& $./*01$+& %./95: .:3145 $# $ # &'$('!) &'%!'!) &'%*'!) *'+'!) *'$$'!) *'$,'!) *'%#'!) ,'$'!) ,','!) Figure 2: Visual meme shots and meme clusters. (Left) Two YouTube videos that share multiple different memes. Note that it is impossible to tell from metadata or the YouTube video page that they shared content, and that the appearance of the remixed shots (bottom row) has large variations. (Right) A sample of other meme keyframes corresponding to one of the meme shots, and the number of videos containing this meme over time 193 videos in total between June 13 and August 11, instances, visualized as representative icons (as in Figure 2 Left and Figure 3), and to the entire equivalence class of reposted near-duplicate video segments, visualized as clusters of keyframes (as in Figure 1 and Figure 2 Right). Intuitively, re-posting is a stronger endorsement, requiring much more effort than simply viewing, commenting on, or linking to the video content. A re-posted visual meme is an explicit statement of mutual awareness, or a relevance statement on a subject of mutual interest. Hence, memes can be used to study virality, lifetimes and timeliness, influential originators, and (in)equality of reference. 4. keyframe. We process the XML metadata associated with each video, and extract information such as author, publish date, view counts, and free-text title and descriptions. We use the term buzz to refer to all the videos that respond to keyword queries on YouTube, although their content may not be directly related to the target event or topic of interest. We use the term meme videos to refer to videos containing one or more memes. The volume of buzz and memes are telling indicators of event evolution in the real world, and we present a few examples in Figure 3. Figure 3(a) graphs the volume of all unique videos acquired according to their upload date. There are local peaks on the Swine Flu topic during April-May 2009 when new cases were spreading over the globe, and in October-November 2009 when vaccination first became available in the US, following with a steady volume decrease into For the 21-month period shown for Pakistan politics, there are two notable peaks: in December 2007, at the time of assassination of Benazir Bhutto; and in February-May 2009, during a series of crises, including serial bombings, an attack on the Sri-Lanka cricket team, and nation-wide protests. Figure 3(b) tracks and illustrates the volume of meme videos for the Iranian Politics topic (dataset Iran3 in Table 1). The number of meme videos is significant hundreds to thousands per day. There are three prominent peaks in June-August 2009 corresponding to important events in the real world2. The first mid-june peak reflects a highly controversial election prompting massive protests and violent clashes. A second mid-june peak captures a viral amateur video on the shooting of Neda Soltan, which became the symbol for the whole event. A third peak in mid-july corresponds to a Friday prayer sermon which drew over two million people, an event described as the most critical and turbulent Friday prayer in the history of contemporary Iran 2. MONITORING EVENTS ON YOUTUBE YouTube has become a virtual worldwide bazaar for video content of almost every type. With more than 48 hours of video being added every minute [2], it is a living marketplace of ideas and a vibrant recorder of current events. We use text queries to pre-filter content, thus making the scale of monitoring feasible. We use a few generic, timeinsensitive text queries as content pre-filters. The queries are manually designed to capture the topic theme, as well as the generally understood cause, phenomena, and consequences of the topic. For example, our queries on the global warming topic consist of global warming, climate change, green house gas, CO2 emission, whereas the swine flu topic expands into swine flu, H1N1, H1N1 travel advisory, swine flu vaccination, and so on. We aim to create queries covering the main invariant aspects of a topic, but automatic timevarying query expansion is open for future work. We use the YouTube API to extract video entries for each query, sorted by relevance and recency. The API will return up to 1000 entries per query, so varying the sorting criteria helps to increase content coverage and diversity. The retrieved video entries are those responding to keyword queries based on YouTube s proprietary algorithm, and often contain entries not directly relevant to the event being monitored. We filter the results to restrict the video database to unique videos, removing redundant entries that responded to multiple queries or whose YouTube identifier matched one that had previously been gathered. Then, for each unique video, we segment it into shots using thresholded color histogram differences. For each shot we randomly select and extract a frame as keyframe, and extract visual features from each 5. SCALABLE VISUAL MEME DETECTION Detecting visual memes in a large video collection is a non-trivial problem. There are two main challenges. First, remixing online video segments changes their visual appearance, adding noise as the video is edited and re-compressed. 2 See timeline: of the 2009 Iranian election protests 55 # new videos topic: swine flu epidemic (b) # meme videos (a) topic: Pakistan topic: Iran Figure 3: Volume of event buzz and visual memes. (a) Event buzz: number of new videos uploaded daily for two topics. (b) Number of videos containing visual meme on the Iran3 topic, illustrated with the representative memes on a timeline, June-August ('4-+&$ B%?+0$ +C5-(9+&$!1#$2034&'$ 5+('4-+$ 7 0&4-+$ #$ %&'()*+$,+-%.*(/0)$ 9+9+$* 4&'+-&$ )0)D9+9+&$ and removing frame borders of uniform colors; normalizing the aspect ratio; performing de-noising; and applying contrast-limited histogram equalization to correct for contrast and gamma differences. We use a frame similarity metric based on the color correlogram [16] that captures the local spatial correlation of pairs of colors. The color correlogram is rotation-, scale-, and to some extent, viewpoint-invariant. It was designed to tolerate moderate changes in appearance and shape that are largely color-preserving, e.g., viewpoint changes, camera zoom, noise, compression, and to a smaller degree, shifts, crops, and aspect ratio changes. We also use a cross -layout that extracts the descriptor only from horizontal and vertical central image stripes, thereby emphasizing the center portion of the image, while disregarding the corners. This layout improves robustness with respect to text and logo overlay, borders, crops, and shifts. It is also invariant to horizontal or vertical flips, while capturing some spatial layout information. We extract the auto correlogram in a 166-dimensional perceptually quantized HSV color space, and the resulting descriptor with a cross layout has 332 dimensions. The result of the above processing (Figure 4 Box A) is a set of features, one per input frame. Furthermore, we use query-adaptive thresholding on the L2 distance of the correlogram features to generate a binary judgement for each candidate pair of frames as to whether they are a near-duplicate pair. This corresponds to Figure 4 Box D. The main purpose of this threshold-tuning step is to relax the match threshold for complex query frames and to tighten the threshold for visually simple frames (e.g., blank frames at the extreme). For a given video keyframe q and its correlogram feature f q, the threshold for deter fq 2. Here mining matches is parameterized as Tq = τ fmax 2. 2 is the L2 vector norm. fmax is the collection max feature vector, composed of the largest observed coefficients for each dimension. τ is a global distance threshold tuned on an independent validation dataset. The fq 2 term scales τ based on the information content of q: it lowers the effective threshold for those frames that are visually simple, such as frames with uniform colors or simple charts, which can otherwise lead to false or trivial matches. At the same time, it increases the threshold for highly complex query %9('+$ ::$;00 48$ 7()?%?('+$ ::$ %&'$ 9('*G+?$8(%-&$ Figure 4: Flow diagram for visual meme detection. Second, finding all pairs of near-duplicates by matching all N shots against each other has a complexity of O(N 2 ), which is infeasible for collections containing millions of shots. Our operational definition of a meme is a reposted video segment that starts and ends at shot boundaries. This definition motivates our processing pipeline of using a single keyframe to represent a video shot (Section 4) without sacrificing matching quality, as the feature-based shot detector is generally robust to intra-shot changes but sensitive to large inter-shot variations in visual appearance. Our process for detecting video memes is outlined in Figure 4. The input to this system is a set of video frames, and the output splits this set into two parts. The first part consists of a collection of meme clusters, where frames in the same cluster are considered near-duplicates with each other. The second part consists of the rest of the frames, which are not considered near-duplicates with any other frame. Blocks A and D address the robust matching challenge using color correlogram features and query-adaptive thresholding, and blocks B, C and E address the scalability challenge using approximate nearest-neigh
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks