eludamos.org
OPEN JOURNAL SYSTEMS
Journal Help
USER
Username
Password
Remember me
JOURNAL CONTENT
Search
Search Scope

Browse
FONT SIZE

INFORMATION
HOSTED BY

Part of the
PKP Publishing Services Network
Home > Vol 7, No 1 (2013) > Greenwood-Ericksen
 
 
Vol. 7, No. 1 (2013)
 
http://www.eludamos.org
 
 
On the Validity of Metacritic in Assessing Game Value
Adams Greenwood-Ericksen, Scott R. Poorman, Roy Papp
Eludamos. Journal for Computer Game Culture. 2013; 7 (1), pp. 101-127
 
 

On the Validity of Metacritic in Assessing Game Value
ADAMS GREENWOOD-ERICKSEN, SCOTT R. POORMAN, AND ROY PAPP
In January 2001, the website Metacritic was launched with the goal of providing consumers with the ability to see a collection of game reviews in one location. The goal was admirable. Game reviews have long been scattered across a myriad of print and online media, and a consumer seeking several reviewer perspectives on the same game had to check multiple, unrelated information sources and then make judgments regarding the quality, accuracy, and content of each review in order to formulate an informed opinion on the quality of a product. Further, the scattered nature of such reviews meant that customers were often unable to easily identify which publications might have reviewed a game, making the process of determining which games to purchase an onerous chore. It appears that the founders of Metacritic hoped to change this paradigm by finding, indexing, and summarizing the scores provided by dozens of print and electronic media sources into a single, overall metascore. However, in recent years Metacritic has increasingly come under fire from critics who allege that it has become a harmful influence on the industry and that it fails to appropriately assess the value of individual games (Dodson 2006; Periera 2012; McDonald 2012). Therefore, the goal of the present work is to assess the scientific validity and empirical value of Metacritic as a tool to assess game value to both consumers and the industry.
 
The Origins of Metareview
The theory and practice of meta analysis was originally developed by scientists over a century ago (the first meta analysis is commonly attributed to the mathematician Karl Pearson and was conducted in 1904). The value of a meta analysis is twofold. First, it is able to aggregate many studies together statistically, allowing for a succinct and coherent analysis of the state of a body of research. Second, when many studies in an area show small effect sizes (the difference between possible outcomes is very small), meta analyses allow for stronger inferences to be made by looking at many less convincing studies together.
Scientific meta analyses are highly technical, in large part because of the complexity of the information being studied. However, by the close of the 20th century, a number of individuals and organizations on the web had noticed that a similar principle could be applied to the explosion of online and print reviews of popular consumer media. An early pioneer in this area was RottenTomatoes.com, which indexed, collected, and displayed movie reviews. The site opened on an amateur basis in 1999 (Lazarus 2001), and quickly became a popular source for movie information. Metacritic began in 2001 as an attempt by cofounders Marc Doyle, Julia Doyle Roberts, and Jason Deitz to extend the concept to a broader set of media (Wingfield 2007).
 
Why Metacritic Needs Assessment
The importance of Metacritic has grown significantly in recent years. Key figures in the game industry have made no secret of their concern with the scores assigned by Metacritic to games with which they have been involved). Interestingly, it appears that there is broad acceptance in the industry not only of the notion that Metacritic score impacts sales (Murdoch 2010; Wingfield 2007; Everiss 2008) but also that Metacritic is not a reliable assessment of game quality (Dodson 2006; Periera 2012; McDonald 2012). The present work is intended to address both of these issues through two related approaches. First, a correlational analysis of the relationship between Metacritic metascore and sales aimed at assessing the historical value of Metacritic scores as an indicator of financial game value will be presented. Subsequently, a comprehensive assessment of the scientific validity of the process by which Metacritic aggregates scores will be shown to demonstrate areas of logical methodological weakness in the Metascore production process. Taken together, these two analyses lead to the conclusion that while Metacritic is a strong predictor of sales, there are also significant flaws in the system by which Metascores are produced. The implications of these findings are also discussed.
 
Review of Literature
The stated goal of Metacritic is "helping consumers make an informed decision about how to spend their money on entertainment--by providing access to thousands of reviews in a number of entertainment genres" (Doyle 2011). Recently, however, Metacritic has come in for criticism from industry figures who argue that Metacritic is flawed and negatively impacting the health of the game industry (Dodson 2006; Periera 2012; McDonald 2012).
 
The Perception of Metacritic Score Impact on Sales
The internet is rife with opinions on the impact that metacritic has on game sales, many of them from apparent industry insiders. Regardless of the actual ground truth of the situation, the general perception of the relationship between sales and scores is worthy of discussion because of the impact that the opinions of decision makers can have on industry policies.
Overall, the general perception seems to strongly favor a clear link between sales and scores. John Riccitiello, CEO of Electronic Arts pointed out in a 2009 interview that "the best selling games in this industry last year were all 80 [Metacritic metascore] and above." Julian Murdoch's 2008 GamePro article "Metacritic: Gaming the Score" cites an interesting point made publicly by Robin Kaminsky, at the time VP of Marketing at Activision. During a presentation at DICE, a well regarded gaming business conference, Kaminsky declared that "for every additional five points over an 80 percent average review score, sales may as much as double" (Murdoch 2010). Similar sentiments have been attributed to Robert Kotik, CEO of Activision, who said "for every 5 percentage points [in metacritic score] above 80%, Activision found sales of a game roughly doubled" (Wingfield 2007; Everiss 2008). Peter Moore, a senior executive at EA Sports, initially espoused the use of Metacritic-based quality metrics, but subsequently argued that they had become overused and might not be ideal development metrics (Dring 2010).
 
The Results of Perceived Sales Impact on Industry Policy
Due to this high level of acceptance of a direct relationship between sales and scores, it appears that at least some industry figures and studios have taken the apparently logical step of connecting scores to studio and employee valuation, and have implemented policies to support and incentivize high scoring games. After all, the argument goes, if scores equal sales, than scores equal value, and employees and studios should be incentivized to produce value by emphasizing the importance of metascores. John Riccitiello, EA's outspoken CEO, has commented publicly not only on the impact of the impact of metacritic scores on sales, but also on studio policy decisions, notably those related to compensation. "There are definitely bonuses attached to scores," he asserted in a 2009 interview that appeared on Industrygamers.com. Other sources have cited similar trends (Everiss 2008; Wingfield 2007).
There are certainly a number of possible implications of this trend. First, the impact on individual studios working with larger publishers can be significant. Fallout: New Vegas, the critically well-received fan favorite from Obsidian Entertainment was reportedly developed on contract to publisher Bethesda Softworks for a straight payment plus a bonus if the Metacritic metascore exceeded a value of 85. Unfortunately for Obsidian, the game apparently failed to meet that goal by one point, receiving a score of 84 (Gilbert, 2012). Interestingly, it appears that the original source for this information, a tweet from Obsidian veteran Chris Avellone has since been removed. As of March 15th 2012, it could be found at https://twitter.com/#!/ChrisAvellone/status/180062439394643968​, but as of the time of this writing is no longer accessible at that address.
Metacritic scores may also have a broader impact on the external perception of viability or success for publishers or developers in the broader business community. For instance, THQ's Homefront received disappointing Metacritic metascores in the low to mid 70's across multiple platforms, apparently leading to upwards of a 20% drop in share price for the company (Pham and Fritz 2011; Baker 2011). The opposite effect has been observed as well, when Take-Two Interactive's stock price jumped 20% the week following the release of the critically-acclaimed Bioshock (Wingfield 2007). Note that despite widespread acceptance of claims to the contrary, there is no empirically valid way to connect these kinds of financial outcomes to metascores directly. Since metascores are based on widely-distributed reviews from independent critics and publications, it is just as reasonable to argue that the general response of critics (or potential purchasers themselves) or other factors such as seasonal buying patterns, marketing strategy, or word of mouth were responsible for the effect. Ultimately, however, these cases underscore the connection between perceived game quality and sales. Given that Metacritic is an aggregate indicator of critical response, however, it seems reasonable to suggest that metascore and sales might be connected. As of yet, however, there appears to have been no attempt to publish a broad analysis of the link between scores and sales, an oversight the present work aims to correct.
An interesting trend associated with this apparent relationship appears to be the tendency of some companies to develop design strategies explicitly aimed at maximizing metacritic score. Tim Heaton, studio director of Australia-based Creative Assembly (CA), has indicated in interviews that CA uses a strategy that specifically links features of games in production to actual hypothetical metascore points, and tracks expected metascore throughout the development process. The system apparently is used to estimate the impact of features with a very high degree of granularity, such that the impact of some features is apparently estimated down to at least the .5% metascore level (Nutt 2012).
Additionally, it appears that Metacritic metascores are also being used to determine hiring and compensation for individual employees. As discussed above, John Riccitiello, the CEO of Electronic Arts, has asserted this in the past (Brightman 2009), and similar claims have been advanced elsewhere (Everiss 2008; Wingfield 2007). Ultimately, it is probably fair to say that it is increasingly the case that individual developers may find that their compensation is directly tied to the metacritic scores of the games on which they work. This has been received in some quarters with hostility (Dodson 2006; McDonald 2012), but arguably represent a case of publishers and studios rewarding value with monetary compensation, assuming of course, that metascores are indeed a valid measurement of product value. Similarly, recently cases have emerged where metacritic scores have been explicitly linked to hiring decisions. On July 27, 2012, Irrational Games posted a job listing for a design manager which included the qualification requirement "credit on at least one game with an 85+ average Metacritic review score" (Graft et al. 2012). This reliance on Metacritic to drive hiring and compensation decisions for individuals raises further issues of fairness, especially in the context of the ongoing questions regarding the reliability of Metacritic metascores as an indicator of quality.
Unsurprisingly, this has resulted in a number of cases where developers or studios have resorted to tampering with Metacritic scores. In at least four documented cases, studio employees have been caught submitting user reviews for games they helped develop without acknowledging their studio affiliation (Sinclair 2011; Fahey 2011). It is unclear whether these individuals were acting with the knowledge of the leadership of the studios or publishers responsible for the games in question.
Ultimately, is seems reasonable to suggest that the perception of Metacritic throughout the game industry as an important metric of game quality has resulted in a broad swath of polices impacting everything from marketing strategy to the use of certain development approaches and metrics, and even to employee and studio compensation. As such, it seems that the influence of Metacritic on policies and decision-making in the game industry is both pervasive and powerful.
 
Criticism of Metacritic
Given the scope of the financial impact on all levels of the game industry associated with Metacritic metascores, it seems obvious that the fairness of this assessment system should be carefully examined. Certainly there is considerable criticism voiced among industry insiders at conferences and offices, although the authors have found this to be more true of off-the-record verbal communication than in written publications. Some notable published criticisms do exist, of course. Joe Dodson's 2006 criticism of metareviews in general and Metacritic in particular deserves note (Dodson 2006). While hardly an unbiased (or even fair) criticism of metareview sites, the article did raise awareness of the controversy and made some reasonable points. It also may serve as a rough indicator of one branch of sentiment among game reviewers regarding metareviews.
It is clear that there are doubts about the validity of Metacritic as a source of unbiased feedback, even among those who promote its use. John Ricciatiello, for instance, who has been quoted previously in strong support of using Metacritic scores for various purposes, has also expressed reservations about its validity. "I'm a huge believer in quality, although I don't think Metacritic measures it the best for everything we do" (Brightman 2009). Peter Moore of EA Sports has been quoted as expressing reservations on the subject as well (Dring 2010). Given the increasing prevalence of Metacritic metascores as a primary indicator of game quality for both customers and industry decision-makers, and the financial implications thereof, it appears vital that a better understanding of the nature, origins, and validity of metacritic scores be undertaken.
 
Methods
The goal was to investigate whether a correlational link exists between game metascores obtained from Metacritic's website (​http://www.metacritic.com/​) and sales data as obtained from the website VGChartz (​http://www.vgchartz.com/​). These sources were chosen in part because they are readily accessible to members of the industry and the general public, which should make it easier for others to replicate and extend the current work independently.
 
Sampling
A random sample of 196 Games was drawn from Metacritic. Games were selected from the Action, RPG, and FPS genres, as defined by Metacritic's internal classification system. Only games released for the XBOX 360 and Playstation 3 consoles were chosen because of the relative similarity of marketing, deliver, and control systems between titles released for the two platforms. A listing of the games included in the sample, as well as associated sales and score data is included in appendix A, below. Sales data were then obtained (in millions of units) from the website VGChartz. Metacritic score and sales data were collected in August of 2010, and reflect the information available from those sources at that time.
 
Analysis Approach
The data collected were analyzed in a three-step process. First, they were plotted out on a graph to allow visual identification of patterns and characteristics of the data. Then, a statistical measure known as a "Pearson's r," or "Pearson Product-Moment Correlation Coefficient" was applied to the data to identify the correlation between the two data sets.
 
Visual Analysis
A graphed plot of all of the data was performed in order to visually identify broad patterns in the data. Plots were performed for the entire data set (N = 196), as well as for each individual combination of platform (PS3, XBOX360) and genre (Action, RPG, and FPS). Visual plots of the data are presented in the below, grouped by genre (Figure 1) and by platform (Figure 2). Visual inspection of the data appeared to show a meaningful geometric or exponential relationship between sales and scores.
 
Quantitative Analysis
The collected data were subsequently analyzed using Pearson's Product-Moment Correlational Coefficient (PMCC, or Pearson's r). Because the visual analysis indicated a pronounced curve to the data set, analysis was split into two parts: first, a bivariate correlation using the untransformed data set was used to assess the linear relationship discounting the obvious visible curve. Such an analysis involves the least amount of processing of the data and therefore might be seen as a more conservative statistical analysis approach. However, such an approach would be expected to underestimate the relationship between the variables, and furthermore violates the assumption of linearity inherent in the PMCC. Therefore, a second analysis was completed after applying a logarithmic transformation to both variables to "flatten out" the curve of the data. This approach, although it involves more processing of the data, should be expected to yield a more accurate coefficient of correlation. Both analyses are presented so that the reader can judge for themselves which they prefer. Note that these two reported analyses should be seen as alternative approaches, rather than one confirming or reinforcing the findings of the other.
 
Results
Bivariate Correlational Analysis
A Pearson's product-moment correlation coefficient (PMCC) was calculated on the untransformed data set and showed a significant positive correlational relationship between sales and scores, r = .55, p < .005. Of course, given the apparent curvature of the data, it is expected that the relationship between sales and scores might be seriously underestimated by this procedure, given that the PMCC assumes a linear relationship between data sets. However, it was expected that the results of this rather unsophisticated analysis approach using untransformed data would nonetheless show a meaningful relationship and would help alleviate any concerns about the conservativeness of subsequent transformation-based analyses.
 
Transformation of Data
Because the data plot suggests a nonlinear relationship between metascore and sales, the above analysis on untransformed data almost certainly underestimates the strength of the relationship between the two variables, as linearity is an assumption of the PMCC. Although the analysis on the untransformed data set still shows a significant correlation, in the interests of fully understanding the relationship between the variables, a more satisfying and accurate approach can be achieved by transforming the data to achieve linearity before calculating the bivariate correlation. In this case, a log transformation was chosen because of its efficacy in linearizing curvilinear data sets. The transformed results also suggested a significant positive relationship between sales and scores, r = .72, p < .005. The increase in the reported r value for the PMCC indicates an even stronger relationship between sales and scores than that suggested by the analysis on untransformed data.
 
Analysis Summary
The results of our analyses are shown below. Visual analysis of the graph shows an apparent geometric or exponential relationship between game sales and metascore, such that higher metascores are associated with higher sales. Additionally, the curve appears to have a "break point" somewhere around 80% where the rate of increase in sales begins to trend strongly upwards.
The initial correlational analysis showed a correlation of .55 on a scale of -1 to 1, which is generally considered to be a reasonably large correlation. The correlation was statistically significant at the .005 level (a criterion ten times more stringent than is typical for these analyses). However, because correlational statistics are designed for linear data rather than curvilinear data, we also performed a second analysis after applying a mathematical procedure known as a " log transformation" to "straighten" the data set. A correlational analysis of the transformed data set revealed a new correlation of .72, far higher than even the initial estimate. This result was also statistically significant at the .005 level, indicating a very high level of confidence in the result.
 
Figure 1. Metacritic Score versus Sales (in Millions) by Genre
 
Figure 2. Metacritic Score versus Sales (in Millions) by Platform
 
Discussion
The dual approach used in the present work was intended to examine the issues surrounding Metacritic scores from both a qualitative and quantitative perspective. The quantitative examination of the mathematical relationship between sales and scores using publically available data was intended to address the issue from an empirical and number-driven perspective. The tight coupling between sales and scores strongly suggests that Metacritic is a valuable tool for assessing (and possibly predicting) game value in terms of critical acclaim, sales, and return on investment for studios and publishers. While the quantitative analysis above has provided strong evidence of a significant relationship between sales and scores, such an approach cannot shed light on the validity or reliability of the procedures by which Metacritic calculates metascores. To address these concerns, a qualitative analysis of the metascore generation processes was conducted. By carefully examining the validity issues with metacritic from a scientific perspective, it was hoped that insights could be gained into how score validity and reviewer intent was preserved or distorted at each step in the process, as well as how this process would impact the overall value of Metacritic as a tool for decision-makers in the game industry.
 
Qualitative Analysis of Validity
Scientists typically discuss the quality of a measure or argument in terms of causal "validity," or simply "validity." Scientists generally recognize five subcategories to validity, each of which pertains to a specific aspect of the measurement or argument in question. Because Metacritic is essentially drawing a conclusion about the quality of a game based on a rating developed using a mathematical argument (Metacritic's proprietary formula) which incorporates a number of measured data points (individual scores), it is vulnerable to concerns about the validity of the process used to make these determinations. Since not everyone is a scientist, a discussion of causal validity as it applies to an assessment of Metacritic is included below.
Internal Validity is about whether a measurement is being assessed in such as way as to determine the appropriate cause for a given effect. An example of this is the classic chicken-egg problem: do chickens cause eggs, or do eggs cause chickens? In the case of metacritic, key questions include, for instance, which review sites are being polled, whether external events have an impact on individual reviews (other reviews, reviewer-developer relationships, etc), and other, similar concerns.
Construct Validity is about whether a documented scale is measuring what it is supposed to be measuring, or something else entirely. IQ tests, for instance, are notorious for measuring things other than intelligence (educational background or ethnicity, for instance). In the case of Metacritic, one interesting question is how the different scales used by different reviewers and publications are "normalized" to fit Metacritics's 100 point scale, and whether distortion of the reviewer's intent occurs during the process.
External Validity focuses on whether a measurement or finding is likely to generalize outside of the specific conditions where the test occurred. In the case of Metacritic, one key question is whether reviewers are a good approximation of customers with regards to the things they like and dislike. Another, partially addressed above, is whether metascore correlates with other real world measures of game success, such as sales or awards.
Face Validity is an indicator of how good a measurement or argument appears to be. This is similar to Stephen Colbert's concept of "truthiness" (which as of 2011 appeared in the Oxford English Dictionary). Just as an idea that is "truthy" appears to be or "feels like" the truth, whether it is actually true or not, a measurement or argument that shows good "face validity" seems like it should be right, regardless of whether or not it actually is. Metacritic typically enjoys high face validity in many circles, as it appears (on the surface at least) to be an unbiased aggregate overall score.
Statistical Conclusion Validity assesses whether the mathematical or statistical procedures used on the data are appropriate. This can be highly technical in the case of complicated experiments, but in the context of Metacritic this mostly boils down to whether Metacritic's approach to the mathematical aggregation of game review information could reasonably be expected to yield an accurate representation or assessment of game quality.
 
How Metacritic metascores are calculated
Marc Doyle and other Metacritic employees have been reasonably forthright on the subject of exactly how Metacritic calculates metascores. The website itself presents a layman's description of the process: "We carefully curate a large group of the world's most respected critics, assign scores to their reviews, and apply a weighted average to summarize the range of their opinions" (Metacritic 2012a). The site goes on to explain that:
Metascore is a weighted average in that we assign more importance, or weight, to some critics and publications than others, based on their quality and overall stature. (Metacritic 2012a)
This is an important point, as it illustrates one of the aspects of this process that often attracts the strongest criticism and confusion. By applying a mathematical "weight" to each individual score, Metacritic is asserting that the opinions of some publications or critics are more important than others. Predictably, this is not received well in all circles (Dodson 2006). Regardless, it appears that Metacritic follows the steps illustrated in Table 1 below when preparing and delivering a metacritic score.
 
Step
Action taken by Metacritic
1
Identify "trusted" publications and critics from which it will draw scores.
2
Assign a "weight" to each of these based on how much Metacritic trusts or respects their work and judgment.
3
Gather individual reviews from these publications and critics
4
Apply Metacritic's conversion scales to the original publication score
5
Aggregate all scores into a weighted average using the individual scores from step 3 and the weights from step 2.
6
Publish these metascores on their website at metacritic.com
Table 1: Steps in Metacritic's metascore creation process
 
The ultimate outcome of this process is a single measurement that incorporates not only the individual score contributed by the critic or publication, but also Metacritic's assessment of the worth or reliability of that source.
 
The validity of Metacritic metascores
As with any process related to subjective criticism, there are a number of areas of concern with regards to the calculation of metacritic scores. Table 2 below summarizes some relevant concerns at each step of the broader assessment process (including the contribution of the original critic or publication).
 
Action
Associated Potential Threats to Validity
Individual reviewer assigns a score based on their own opinion and scoring system.
 
Reviewer can be biased for or against the game, genre, series, studio, or publisher for any of a number of reasons.
 
Reviewer can be influenced by previous iterations in a series.
 
Reviewer can be influenced by other published scores for the game in question.
Metacritic gathers scores from individual sites
 
Metacritic may miss a score from a publication or critic that they intend to track
 
Some important or useful scores may not be considered because Metacritic does not track them.
 
Metacritic staff may misinterpret a reviewer's intent when assigning a score to reviews in which no quantitative score is provided.
Metacritic applies conversions to 100 point scale
Metacritic's conversion system may distort the reviewer's intent (see Tables 3, 4, and 5 below).
Metacritic aggregates all scores into a weighted average
 
Weighting may not accurately represent the general consensus of reviewers.
 
Weights are assigned at the discretion of Metacritic and criteria for weighting are not transparent.
 
A single highly divergent score from a highly-weighted publication can distort the overall metascore.
Metacritic publishes the metascore
Consumers can misunderstand the meaning, relevance, or importance of a metascore.
Table 2: Potential threats to validity associated with Metacritic's metareview process
 
The first, and in some ways the most basic, potential problem with Metacritic metascores is the inherently subjective nature of critical review. Not all critics agree on the quality of a given art object, product, or service (as games could potentially be categorized as any of these, depending on features and/or distribution approach). An examination of the possibilities for a breakdown during critical review is well beyond the scope of this work, and isn't entirely germane to the issue of Metacritic's validity specifically, but is still important to note. At a minimum, there are several types of issues related to critical review as a basic level that need to be considered:
1.  Issues of reviewer bias stemming from reviewer attitudes toward the game, publisher, genre, development studio, or content area.
2.  Reviewer experience with game genres, games, or criticism in general.
3.  Editorial pressure stemming from personal or financial relationships between publishers or studios and publications.
4.  Reviewer peer pressure stemming from previously published reviews of the game in question.
All of these issues should be matters of concern when considering the reliability and accuracy of game reviews, and these represent fertile topics for future research. However, the focus of the present work is on the impact that Metacritic itself as an organization or information source has on the process.
 
Gathering Individual Reviews
Even if all reviews are reasonably on-target, a number of other potential pitfalls emerge as these reviews make their way into Metacritic's database. First, Metacritic makes it clear that they do not track reviews from all publishers. The actual requirements for inclusion in Metacritic are not entirely transparent, but appear to include publication reputation, subjectively-assessed review quality, and review quantity (Metacritic, 2010c). Therefore, it is entirely possible that a review for a given game may appear in an untracked publication or source, and would therefore not be included in Metacritic's score. Further, although representatives of Metacritic have previously stated that there are certain publications that are regularly checked for reviews (Metacritic, 2012b), Metacritic's staff may not become aware of a particular review of a game that appears in a tracked publication, either as a result of an oversight or because the review is not noticed by or is inaccessible to their staff. Therefore, many relevant reviews may be overlooked either because the publication in question is not tracked, or because of a failure in the review collection or tracking process.
Even when a review is identified, there are certain cases where reviewers do not assign a score to a game. Under those circumstances, it is Metacritic's policy to also assign a score to a review when none exists based on a subjective assessment of reviewer intent by Metacritic staff Metacritic 2010a). Given the inherently inconsistent nature of subjective assessment, and the lack of inside knowledge of the reviewer's state of mind on the part of Metacritic staff, it is obviously possible that the reviewer's intent may not be appropriately understood and documented, posing a serious threat to validity.
 
Score Conversion
Different reviewers and publications use widely varying methods for quantifying the quality of a game. In cases where a quantitative score is available directly from the reviewer, Metacritic generally needs to convert the score used by the publication or reviewer into Metacritic's 0-100 scale format in order for it to be included in their database. Metacritic clearly lists their conversion system on their website with tables for 4-star scales (Table 4), traditional A-F scholastic grading scales (Table 5), and the rather obvious 1-10 scale conversion (Table 3). The conversion systems for other scales (thumbs up/thumbs down, go/wait/don't go, buy/rent/ skip, etc) are not included on Metacritic's website, and do not appear to be published elsewhere. It seems likely that these represent cases where Metacritic staff subjectively assign a 0-100 score directly, consistent with their policy as documented in Metacritic (2010a).
The translation of scores from one scale to another is a very problematic process from a validity perspective. Many of the difficulties lie in the distinction between the perceptions of the reviewer, the general public, and Metacritic staff on the meaning of certain specific ratings, particularly when there are preexisting problems with the scale used in the review.
This is nowhere seen more clearly than in the A-F conversion scale used by Metacritic, although it can be argued that the difficulty in this case is not really of Metacritic's making. The traditional A-F scholastic grading scale has long been known to be quite seriously flawed, most obviously with regards to a problem known as "restriction of range." Typically, a score of 100-90 is seen as an "A," an 89-80 as a "B," a 79-70 as a "C," and so on. Some scales use "+" and "-" modifiers to increase the granularity of the score, such that 100-97 is seen as an A+, a 96-94 as an A, a 92-90 as an A-, a 89-87 as a B+, an 86-84 as a B, and so on. Regardless of which type of A-F scale is used, it quickly becomes apparent that the lowest possible grade (an "F"), includes the entire set of scores from 50-0, making the range covered by "F" anywhere from 5-15 times larger (depending on whether or not one includes "+" and "-" grades) than any other category.
While broad exposure has conditioned individuals in the United States and other countries which commonly use this scale to accept the qualitative value of each of these grade categories, translating scores from varying rating systems into a true 0-100 scale represents a serious challenge, because the A-F system is actually based on only half of the 100 point range, as everything at or below 50 is simply an "F." This puts Metacritic in the unenviable position of choosing between using only half of their overall scale (thereby potentially artificially inflating the score above what was intended by the reviewer), or having to redefine the numeric values associated with each letter grade contrary to the established public perception of their value. By choosing the latter approach, Metacritic faces the validity challenge of a serious discrepancy between the numbers many reviewers expect will be associated with a letter grade, and those that are actually applied by Metacritic.
For instance, it can be seen in Table 5 that Metacritic assigns a score of 75 to a game rated as a B, and 67 to a game rated as a B-. This is of course confusing to individuals who consider a "B" in the context of the A-F scholastic grading scale to be a reasonably good outcome (typically, an 86-84%). By contrast, a 75 is seen as a weak grade. This discrepancy is even more pronounced for games with B-, C, D, and F ratings. The result is that Metacritic's conversion system may distort either the perception of the user as to what a score means, the intent of the reviewer, or both. This particular discrepancy has been widely documented elsewhere (Wingfield 2007; Boesky 2008).
Other scales have their own conversion problems. On a 4-star scale, removing a single star drops a game to a 75% rating once the conversion is applied (see table 4). This low level of granularity may result in the artificial deflating of a score contrary to the intent of the reviewer, and may additionally cause confusion among user of Metacritic.
 
Publication's
rating
Metacritic's
Rating
10
100
9
90
8
80
7
70
6
60
5
50
4
40
3
30
2
20
1
10
0
0
Table 3: Metacritic score conversion: 10 to 100 point scale (from Metacritic 2012a)
 
Publication's rating
Metacritic's Rating
4 stars
100
3.5 stars
88
3 stars
75
2.5 stars
63
2 stars
50
1.5 stars
38
1 stars
25
0.5 stars
12
0 stars
0
Table 4: Metacritic score conversion: X out of 4 Stars to 100 point scale (from Metacritic, 2012a)
 
Publication's Rating
Metacritic's Rating
A or A+
100
A-
91
B+
83
B
75
B-
67
C+
58
C
50
C-
42
D+
33
D+
25
D-
16
F+
8
F or F-
0
Table 5: Metacritic score conversion: A-F to 100 point scale (from Metacritic, 2012a)
 
Score Aggregation and Weighting
Metacritic has been quite open and consistent in stating that their metascores are calculated using a weighted average (Metacritic 2012a), which is calculated by multiplying each score by a coefficient that is used to represent the quality or importance of the individual score in assessing the game as a whole. However, Metacritic has previously refused to comment on the specific weights they apply to various publications or reviewers in calculating the value of metacritic scores (Metacritic 2010b). This is understandable for several reasons. First, this represents some level of proprietary system for Metacritic, and could be seen as a form of intellectual property. Second, it represents a potentially volatile issue with regards to the public perception of certain reviewers and publications. The nature of this type of rating system inherently implies value judgments about the quality of publications and reviewers, which of course makes this information highly sensitive and potentially controversial in nature. Additionally, little information is available at the present date regarding the exact process by which Metacritic assigns and maintains these weights.
 
Score Presentation
Even once this process is complete, there remains one remaining potential problem - the color code used by Metacritic based on the numeric metascore a game receives. Scores in the 100-75 range are displayed to the user in green colored text, scores from 74 to 50 in yellow, and 49 and below in red (see table 6 below for detailed breakdowns as published by Metacritic). This may be seen as implying a more judgmental assessment on the part of metacritic, such that green games are "good," yellow games are "moderate" in quality, and red games are "bad." Additionally, the sharpness of the rating scale means that a game that scores a 74 and thereby misses the green color category by a single point (1/100th) of the scale gets the same color code as a game that gets a 50. Taken together, these problems could lead to the distortion of user perceptions such that users of Metacritic's site may perceive certain games as being much better than others when the actual difference is much more subtle.
 
General Meaning of Score
Games
Color
Universal Acclaim
90 - 100
Green
Generally Favorable Reviews
75 - 89
Green
Mixed or Average Reviews
50 - 74
Yellow
Generally Unfavorable Reviews
20 - 49
Red
Overwhelming Dislike
0 - 19
Red
Table 6: Metacritic score conversion: 100 point scale to color code (from Metacritic 2012a)
 
Qualitative Validity Assessment
The investigation of validity in the first part of the paper identified a number of flaws in the methodology used by Metacritic to calculate metascores. Many of these deal explicitly with the translation of the intent of the reviewer to a 0-100 numeric score, but threats to validity and accuracy have been identified at every step of the process. Taken together, these findings raise concerns regarding the accuracy and validity of metascores as representations of the aggregate opinion of the community of game reviewers regarding the quality and value of specific games.
The issues associated with the translation of various reviewer scales to Metacritic's 100 point scale are particularly worrisome, not only because they affect the actual reliability of metacritic as an assessment tool with regards to how appropriate the assessment process is (internal validity) and how reliable the measurements are (construct validity) but also because they seem to have a broad impact on the perception of the reliability of Metacritic as a whole (face validity).
However, these analyses are only half the story - equally important is direct observation of the actual accuracy and consistency with which Metacritic metascores predict or correlate with actual game sales.
 
Empirical Results
In general, the results of the statistical analysis in the results section above showed a very strong relationship between sales and scores, regardless of genre or platform. This is a very important point, as much of the criticism of Metacritic metascores centers around the idea that they fail to accurately represent product value. Our results showed fairly conclusively that there was a tight coupling between sales and scores, such that games with higher Metacritic metascores tended to have higher sales as well, across genre and platform. Accordingly, despite the threats to validity noted in earlier sections, it is difficult to argue against the value of Metacritic as an assessment tool when it shows itself to be such a clear bellwether of financial success in games.
Some important caveats exist, however. First, by the nature of the mechanisms of assessment available to the authors, the data collected are necessarily observational in nature; that is to say, that they allow us to talk about the correlation between sales and scores, but do not allow us to say definitively whether high scores cause high sales, or the converse, or whether a more complicated relationship exists involving other factors such as marketing or media exposure. One obvious interpretation would be that both high scores and high sales are correlated with game quality, which while gratifying to proponents of Metacritic, is unfortunately only one of many possible explanations. Ultimately, the most likely interpretations would appear to be that (1) Metacritic is driving sales, (2) Metacritic is predicting sales, that (3) both Metacritic score and sales are both being driven by a third factor such as game quality, reviewer bias, or marketing activity, or that (4) some combination of the above factors is in play. Regardless, the important point is that the strong relationship between the two would seem to suggest that Metacritic is a good benchmark for studios and publishers interested in assessing the financial value of individual games, whatever the industry or general public may think of its suitability as a measure of game quality.
 
Correcting the Flaws in Metacritic
Despite the identification of serious concerns regarding the process Metacritic uses in gathering and aggregating scores, it is difficult in many cases to see how Metacritic could act to address them. Many of these, such as the score translation problem, arise either from the inherent drawbacks of metareviews in general, or as a result of decisions made by individual reviewers whose choices are outside of the control of Metacritic as an organization. For instance, an A-F rating scale is broadly regarded as a flawed scale, replete with validity issues on multiple levels, yet it continues to be used by some publications and reviewers (as well as most school districts in the United States). No action on the part of Metacritic, other than entirely excluding any score not formatted as a 0-100 scale, would address such scale translation issues entirely. Further, were Metacritic to take that drastic step, it would arguably produce a far worse outcome by providing a score based only on a few specific sources of reviews and excluding a large number of valid perspectives on a given game title.
One method of addressing criticisms of the "one size fits all" model of metascore generation (i.e. that it assumes that all users have the same tastes) might be the adoption of a more sophisticated individualized approach, in which users are provided with relative ratings for games based on their stated preferences or user review history. This could have the dual benefit of defusing the "absolute measurement" value that has attracted so much negative attention to Metacritic while providing a more personally relevant score to each particular user. The potential improvement in industry acceptance and specific user-focus might well be worth the increased complexity inherent in implementing such an approach.
Additionally, adding additional transparency to the weights and formula Metacritic uses to calculate metascores could help to reduce the mystery of how scores are calculated (and could thereby reduce suspicion on the part of industry members and users).
 
Conclusions
Overall, debate on this issue will almost certainly continue. However, a few things can be said with a fair degree of confidence. First, Metacritic's process for gathering, translating, and aggregating scores appears to be flawed at several levels. That being said, in many cases it is unclear how precisely these flaws could or should be addressed. Other issues may well be systemic to the community of game reviewers and publications. This factor may be particularly problematic to address because these individuals and groups do not appear to adhere in many cases to basic standards of journalistic and editorial professionalism. Examples of such standards which are routinely neglected by industry-targeted publications include avoiding or disclosing conflicts of interest on the part of reviewers and publications, clearly differentiating between paid or advertising content and news or opinion material, and consistently requiring relevant educational credentials or certifications of reviewers.
Ultimately, it is difficult to escape the conclusion that the strong empirical evidence for a close link between sales and scores argues strongly for the value of Metacritic as an assessment tool. Accordingly, it is na�ve to expect publishers or other decision makers in the industry to abandon Metacritic as a yardstick anytime in the forseeable near-term future. Indeed one might expect them to adopt the tool more fully in that role. The cases of Homefront and Bioshock also clearly indicate that financial markets and the broader business community consider Metacritic to be an important indicator of product quality and therefore company health, and are likely to continue to make judgments of the value of games based on metascores. This cannot but help have the effect of further raising the profile and importance of Metacritic scores even higher among shareholders, executives, and the general public. Additionally, the financial success of Metacritic and its high visibility indicate that it has come to play a significant, if not central, role in driving consumer purchasing decisions. Future research could certainly be done to productively establish precisely the nature of that relationship, but in the meantime the industry should probably expect the influence of Metacritic to increase, rather than decrease.
One addition note of caution is in order as well - it may well be the case that since Metacritic acknowledges that their metascore formula is based on a weighted average, the only intellectual property of value that the company possesses, aside from its current visibility, is the proprietary list of reviewer weightings they use to derive these scores. The simplicity of Metacritic's approach to calculating aggregated metareviews may therefore make it potentially vulnerable to upstart competitors who utilize more sophisticated approaches to calculate or display and visualize data. If someone else finds a better, more easily accessible way to do what Metacritic currently does, the organization could quickly experience a ruinous fall from their current ascendancy.
 
Appendices
Appendix A: Scores vs. Sales Data
Genre
Game
System
Score
Sales (in Millions)
Action
Naughty Bear
XBOX 360
43
0.19
Action
The Lord of the Rings: Conquest
XBOX 360
55
0.59
Action
Dark Void
XBOX 360
59
0.21
Action
Avatar: The Game
XBOX 360
61
0.58
Action
Transformers: Revenge of the Fallen
XBOX 360
61
0.51
Action
Ninja Blade
XBOX 360
68
0.23
Action
Deadly Premonition
XBOX 360
69
0.12
Action
Silent Hill: Homecoming
XBOX 360
70
0.38
Action
Dante's Inferno
XBOX 360
73
0.6
Action
Star wars: The force Unleashed
XBOX 360
73
2.41
Action
Mafia II
XBOX 360
74
0.56
Action
Prince of Persia: The Forgotten sands
XBOX 360
74
0.3
Action
X-Men Origins: Wolverine
XBOX 360
75
0.58
Action
Prototype
XBOX 360
78
1.2
Action
Dead Rising 2
XBOX 360
79
0.69
Action
Ghostbusters: The Video game
XBOX 360
79
0.54
Action
Mirror's Edge
XBOX 360
79
1.08
Action
Assassins Creed
XBOX 360
81
4.97
Action
Just Cause 2
XBOX 360
81
0.76
Action
Ninja Gaiden II
XBOX 360
81
0.98
Action
Saints Row 2
XBOX 360
81
1.98
Action
Brutal Legend
XBOX 360
82
0.7
Action
Alan Wake
XBOX 360
83
0.81
Action
Castlevania: Lords of Shadow
XBOX 360
83
0.21
Action
Darksiders
XBOX 360
83
0.71
Action
Devil May Cry 4
XBOX 360
84
1.27
Action
Dead Rising
XBOX 360
85
1.82
Action
Resident Evil 5
XBOX 360
85
2.86
Action
Tom Clancy's Splinter Cell: Conviction
XBOX 360
85
1.55
Action
Tom Clancy's Splinter Cell: Double Agent
XBOX 360
85
1.18
Action
Assassins Creed II
XBOX 360
90
4.28
Action
Bayonetta
XBOX 360
90
0.71
Action
Batman: Arkham Asylum
XBOX 360
92
1.65
Action
Red Dead Redemption
XBOX 360
95
3.61
Action
Grand Theft Auto IV
XBOX 360
98
8.12
Action
Iron Man 2
PS3
41
0.16
Action
Naughty Bear
PS3
43
0.16
Action
Lair
PS3
53
0.41
Action
Fist of the North Star: Ken's Rage
PS3
58
0.56
Action
Way of the Samurai 3
PS3
58
0.44
Action
Dark Void
PS3
59
0.2
Action
Avatar: The Game
PS3
60
0.63
Action
Dynasty Warriors 6 Empires
PS3
62
0.28
Action
Transformers: Revenge of the Fallen
PS3
63
0.44
Action
Silent Hill: Homecoming
PS3
64
0.23
Action
Star wars: The force Unleashed
PS3
71
1.63
Action
X-Men Origins: Wolverine
PS3
73
0.59
Action
Dante's Inferno
PS3
75
0.72
Action
Prince of Persia: The Forgotten sands
PS3
75
0.41
Action
Spider-Man: Shattered Dimensions
PS3
75
0.17
Action
Ghostbusters: The Video game
PS3
78
0.62
Action
Heavenly Sword
PS3
79
1.44
Action
Mirror's Edge
PS3
79
0.9
Action
Prototype
PS3
79
0.98
Action
Assassins Creed
PS3
81
3.83
Action
Darksiders
PS3
82
0.77
Action
Saints Row 2
PS3
82
1.17
Action
Brutal Legend
PS3
83
0.55
Action
Just Cause 2
PS3
83
0.83
Action
Ninja Gaiden Sigma 2
PS3
83
0.53
Action
Devil May Cry 4
PS3
84
1.31
Action
Castlevania: Lords of Shadow
PS3
85
0.26
Action
Infamous
PS3
85
1.71
Action
Resident Evil 5
PS3
86
3.52
Action
Bayonetta
PS3
87
0.78
Action
Uncharted: Drake's Fortune
PS3
88
3.35
Action
Assassins Creed II
PS3
91
4.05
Action
Batman: Arkham Asylum
PS3
91
2.08
Action
God of War III
PS3
92
3.13
Action
Metal Gear Solid 4: Guns of the Patriots
PS3
94
5
Action
Red Dead Redemption
PS3
95
3.01
Action
Uncharted 2: Among Thieves
PS3
96
3.81
Action
Grand Theft Auto IV
PS3
98
6.91
FPS
ShellShock 2: Blood Trails
XBOX 360
30
0.09
FPS
History Channel: Battle for the Pacific
XBOX 360
35
0.05
FPS
Hour of Victory
XBOX 360
37
0.14
FPS
America's Army: True Soldiers
XBOX 360
43
0.1
FPS
NPPL Championship Paintball 2009
XBOX 360
44
0.11
FPS
Legendary
XBOX 360
47
0.08
FPS
History Civil War: Secret Missions
XBOX 360
51
0.15
FPS
Conflict: Denied Ops
XBOX 360
52
0.18
FPS
History Channel: Civil War- A Nation Divided
XBOX 360
53
0.14
FPS
Velvet Assassin
XBOX 360
56
0.13
FPS
007: Quantum of Solace
XBOX 360
65
1.14
FPS
Section 8
XBOX 360
69
0.22
FPS
Wolfenstein
XBOX 360
72
0.45
FPS
Medal of Honor
XBOX 360
74
1.45
FPS
Frontlines: Fuel of War
XBOX 360
75
0.54
FPS
Brothers in Arms: Hell's Highway
XBOX 360
76
0.84
FPS
Operation Flashpoint: Dragon Rising
XBOX 360
76
0.85
FPS
Singularity
XBOX 360
76
0.13
FPS
Call of Juarez: Bound in Blood
XBOX 360
77
0.55
FPS
Metro 2033
XBOX 360
77
0.32
FPS
Prey
XBOX 360
79
0.3
FPS
Perfect Dark Zero
XBOX 360
81
1.32
FPS
Call of Duty 3
XBOX 360
82
2.36
FPS
The Chronicles of Riddick: Assault on Dark Athena
XBOX 360
82
0.25
FPS
Tom Clancy's Rainbow Six: Vegas 2
XBOX 360
82
2.39
FPS
Unreal Tournament III
XBOX 360
82
0.46
FPS
Halo 3:ODST
XBOX 360
83
5.75
FPS
Battlefield: Bad Company
XBOX 360
84
1.4
FPS
Borderlands
XBOX 360
84
1.94
FPS
Call of Duty: World at War
XBOX 360
84
6.57
FPS
Far Cry 2
XBOX 360
85
1.54
FPS
FEAR 2: Project Origin
XBOX 360
85
0.47
FPS
Tom Clancy's Ghost Recon Advanced Warfighter 2
XBOX 360
86
1.48
FPS
Battlefield: Bad Company 2
XBOX 360
88
2.65
FPS
Bioshock 2
XBOX 360
88
1.52
FPS
Call of Duty 2
XBOX 360
89
2.47
FPS
Left for Dead
XBOX 360
89
2.92
FPS
Left for Dead 2
XBOX 360
89
3
FPS
Tom Clancy's Ghost Recon Advanced Warfighter
XBOX 360
90
2.28
FPS
Halo Reach
XBOX 360
91
6.27
FPS
Call of Duty: Modern Warefare 2
XBOX 360
94
11.87
FPS
Halo 3
XBOX 360
94
11.26
FPS
Bioshock
XBOX 360
96
2.6
FPS
Rogue Warrior
PS3
27
0.08
FPS
Soldier of Fortune: Payback
PS3
50
0.08
FPS
007: Quantum of Solace
PS3
65
0.89
FPS
Wolfenstein
PS3
71
0.42
FPS
Medal of Honor
PS3
75
1.24
FPS
Brothers in Arms: Hell's Highway
PS3
76
0.73
FPS
MAG
PS3
76
0.97
FPS
Operation Flashpoint: Dragon Rising
PS3
76
0.73
FPS
Singularity
PS3
77
0.12
FPS
Call of Juarez: Bound in Blood
PS3
78
0.64
FPS
FEAR 2: Project Origin
PS3
79
0.34
FPS
Call of Duty 3
PS3
80
0.7
FPS
The Chronicles of Riddick: Assault on Dark Athena
PS3
80
0.16
FPS
Tom Clancy's Rainbow Six: Vegas 2
PS3
81
1.2
FPS
Condemned 2: Bloodshot
PS3
82
0.31
FPS
Borderlands
PS3
83
0.82
FPS
Call of Duty: World at War
PS3
85
4.29
FPS
Far Cry 2
PS3
85
1.13
FPS
Resistance
PS3
86
3.71
FPS
Unreal Tournament III
PS3
86
0.57
FPS
Resistance 2
PS3
87
2.01
FPS
Battlefield: Bad Company 2
PS3
88
1.95
FPS
Bioshock 2
PS3
88
0.82
FPS
Killzone 2
PS3
91
2.42
FPS
Bioshock
PS3
94
0.72
FPS
Call of Duty: Modern Warefare 2
PS3
94
8.86
RPG
Operation Darkness
XBOX 360
46
0.03
RPG
Two Worlds
XBOX 360
50
0.48
RPG
Kingdom Under Fire: Circle of Doom
XBOX 360
55
0.32
RPG
Spectral Force 3
XBOX 360
59
0.07
RPG
Risen
XBOX 360
60
0.12
RPG
Divinity II: Ego Draconis
XBOX 360
62
0.14
RPG
Alpha Protocol
XBOX 360
63
0.19
RPG
Phantasy Star Universe
XBOX 360
64
0.1
RPG
Too Human
XBOX 360
65
0.72
RPG
Final Fantasy XI: Online
XBOX 360
66
0.22
RPG
The Last Remnant
XBOX 360
66
0.64
RPG
Nier
XBOX 360
67
0.14
RPG
Infinite Undiscovery
XBOX 360
68
0.6
RPG
Enchanted Arms
XBOX 360
69
0.19
RPG
Record of Agarest War
XBOX 360
71
0.14
RPG
Sacred 2: Fallen Angel
XBOX 360
71
0.43
RPG
Star Ocean: The Last Hope
XBOX 360
72
0.64
RPG
Marvel Ultimate Alliance 2
XBOX 360
73
0.74
RPG
Resonance of Fate
XBOX 360
74
0.2
RPG
Culdcept SAGA
XBOX 360
75
0.17
RPG
Lost Odyssey
XBOX 360
78
0.84
RPG
Blue Dragon
XBOX 360
79
0.56
RPG
Eternal Sonata
XBOX 360
79
0.25
RPG
Tales of Vesperia
XBOX 360
79
0.54
RPG
Final Fantasy XIII
XBOX 360
82
1.62
RPG
Marvel: Ultimate Alliance
XBOX 360
82
2.48
RPG
Dragon Age: Origins
XBOX 360
86
1.86
RPG
Fable II
XBOX 360
89
3.9
RPG
Mass Effect 2
XBOX 360
91
2.21
RPG
Fallout 3
XBOX 360
93
3.4
RPG
The Elder Scrolls IV: Oblivion
XBOX 360
94
3.43
RPG
Mass Effect
XBOX 360
96
2.32
RPG
Last Rebellion
PS3
44
0.05
RPG
Untold Legends: Dark Kingdom
PS3
58
0.13
RPG
Trinity Universe
PS3
62
0.09
RPG
Enchanted Arms
PS3
64
0.21
RPG
White Knight Chronicles: International Edition
PS3
64
0.68
RPG
Atelier Rorona: Alchemist of Arland
PS3
65
0.16
RPG
Record of Agarest War
PS3
67
0.03
RPG
Nier
PS3
68
0.31
RPG
Sacred 2: Fallen Angel
PS3
70
0.42
RPG
Alpha Protocol
PS3
72
0.19
RPG
Resonance of Fate
PS3
72
0.43
RPG
Marvel Ultimate Alliance 2
PS3
74
0.59
RPG
Star Ocean: The Last Hope
PS3
74
0.4
RPG
Folklore
PS3
75
0.21
RPG
3D Dot Game Heroes
PS3
77
0.26
RPG
Marvel: Ultimate Alliance
PS3
78
0.31
RPG
Eternal Sonata
PS3
80
0.17
RPG
Final Fantasy XIII
PS3
83
4.37
RPG
Valkyria Chronicles
PS3
86
0.94
RPG
Dragon Age: Origins
PS3
87
1
RPG
Demon's Souls
PS3
89
0.82
RPG
Fallout 3
PS3
90
2.37
RPG
The Elder Scrolls IV: Oblivion
PS3
94
1.99
 
References
Baker, L. (2011) THQ shares fall on reviews of "Homefront" war game. Reuters, March 15, 2011. Available at: http://www.reuters.com/article/2011/03/15/us-thq-shares-idUSTRE72E7E620110315 [Accessed: 9 May 2011].
Brightman, J. (2009) Interview: John Riccitiello on E3, fighting piracy, metacritic, and more. Available at: http://www.industrygamers.com/news/interview-john-riccitiello-on-e3-fighting-piracy-metacritic-and-more/3/ [Accessed: 8 February 2012].
Boesky, K. (2008) Opinion: Why EA, the industry shouldn't rely on metacritic. Gamasutra, May 23, 2008. Available at: http://www.gamasutra.com/php-bin/news_index.php?story=18562 [Accessed: 15 August 2011].
Dodson, J. (2006) Mind over meta. GameRevolution, July 14, 2006. Available at: http://www.gamerevolution.com/features/mind_over_meta [Accessed: 9 May 2011].
Doyle, M. (2011) About Metacritic. Available at: http://www.metacritic.org/about/ [Accessed: 27 September 2011].
Dring, C. (2010) EA's Moore: Metacritic mania a slippery slope. Develop, 7/20/2010. Available at: http://www.develop-online.net/news/35425/EAs-Moore-Metacritic-mania-a-slippery-slope [Accessed: 18 August 2012].
Everiss, B. (2008) Metacritic has changed the games industry. Available at: http://www.bruceongames.com/2008/06/04/metacritic-has-changed-the-games-industry/ [Accessed: 8 February 2012].
Fahey, M. (2011) Dragon Age II dev rates his own game on Metacritic, EA bets Obama voted for himself too. Kotaku, 3/15/2011. Available at: http://kotaku.com/5782097/dragon-age-ii-dev-rates-his-own-game-on-metacritic-ea-bets-obama-voted-for-himself-too [Accessed: 18 August 2012].
Gilbert, B. (2012) Obsidian missed Fallout: New Vegas Metacritic bonus by one point. Available at: http://www.joystiq.com/2012/03/15/obsidian-missed-fallout-new-vegas-metacritic-bonus-by-one-point/ [Accessed: 9 May 2011].
Graft, K., Sheffield, B., Nutt, C., Rose, M., Cifaldi, F., Caoili, E., Alexander, L., Miller, P., Curtis, T. (2012) Ask Gamasutra: 84 Metacritic need not apply. Gamasutra, 7/24/2012. Available at: http://gamasutra.com/view/news/174829/Ask_Gamasutra_84_Metacritic_need_not_apply.php [Accessed: 20 August 2012].
Greenwood-Ericksen, A. (2011) On the Role of Metacritic in the Game Industry. Available at: http://www.fsoblogs.com/gdms/?currentPage=4 on 8/15/2011
Lazarus, D. (2001) "San Francisco Chronicle, 2001". Sfgate.com, cited in Wikipedia. Available at: http://en.wikipedia.org/wiki/Rotten_Tomatoes 8/15/2011 [Accessed: 4 December 2009].
McDonald, K. (2012) Is metacritic ruining the games industry? IGN, 6/16/2012. Available at: http://www.ign.com/articles/2012/07/16/is-metacritic-ruining-the-games-industry?utm_source=Monday%20newsletter&utm_medium=email&utm_campaign=7.17%20Dynamic%20Newsletter_NO%20FNAME_5949_280581_280628&utm_content=16600107 [Accessed: 18 August 2012].
Metacritic. (2010a) I read Manohla Dargis' review of [MOVIE NAME] and I swear it sounded like a 90... why did you say she gave it an 80? Available at: https://metacritic.custhelp.com/app/answers/detail/a_id/1501/session/L3Nuby8wL3NpZC9DOFVxQkczaw [Accessed 20 May 2012].
Metacritic. (2010b) Can you tell me how each of the different critics are weighted in your formula? Available at: https://metacritic.custhelp.com/app/answers/detail/a_id/1507/session/L3Nuby8wL3NpZC9DOFVxQkczaw [Accessed 27 May 2012].
Metacritic. (2010c) Why don't you have 97 reviews for every movie like those other websites do? Available at: http://metacritic.custhelp.com/app/answers/detail/a_id/1510/session/L2F2LzEvdGltZS8xMzM5NDI2MTYyL3NpZC90anp0NnAtaw%3D%3D/sno/0 [Accessed 13 June 2012].
Metacritic. (2012a) How we create the metascore magic. Retrieved 5/20/2012 from http://www.metacritic.com/about-metascores [Accessed: 20 May 2012].
Metacritic. (2012b) Which critics and publications are included in your calculations? Available at: https://metacritic.custhelp.com/app/answers/detail/a_id/1508/session/L3Nuby8wL3NpZC9DOFVxQkczaw [Accessed: 13 June 2012].
Murdoch, J. (2010) Metacritic: Gaming the score. Gamepro. Available at: from http://www.gamepro.com/article/features/214841/gaming-the-score-metacritic/ [Accessed: 28 September 2011].
Nutt, C. (2012) How Creative Assembly's process breeds quality. Gamasutra, 4/30/2012. Available at: http://gamasutra.com/view/feature/169354/how_creative_assemblys_process_.php [Accessed: 30 April 2012].
O'Rourke, K. (2007) "An historical perspective on meta-analysis: dealing quantitatively with varying study results". Journal of the Royal Society of Medicine, 100 (12): 579-582. doi​:​10.1258/jrsm.100.12.579​. PMID 18065712.
Periera, C. (2012) OP-ED: Metacritic presents real problems for the industry. 1UP, 7/16/2012. Available at: http://www.1up.com/news/metacritic-presents-problems-industry [Accessed: 18 August 2012].
Pham, A., & Fritz, B. (2011) Bad reviews of Homefront send THQ shares tumbling. The Los Angeles Times, March 16, 2011. Available at: http://articles.latimes.com/2011/mar/16/business/la-fi-ct-thq-homefront-20110316 [Accessed: 9 May 2012].
Sinclair, B. (2011) Jurassic Park user reviews abused. Gamespot, 11/17/2012. Available at: http://uk.gamespot.com/features/jurassic-park-user-reviews-abused-6346288/?tag=updates%3Beditor%3Ball%3Btitle%3B2 [Accessed: 18 August 2011].
Wingfield, N. (2007) High scores matter to game makers, too. The Wall Street Journal, September 20, 2007. Available at: http://online.wsj.com/public/article/SB119024844874433247-EnpxM1F6fI9YZDofC7VnyPzVrGQ_20070920.html [Accessed: 15 August 2012].
HOMEABOUTLOGINREGISTERSEARCHCURRENTARCHIVESANNOUNCEMENTS