Digital Marketing, Social Media Strategy, Social Media Marketing

The Fuzzy Math of Ratings & Reviews

social media monsters 300x231 The Fuzzy Math of Ratings & ReviewsMy wife and I took the kids to see Monsters vs. Aliens recently. Seth Rogen as Bob the Blob was funny, and Hugh Laurie was excellent as Dr. Cockroach. A handful of great one liners, but overall a middling effort in the kid movie genre.

The next day – after an appropriate period of contemplation to absorb the deeper lessons of an animated, 3D tale – I asked my wife what she would score the movie on a 5 point scale. She said 2.

I asked her what she would score it on a 10 point scale. She said 6.

I asked her what she would score it on a 100 point scale. She said 70.

2, 6, 70

So what? Well, the statistically-astute readers of Convince & Convert already figured it out, but my wife’s 3 scores – given at the same time, by the same person, for the same movie – are in reality quite different when normalized on a 100 point scale.

2 out of 5 normalizes to 40% on a 100 point scale.

6 out of 10 normalizes to 60% on a 100 point scale.

70 out of 100 of course normalizes to 70% on a 100 point scale.

Thus, it appears that my wife’s appreciation for the cinematic grandeur that it Monsters vs. Aliens jumps by 75% (difference between 40 and 70) based solely on the width of the rating scale.

A 3 Star Hotel, 1 For Each Cockroach

We confront this issue of scale bias every day. Netflix attempts to recommend movies on a 5 point scale (and they have offered a million dollars for improvements to their matching algorithm). TripAdvisor and most other travel sites ask consumers to rate hotels and attractions on a 5 point scale. The Hottie and the Fatso restaurant review blog and podcast authored by my wife and me also uses a 5 point scale.

Interestingly, PriceLine uses a 10 point scale.

Review aggregators like Metacritic and Rotten Tomatoes use a 100 point scale.

As consumers become more and more prone to critiquing everything around them (Forrester research indicates that 37% of U.S. Internet users fall into the “critic” category on their Social Technographics Ladder), a 75% swing in consumer preference based on rating scale could mean real money for businesses.

Imagine a hotel has 9 ratings on TripAdvisor. Four of the ratings are 4 stars. Five of the ratings are 3 stars. The average rating is 3.44, which would round up to a 3.5 score. However, if a 10th rating is added at 1 star, it would bring the average down to 3.20, rounded down to an overall score of 3.

You might give an especially bad hotel a 1 on a 5 point scale. But you would probably not give the same hotel a 2 on a 10 point scale (which is the same, 20% rating). You would probably also not give the hotel a 20 on a 100 point scale. Why? Because 1 out of 5 seems normal. Two out of 10 and 20 out of 100 seem punitive and unnecessary.

What’s the big deal, you say? I researched hotels in Copenhagen recently and the first hotel rated 3.5 on TripAdvisor ranks 16th out of 130 total hotels. The first hotel rated 3 ranks 41st. A potentially significant difference in visibility and bookings.

To me, the 5 point scale is too restrictive and allows for little nuance. And the 100 point scale is too big. Scores almost never fall below 50, because most people innately think of education scoring and its 90 = A, 80 = B scheme when using 100 point scales. The 100 point wine criticism system popularized by Robert Parker is victimized by this scenario. I’ve watched nearly every episode of Gary Vaynerchuk’s WineLibaryTV and even “bad” wines never score below 70 points. In essence, the 100 point scale is really a 30 or 40 point scale from 60/70 to 100. Not optimal.

I prefer the 10 point scale. It provides enough room to make important distinctions, but not so much room it throws off the accuracy of the scale.

Maybe I need to change my restaurant reviews to a 10 point system…

(photo by Jelene)

  • http://amymengel.com/ amymengel

    Jay, this post is really intriguing. The five-point system probably is too restrictive. I wonder how much the visual presentation has to do with it. I’ve taken 10-point-scale surveys before and when you have a list of questions/statements with 10 radio buttons after each, it just looks so cluttered and I tend not to complete the survey. If I’m just asked to enter a number in a text box, the survey “looks” easier and I’m more likely to fill it out. The 5-point scale generally can be presented a little more cleanly and usually doesn’t look as daunting.

    I think that I tend to be more punative than laudatory. If I stay at a hotel and have a poor experience, I have no qualms giving it a 1 (no matter if 5- or 10-point scale). But I rarely rate good experiences with the top score of 5 or 10. I think I have higher expectations for the top end of the scale and need to be totally or completely wowed to warrant giving a top score.

    Unfortunately, brands and services often cannot control the scoring system that review systems use, so hotels are at the mercy of TripAdvisor’s 5-point scale and wines are at the mercy of the Parker 100-point scale. What’s likely to happen, if not already, is that brands will start trying to game these systems to their advantage to move up in the rankings… incentivizing people to write reviews, writing ghost reviews, etc. These measures may improve a ranking, but do nothing to actually change the company’s approach to serving customers better or improving its product.

    Thanks for a very thought-provoking post.

    @amymengel

    amymengel’s last blog post..Take control of your comment history with Backtype

  • http://amymengel.com amymengel

    Jay, this post is really intriguing. The five-point system probably is too restrictive. I wonder how much the visual presentation has to do with it. I’ve taken 10-point-scale surveys before and when you have a list of questions/statements with 10 radio buttons after each, it just looks so cluttered and I tend not to complete the survey. If I’m just asked to enter a number in a text box, the survey “looks” easier and I’m more likely to fill it out. The 5-point scale generally can be presented a little more cleanly and usually doesn’t look as daunting.

    I think that I tend to be more punative than laudatory. If I stay at a hotel and have a poor experience, I have no qualms giving it a 1 (no matter if 5- or 10-point scale). But I rarely rate good experiences with the top score of 5 or 10. I think I have higher expectations for the top end of the scale and need to be totally or completely wowed to warrant giving a top score.

    Unfortunately, brands and services often cannot control the scoring system that review systems use, so hotels are at the mercy of TripAdvisor’s 5-point scale and wines are at the mercy of the Parker 100-point scale. What’s likely to happen, if not already, is that brands will start trying to game these systems to their advantage to move up in the rankings… incentivizing people to write reviews, writing ghost reviews, etc. These measures may improve a ranking, but do nothing to actually change the company’s approach to serving customers better or improving its product.

    Thanks for a very thought-provoking post.

    @amymengel

    amymengel’s last blog post..Take control of your comment history with Backtype

  • http://www.catywampus.com/ Doug Cholewa

    While the 5-point system may seem too restrictive from a stats perspective, I’d venture that it’s more manageable from a usability perspective.

    Studies show that too much choice creates anxiety; and while a 10-point system falls strictly in line with the grading system that we’ve all grown up with in school, do the average critics really want to put that much thought into their reviews?

    Personally, I’d rather give them a less-perfect 5-point scale than risk having them get too stressed out to not even bother to post a review at all.

    @catywampus

  • http://www.catywampus.com Doug Cholewa

    While the 5-point system may seem too restrictive from a stats perspective, I’d venture that it’s more manageable from a usability perspective.

    Studies show that too much choice creates anxiety; and while a 10-point system falls strictly in line with the grading system that we’ve all grown up with in school, do the average critics really want to put that much thought into their reviews?

    Personally, I’d rather give them a less-perfect 5-point scale than risk having them get too stressed out to not even bother to post a review at all.

    @catywampus

  • Jacki Mieler

    Great post! I do agree with you on the 10 point scale probably being more effective than the 5 point scale. When I’ve rated things on a 10-point scale, I’m not afraid to get down into 1’s, 2’s and 3’s if something is truly deserving of that rating. I think the main problem with a 5 point scale is that we equare it with the verbiage that we often find associated with these scales – 5=Excellent, 4=Good, 3=Indifference, 2=Poor, 1=Awful. Especially when it comes to travel and food, I have so many more emotions between “Excellent” and “Good” and separating them by one number doesn’t really do it justice.

    But, with Trip Advisor, I put much more credibility in the comments than I do in the actual number ratings. For instance, I recently planned my entire wedding/honeymoon in Belize based on what people said on TripAdvisor. I pored over the site for days, taking in all of the comments, trying to make an informed decision. At the end of the day, we ended up choosing the property that was perfect for us. Sure, there were some negative comments about the property and some poor ratings, but I balanced those with the positive and took into consideration what they were complaining about.

    On a related note, I have not done my review of this property on Trip Advisor yet. To make a long story short, we had a situation where a staff member stole some money out of our suitcase. Yes, this detracted from our overall experience, but I loved the place so much that I would go back again (provided they install safes). So, in the numbers ratings, I would rate them extremely high, but I would put my caution about protecting your valuables in the comments. If someone was basing their decisions purely on the numbers, they would see me as having a near perfect experience, even though we had money stolen from us. Bottom line – it’s not just about the numbers, you have to take other things into consideration when making travel decisions.

    @JackiMieler

  • http://YourWebsite Jacki Mieler

    Great post! I do agree with you on the 10 point scale probably being more effective than the 5 point scale. When I’ve rated things on a 10-point scale, I’m not afraid to get down into 1’s, 2’s and 3’s if something is truly deserving of that rating. I think the main problem with a 5 point scale is that we equare it with the verbiage that we often find associated with these scales – 5=Excellent, 4=Good, 3=Indifference, 2=Poor, 1=Awful. Especially when it comes to travel and food, I have so many more emotions between “Excellent” and “Good” and separating them by one number doesn’t really do it justice.

    But, with Trip Advisor, I put much more credibility in the comments than I do in the actual number ratings. For instance, I recently planned my entire wedding/honeymoon in Belize based on what people said on TripAdvisor. I pored over the site for days, taking in all of the comments, trying to make an informed decision. At the end of the day, we ended up choosing the property that was perfect for us. Sure, there were some negative comments about the property and some poor ratings, but I balanced those with the positive and took into consideration what they were complaining about.

    On a related note, I have not done my review of this property on Trip Advisor yet. To make a long story short, we had a situation where a staff member stole some money out of our suitcase. Yes, this detracted from our overall experience, but I loved the place so much that I would go back again (provided they install safes). So, in the numbers ratings, I would rate them extremely high, but I would put my caution about protecting your valuables in the comments. If someone was basing their decisions purely on the numbers, they would see me as having a near perfect experience, even though we had money stolen from us. Bottom line – it’s not just about the numbers, you have to take other things into consideration when making travel decisions.

    @JackiMieler

  • http://stevesme.name/ Steve Douglas

    Jay, think you are on to something but you’re looking at it in the wrong angle. You’re looking at in the angle of what most companies are doing. If you look at Best Buy they do a review based a granular perspective (price, ease of use, weight, etc), if you would take this granular approach to restaurants you could look at last health food score, overall display of food, how busy the establishment was and more.

    Just my thoughts, hope you come up with the ulimate solution.

    Steve Douglas’s last blog post..Most marketing ideas are just hybrids off others ideas, comments and creations. Something new feels…

  • http://stevesme.name Steve Douglas

    Jay, think you are on to something but you’re looking at it in the wrong angle. You’re looking at in the angle of what most companies are doing. If you look at Best Buy they do a review based a granular perspective (price, ease of use, weight, etc), if you would take this granular approach to restaurants you could look at last health food score, overall display of food, how busy the establishment was and more.

    Just my thoughts, hope you come up with the ulimate solution.

    Steve Douglas’s last blog post..Most marketing ideas are just hybrids off others ideas, comments and creations. Something new feels…

  • http://twitter.com/lisamloeffler/status/1438702697 Lisa Loeffler

    If you are a #s guy or gal you’ll appreciate @jaybaer blog post about ratings & reviews. Ugh. Too early for math. http://tinyurl.com/czg3ks

  • http://www.colinandbethany.com/ Colin Jensen

    It’s because of school. When you ask her vote out of 5, her “what grade would I give it” filter doesn’t kick in, and neither then her “but 40% is an F, and I’m sure the people who made these movies are good people” filter. I say that as one more teacher who always inflates grades for exactly that reason–an A student will repent if you give him a B, but a D student will never try again if you give him an F.

    Colin Jensen’s last blog post..Welcome Heidi Phoenix Jensen

  • http://www.colinandbethany.com/ Colin Jensen

    It’s because of school. When you ask her vote out of 5, her “what grade would I give it” filter doesn’t kick in, and neither then her “but 40% is an F, and I’m sure the people who made these movies are good people” filter. I say that as one more teacher who always inflates grades for exactly that reason–an A student will repent if you give him a B, but a D student will never try again if you give him an F.

    Colin Jensen’s last blog post..Welcome Heidi Phoenix Jensen

  • http://twitter.com/farwalker/status/1440192838 Dale Walker

    Interesting blog post by @jaybear about how the scale of a rating system affects the results http://tr.im/ia1Q

  • http://twitter.com/charityhisle/status/1440899015 Charity Hisle

    @jaybaer Thought provoking post on customer reviews and scales. http://tinyurl.com/czg3ks

  • http://www.twitter.com/lena_ Lena

    Interesting post…it would be curious to run a broader survey of more people that also included up/down voting.

    I feel like when someone believes they’re more of an expert in a subject, providing the nuance of star rating (5 for possibly a lower level of expertise, 10 for greater) is beneficial, but barring that, “yes I’d recommend this” or “no I wouldn’t” it’s a much more simple decision to make for most people, and a lot of ratings sites are testing this theory out.

    Lena’s last blog post..Lena_: @ryenanderson lol! why isn’t there an after shot :P

  • http://www.twitter.com/lena_ Lena

    Interesting post…it would be curious to run a broader survey of more people that also included up/down voting.

    I feel like when someone believes they’re more of an expert in a subject, providing the nuance of star rating (5 for possibly a lower level of expertise, 10 for greater) is beneficial, but barring that, “yes I’d recommend this” or “no I wouldn’t” it’s a much more simple decision to make for most people, and a lot of ratings sites are testing this theory out.

    Lena’s last blog post..Lena_: @ryenanderson lol! why isn’t there an after shot :P

  • http://twitter.com/jessenewhart/status/1447178031 Jesse Newhart

    The Fuzzy Math of Ratings & Reviews: http://bit.ly/MivN by @JayBaer

  • http://twitter.com/meepbobeep/status/1447211985 Mary Pat Campbell

    I can’t say I’m surprised. Framing is important. RT @JesseNewhart: The Fuzzy Math of Ratings & Reviews: http://bit.ly/MivN by @JayBaer

  • http://twitter.com/mailgeek/status/1447238668 mailgeek

    RT @JesseNewhart: The Fuzzy Math of Ratings & Reviews: http://bit.ly/MivN by @JayBaer

  • http://twitter.com/docudramaqueen/status/1447245226 docudramaqueen
  • http://twitter.com/gidjin/status/1447245924 John Gedeon

    Interesting. I like 5pts with 1/2s so 10. RT @JesseNewhart: The Fuzzy Math of Ratings & Reviews: http://bit.ly/MivN by @JayBaer

  • http://twitter.com/purecognition/status/1447307842 Nathan Lauffer

    RT: @JesseNewhart: The Fuzzy Math of Ratings & Reviews: http://bit.ly/MivN by @JayBaer

  • http://hftesting.com/ Regis Magyar

    The problem with all rating scales is that the respondent is confronted with asking the question “Compared to what?” In my profesion I avoid traditional rating scales and use a paired comparison procedure in which the user is asked to compare two items at a time and simply decide if one item ie netter, worse or the same as another item. This procedure then quantitatively ranks each item in a preference listing with scores showing how much greater or less each item is relative to another item. In the current text, the user should have been asked to compare several films of the desired type of fiilms to see how the Monsters movie compared to other similat animated flicks.

  • http://hftesting.com Regis Magyar

    The problem with all rating scales is that the respondent is confronted with asking the question “Compared to what?” In my profesion I avoid traditional rating scales and use a paired comparison procedure in which the user is asked to compare two items at a time and simply decide if one item ie netter, worse or the same as another item. This procedure then quantitatively ranks each item in a preference listing with scores showing how much greater or less each item is relative to another item. In the current text, the user should have been asked to compare several films of the desired type of fiilms to see how the Monsters movie compared to other similat animated flicks.

  • http://www.freshnetworks.com/ Charlie Osmond

    Interesting findings. But I think the real problems with ratings and reviews are even deeper. It is covered in more detail here:

    the lies behind online ratings and reviews

  • http://www.freshnetworks.com Charlie Osmond

    Interesting findings. But I think the real problems with ratings and reviews are even deeper. It is covered in more detail here:

    the lies behind online ratings and reviews

  • http://twitter.com/jose602 jose

    Good read! Similarly, I find the ranking of news stories on Yahoo and news sites frustratingly imprecise. When people click the stars to rank a news item are they rating the content? The writing? The subject of the news item? It muddies the notion of quality (wrt specific news items) to be that imprecise, I think.

    At least with Digg and other websites, it’s clearly A) you recommend this and want other people to read this or B) you don’t recommend it and think it would waste people’s time.

    jose’s last blog post..jose602: @PhoenixSuns @paigeiam is Suns superfan #1. If she’s not your one & only, there will be many a sad panda.

  • http://twitter.com/jose602 jose

    Good read! Similarly, I find the ranking of news stories on Yahoo and news sites frustratingly imprecise. When people click the stars to rank a news item are they rating the content? The writing? The subject of the news item? It muddies the notion of quality (wrt specific news items) to be that imprecise, I think.

    At least with Digg and other websites, it’s clearly A) you recommend this and want other people to read this or B) you don’t recommend it and think it would waste people’s time.

    jose’s last blog post..jose602: @PhoenixSuns @paigeiam is Suns superfan #1. If she’s not your one & only, there will be many a sad panda.

  • http://twitter.com/echerub/status/6904661946 Leonard Chu

    5 Stars? 10/10? 100%? The rating scale you use makes a big difference. http://bit.ly/8tYj4U

  • http://twitter.com/echerub/status/6904661946 Leonard Chu

    5 Stars? 10/10? 100%? The rating scale you use makes a big difference. http://bit.ly/8tYj4U

  • letstalkandchat

    I just found a great company that builds websites for info products. To keep your costs low, they’ll mentor you on how to create your site, design a marketing funnel (one of the guys works in Hollywood and makes really slick videos), and the other guy programmed Myspace. If you’re looking to have professional web design for your small business and not waste any time or money then check their site out. Check them out: http://www.mikelmurphy.com/easy-info-product-site-system/