Next Glass: A Step In Solving Wine Rating’s Genetic Issues

NOTE: This article uses “wine” to avoid having to type “beer or wine” everywhere. But, unless noted otherwise, a reference to wine refers to both beer and wine.

The Next Glass wine and beer selection app is a step toward more accurate wine recommendations, but faces several substantial hurdles on the path to accuracy, scalability and profitability.

Two Steps Toward Accuracy

Next Glass’s two biggest advances toward accuracy are:

  1. It partly addresses the genetics puzzle ( Inherited Taste Chaos Sabotages Recommendations) and ,
  2. The use of a simple, even-numbered rating scale that eliminates wishy-washy opinions that are of no real use for recommendations (Rating The Rating Systems).

Screen Shot 2014-11-28 at 12.32.27 PM

Working The Genetics Puzzle

As examined in our genetics backgrounder, inherited variations mean that even sommeliers and other experts frequently describe different sensations and qualities from the same wine.

This hinders consistent evaluations and hinders accurate recommendations. (See: An Examination of Judge Reliability at a major US Wine Competition and “Wine-tasting: it’s junk science” )

Those genetic variables further impair accuracy when compounded by other factors such as education, experience and peer pressure.

Substituting Mr. G.C. MassSpec For Humans

Substituting a machine and computer addresses part of the inaccuracy issue.

Instead of a human expert, Next Glass uses laboratory techniques of gas chromatography and mass spectrometry (GC/MS) to determine some of the chemical compounds in each particular wine. (see The Science of Satisfaction on their web site).

Screen Shot 2014-11-25 at 5.02.55 AM

Right-click image to enlarge. Source: Next Glass

In the Next Glass system, The combo GC/MS instruments chart a wine’s chemical profile and, theoretically provides a data-based, fixed point for describing the wine.

This instrument-based determination at the wine end addresses but does not solve the biggest flaws inherent in human-based profile matching systems.

Human Genetics Is  More Complicated Than A Mass Spec

There are literally thousands of compounds in wine. It’s not feasible to include all of those in a GC/MS analysis of a wine. Therefore, Next Glass must pick and choose among the compounds they “think” are most relevant.

Not enough information is available about the number of compounds assessed and the adequacy of the algorithm that selects relevant compounds used in the profile.

Partial Analysis = Partial Accuracy

The choice of a subset of compounds to measure in the mass spec system introduces a “creator bias” in the resulting test protocol.

Because the human nose and tongue can distinguish compounds present in parts per billion and parts per trillion, omitting those trace compounds will result in an inaccurate match between MS/GC and human perception.

Single Dimensional Analysis Inferior,  Not An Adequate Model of Human Perception

The largest genetic-based flaw in the Next Glass procedure is the fact that it relies on a “flat” analysis of chemical compounds. This contrasts with human taste perception which is based on a multi-dimensional process involving:

  • taste buds,
  • olfactory sense receptors,
  • physical “mouth feel” sensations,
  • secondary olfactory sensations from the back of the palate as alcohol and other volatile organic compounds filter up into the nasal cavity from the throat.

This article explains the billions of individual taste and smell variations inherent in genetic differences from person to person.

Next: Calibrate The Consumer

Like most wine apps, Next Glass consumers begin by using the phone app to create an image of the label.

Once done and the product is recognized, the consumer uses a 4-point scale to express their rating.

Why four points? Why not 100 or some other number? Because those scales carry psychological baggage and many opportunities for misunderstanding that introduce their own biases (For more, see: Rating The Rating Systems).

Also see Peer pressure brings wine scores toward the middle for more.

Third: Keep Your Data Clean

UNLIKE most other wine apps, Next Glass does NOT show users the average rating of wines before they enter their own reviews. The common practice of showing average ratings on wines prior to rating severely biases ratings and damages their value for accurate recommendations. In addition, “average” rating are useless because no wine drinker is average. They are genetically unique.

This piece from the MIT/Sloan Management Review discusses these sorts of bias inherent in online reviews: The Problem With Online Ratings.

Fourth, Force A Decision

The Next Glass FAQs correctly note that odd-numbered scales encourage people to be wishy-washy and gravitate toward the middle.

In addition, scales with too many points (10 or 100 or in between) are subject to myriad personal biases arising from psychology, interpretation and personal experience.

In this instance, Next Glass has the right idea, but a faulty implementation.

The app itself offers no specifics on how to interpret their four-star system and leaves that up to the individual user.

The FAQs offer one of the many possible interpretations:

“Here is what a lot of our users are adopting as their scale:

  1. “I strongly dislike this beer. It’s one of my least favorites. Can I pour it out and get something else?

  2. “I would never buy this, but I’ll drink it since it’s in front of me. Overall, I don’t enjoy the taste.

  3. “I like this. It’s not an all time favorite, but it’s a good, reliable wine/beer that I would drink again.

  4. “I love this. I want more hours in the day to keep drinking it.”

Setback #1: More Words = Less Accuracy

The Tribes backgrounders: Words = Big Trouble and Rating The Rating Systems, explain why more words mean greater confusion and lower predictive value.

My work designing, deploying, and debugging two previous online profile matching systems (including a Facebook app) indicate that “action” choices produce better predictive data than qualitative, sensory opinions.

For that reason the Tribes four-point system would offer more accuracy to Next Glass:

  • I would serve this for a special occasion.
  • I would buy this again
  • I would not buy this again.
  • I would warn friends not to buy this.

This boils down to very short, specific actions as opposed to lengthier, “fuzzier” options:

  • Recommend
  • Buy
  • Not buy
  • Warn

Setback #2: Social Insecurity

In addition to inherent bias, an important 2014 study — Social pressure stops Facebook users recommending products on social media sites — shows that users in general avoid online product recommendations. And those who do are biased by peer pressure and other psychological factors.

The media are ready to enforce insecurties and social stress with articles like: “Forget the sweet swill, try a well-crafted rosé.” or, “How to Not Sound Stupid When Ordering Wine.”

The Tribes anonymous social networking preference selection algorithm solves setbacks #1-3. It also solves the biggest challenge faced by Next Glass: Scaling.

Scaling: Next Glass’s Biggest Challenge

While Next Glass solves some of the genetic and some of the other flaws inherent in systems that rely on humans experts to rate and describe wines, it still faces the same scaling problem:

More than 100,000 new wines are introduced into the American market every year. Fewer than 25,000 of those are ever rated. (For more details, see: Most Wines Have NEVER Been Rated By Critics. )

The Next Glass web site is silent on the number of wines in its database, but this article from USA Today USA Today puts the combined number of both beer and wines in its data base at 23,000.

While that’s a substantial accomplishment for a new app, it still falls very short of covering the 100,000 wines per year that are sold in the U.S.

In addition, the U.S. Trade and Tax Bureau’s Label Certification web site, says that more than 14,000 new labels for beer and related beverages (ales, porters etc. : TTB product classes 901-906 ) had been submitted as of November 30, 2014.

That’s not to denigrate Next Glass’s substantial technical accomplishments: their Science of Satisfaction section describes how they have tweaked their system to enable the analysis of 160 samples per day which is an impressive number.

But that would mean that it would take that machine 625 days — roughly 21 months — to keep up just with the wine. And that would not include down time for maintenance.

Of course, with enough money, most anything can be scaled.

According to Next Glass’s CTO, a reconditioned GC/MS unit like the one they use (ThermoScientific Orbitrap) are usually around $280k to $300k. They are around $400k new.

Of course, more machines means more people to operate them and more very costly supplies such as the columns and ultra-pure reagents needed for precise measurements.

Scaling also requires a larger support staff for acquiring wines, payments for wine and shipments, organizing and coordinating testing and other expenses.

No Plans To Sample Every Wine

Next Glass’s CTO Forrest Maready told us:

“We have no intentions of sampling every new wine/beer/liquor created every year. We want enough coverage that an average user could find something they like in any given situation (grocer, wine shop, restaurant, etc.).

Our standard for whether we are sampling enough will be if you walk into an average grocer and using our app can’t find a new wine/beer you like, then we need to increase our sampling percentage.

Obviously we don’t stop once we hit that threshold, but we also are not bent upon sampling every single thing created. It’s just not feasible (or necessary in our opinion).

Accurate Crowdsourcing: The Economical Scaling Solution

The primary goal of a “perfect” wine recommendation system is that consumers are going to get a recommendation almost all the time. This includes wine buyers of of all motivations from expert to the average supermarket shopper looking for a wine they are going to like without having a graduate degree in enology.

The rapid proliferation of wines at retail means that experts of all sorts will fail a large and valuable portion of the buyer base whether the expert is a human or a lab instrument.

A recommendation system that only relies on 20% of the available products available will remain JAWA … “Just Another Wine App.”

Crowdsourcing is the solution.

But the crowdsourcing implementations used by every major wine app so far fail to provide accurate recommendations. This is because of a series of fatal flaws that afflict them as well as the collaborative filtering methods used by Amazon, Netflix and others.

For a complete discussion, see Fatal Flaws In Current Recommendation Systems (located at the top of the left column).

Enhance Crowdsourcing Without Replacing Existing System

The Tribes system is designed to supplement and enhance (not replace) this and any other existing app or recommendation system. (See: What Problems Does Tribes Solve & How?)

People who like to publicly discuss and rate their wines and other products can continue to do so without changes to an app or site’s core programming.

In the case of Next Glass, the app could offer users the ability to keep their ratings anonymous. Those who want to post ratings publicly could be offered the option of entering a separate and anonymous rating which would be used by the Tribes algorithm for tribal assignment and recommendations.

Thus, someone affected by the “Cupcake Syndrome” can share a preference anonymously without fear. That private and “honest” expression of their preference also increases the odds of getting better recommendations from the Tribes algorithm.

Cupcake Syndrome?

One of the most common emails from consumer users of the SavvyTaste Facebook app (discontinued in 2011) addressed vino-insecurity over what I call the “Cupcake Syndrome.”

“I really like the app and its attitude,” wrote one reader. “But I have not used it [the FB app] because my name gets attached to my review. I really don’t want my friends to know that I really like Cupcake Cabernet better than their really expensive French reds.”

That consumer asked if he could post anonymously and still get recommendations. He explained that he was a top executive at a Fortune 500 corporation and had an image to maintain.

Consumer Friction & A Final Thought

A first testing of the Next Glass app revealed the scaling issues.

First of all, most of the initial wines the app presented me to rate were ones I had not tried. And most of the ones I had tried were ones I disliked. Thus a limited training of my profile. It failed to recognize more than half of the wines I scanned from bottles on hand.

Next Glass says that the more wines rated, the better recommendations will be.

The need to take a trip to a supermarket or other retailer in search of more wines to scan is a big consumer friction point that will deter acceptance.

But anyway,I went in search of the wines (Lucky Supermarket) that the app recommended for me.

I found none of them. That was more frustrating friction.

It’s a reality that wine inventory changes rapidly with the season, with the arrival of new vintages and the demand to maximize profits per linear foot of shelf space. That latter demand is software driven and can change in an instant.

Next Glass desperately needs a strategic alliance with sites like Wine Searcher or Snooth to provide retail sources to buy recommended wines. Other apps have those sorts of arrangements, but their recommendations have the accuracy of a back alley crap shoot.

Finally, one picky comment: The interface/user experience also needs tweaking, especially regarding the need for users to snap a photo like other apps. The active image process they use is a bit of a hassle.