Entries in Restaurant Ratings (25)

Wednesday
Apr232008

Restaurant Girl's Ratings, WTF?

Since she debuted as the New York Daily News restaurant critic last year, Danyelle Freeman (a/k/a “Restaurant Girl”) has taken plenty of flak. In a recent interview, Robert Sietsema, the Village Voice’s veteran critic, ripped into her:

restaurantgirl.jpg
[New York Daily News]

I think she was thrust into a very important position without having a lot of experience and perhaps chosen for extraneous reasons. Her writing has been improving, but still she seems to take an a priori, frivolous attitude towards the material. And the fact that she did choose to be recognized is, to me, like, really horrible… I presume that part of her being non-anonymous is that she goes into a restaurant under her own name, flashes her cleavage, and they just bring her free food.

You could fill a book with her tortured prose, like this howler in her review of Dovetail: “The rosy fish, grilled à la plancha, is exhilarated by a creamy horseradish gribiche (egg and mustard sauce) and bursts of caviar.” The fish was exhilarated? Actually, I thought the poor fish would rather still be swimming.

Fully alive to the problem, Freeman has the fix: she’s changed her rating system to a best-of-five stars, replacing the former best-of-four. I believe it happened just this week, with three-of-five for Elettaria. She’s also gone back and revised her old reviews retroactively—only the stars; not the grammar. That Dovetail review, formerly 3-of-4, is now 4-of-5.

Her old scale allowed half-stars, but it seems the new one does not. Merkato 55 is a winner, rounded up to 3-of-5 from 2½-of-4. South Gate is a loser, rounded down to 1-of-5 from 1½-of-4. Most perplexing is Adour, which she didn’t seem to like, but which gets the benefit of rounding to 3-of-5 from 2½-of-4.

We weren’t expecting to revise our star-system roundup quite this quickly. But revise it we will.

Update: RG explained to Eater.com: “The New York Daily News has newly implemented a five star rating system for all critical reviews (theater, movies, restaurants,) thus eliminating half stars…I have adjusted my system accordingly as well as readjusted all formerly filed reviews to the new system in order to maintain consistency.”

Monday
Apr142008

Stars: Here and Elsewhere

stars.pngMany media outlets, including this blog, use stars to rate restaurants. I’ve posted explanations of the star system before, but recently it struck me how much different the various “star systems” are, despite their surface similarity.

This can cause quite a bit of confusion. In the Michelin Guide, one star is a significant honor, but in The New York Daily News one star means “disappointing”. The New York Times system is somewhere in between: one star means “good”. Three stars or more always signifies an extremely good restaurant, except in Time Out New York, where the scale goes up to six, and therefore three is just mediocre.

I thought it was time that all of these systems were summarized in one place. First, here’s a summary, with the systems listed roughly in declining order of pedigree. Beneath the name of each source, I provide a link to a page, or pages, where it’s explained in more detail:

SourceRangeExplanation

Michelin Guide
[see here]

1–3

***       Excellent cuisine and worth the journey
**Excellent cooking and worth a detour
*A very good restaurant in its category

New York Times
[see here, here]

0–4

****Extraordinary
***Excellent
**Very Good
*Good
Zero     Satisfactory, Fair or Poor

New York Magazine
[see here]

0–5

*****Ethereal; almost perfect
****Exceptional; consistently elite
***Generally excellent
**Very good
*Good
Zero     (No explanation)

New York Daily News
[see here, here]

1–45

(No explanation)
****Sheer perfection
***½   Truly exceptional
***Outstanding
**½Great night out
**A safe bet
Hit or miss
*Disappointing

Time Out New York
[see here]

1–6

(No explanation)

Bloomberg
[see here]

0–4

****Incomparable food, service, ambience
***First-class of its kind
**Good, reliable
*Fair
Zero     Poor

I listed the Michelin system first, for though it is relatively new to New York, it has been employed in Europe since the 1930s. To be a “Michelin-starred” restaurant is an internationally recognized honor, and it’s a distinction many European tourists rely on. However, it’s the system that New Yorkers pay the least attention to.

The newspaper and magazine systems all operate on the same basic idea, though the meaning of a rating varies widely, depending on how high they go (four, five, or six), how low they go (one or zero), and whether they accommodate half-stars (only the Daily News does).

The New York Times is the dean of the star-bestowing media. It has been handing out stars since 1963. At the other extreme is Bloomberg, which seems to have inaugurated its star system just a few weeks ago with a three-star review of Jean Georges.

But most of the others are pretty new, too. New York Magazine’s star system debuted in 2006. The current system in the Daily News dates from the summer of 2007, after a lengthy period when the paper had no restaurant reviews. Time Out New York’s peculiar six-star system bowed in mid-2006.

New York Times

Although the Times system is 45 years old, it has changed often. Leonard Kim, the eGullet Society’s star system historian, has charted its many tweaks. The original system was just three stars, with the fourth added in 1964. In 1971, critic Raymond Sokolov added a separate rating of one-to-four “triangles” for service, atmosphere and décor. The triangles were dropped in 1973.

There’s more to the Times stars than the cryptic explanation printed in the newspaper every week. Thanks again to Leonard Kim, we know that Mimi Sheraton’s typical rating was one star (almost half of her reviews), with about 20–25% two stars, and another 20–25% zero stars. Three and four stars were given out quite infrequently. Kim thinks that “this is the most sensible system.”

Under Sheraton’s successor, Bryan Miller, the percentage of zero-star reviews dropped to around 10–15%, with one or two stars being handed out with about equal frequency (35–40%). Under the next critic, Ruth Reichl, two-star reviews were given out over half the time. She also gave out three stars a fairly generous 15% of the time, while zero-star reviews became a great rarity (4–5%).

William Grimes brought the stars back under control, reducing the frequency of three-star awards and increasing the frequency of one star. However, he continued Reichl’s practice of giving out zero stars only rarely. Bruni’s frequencies are more-or-less comparable to Grimes’s, but Bruni comes down hard on luxury restaurants while being quite generous with two-star ratings for extremely casual places.

Because the zero-star rating is so seldom used, some of the other ratings have lost their nominal meanings. For decades, one star has supposedly meant “good,” but Frank Bruni’s one-star reviews seldom sound good. It’s the rare restaurant nowadays that would be pleased to receive just one star. The Times’s zero-star reviews carry an additional label: Satisfactory, Fair, or Poor. (No other paper does this.) Bruni has given “Poor” only twice, he has never given “Fair,” and his “Satisfactory” reviews never sound very “satisfied.”

Some papers don’t reprint the definitions every week, but the Times does, along with this blurb: “Ratings range from zero to four stars and reflect the reviewer’s reaction to food, ambience and service, with price taken into consideration.”

That blurb has changed over the years. In 1974, it said, “The restaurants reviewed here each Friday are rated four stars to none, based on the author’s reaction to cuisine, atmosphere and price in relation to comparable establishments.” The words “comparable establishments” were dropped in 1984, though you could argue that the system in spirit still operates that way.

Bruni has explained what the stars mean to him:

There are no assigned percentages for food versus service versus ambience. The star ratings take into consideration all of those elements, giving primary importance to food, to come to a conclusion about how excited I would be to return to the restaurant. The number of stars chart ever greater degrees of excitement.

This cannot be the full explanation, because you’ll read two-star reviews in which he sounds extremely excited (e.g., Franny’s), and you’ll read other two-star reviews that sound like he hates the place (e.g., Gordon Ramsay). That’s because there is an unwritten “rule of expectation.” An expensive luxury restaurant expects to get at least three stars, and his review is written in light of that. In his Ramsay review, it sounded like he was “taking away” the implied third star. A two-star rating is supposed to mean “very good,” but the tenor of the review was “not good at all.”

There seems to be an unwritten rule that a three-star restaurant needs to have most, if not all, the trappings of “traditional luxury.” Frank Bruni has relaxed that requirement on occasion, but you haven’t seen a three-star pizzeria yet. There also seems to be an unwritten rule that there can be no more than half-a-dozen four-star restaurants at any given time. Four stars means “extraordinary,” and by its nature, it can’t often be given out.

None of the New York media updates obsolete ratings systematically, but the Times at least tries. In the first three months of 2008, four of Bruni’s reviews were updates, including two that Bruni himself had previously reviewed. New York’s Platt, as far as I know, has never updated any of his own ratings in several years on the job. However, with only 52 reviewing slots per year, most of them covering just a single restaurant, the Times often goes many years between updates, even when there have been significant intervening events.

For instance, the Times’s blurb for three-star Oceana says, “The good ship Oceana — a two-story town house decorated to resemble a yacht — has found a new surge of energy.” That “surge” dates from William Grimes’s 2003 review. The chef who supplied that surge of energy, Cornelius Gallagher, left Oceana in 2006, but the obsolete review remains in place. Major shakeups at three-star restaurants are infrequent enough that you’d think the Times could take note of them, but apparently Mr. Bruni doesn’t think so.

Two other changes over the years have limited the paper’s ability to keep its ratings updated. Up to the end of Bryan Miller’s tenure, the critic normally reviewed two restaurants per week. When Ruth Reichl arrived in 1993, she switched to one longer review of a single restaurant per week, dramatically reducing the paper’s bandwith to update previously given ratings. Frank Bruni has revived the custom of the double review, but he uses it only a handful of times per year.

The other change is in the way that casual, inexpensive restaurants are handled. Historically, the Times restaurant critic’s beat had a clear emphasis on “fine dining.” That became more explicit in 1992, when Eric Asimov started the “$25 and Under” column. Asimov used to cover serious, though inexpensive, restaurants. His successor, Peter Meehan, has been marginalized. His column now appears only every other week, and it is usually relegated to extremely humble eateries in the outer boroughs. As a result of this change, Frank Bruni now covers everything from delis to high-end French luxury palaces.

The Times does not give star ratings to its “$25-and-under” restaurants. New York has a separate star system for casual places (signified by one-to-five “hollow” stars). No other media outlet has a separate rating system, or indeed a separate critic, for casual dining.

Other Media

Since all of the other star systems are fairly new, there is nothing like the kind of historical perspective we have at the Times.

New York Magazine

The New York Magazine star system debuted with the January 1, 2006, issue. Adam Platt retroactively put star ratings on 101 restaurants, including many that he’d never reviewed himself. He surely couldn’t have paid the minimum of three contemporaneous visits that the Times requires of its critics.

All reviews since then have been rated on Platt’s zero-to-five system. The only explanation given for a five-point scale, rather than the traditional four, is that, “We chose to use five stars, instead of three or four, because the more levels of discrimination, or so the thinking goes, the more useful the list.” But despite having an extra step on his ladder, Platt has actually given three stars less frequently than Bruni. And except in his 101-restaurant retroactive list, he has yet to give out five stars, while he has given four only once.

The Others

The New York Daily News re-instated its restaurant reviewing column in August 2007, with Danyelle Freeman (a/k/a “Restaurant Girl”) handing out the stars. Initially, she used a one-to-four system, with half-stars allowed. In April 2008, she abruptly switched to a five-star system without an explanation, retroactively re-rating all of her previous reviews. Her system doesn’t go down to zero, so a one-star rating from Freeman is like a zero-star rating from Bruni or Platt.

Time Out New York instituted a six-star scale, which has never been explained, but its critics are fairly promiscuous with three and four-star ratings, making the TONY ratings entirely meaningless in relation to everyone else’s.

Bloomberg seems to have inaugurated a new zero-to-four star system with an April 2008 three-star review of Jean Georges, but it is too new for us to draw any conclusions.

This Blog

The rating system on this blog is similar to that employed by The Times. I award zero to four stars, with one star intended to signify a good restuarant, not “fair” (Bloomberg) or “disappointing” (NYDN). However, I endorse Adam Platt’s comment that “one star for a restaurant with elite aspirations is really not much better than no star at all.” Unlike the Times, I use half-stars, and I give separate ratings for service and ambiance, in addition to an overall rating.

Thursday
Jan252007

Stars — Here and Elsewhere

stars.png

Note: Click here for an updated look at the star system.

After my post on Momofuku Ssäm Bar, a commenter wondered how my enthusiasm for the restaurant could be reconciled with my 1½-star rating on a four-star scale. “Overall, this review makes little to no sense,” she wrote.

There’s a separate page where I explain my Rating System. I’m using the same four-star scale the New York Times employs. Many other media outlets follow the identical system, or something close to it. While the system is far from perfect, it does allow comparisons (for those who care) between my ratings and other people’s.

If all you know is that the top rating is four stars, you might think that 1½ stars is pretty bad. It’s an understandable reaction, but nevertheless incorrect. One star means “good,” and two stars means “very good.” There is nothing inconsistent about writing an enthusiastic review and awarding 1½ stars. After all, I’m saying the restaurant is “better than just ‘good’.”

If you read more of this blog (index here), you’ll find other enthusiastic 1½-star reviews. This is consistent with the mainstream press. The New York Times doesn’t use half-stars, but Frank Bruni, the current critic, has written numerous two-star rave reviews. I am quite sure that if the mainstream critics review Momofuku Ssäm Bar, it will earn either one star or two, with two being more likely.

The New York Times rating system has its critics. When the same system has to accommodate restaurants as different as Momofuku Ssäm Bar and Per Se, perhaps the rating by itself doesn’t mean very much. You could argue that if Momofuku is the best damned ssäm bar in town, it ought to get four stars. But that’s not the system we have, and you just have to get used to it. I could set up my own system, but I’m sure I’d make different mistakes, and then my ratings would have no connection to anyone else’s.

Star ratings are not great carriers of information. But if you are accustomed to the system, the ratings at the bottom of my reviews do faciliate comparisons with what other media outlets have done.

Saturday
Jan072006

What the Stars Mean

I’ve employed a variation on the system found in The New York Times and many other newspapers:

**** Extraordinary
*** Excellent
** Very Good
* Good
(zero) Satisfactory, Fair, or Poor

Like some newspapers (but not the Times), I award half-stars to further discriminate between rating categories. Similar to Zagat (but not most newspapers), I consider the food, service, and ambiance separately, in addition to awarding an overall rating.

I attach greater significance to the food rating than to service or ambiance. If service and/or ambiance are only a bit better/worse than the food rating, then the overall rating will simply be the same as the food rating. However, if I feel that service/ambiance make a significant difference, I adjust the overall rating accordingly.

Here’s a bit more on what the stars mean to me:

One star: Good in its category; worth a look in its neighborhood, but not worth a special trip.

Two stars: One of the city’s better restaurants in its category. More than just “good for the neighborhood.” A “minor destination,” though possibly with some significant limitations. Worth going at least somewhat out of your way.

Three stars: The city’s best, or very close to the best, of its kind. A special experience. A destination in every respect, without any serious limitations. Nationally, or perhaps even internationally recognized (or deserves to be).

Four stars: A transcendent experience, one of the world’s best. Worth a trip to New York in its own right.

For service and ambiance, I award stars based on my views of what is generally expected for a restaurant in its category. Service, I think, is self-explanatory. Ambiance refers to décor and related issues, such as the noise level, spacing of tables, and so forth.

One and two stars are not bad ratings. They literally mean “good” and “very good” respectively.

Like the Times, I take price into account, but I am not as price sensitive as Frank Bruni. If something is “very good,” it doesn’t suddenly become “bad” because I think the restaurant is over-charging for it. I usually mention prices—at least for what I ordered—and you can decide for yourself if you think it’s worth it. In borderline cases, I may award a slightly higher rating for a great bargain, or a slightly lower one for egregiously over-priced fare.

Unlike the Times, I don’t limit the star system to “$25-and-over” restaurants.The Times isn’t entirely consistent about this anyway. And I will sometimes rate restaurants that the professional critics didn’t bother to review.

Note: For more on the stars, see this post.

Thursday
May202004

The Trouble With Zagat

The Zagat Guide is a wonderful restaurant directory. It allows you to search on a wide variety of criteria (neighborhood, cuisine, etc.), and it provides just about all of the basic information you need (address, phone number, hours, map, price range). The comments provided, although brief, are often witty and scathingly accurate.

But the one area where Zagat falls down is the statistic most often quoted, and for which Zagat is best known: the numeric ratings of each restaurant. Zagat separately rates Food, Décor and Service on a 1-to-30 scale. If properly used, this scale would provide sufficient amplitude to distinguish the neighborhood taco stand from Alain Ducasse and Per Se. In practice, it does nothing of the kind. This is ironic, given that restaurants love to post their so-called “Zagat rating,” and some will say that they’re “Zagat rated.” What is this so-called “rating”?

For starters, Zagat is a raw popularity contest, with very little guidance given to the voters. Someone who thinks Olive Garden is a pretty good restaurant is going to rate all of Little Italy off-the-charts, while an experienced high-end diner will pooh-pooh anyplace that lacks a chef’s tasting menu. The upshot is you have a hot dog stand like Gray’s Papaya carrying a Zagat food rating of 20 out of 30, which (according to Zagat’s own definitions) is supposed to mean “very good to excellent,” when the highest rating in New York is just 28.

Zagat’s own voting mechanism is largely at fault. Individual voters are allowed to vote on a 0-to-3 scale. Zagat says that “1” is supposed to mean “good,” but psychologically a “1” vote feels like “below average.” People will realize that “3” must be pretty damned good, so there’s a tendency for almost everything to get rated “2”. For the final rating, Zagat multiplies the average by 10 and rounds off, resulting in the familiar 1-to-30 scale.

A look at the details shows that this is a serious problem. Of the 1,454 restaurants in Zagat’s 2003 New York guide, 74% of them carry a food rating between 18 and 23. What’s more, 97% of them carry a food rating of 16 or higher, and none carry a food rating worse than 9. The upshot is that what’s claimed to be a 1–30 scale is, for all practical purposes, a 16–28 scale. You can safely say that any restaurant with a Zagat rating of 25 or higher is very good. But ratings below 25, which is almost all of them, are in an undifferentiated scrum, and aren’t statistically significant.

Oddly, voters are considerably more discriminating in their Décor ratings: just 62% of New York restaurants have a Décor rating 16 or higher, and just 36% are clustered in the 18–23 range. When it comes to Service, Zagat voters rate about 80% of restaurants 16 or higher, and 54% are in the 18–23 range. So the Zagat Service ratings are nearly, but not quite, as useless as the Food ratings, while the Décor ratings actually do seem to mean something.

The pernicious tendency of the ratings to cluster around 20 is shown in the following graph:

I am not sure why voters are least discriminating about the one thing that should matter most at a restuarant - the food - but perhaps it’s because the qualities that make food great are awfully difficult to describe. Yet, everyone knows an ugly room when they see it.

I think Zagat would be considerably more reliable if they collected votes on the same scale they report, from 0 to 30. Voters would then tend to rate an average restaurant “15”, instead of “2”. The higher Zagat ratings would be harder to get, and the scale overall would be a lot more meaningful.

[Update: After I posted this, a colleague on eGullet observed that the Zagat food ratings are almost a proper bell curve, if you consider “average” to be 20 rather than 15. The problem is that the standard deviation is only about 2, which means that the scale simply fails to offer a meaningful spread between the best and the worst.]

Of course, there’s no chance of Tim and Nina actually changing anything, so the Zagat ratings will continue to be the least useful part of what is otherwise a very useful service.

Page 1 2 3