How 'Average' Price Misleads

By Dave Fratello January 25th, 2013.

We go off on little data-nerd rants from time to time here at MBC, and today will be one of those days.

We recently reviewed a print publication which offered real estate statistics for 2012 for Manhattan Beach. (No names necessary here, we have a substantive point here, not a need for finger-pointing.)

This print pub offered the number of sales in various South Bay markets for 2012 and the "average price" of sales in each city.

Average price. You may instinctively know that it's irrelevant. But how irrelevant, and why?

First, why it's irrelevant. We have incredibly diverse housing stock in Manhattan Beach. If you include townhomes, then you've got $450K subunits of condo buildings along MBB, you have $800K old cottages along Marine, you've got all those postwar cottages spread around East MB and west of Sepulveda, but you've also got newer development, Hill Section estates, beach-close view homes, and something called The Strand which hosts homes that fetch $10M or more at times.

So "average" price is going to be highly dependent on how many little box condos or Strand homes or Hill Section mega manses happen to sell. Average, after all, takes the total dollar value of sales in the city and divides by the number of sales. Put a few extra of one kind of sale in the mix – a quirk from year to year – and you skew the data high or low.

Second, there's an MLS problem. You're going to calculate averages from MLS data. This is a pretty solid database, the best we have, and one where the rules are becoming more stringent (good), but human error is still rife.

1600 The Strand: This $10.9M Sale Appears Twice in 2011 Data
Looking at "average" sale prices, maybe the biggest problem is duplicate entries. In 2011, for instance, 1600 The Strand sold once, for $10.9M. But it appears in the MLS database twice. Yikes.

So your "average" sale price for 2011 is going to have 2 x $10.9M instead of 1 x $10.9M as part of the numerator. This single error will add $27,525 to the "average" sale price of all residential properties sold citywide in 2011 (268). It adds $48,017 to the "average" sale price for SFRs west of Sepulveda.

The only way to get around the duplicate-entry problem is to hand-craft your data run, not let the computer do it for you. That's how Dave produces median price figures, but we're doubting that anyone grabbing average price data is scrubbing for dupes.

So let's say an error of 3%/yr. is somehow acceptable, and maybe you will calmly assume that every year has about the same kind of data problem, so it all "averages" out. (What is this, statistics, or horseshoes, where "close" is fine?)

Then we need to look at the last big problem(s).

Average price seems to drastically overstate prices (for Manhattan Beach anyway), while understating shifts in pricing over time.

That's a pretty astonishing double-whammy of a problem.

We'll compare average to median prices. Median price is an imperfect indicator, but it's the least imperfect. (We've said before that medians are "the worst indicator, except for all the others," with a wink and nod to Churchill.)

Median price is what's relied on most widely for establishing housing market trends. It's just a tiny bit trickier to calculate than an average, which is 3rd grade math. Maybe this is why people would publish average instead of median prices – less work.

But you get what you pay work for.

Let's pull out just a few simple examples. For this report we're looking exclusively at full-year data for SFRs west of Sepulveda.

For 2007, you see a median sale price of $1.952M and an average price of $2.190M.

Whoops, that's a 14% difference.

It's worse for 2012: Median price was $1.639M, average price $1.991M, a whopping 21% difference.

So that print publication accurately reported the average price at near $2M, but it's 21% above the more commonly used figure.

Now let's look at a price trend.

Everyone seems to know that the period from 2007-2009 saw the biggest price drops across Manhattan Beach and among individual properties.

Median price tells that story somewhat. Yes, even here in MB west of Sepulveda, the median price for SFRs dropped 18% between year-end 2007 and year-end 2009.

But using average price data, you see a drop of only 12%.

When you compare these composite figures to individual properties, the median price drop is a much better reflection. We saw plenty of real-world cases of bubble-era purchases resold into the depressed market of 2008-2009 at 15-20% discounts.

Not an actual pic of Dave reviewing data
So average price understates the decline in market during that period by about one-third.

That's not acceptably close for horseshoes, but maybe for hand grenades.

Takeaway #1: Don't place any faith in the relevance of "average" price data. Feel free to look away when someone prints the number.

Takeaway #2: Your blog author, Dave, is a little persnickety, the kind of guy who tries to convince his kids that "nerd" is not necessarily a bad word.

Takeaway #3: After trashing "average" pricing, MBC had best produce that final run of median price data for west of Sepulveda that Dave promised. (Yes, coming Monday!)
In case you missed our median price post for year-end 2012 – all of MB – here's a link to the data: "2012 Wrap: More Sales, Tiny Median Bump."

