Monday, February 11, 2008

The Random Sample - No Longer Random or Accurate

For years in research we have relied on the "Random Sample." It's the core of statistical based research that relies on a sample instead of a census. We all know the theory that in a world of limited options we only have to talk to sampling of the population to get a confident estimate of their behavior.

When the Telephone came along sampling became easier. Suddenly research became pretty simple. Get a phone bank together and get a list of the telephone numbers in the community then put together a random formula to grab the names and call away. At first it was easy as everyone had a phone, very few had private or unlisted numbers, answering machines were rather complicated new fangaled equipment and caller ID wasn't available.

Now we are in a world where the land line phone is a relic. Over the last 20 years the database has become torn apart by more and more unlisted numbers, don't call lists, over tele-marketing, caller ID, answering machines, and the biggest issue THE CELL PHONE. So where do we go from here in research?

Perhaps the first realization we need to make is that a simple random sample is really no longer available. In most of our research we stratify the sample in some manner, perhaps making sure we hit a sample goals. In Arbitron we have ethnic groups, population distribution across the market and demographic groups involved in various sample goals. In our radio product research we also limit ourselves to target audiences and P1 preferences based on the goals of the project.

So why are we still clinging to the idea that we need to have a random sample in research for it to be accurate? What we are really doing in nearly all reserach today is building panels. Groups of respondents that fit some picture of the world we are looking at - realizing we can't see the whole picture on any subject in today's world.

I bring all of this up because we need to start looking at a different model for our panels. It will have to start with Arbitron which is still clinging to the old land line phone database.

Now Arbitron hopes to get a little wider view from the shrinking land line phone view from going back to home address based databases in combo with the land line database. By comparing the 2 they can find out who are the cell phone onlies and try to get them into the panel by going around the phone.

The problem here is that while we all probably have a mail box it also suffers from many of the same problems we have in using the telephone. We screen the mail closely and quickly sort non important items and Email competes with the mail box for a lot of communication needs.

We're going to have to reach out and build a complete database and it's going to take more than an outdated phone and address list. We need to communicate with the audience and the sample on the internet - it's their new community and finding sampling tools here needs to be explored quickly. It's a fast moving world and unless Arbitron starts to experiment here and move a lot quicker we will continue to live in the past with the audience estimates we sell with.

There are many experiments to tackle including building panels using email, text messages, social networks, and reward based incentives. Yet, we seem to find the radio research community looking at the web options with an evil doubting eye. Look at Neilsen's web site research where they are purely soliciting the panel online. You apply to join up and are rewarded if chosen to put a collector on you browser to relay your activity back to Neilsen for their data. In their TV research they gather the panel by door to door solicitations, the phone and via email. And once you are in the panel you stay for years much like PPM.

How about building a big panel that we pick from for the diary system that will still be in place for 150 to 200 markets for at least the next 5-10 years. If we had a nicely proportioned panel of 10,000 in many of the markets where we have 1000 diaries in a report we could draw from that sample and suddenly it would be easy to find enough males or Hispanics or 25-34s. We have a group that knows how to complete the survey, is better compensated (since we don't have to go to all the expense of random dialing through the whole database), and reliable to respond. A lot better answer than striving for a Random Sample in a world where no sample is random any more.

No comments: