Mortgage





Buy Conservative Advertising

« "A tub of goo" | Main | Hypochondria played for laughs »

March 06, 2008

More on the famous file-sharing paper

About five months ago I posted about the famous file-sharing paper by Oberholzer-Gee and Strumpf. I noted that the paper had been criticized in detail by Stan Liebowitz but that Liebowitz was hampered because he could not obtain the data from O-G and S.

The story has now been picked up by Handelsblatt, a leading German financial publication. For those of you who don't read German--I don't--here are some excerpts of the article in English. (The translation is courtesy of Binghamton Univ. associate professor Florenz Plassmann and is posted with his permission. I thank Drexel Univ. professor Bruce McCullough for forwarding the translation to me.)

Recently, this paper has sparked a heated discussion. The relevance of the debate extends far beyond the paper in question. It questions the reliability of empirical studies in economics, and may ultimately challenge the way in which the crème de la crème of scientific journals deals with scientific evidence.

The key question is: how can a study that is based on secret data that nobody has double-checked be printed without close examination by one of the most prestigious economics journals? This is especially puzzling because the supplier of the data has a special interest in a certain result. The study of the two economists from Harvard and Kansas is based on proprietary data on music downloads, which the authors received from the file sharing services "MixmasterFlame" and "FlameNap."

. . . .

Liebowitz knew of the filesharing study before it was published because it had been circulated as a working paper. In his letter he told Levitt that, despite repeated requests, the authors did not provide him with an opportunity to check their results. Could he please use his influence as editor of the "JPE" to make such checks possible? Levitt declined to tell Handelsblatt whether he followed up on this request.

It appears that he did not. Even one year after publication, the authors still keep their data to themselves. Oberholzer-Gee told Handelsblatt that they had to sign an agreement not to share the data to get them from the file sharing service. The authors argued that they had to "protect their sources" and declined to provide Handelsblatt with either a copy of the agreement or the name of a reference at the file sharing service who could confirm their version.

Liebowitz pressed Levitt, the editor of the "JPE," to at least correct several mistakes and ambiguities before publishing the paper.

For example, the authors write that about half the reductions in music CD sales are the result of the increase in market share of music discount stores with smaller inventories. Liebowitz argues that this cannot possibly be correct. He calculates that, even under extreme assumptions, the reduction in inventories can at most account for one-sixth of the decrease in sales. "It is unbelievable that a top-journal like the "JPE" would publish such claims without any evidence," Liebowitz complains in his letter, and he points Levitt to an entire series of additional errors or ambiguities.

Levitt forwarded Liebowitz’ letter to the authors, who ignored it—their study was published with only minor changes. Since then, file sharing services can refer to an academic paper in one of the top economics journals to defend themselves against the music industry.

In principle, like many other journals, the "JPE" requires that authors publish not only their results but also disclose the data and the methods that they use to derive them. However, this requirement does not apply to Oberholzer-Gee and Strumpf—their paper was accepted before the requirement became binding. "This has nothing to do with science," criticizes Bruce McCullough, professor of decision sciences at Drexel University in Philadelphia. "Without scrutiny, there can be no science," says the expert on the replicability of empirical results in economics.

The translation of the entire article is here (Word .doc).

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/t/trackback/510/26823704

Listed below are links to weblogs that reference More on the famous file-sharing paper:

Comments

"The key question is: how can a study that is based on secret data that nobody has double-checked be printed without close examination by one of the most prestigious economics journals?"

Most papers are not "double-checked" before publication. They are reviewed (anonymously). But at best, they are double-checked (replicated) only after publication. Even if the data had been available, the study could well have (even likely) been published.

"This is especially puzzling because the supplier of the data has a special interest in a certain result."

If the suggestion is true (that the data are rigged), it suggests that even had the data been available any replication would find similar results. Of course, perhaps the rigging would have been detected.

I don't see how the authors or the editor can be blamed. Any fault lies with the file sharing companies who should perhaps make their confidential data available to some additional researchers.

Should we believe the study less (than we would had it been replicated)? Probably. Should this have been sufficient to deny publication? My opinion is no.

But what do I know...

Roger

I'm with Roger. Don't get mad, get even. If you think the data/analysis were bad, get new data and do a better job. Surely a thorough, careful, and replicable analysis of this question which delivers strikingly different results would find a place in a reputable journal somewhere. Note: the Liebowitz paper isn't this, but rather it's a critique of the original Oberholzer-Gee and Strumpf piece. It's always easier (and generally less fruitful) to tear down than to build up.

Newmark -- with his blogfull of compromised ability to think critically -- is hardly the person to point a judgmental finger at the scholarship of a colleague. More appropriately, Newmark might consider retiring and sparing his profession its continuing embarrassment over his sophomoric inability to parse knowledge.

This isn't uncommon, as pretty much anyone of us who has conducted, published, or reviewed articles for the economic and social science literature can attest. Consider the case of health economics, where studies may require the use of confidential patient data (think of a cost-effectiveness study of treatment for hip fractures, which may require identifiable medical records and charge data in order to link physician, EMS, hospital, home health, and rehab institutional records to chart the totality of care for an injury) which cannot legally be shared without external approvals due to HIPAA constraints.

Confidentiality agreements may also be necessary to get access to the data in the first place - for example, if the dataset contains confidential business information or the researchers wish to gain candid information on practices that may be illegally (such as copyright violations in this case?)

Access to data is an ideal, but like any ideal too rigid an adherence may undercut the ability to actually DO research. As the saying goes, "The Perfect is the enemy of the Good" - in the real world, we often have to satisfice rather than satisfy our quest for the ideal.

Lack of access, however, does not mean that the work cannot be replicated. Replication of the work does no require retesting the hypotheses using the exact same dataset. Others are free to go out and negotiate access to independent data sources and retest the hypothesis. IMHO, meta analysis of the results of multiple studies using independent data will answer the question in a more definitive fashion than simply redoing the statistics using the same data. Any individual economic study is flawed by nature - we are forced by the assumptions and limitations of statistical techniques and cognitive processes to act as reductionists. Only by evaluating multiple independent and overlapping studies can we really begin to understand the true role of a phenomena in the broader economic world.

Post a comment

If you have a TypeKey or TypePad account, please Sign In

Powered by TypePad
Member since 07/2003