value insights

Advancing Knowledge Is Difficult- Valutrics

Stephen Scherer, from Toronto’s world-renowned Hospital for Sick Children. Dr. Scherer, it turns out, is one of world’s leading researchers into autism, or to be more precise, into autism spectrum disorder (ASD). The study of autism is a relatively young field, with a body of research going back only seventy years or so. The somewhat scattered nature of the research is representative of the challenges faced by the investigators of any mystery. With a mystery, it is difficult to know where to start. What is this condition? Where does it originate? How does it operate? What makes children with ASD behave as they do? Is it the result of a virus? Is it the way their parents treat them? Is it a response to environmental conditions, diet, or even the vaccinations administered to them in infancy? Is it genetic? How can a researcher even begin to think about understanding and treating this mysterious disorder?

With so much unknown, investigators have to consider every possible contributor to the condition, because they do not yet know what to leave out. Anything omitted might turn out to be a key to unlocking the mystery. With little to go on, investigators typically seek a pattern in the chaos and look deeply into that pattern for answers. Based on our brief conversation, I began to suspect that Scherer approached his mystery—and the search for patterns within it—a little differently and perhaps more successfully than most of us do. What sets Scherer apart, I thought, is that he looks for patterns in the data that other researchers discard as extraneous.

Scherer’s unconventional approach has met with unconventional success. He was just forty-four years old when I sat with him at an awards dinner, where he was to receive the evening’s capstone award. Yet he already had more than 240 peer-reviewed journal articles on his résumé, cited by other scholars more than fifteen thousand times. His list of awards is too long to recite. 1 I was so intrigued by Scherer’s work and methods that I arranged to meet him again at his laboratory. I wanted to learn how he thinks about turning medical mysteries into heuristics and algorithms.

We started the meeting in his office, where I learned more about what the self-deprecating Scherer calls his “garbage can approach.” Scherer credits his breakthroughs in understanding ASD to his focus on data that others tend to toss into the trash. He believes that the answers to mysteries can often be found in the “outlier” data that does not seem to fit comfortably within one of the categories he or others have constructed. To illustrate his point, he drew a simple scatter plot diagram and then drew large circles around the dots that seemed to fit together. Finally, he pointed to a few dots farthest removed from the clusters of data. “These are the really interesting things,” he said. “These are the guys that we study.” 2

To help me understand more concretely what he was saying, we walked from his office to his laboratory. And what a lab it was. The second we passed through the security door, we stepped out of what felt like a normal office building into the future. There were complex machines stacked high on racking systems as far as I could see. Modern medical research is a capital-intensive game. Scherer showed me a gene microchip high-definition imaging device that he uses to compare DNA sequences from individuals with ASD against control samples from people without the condition. He showed me slides of the results, pointing out the large peaks that indicated missing DNA on the chromosomes of a patient with ASD—deletions that have been studied intensively by scientists around the world. His focus, however, is on the data that other researchers set aside as random, statistically insignificant, or simply not germane.

“Look at this,” he said, leaning over a slide of a genetic sequence. “What else do you see besides the peaks? You see all of these other little things. In every experiment, you see these little spurious things come up. Everybody else throws them away.” His experience as a geneticist tells him that the discarded data may contain clues to the genetic anomaly underlying ASD. “In genetics,” he explained, “nature often protects things through redundancy, complexity, and variability, all of which cloud our resolution for interpretation. So those individual, hard-to-reproduce variations are like a signpost: this is what you should be looking at.”

After twenty years of research, Scherer has started to see a pattern emerge from those little differences. With a “logical leap of the mind,” he has formulated a heuristic, which is that the key to the mystery is to be found in deletions and duplications in certain genes (“copy number variations,” as geneticists call them) found in children with ASD. Those genetic anomalies, he infers, predispose autistic children to have developmental imbalances leading to unique behavioral tendencies including repetitive overanalysis of aspects of their world, such as numerical patterns or arrangements of objects. Rather than process information and move on, as most of us do, these individuals settle on a feature of their mental or physical environment and dig in, obsessively analyzing and reanalyzing the information to the exclusion of the rest of the world. The medical term for this obsessive, repetitive focus is perseveration.

Scherer’s heuristic is typical in that it makes no attempt to encompass the width and breadth of the problem at hand. He has left out large tracts of investigative territory, such as possible environmental causes and sections of missing DNA. To wrestle the mystery down to manageable size, he focused on understanding what autistic children do. His answer: they overanalyze or perseverate. From there, he examined why they do so, and from his discovery of a pattern in seemingly unrelated bits of nonconforming genetic data, he has concluded that the answer is to be found in the copy number variations of people with ASD. By limiting his focus, he has advanced to a working heuristic, while many of his colleagues worldwide are still probing the mystery and cycling through numerous potential angles of attack.

Scherer moved from mystery to heuristic by focusing not on what was common from experiment to experiment—the replicable peaks in the genetic sequence—but by focusing on what was different—the oddball findings that stood out from the others as unexplained and unexpected. As he says, “evolution doesn’t tolerate junk for too long, so all data, even the outliers, need to be considered.” Ultimately, he was interested not in producing a reliable outcome but a valid one. He wanted to generate a theory on the origins of ASD that would move knowledge forward.

This distinction, between reliability and validity, is at the heart of the innovation dilemma, for medical researchers and business-people alike. The challenge is how to balance the irresolvable tension between operating within the current knowledge stage and moving through the knowledge funnel. The tension can’t be fully resolved but only balanced and managed, because reliability and validity are inherently incompatible.

The goal of reliability is to produce consistent, predictable outcomes. A perfectly reliable blood-testing procedure would produce the same test results each of a hundred times, if a blood sample were divided into a hundred portions and tested successively using the procedure. A perfectly reliable political poll would produce the same result from five different random samples of voters. Reliability, in this context, is achieved by narrowing the scope of the test to what can be measured in a replicable, quantitative way and by eliminating as much subjectivity, judgment, and bias as possible.

The goal of validity, on the other hand, is to produce outcomes that meet a desired objective. A perfectly valid system produces a result that is shown, through the passage of time, to be correct. A valid blood test is one that assesses whether that the subject actually has hepatitis B or not. A perfectly valid political poll would predict in advance the winner of the election and the winning percentage. Validity is difficult to achieve with only quantitative measures, because those measures strip away nuance and context. Typically, to achieve a valid outcome, one must incorporate some aspects of the subjectivity and judgment that are eschewed in the quest for a reliable outcome.

To illustrate the distinction between reliability and validity a little more clearly, let’s return to medical research, Scherer’s field. Scherer was interested in validity, even at the expense of greater reliability. He looked at small differences, rather than big similarities across test results. This emphasis on getting to a valid answer enabled him to move from one stage of the knowledge funnel (mystery) to the next (heuristic). But consider the kind of large-scale, groundbreaking work that requires hundreds of participants around the world to work together, refining knowledge within one stage. Here, reliability would be paramount, as it was in the execution of the Human Genome Project.

The Human Genome Project, the great global medical project of the past decade, was a publicly funded endeavor (alongside a parallel, privately funded effort headed by Dr. J. Craig Venter, a pioneer in gene-sequencing technology) that used donor DNA from more than seven hundred anonymous individuals to create a single mosaic sequence representative of universal human DNA. 3 The model constructed from that massive data set smoothed out individual differences to assemble a merged sequence meant to reflect the genetic information we all have in common. Note that by creating a seven-hundred-person composite, the effort wiped out every piece of “garbage can” data that Scherer would have used to establish his findings. That is a key price of reliability: the simplification or conformity that enables consistent replicability also leaves out knowledge that is necessary for greater validity.

Unlike Scherer’s work, which to date has progressed along the knowledge funnel only from mystery to heuristic, the Human Genome Project represented the careful application and refinement of an algorithm. That is in fact what made the project possible. The Human Genome Project scientists reduced the task of gene sequencing to a specific algorithm, which enabled them to assign research teams worldwide the task of applying that sequencing algorithm to a particular piece of the overall genome. This would not have been possible if the sequencing technique had remained a heuristic; in that case, the teams would have utilized many different approaches based on their expert judgment in applying the heuristic. Operating the relatively efficient and clear algorithm, on the other hand, required little application of judgment.

While it was an epic and mammoth project, there was little doubt that the researchers would eventually succeed in assembling the human genetic model because their method was reliable. The real question was how soon they would complete the sequence. Thanks in part to competitive pressure from Venter’s parallel project, the Human Genome Project finished ahead of schedule. In scientific research as in business, pushing knowledge from heuristic to algorithm generates impressive efficiencies.

The Human Genome Project has its parallels in the business world, where large groups of highly trained people employ algorithms in pursuit of reliable results. Case in point: today’s complex, elaborate, firmwide software. Enterprise resource planning (ERP) systems keep track of all corporate data in a single database and spit out comprehensive reports on everything from inventory levels to sales by product to cost absorption by area. Customer relationship management (CRM) systems purport to ensure that a company knows exactly who its customers are, what each is buying, and what more it could sell to them. Six Sigma programs and total quality management (TQM) systems knock the waste out of an organization’s systems, and knowledge management (KM) systems (attempt to) organize all the knowledge in a corporation. Those and other tools enable the modern corporation to crunch data objectively and extrapolate from the past to make “scientific” predictions about the future, all part of the quest for reliability.

Commercial enterprises seek validity too, of course, usually classifying the activity as research and development. Pharmaceutical companies spend billions of dollars each year staring into the mysteries of diseases. Consumer packaged-goods companies like Unilever and Colgate spend billions each year to explore the mysteries of consumer desires and the products that might satisfy them. Similar efforts are under way at information technology companies, medical device companies, and other companies in research-intensive sectors. In each case, their inquiries are considered high-risk activity, because they lack a formal production process. The corporation cannot define the resources or time frame required to solve the mystery, which means that debt financing cannot be used as a source of funding for the exploration. Debt must to be paid back on a predetermined schedule, and thus exploration, which has no schedule, requires equity financing, which has no fixed schedule for paying equity providers or even any assurance that the equity providers will ever be repaid. Given this dynamic, it’s no wonder that the business world chooses reliability over validity.

Leave a Reply

Your email address will not be published. Required fields are marked *