The Dark Secret of Bioinformatics

Last week, I gave a presentation on my current research. My brief introduction ended with comments that are a mantra to many grad students: “…which will hopefully be my dissertation project.” The talk went well enough—the audience was just other grad students—and I got a lot of good feedback. At the start of the question-and-answer period, one of the girls raised her hand and asked, “Do you have a hypothesis?”

This seems like a perfectly reasonable question. As any fifth grader can tell you, the scientific method starts off by forming a hypothesis. Then you conduct experiments and gather evidence to support or refute your hypothesis. So a Ph.D. candidate should obviously have a hypothesis about his dissertation project, right?

I chuckled a little at the question. “No,” I said, “this is discovery research.”

Discovery research is pretty much the opposite of traditional, hypothesis-driven science. Instead of having an idea about the results before performing an experiment, the researcher says, “Hey, let’s see what happens if we do this!”

Biology is such a complicated subject that sometimes we’re reduced to this sort of mentality, blindly pushing forward with what seems at the time to be a good idea. This is especially true in bioinformatics, where we’re frequently trying to squeeze meaning from an incomprehensible mass of data.

Some very esteemed researchers have trumpeted the advantages of discovery research and the wonderful things that are possible without the encumbrance of hypotheses. That makes me shiver a little bit. Hypothesis is the core of the scientific method, and if we give that up, are we really doing science? I may be doing discovery research, but I feel somewhat guilty about it, like I’m a charlatan and it’s only a matter of time until I’m found out.

The girl who asked the question at my presentation works in a protein engineering lab, so it’s a lot of bench work and a lot of hypotheses. “Man,” she responded to me, “I’m in the wrong field.”

2 Comments

  1. DocWedley on 10 May 2007 at 8:14 am | Permalink

    Mmmm I hope she wasn’t trying to pull the standard “Gotcha!” question that can be all too common lurking inside competitively minded grad students.

    It is a fair question. I recently had a grad student tell me that a seemingly innocent question I asked her (how she planned to test the computational models she was developing using historical data when future events were scarce) still haunts her to this day. I felt kinda bad about it. She said she was glad I asked because it was a good question precisely because it was difficult to answer, but still, I didn’t intend to haunt her with it.

    Still, I think you could probably craft a hypothesis to fit your work. It might sound a little cheeseball (”We hypothesize that mining dataset X using technique Y will yeild useful candidate widgets for further testing…”) but, strictly speaking, it’s a hypothesis, and it’s testable. Either you find useful widgets, or you don’t. Sometimes you might have an underlying hypothesis that you’re trying to solve using the bioinformatic approach, and you might be able to simply address the underlying hypothesis, ie “We hypothesize a novel widget with the following properties exists which will make a cell do XYZ…” then you go about trying to identify a suitable widget using your bioinformatics approach.

    It might be useful, because there are some PhDs out there who sharpen their knives, waiting to eviscerate a student who hasn’t armed himself or herself with some sort of hypothesis. :)

    DocWedley

  2. Phil on 10 May 2007 at 9:30 am | Permalink

    I gave that presentation at the beginning of April, and since then I’ve gotten more data, refined my expectations, and kinda gone in a different direction. I do actually have a hypothesis, although it is broad like you described (”I expect distribution X.”). Even better, I have a null hypothesis (”If these events happened randomly, we would get distribution Z”)! Unfortunately, with my preliminary data, X looks an awful lot like Z. I’m doing another analysis now, which will hopefully separate X and Z much more.

    It also helps that I’ve been working closely with my co-advisor and some of the students in his lab. They’re all ecologists and evolutionists—much more “pure” science—so they’re very focused on hypotheses and such. My lab, as a technology and engineering lab, is more concerned about results and applications and not so much theories. :)

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>