…and then there were two!

July 30th, 2013 by Tal Yarkoni

Last year when I launched my lab (which, full disclosure, is really just me, plus some of my friends who were kind enough to let me plaster their names and faces on my website), I decided to call it the Psychoinformatics Lab (or PILab for short and pretentious), because, well, why not. It seemed to nicely capture what my research is about: psychology and informatics. But it wasn’t an entirely comfortable decision, because a non-trivial portion of my brain was quite convinced that everyone was going to laugh at me. And even now, after more than a year of saying I’m a “psychoinformatician” whenever anyone asks me what I do, I still feel a little bit fraudulent each time–as if I’d just said I was a member of the Estonian Cosmonaut program, or the president of the Build-a-Bear fan club*.

But then… just last week… everything suddenly changed! All in one fell swoop–in one tiny little nudge of a shove-this-on-the-internet button, things became magically better. And now colors are vibrating**, birds are chirping merry chirping songs–no, wait, those are actually cicadas–and the world is basking in a pleasant red glow of humming monitors and five-star Amazon reviews. Or something like that. I’m not so good with the metaphors.

Why so upbeat, you ask? Well, because as of this writing, there is no longer just the one lone Psychoinformatics Lab. No! Now there are not one, not three, not seven Psychoinformatics Labs, but… two! There are two Psychoinformatics Labs. The good Dr. Michael Hanke (of PyMVPA and NeuroDebian fame) has just finished putting the last coat of paint on the inside of his brand new cage Psychoinformatics Lab at the Otto-von-Guericke University Magdeburg in Magdeburg, Germany. No, really***: his startup package didn’t include any money for paint, so he had to barter his considerable programming skills for three buckets of Going to the Chapel (yes, that’s a real paint color).

The good Dr. Hanke drifts through interstellar space in search of new psychoinformatic horizons.

Anyway, in case you can’t tell, I’m quite excited about this. Not because it’s a sign that informatics approaches are making headway in psychology, or that pretty soon every psychology lab will have a high-performance computing cluster hiding in its closet (one can dream, right?). No sir. I’m excited for two much more pedestrian reasons. First, because from now on, any time anyone makes fun of me for calling myself a psychoinformatician, I’ll be able to say, with a straight face, well it’s not just me, you know–there are multiple ones of us doing this here research-type thing with the data and the psychology and the computers. And secondly, because Michael is such a smart and hardworking guy that I’m pretty sure he’s going to legitimize this whole enterprise and drag me along for the ride with him, so I won’t have to do anything else myself. Which is good, because if laziness was an olympic sport, I’d never leave the starting block.

No, but in all seriousness, Michael is an excellent scientist and an exceptional human being, and I couldn’t be happier for him in his new job as Lord Director of All Things Psychoinformatic (Eastern Division). You might think I’m only saying this because he just launched the world’s second PILab, complete with quote from yours truly on said lab’s website front page. Well, you’d be right. But still. He’s a pretty good guy, and I’m sure we’re going to see amazing things coming out of Magdeburg.

Now if anyone wants to launch PILab #3 (maybe in Asia or South America?), just let me know, and I’ll make you the same offer I made Michael: an envelope full of $1 bills (well, you know, I’m an academic–I can’t afford Benjamins just yet) and a blog post full of ridiculous superlatives.

 

* Perhaps that’s not a good analogy, because that one may actually exist.

** But seriously, in real life, colors should not vibrate. If you ever notice colors vibrating, drive to the nearest emergency room and tell them you’re seeing colors vibrating.

*** No, not really.

what do you get when you put 1,000 psychologists together in one journal?

April 5th, 2013 by Tal Yarkoni

I’m working on a TOP SEKKRIT* project involving large-scale data mining of the psychology literature. I don’t have anything to say about the TOP SEKKRIT* project just yet, but I will say that in the process of extracting certain information I needed in order to do certain things I won’t talk about, I ended up with certain kinds of data that are useful for certain other tangential analyses. Just for fun, I threw some co-authorship data from 2,000+ Psychological Science articles into the d3.js blender, and out popped an interactive network graph of all researchers who have published at least 2 papers in Psych Science in the last 10 years**. It looks like this:

coauthorship_graph

You can click on the image to take a closer (and interactive) look.

I don’t think this is very useful for anything right now, but if nothing else, it’s fun to drag Adam Galinsky around the screen and watch half of the field come along for the ride. There are plenty of other more interesting things one could do with this, though, and it’s also quite easy to generate the same graph for other journals, so I expect to have more to say about this later on.

 

* It’s not really TOP SEKKRIT at all–it just sounds more exciting that way.

** Or, more accurately, researchers who have co-authored at least 2 Psych Science papers with other researchers who meet the same criterion. Otherwise we’d have even more nodes in the graph, and as you can see, it’s already pretty messy.

the truth is not optional: five bad reasons (and one mediocre one) for defending the status quo

March 12th, 2013 by Tal Yarkoni

You could be forgiven for thinking that academic psychologists have all suddenly turned into professional whistleblowers. Everywhere you look, interesting new papers are cropping up purporting to describe this or that common-yet-shady methodological practice, and telling us what we can collectively do to solve the problem and improve the quality of the published literature. In just the last year or so, Uri Simonsohn introduced new techniques for detecting fraud, and used those tools to identify at least 3 cases of high-profile, unabashed data forgery. Simmons and colleagues reported simulations demonstrating that standard exploitation of research degrees of freedom in analysis can produce extremely high rates of false positive findings. Pashler and colleagues developed a “Psych file drawer” repository for tracking replication attempts. Several researchers raised trenchant questions about the veracity and/or magnitude of many high-profile psychological findings such as John Bargh’s famous social priming effects. Wicherts and colleagues showed that authors of psychology articles who are less willing to share their data upon request are more likely to make basic statistical errors in their papers. And so on and so forth. The flood shows no signs of abating; just last week, the APS journal Perspectives in Psychological Science announced that it’s introducing a new “Registered Replication Report” section that will commit to publishing pre-registered high-quality replication attempts, irrespective of their outcome.

Personally, I think these are all very welcome developments for psychological science. They’re solid indications that we psychologists are going to be able to police ourselves successfully in the face of some pretty serious problems, and they bode well for the long-term health of our discipline. My sense is that the majority of other researchers–perhaps the vast majority–share this sentiment. Still, as with any zeitgeist shift, there are always naysayers. In discussing these various developments and initiatives with other people, I’ve found myself arguing, with somewhat surprising frequency, with people who for various reasons think it’s not such a good thing that Uri Simonsohn is trying to catch fraudsters, or that social priming findings are being questioned, or that the consequences of flexible analyses are being exposed. Since many of the arguments I’ve come across tend to recur, I thought I’d summarize the most common ones here–along with the rebuttals I usually offer for why, with one possible exception, the arguments for giving a pass to sloppy-but-common methodological practices are not very compelling.

“But everyone does it, so how bad can it be?”

We typically assume that long-standing conventions must exist for some good reason, so when someone raises doubts about some widespread practice, it’s quite natural to question the person raising the doubts rather than the practice itself. Could it really, truly be (we say) that there’s something deeply strange and misguided about using p values? Is it really possible that the reporting practices converged on by thousands of researchers in tens of thousands of neuroimaging articles might leave something to be desired? Could failing to correct for the many researcher degrees of freedom associated with most datasets really inflate the false positive rate so dramatically?

The answer to all these questions, of course, is yes–or at least, we should allow that it could be yes. It is, in principle, entirely possible for an entire scientific field to regularly do things in a way that isn’t very good. There are domains where appeals to convention or consensus make perfect sense, because there are few good reasons to do things a certain way except inasmuch as other people do them the same way. If everyone else in your country drives on the right side of the road, you may want to consider driving on the right side of the road too. But science is not one of those domains. In science, there is no intrinsic benefit to doing things just for the sake of convention. In fact, almost by definition, major scientific advances are ones that tend to buck convention and suggest things that other researchers may not have considered possible or likely.

In the context of common methodological practice, it’s no defense at all to say but everyone does it this way, because there are usually relatively objective standards by which we can gauge the quality of our methods, and it’s readily apparent that there are many cases where the consensus approach leave something to be desired. For instance, you can’t really justify failing to correct for multiple comparisons when you report a single test that’s just barely significant at p < .05 on the grounds that nobody else corrects for multiple comparisons in your field. That may be a valid explanation for why your paper successfully got published (i.e., reviewers didn’t want to hold your feet to the fire for something they themselves are guilty of in their own work), but it’s not a valid defense of the actual science. If you run a t-test on randomly generated data 20 times, you will, on average, get a significant result, p < .05, once. It does no one any good to argue that because the convention in a field is to allow multiple testing–or to ignore statistical power, or to report only p values and not effect sizes, or to omit mention of conditions that didn’t ‘work’, and so on–it’s okay to ignore the issue. There’s a perfectly reasonable question as to whether it’s a smart career move to start imposing methodological rigor on your work unilaterally (see below), but there’s no question that the mere presence of consensus or convention surrounding a methodological practice does not make that practice okay from a scientific standpoint.

“But psychology would break if we could only report results that were truly predicted a priori!”

This is a defense that has some plausibility at first blush. It’s certainly true that if you force researchers to correct for multiple comparisons properly, and report the many analyses they actually conducted–and not just those that “worked”–a lot of stuff that used to get through the filter will now get caught in the net. So, by definition, it would be harder to detect unexpected effects in one’s data–even when those unexpected effects are, in some sense, ‘real’. But the important thing to keep in mind is that raising the bar for what constitutes a believable finding doesn’t actually prevent researchers from discovering unexpected new effects; all it means is that it becomes harder to report post-hoc results as pre-hoc results. It’s not at all clear why forcing researchers to put in more effort validating their own unexpected finding is a bad thing.

In fact, forcing researchers to go the extra mile in this way would have one exceedingly important benefit for the field as a whole: it would shift the onus of determining whether an unexpected result is plausible enough to warrant pursuing away from the community as a whole, and towards the individual researcher who discovered the result in the first place. As it stands right now, if I discover an unexpected result (p < .05!) that I can make up a compelling story for, there’s a reasonable chance I might be able to get that single result into a short paper in, say, Psychological Science. And reap all the benefits that attend getting a paper into a “high-impact” journal. So in practice there’s very little penalty to publishing questionable results, even if I myself am not entirely (or even mostly) convinced that those results are reliable. This state of affairs is, to put it mildly, not A Good Thing.

In contrast, if you as an editor or reviewer start insisting that I run another study that directly tests and replicates my unexpected finding before you’re willing to publish my result, I now actually have something at stake. Because it takes time and money to run new studies, I’m probably not going to bother to follow up on my unexpected finding unless I really believe it. Which is exactly as it should be: I’m the guy who discovered the effect, and I know about all the corners I have or haven’t cut in order to produce it; so if anyone should make the decision about whether to spend more taxpayer money chasing the result, it should be me. You, as the reviewer, are not in a great position to know how plausible the effect truly is, because you have no idea how many different types of analyses I attempted before I got something to ‘work’, or how many failed studies I ran that I didn’t tell you about. Given the huge asymmetry in information, it seems perfectly reasonable for reviewers to say, You think you have a really cool and unexpected effect that you found a compelling story for? Great; go and directly replicate it yourself and then we’ll talk.

“But mistakes happen, and people could get falsely accused!”

Some people don’t like the idea of a guy like Simonsohn running around and busting people’s data fabrication operations for the simple reason that they worry that the kind of approach Simonsohn used to detect fraud is just not that well-tested, and that if we’re not careful, innocent people could get swept up in the net. I think this concern stems from fundamentally good intentions, but once again, I think it’s also misguided.

For one thing, it’s important to note that, despite all the press, Simonsohn hasn’t actually done anything qualitatively different from what other whistleblowers or skeptics have done in the past. He may have suggested new techniques that improve the efficiency with which cheating can be detected, but it’s not as though he invented the ability to report or investigate other researchers for suspected misconduct. Researchers suspicious of other researchers’ findings have always used qualitatively similar arguments to raise concerns. They’ve said things like, hey, look, this is a pattern of data that just couldn’t arise by chance, or, the numbers are too similar across different conditions.

More to the point, perhaps, no one is seriously suggesting that independent observers shouldn’t be allowed to raise their concerns about possible misconduct with journal editors, professional organizations, and universities. There really isn’t any viable alternative. Naysayers who worry that innocent people might end up ensnared by false accusations presumably aren’t suggesting that we do away with all of the existing mechanisms for ensuring accountability; but since the role of people like Simonsohn is only to raise suspicion and provide evidence (and not to do the actual investigating or firing), it’s clear that there’s no way to regulate this type of behavior even if we wanted to (which I would argue we don’t). If I wanted to spend the rest of my life scanning the statistical minutiae of psychology articles for evidence of misconduct and reporting it to the appropriate authorities (and I can assure you that I most certainly don’t), there would be nothing anyone could do to stop me, nor should there be. Remember that accusing someone of misconduct is something anyone can do, but establishing that misconduct has actually occurred is a serious task that requires careful internal investigation. No one–certainly not Simonsohn–is suggesting that a routine statistical test should be all it takes to end someone’s career. In fact, Simonsohn himself has noted that he identified a 4th case of likely fraud that he dutifully reported to the appropriate authorities only to be met with complete silence. Given all the incentives universities and journals have to look the other way when accusations of fraud are made, I suspect we should be much more concerned about the false negative rate than the false positive rate when it comes to fraud.

“But it hurts the public’s perception of our field!”

Sometimes people argue that even if the field does have some serious methodological problems, we still shouldn’t discuss them publicly, because doing so is likely to instill a somewhat negative view of psychological research in the public at large. The unspoken implication being that, if the public starts to lose confidence in psychology, fewer students will enroll in psychology courses, fewer faculty positions will be created to teach students, and grant funding to psychologists will decrease. So, by airing our dirty laundry in public, we’re only hurting ourselves. I had an email exchange with a well-known researcher to exactly this effect a few years back in the aftermath of the Vul et al “voodoo correlations” paper–a paper I commented on to the effect that the problem was even worse than suggested. The argument my correspondent raised was, in effect, that we (i.e., neuroimaging researchers) are all at the mercy of agencies like NIH to keep us employed, and if it starts to look like we’re clowning around, the unemployment rate for people with PhDs in cognitive neuroscience might start to rise precipitously.

While I obviously wouldn’t want anyone to lose their job or their funding solely because of a change in public perception, I can’t say I’m very sympathetic to this kind of argument. The problem is that it places short-term preservation of the status quo above both the long-term health of the field and the public’s interest. For one thing, I think you have to be quite optimistic to believe that some of the questionable methodological practices that are relatively widespread in psychology (data snooping, selective reporting, etc.) are going to sort themselves out naturally if we just look the other way and let nature run its course. The obvious reason for skepticism in this regard is that many of the same criticisms have been around for decades, and it’s not clear that anything much has improved. Maybe the best example of this is Gigerenzer and Sedlmeier’s 1989 paper entitled “Do studies of statistical power have an effect on the power of studies?“, in which the authors convincingly showed that despite three decades of work by luminaries like Jacob Cohen advocating power analyses, statistical power had not risen appreciably in psychology studies. The presence of such unwelcome demonstrations suggests that sweeping our problems under the rug in the hopes that someone (the mice?) will unobtrusively take care of them for us is wishful thinking.

In any case, even if problems did tend to solve themselves when hidden away from the prying eyes of the media and public, the bigger problem with what we might call the “saving face” defense is that it is, fundamentally, an abuse of taxypayers’ trust. As with so many other things, Richard Feynman summed up the issue eloquently in his famous Cargo Cult science commencement speech:

For example, I was a little surprised when I was talking to a friend who was going to go on the radio. He does work on cosmology and astronomy, and he wondered how he would explain what the applications of this work were. “Well,” I said, “there aren’t any.” He said, “Yes, but then we won’t get support for more research of this kind.” I think that’s kind of dishonest. If you’re representing yourself as a scientist, then you should explain to the layman what you’re doing–and if they don’t want to support you under those circumstances, then that’s their decision.

The fact of the matter is that our livelihoods as researchers depend directly on the goodwill of the public. And the taxpayers are not funding our research so that we can “discover” interesting-sounding but ultimately unreplicable effects. They’re funding our research so that we can learn more about the human mind and hopefully be able to fix it when it breaks. If a large part of the profession is routinely employing practices that are at odds with those goals, it’s not clear why taxpayers should be footing the bill. From this perspective, it might actually be a good thing for the field to revise its standards, even if (in the worst-case scenario) that causes a short-term contraction in employment.

“But unreliable effects will just fail to replicate, so what’s the big deal?”

This is a surprisingly common defense of sloppy methodology, maybe the single most common one. It’s also an enormous cop-out, since it pre-empts the need to think seriously about what you’re doing in the short term. The idea is that, since no single study is definitive, and a consensus about the reality or magnitude of most effects usually doesn’t develop until many studies have been conducted, it’s reasonable to impose a fairly low bar on initial reports and then wait and see what happens in subsequent replication efforts.

I think this is a nice ideal, but things just don’t seem to work out that way in practice. For one thing, there doesn’t seem to be much of a penalty for publishing high-profile results that later fail to replicate. The reason, I suspect, is that we incline to give researchers the benefit of the doubt: surely (we say to ourselves), Jane Doe did her best, and we like Jane, so why should we question the work she produces? If we’re really so skeptical about her findings, shouldn’t we go replicate them ourselves, or wait for someone else to do it?

While this seems like an agreeable and fair-minded attitude, it isn’t actually a terribly good way to look at things. Granted, if you really did put in your best effort–dotted all your i’s and crossed all your t’s–and still ended up reporting a false result, we shouldn’t punish you for it. I don’t think anyone is seriously suggesting that researchers who inadvertently publish false findings should be ostracized or shunned. On the other hand, it’s not clear why we should continue to celebrate scientists who ‘discover’ interesting effects that later turn out not to replicate. If someone builds a career on the discovery of one or more seemingly important findings, and those findings later turn out to be wrong, the appropriate attitude is to update our beliefs about the merit of that person’s work. As it stands, we rarely seem to do this.

In any case, the bigger problem with appeals to replication is that the delay between initial publication of an exciting finding and subsequent consensus disconfirmation can be very long, and often spans entire careers. Waiting decades for history to prove an influential idea wrong is a very bad idea if the available alternative is to nip the idea in the bud by requiring stronger evidence up front.

There are many notable examples of this in the literature. A well-publicized recent one is John Bargh’s work on the motor effects of priming people with elderly stereotypes–namely, that priming people with words related to old age makes them walk away from the experiment more slowly. Bargh’s original paper was published in 1996, and according to Google Scholar, has now been cited over 2,000 times. It has undoubtedly been hugely influential in directing many psychologists’ research programs in certain directions (in many cases, in directions that are equally counterintuitive and also now seem open to question). And yet it’s taken over 15 years for a consensus to develop that the original effect is at the very least much smaller in magnitude than originally reported, and potentially so small as to be, for all intents and purposes, “not real”. I don’t know who reviewed Bargh’s paper back in 1996, but I suspect that if they ever considered the seemingly implausible size of the effect being reported, they might have well thought to themselves, well, I’m not sure I believe it, but that’s okay–time will tell. Time did tell, of course; but time is kind of lazy, so it took fifteen years for it to tell. In an alternate universe, a reviewer might have said, well, this is a striking finding, but the effect seems implausibly large; I would like you to try to directly replicate it in your lab with a much larger sample first. I recognize that this is onerous and annoying, but my primary responsibility is to ensure that only reliable findings get into the literature, and inconveniencing you seems like a small price to pay. Plus, if the effect is really what you say it is, people will be all the more likely to believe you later on.

Or take the actor-observer asymmetry, which appears in just about every introductory psychology textbook written in the last 20 – 30 years. It states that people are relatively more likely to attribute their own behavior to situational factors, and relatively more likely to attribute other agents’ behaviors to those agents’ dispositions. When I slip and fall, it’s because the floor was wet; when you slip and fall, it’s because you’re dumb and clumsy. This putative asymmetry was introduced and discussed at length in a book by Jones and Nisbett in 1971, and hundreds of studies have investigated it at this point. And yet a 2006 meta-analysis by Malle suggested that the cumulative evidence for the actor-observer asymmetry is actually very weak. There are some specific circumstances under which you might see something like the postulated effect, but what is quite clear is that it’s nowhere near strong enough an effect to justify being routinely invoked by psychologists and even laypeople to explain individual episodes of behavior. Unfortunately, at this point it’s almost impossible to dislodge the actor-observer asymmetry from the psyche of most researchers–a reality underscored by the fact that the Jones and Nisbett book has been cited nearly 3,000 times, whereas the 1996 meta-analysis has been cited only 96 times (a very low rate for an important and well-executed meta-analysis published in Psychological Bulletin).

The fact that it can take many years–whether 15 or 45–for a literature to build up to the point where we’re even in a position to suggest with any confidence that an initially exciting finding could be wrong means that we should be very hesitant to appeal to long-term replication as an arbiter of truth. Replication may be the gold standard in the very long term, but in the short and medium term, appealing to replication is a huge cop-out. If you can see problems with an analysis right now that cast aspersions on a study’s results, it’s an abdication of responsibility to downplay your concerns and wait for someone else to come along and spend a lot more time and money trying to replicate the study. You should point out now why you have concerns. If the authors can address them, the results will look all the better for it. And if the authors can’t address your concerns, well, then, you’ve just done science a service. If it helps, don’t think of it as a matter of saying mean things about someone else’s work, or of asserting your own ego; think of it as potentially preventing a lot of very smart people from wasting a lot of time chasing down garden paths–and also saving a lot of taxpayer money. Remember that our job as scientists is not to make other scientists’ lives easy in the hopes they’ll repay the favor when we submit our own papers; it’s to establish and apply standards that produce convergence on the truth in the shortest amount of time possible.

“But it would hurt my career to be meticulously honest about everything I do!”

Unlike the other considerations listed above, I think the concern that being honest carries a price when it comes to do doing research has a good deal of merit to it. Given the aforementioned delay between initial publication and later disconfirmation of findings (which even in the best case is usually longer than the delay between obtaining a tenure-track position and coming up for tenure), researchers have many incentives to emphasize expediency and good story-telling over accuracy, and it would be disingenuous to suggest otherwise. No malevolence or outright fraud is implied here, mind you; the point is just that if you keep second-guessing and double-checking your analyses, or insist on routinely collecting more data than other researchers might think is necessary, you will very often find that results that could have made a bit of a splash given less rigor are actually not particularly interesting upon careful cross-examination. Which means that researchers who have, shall we say, less of a natural inclination to second-guess, double-check, and cross-examine their own work will, to some degree, be more likely to publish results that make a bit of a splash (it would be nice to believe that pre-publication peer review filters out sloppy work, but empirically, it just ain’t so). So this is a classic tragedy of the commons: what’s good for a given individual, career-wise, is clearly bad for the community as a whole.

I wish I had a good solution to this problem, but I don’t think there are any quick fixes. The long-term solution, as many people have observed, is to restructure the incentives governing scientific research in such a way that individual and communal benefits are directly aligned. Unfortunately, that’s easier said than done. I’ve written a lot both in papers (1, 2, 3) and on this blog (see posts linked here) about various ways we might achieve this kind of realignment, but what’s clear is that it will be a long and difficult process. For the foreseeable future, it will continue to be an understandable though highly lamentable defense to say that the cost of maintaining a career in science is that one sometimes has to play the game the same way everyone else plays the game, even if it’s clear that the rules everyone plays by are detrimental to the communal good.

 

Anyway, this may all sound a bit depressing, but I really don’t think it should be taken as such. Personally I’m actually very optimistic about the prospects for large-scale changes in the way we produce and evaluate science within the next few years. I do think we’re going to collectively figure out how to do science in a way that directly rewards people for employing research practices that are maximally beneficial to the scientific community as a whole. But I also think that for this kind of change to take place, we first need to accept that many of the defenses we routinely give for using iffy methodological practices are just not all that compelling.

the seedy underbelly

March 2nd, 2013 by Tal Yarkoni

This is fiction. Science will return shortly.


Cornelius Kipling doesn’t take No for an answer. He usually takes several of them–several No’s strung together in rapid sequence, each one louder and more adamant than the last one.

“No,” I told him over dinner at the Rhubarb Club one foggy evening. “No, no, no. I won’t bankroll your efforts to build a new warp drive.”

“But the last one almost worked,” Kip said pleadingly. “I almost had it down before the hull gave way.”

I conceded that it was a clever idea; everyone before Kip had always thought of warp drives as something you put on spaceships. Kip decided to break the mold by placing one on a hydrofoil. Which, naturally, made the boat too heavy to rise above the surface of the water. In fact, it made the boat too heavy to do anything but sink.

“Admittedly, the sinking thing is a small problem,” he said, as if reading my thoughts. “But I’m working on a way to adjust for the extra weight and get it to rise clear out of the water.”

“Good,” I said. “Because lifting the boat out of the water seems like a pretty important step on the road to getting it to travel through space at light speed.”

“Actually, it’s the only remaining technical hurdle,” said Kip. “Once it’s out of the water, everything’s already taken care of. I’ve got onboard fission reactors for power, and a tentative deal to use the International Space Station for supplies. Virgin Galactic is ready to license the technology as soon as we pull off a successful trial run. And there’s an arrangement with James Cameron’s new asteroid mining company to supply us with fuel as we boldly go where… well, you know.”

“Right,” I said, punching my spoon into my crème brûlée in frustration. The crème brûlée retaliated by splattering itself all over my face and jacket.

“See, this kind of thing wouldn’t happen to you if you invested in my company,” Kip helpfully suggested as he passed me an extra napkin. “You’d have so much money other people would feed you. People with ten or fifteen years of experience wielding dessert spoons.”


After dinner we headed downtown. Kip said there was a new bar called Zygote he wanted to show me.

“Actually, it’s not a new bar per se,” he explained as we were leaving the Rhubarb. “It’s new to me. Turns out it’s been here for several years, but you have to know someone to get in. And that someone has to be willing to sponsor you. They review your biography, look up your criminal record, make sure you’re the kind of person they want at the bar, and so on.”

“Sounds like an arranged marriage.”

“You’re not too far off. When you’re first accepted as a member, you’re supposed to give Zygote a dowry of $2,000.”

“That’s a joke, right?” I asked.

“Yes. There’s no dowry. Just the fee.”

“Two thousand dollars? Really?”

“Well, more like fifty a year. But same principle.”

We walked down the mall in silence. I could feel the insoles of my shoes wrapping themselves around my feet, and I knew they were desperately warning me to get away from Kip while I still had a limited amount of sobriety and dignity left.

“How would anyone manage to keep a place like that secret?” I asked. “Especially on the mall.”

“They hire hit men,” Kip said solemnly.

I suspected he was joking, but couldn’t swear to it. I mean, if you didn’t know Kip, you would probably have thought that the idea of putting a warp drive on a hydrofoil was also a big joke.

Kip led us into one of the alleys off Pearl Street, where he quickly located an unobtrusive metal panel set into the wall just below eye level. The panel opened inwards when we pushed it. Behind the panel, we found a faint smell of old candles and a flight of stairs. At the bottom of the stairs–which turned out to run three stories down–we came to another door. This one didn’t open when we pushed it. Instead, Kip knocked on it three times. Then twice more. Then four times.

“Secret code?” I asked.

“No. Obsessive-compulsive disorder.”

The door swung open.

“Evening, Ashraf,” Kip said to the doorman as we stepped through. Ashraf was a tiny Middle Eastern man, very well dressed. Suede pants, cashmere scarf, fedora on his head. Feather in the fedora. The works. I guess when your bar is located behind a false wall three stories below grade, you don’t really need a lot of muscle to keep the peasants out; you knock them out with panache.

“Welcome to Zygote,” Ashraf said. His bland tone made it clear that, truthfully, he wasn’t at all interested in welcoming anyone anywhere. Which made him exactly the kind of person an establishment like this would want as its doorman.

Inside, the bar was mostly empty. There were twelve or fifteen patrons scattered across various booths and animal-print couches. They all took great care not to make eye contact with us as we entered.

“I have to confess,” I whispered to Kip as we made our way to the bar. “Until about three seconds ago, I didn’t really believe you that this place existed.”

“No worries,” he said. “Until about three seconds ago, it had no idea you existed either.”

He looked around.

“Actually, I’m still not sure it knows you exist,” he added apologetically.

“I feel like I’m giving everyone the flu just by standing here,” I told him.

We took a seat at the end of the bar and motioned to the bartender, who looked to be high on a designer drug chemically related to apathy. She eventually wandered over to us–but not before stopping to inspect the countertop, a stack of coasters with pictures of archaeological sites on them, a rack of brandy snifters, and the water running from the faucet.

“Two mojitos and a strawberry daiquiri,” Kip said when she finally got close enough to yell at.

“Who’s the strawberry daiquiri for,” I asked.

“Me. They’re all for me. Why, did you want a drink too?”

I did, so I ordered the special–a pink cocktail called a Flamingo. Each Flamingo came in a tall Flamingo-shaped glass that couldn’t stand up by itself, so you had to keep holding it until you finished it. Once you were done, you could lay the glass on its side on the counter and watch it leak its remaining pink guts out onto the tile. This act was, I gathered from Kip, a kind of rite of passage at Zygote.

“This is a very fancy place,” I said to no one in particular.

“You should have seen it before the gang fights,” the bartender said before walking back to the snifter rack. I had high hopes she would eventually get around to filling our order.

“Gang fights?”

“Yes,” Kip said. “Gang fights. Used to be big old gang fights in here every other week. They trashed the place several times.”

“It’s like there’s this whole seedy underbelly to Boulder that I never knew existed.”

“Oh, this is nothing. It goes much deeper than this. You haven’t seen the seedy underbelly of this place until you’ve tried to convince a bunch of old money hippies to finance your mass-produced elevator-sized vaporizer. You haven’t squinted into the sun or tasted the shadow of death on your shoulder until you’ve taken on the Bicycle Triads of North Boulder single-file in a dark alley. And you haven’t tried to scratch the dirt off your soul–unsuccessfully, mind you–until you’ve held all-night bargaining sessions with local black hat hacker groups to negotiate the purchase of mission-critical zero-day exploits.”

“Well, that may all be true,” I said. “But I don’t think you’ve done any of those things either.”

I should have known better than to question Kip’s credibility; he spent the next fifteen minutes reminding me of the many times he’d risked his life, liberty, and (nonexistent) fortune fighting to suppress the darkest forces in Northern Colorado in the service of the greater good of mankind.

After that, he launched into his standard routine of trying to get me to buy into the latest round of his inane startup ideas. He told me, in no particular order, about his plans to import, bottle and sell the finest grade Kazakh sand as a replacement for the substandard stuff currently found on American kindergarten sandlots; to run a “reverse tourism” operation that would fly in members of distant cultures to visit disabled would-be travelers in the comfort of their own living rooms (tentative slogan: if the customer can’t come to Muhammad, Muhammad must come to the customer); and to create giant grappling hooks that could pull Australia closer to the West Coast so that Kip could speculate in airline stocks and make billions of dollars once shorter flights inevitably caused Los Angeles-Sydney routes to triple in passenger volume.

I freely confess that my recollection of the finer points of the various revenue enhancement plans Kip proposed that night is not the best. I was a little bit distracted by a woman at the far end of the bar who kept gesturing towards me the whole time Kip was talking. Actually, she wasn’t so much gesturing towards me as gently massaging her neck. But she only did it when I happened to look at her. At one point, she licked her index finger and rubbed it on her neck, giving me a pointed look.

After about forty-five minutes of this, I finally worked up the courage to interrupt Kip’s explanation of how and why the federal government could solve all of America’s economic problems overnight by convincing Balinese children to invest in discarded high school football uniforms.

“Look,” I told him, pointing down to the other side of the bar. “You see? This is why I don’t go to bars any more now that I’m married. Attractive women hit on me, and I hate to disappoint them.”

I raised my left hand and deliberately stroked my wedding band in full view.

The lady at the far end didn’t take the hint. Quite the opposite; she pushed back her bar stool and came over to us.

“Christ,” I whispered.

Kip smirked quietly.

“Hi,” said the woman. “I’m Suzanne.”

“Hi,” I said. “I’m flattered. And also married.”

“I see that. I also see that you have some food in your… neckbeard. It looks like whipped cream. At least I hope that’s what it is. I was trying to let you know from down there, so you could wipe it off without embarrassing yourself any further. But apparently you’d rather embarrass yourself.”

“It’s crème brûlée,” I mumbled.

“Weak,” said Suzanne, turning around. “Very weak.”

After she’d left, I wiped my neck on my sleeve and looked at Kip. He looked back at me with a big grin on his face.

“I don’t suppose the thought crossed your mind, at any point in the last hour, to tell me I had crème brûlée in my beard.”

“You mean your neckbeard?”

“Yes,” I sighed, making a mental note to shave more often. “That.”

“It certainly crossed my mind,” Kip said. “Actually, it crossed my mind several times. But each time it crossed, it just waved hello and kept right on going.”

“You know you’re an asshole, right?”

“Whatever you say, Captain Neckbeard.”

“Alright then,” I sighed. “Let’s get out of here. It’s past my curfew anyway. Do you remember where I left my car?”

“No need,” said Kip, putting on his jacket and clapping his hand to my shoulder. “My hydrofoil’s parked in the Spruce lot around the block. The new warp drive is in. Walk with me and I’ll give you a ride. As long as you don’t mind pushing for the first fifty yards.”

the Neurosynth viewer goes modular and open source

February 24th, 2013 by Tal Yarkoni

If you’ve visited the Neurosynth website lately, you may have noticed that it looks… the same way it’s always looked. It hasn’t really changed in the last ~20 months, despite the vague promise on the front page that in the next few months, we’re going to do X, Y, Z to improve the functionality. The lack of updates is not by design; it’s because until recently I didn’t have much time to work on Neurosynth. Now that much of my time is committed to the project, things are moving ahead pretty nicely, though the changes behind the scenes aren’t reflected in any user-end improvements yet.

The github repo is now regularly updated and even gets the occasional contribution from someone other than myself; I expect that to ramp up considerably in the coming months. You can already use the code to run your own automated meta-analyses fairly easily; e.g., with everything set up right (follow the Readme and examples in the repo), the following lines of code:

dataset = cPickle.load(open('dataset.pkl', 'rb'))
studies = get_ids_by_expression("memory* &~ ("wm|working|episod*"), threshold=0.001)
ma = meta.MetaAnalysis(dataset, studies)
ma.save_results('memory')

…will perform an automated meta-analysis of all studies in the Neurosynth database that use the term ‘memory’ at a frequency of 1 in 1,000 words or greater, but don’t use the terms wm or working, or words that start with ‘episod’ (e.g., episodic). You can perform queries that nest to arbitrary depths, so it’s a pretty powerful engine for quickly generating customized meta-analyses, subject to all of the usual caveats surrounding Neurosynth (i.e., that the underlying data are very noisy, that terms aren’t mental states, etc.).

Anyway, with the core tools coming along, I’ve started to turn back to other elements of the project, starting with the image viewer. Yesterday I pushed the first commit of a new version of the viewer that’s currently on the Neurosynth website. In the next few weeks, this new version will be replacing the current version of the viewer, along with a bunch of other changes to the website.

A live demo of the new viewer is available here. It’s not much to look at right now, but behind the scenes, it’s actually a huge improvement on the old viewer in a number of ways:

  • The code is completely refactored and is all nice and object-oriented now. It’s also in CoffeeScript, which is an alternative and (if you’re coming from a Python or Ruby background) much more readable syntax for JavaScript. The source code is on github and contributions are very much encouraged. Like most scientists, I’m generally loathe to share my code publicly because I think it sucks most of the time. But I actually feel pretty good about this code. It’s not good code by any stretch, but I think it rises to the level of ‘mostly sensible’, which is about as much as I can hope for.
  • The viewer now handles multiple layers simultaneously, with the ability to hide and show layers, reorder them by dragging, vary the transparency, assign different color palettes, etc. These features have been staples of offline viewers pretty much since the prehistoric beginnings of fMRI time, but they aren’t available in the current Neurosynth viewer or most other online viewers I’m aware of, so this is a nice addition.
  • The architecture is modular, so that it should be quite easy in future to drop in other alternative views onto the data without having to muck about with the app logic. E.g., adding a 3D WebGL-based view to complement the current 2D slice-based HTML5 canvas approach is on the near-term agenda.
  • The resolution of the viewer is now higher–up from 4 mm to 2 mm (which is the most common native resolution used in packages like SPM and FSL). The original motivation for downsampling to 4 mm in the prior viewer was to keep filesize to a minimum and speed up the initial loading of images. But at some point I realized, hey, we’re living in the 21st century; people have fast internet connections now. So now the files are all in 2 mm resolution, which has the unpleasant effect of increasing file sizes by a factor of about 8, but also has the pleasant effect of making it so that you can actually tell what the hell you’re looking at.

Most importantly, there’s now a clean, and near-complete, separation between the HTML/CSS content and the JavaScript code. Which means that you can now effectively drop the viewer into just about any HTML page with just a few lines of code. So in theory, you can have basically the same viewer you see in the demo just by sticking something like the following into your page:

 viewer = Viewer.get('#layer_list', '.layer_settings')
 viewer.addView('#view_axial', 2);
 viewer.addView('#view_coronal', 1);
 viewer.addView('#view_sagittal', 0);
 viewer.addSlider('opacity', '.slider#opacity', 'horizontal', 'false', 0, 1, 1, 0.05);
 viewer.addSlider('pos-threshold', '.slider#pos-threshold', 'horizontal', 'false', 0, 1, 0, 0.01);
 viewer.addSlider('neg-threshold', '.slider#neg-threshold', 'horizontal', 'false', 0, 1, 0, 0.01);
 viewer.addColorSelect('#color_palette');
 viewer.addDataField('voxelValue', '#data_current_value')
 viewer.addDataField('currentCoords', '#data_current_coords')
 viewer.loadImageFromJSON('data/MNI152.json', 'MNI152 2mm', 'gray')
 viewer.loadImageFromJSON('data/emotion_meta.json', 'emotion meta-analysis', 'bright lights')
 viewer.loadImageFromJSON('data/language_meta.json', 'language meta-analysis', 'hot and cold')
 viewer.paint()

Well, okay, there are some other dependencies and styling stuff you’re not seeing. But all of that stuff is included in the example folder here. And of course, you can modify any of the HTML/CSS you see in the example; the whole point is that you can now easily style the viewer however you want it, without having to worry about any of the app logic.

What’s also nice about this is that you can easily pick and choose which of the viewer’s features you want to include in your page; nothing will (or at least, should) break no matter what you do. So, for example, you could decide you only want to display a single view showing only axial slices; or to allow users to manipulate the threshold of layers but not their opacity; or to show the current position of the crosshairs but not the corresponding voxel value; and so on. All you have to do is include or exclude the various addSlider() and addData() lines you see above.

Of course, it wouldn’t be a mediocre open source project if it didn’t have some important limitations I’ve been hiding from you until near the very end of this post (hoping, of course, that you wouldn’t bother to read this far down). The biggest limitation is that the viewer expects images to be in JSON format rather than a binary format like NIFTI or Analyze. This is a temporary headache until I or someone else can find the time and motivation to adapt one of the JavaScript NIFTI readers that are already out there (e.g., Satra Ghosh‘s parser for xtk), but for now, if you want to load your own images, you’re going to have to take the extra step of first converting them to JSON. Fortunately, the core Neurosynth Python package has a img_to_json() method in the imageutils module that will read in a NIFTI or Analyze volume and produce a JSON string in the expected format. Although I’m pretty sure it doesn’t handle orientation properly for some images, so don’t be surprised if your images look wonky. (And more importantly, if you fix the orientation issue, please commit your changes to the repo.)

In any case, as long as you’re comfortable with a bit of HTML/CSS/JavaScript hacking, the example/ folder in the github repo has everything you need to drop the viewer into your own pages. If you do use this code internally, please let me know! Partly for my own edification, but mostly because when I write my annual progress reports to the NIH, it’s nice to be able to truthfully say, “hey, look, people are actually using this neat thing we built with taxpayer money.”

several half-truths, and one blatant, unrepenting lie about my recent whereabouts

February 15th, 2013 by Tal Yarkoni

Apparently time does a thing that is much like flying. Seems like just yesterday I was sitting here in this chair, sipping on martinis, and pleasantly humming old show tunes while cranking out several high-quality blog posts an hour a mediocre blog post every week or two. But then! Then I got distracted! And blinked! And fell asleep in my chair! And then when I looked up again, 8 months had passed! With no blog posts!

Granted, on the Badness Scale, which runs from 1 to Imminent Apocalypse, this one clocks in at a solid 1.04. But still, eight months is a long time to be gone–about 3,000 internet years. So I figured I’d write a short post about the events of the past eight months before setting about the business of trying (and perhaps failing) to post here more regularly. Also, to keep things interesting, I’ve thrown in one fake bullet. See if you can spot the impostor.

  • I started my own lab! You can tell it’s a completely legitimate scientific operation because it has (a) a fancy new website, (b) other members besides me (some of whom I admittedly had to coerce into ‘joining’), and (c) weekly meetings. (As far as I can tell, these are all the necessary requirements for official labhood.) I decided to call my very legitimate scientific lab the Psychoinformatics Lab. Partly because I like how it sounds, and partly because it’s vaguely descriptive of the research I do. But mostly because it results in a catchy abbreviation: PILab. (It’s pronounced Pieeeeeeeeeee lab–the last 10 e’s are silent.)
  • I’ve been slowly writing and re-writing the Neurosynth codebase. Neurosynth is a thing made out of software that lets neuroimaging researchers very crudely stitch together one giant brain image out of other smaller brain images. It’s kind of like a collage, except that unlike most collages, in this case the sum is usually not more than its parts. In fact, the sum tends to look a lot like its parts. In any case, with some hard work and a very large serving of good luck, I managed to land a R01 grant from the NIH last summer, which will allow me to continue stitching images for a few more years. From my perspective, this is a very good thing, for two reasons. FIrst, because it means I’m not unemployed right now (I’m a big fan of employment, you see); and secondly, because I’m finding the stitching surprisingly enjoyable. If you enjoy stitching software into brain images, please help out.
  • I published a bunch of papers in 2012, so, according to my CV at least, it was a good year for me professionally. Actually, I think it was a deceptively good year–meaning, I don’t think I did any more work than I did in previous years, but various factors (old projects coming to fruition, a bunch of papers all getting accepted at the same time, etc.) conspired to produce more publications in 2012. This kind of stuff has a tendency to balance out in fairly short order though, so I fully expect to rack up a grand total of zero publications in 2013.
  • I went to Iceland! And England! And France! And Germany! And the Netherlands! And Canada! And Austin, Texas! Plus some other places. I know many people spend a lot of their time on the road and think hopping across various oceans is no big deal, but, well, it is to me, so BACK OFF. Anyway, it’s been nice to have the opportunity to travel more. And to combine business and pleasure. I am not one of those people–I think you call them ‘sane’–who prefer to keep their work life and their personal life cleanly compartmentalized, and try to cram all their work into specific parts of the year and then save a few days or weeks here and there to do nothing but roll around on the beach or ski down frighteningly tall mountains. I find I’m happiest when I get to spend one part of the day giving a talk or meeting with some people to discuss the way the edges of the brain blur when you shake your head, and then another part of the day roaming around De Jordaan asking passers-by, in a stilted Dutch, “where can I find some more of those baby cheeses?”
  • On a more personal note (as the archives of this blog will attest, I have no shame when it comes to publicly divulging embarrassing personal details), my wife and I celebrated our fifth anniversary a few weeks ago. I think this one is called the congratulations, you haven’t killed each other yet! anniversary. Next up: the ten year anniversary, also known as the but seriously, how are you both still alive? decennial. Fortunately we’re not particularly sentimental people, so we celebrated our wooden achievement with some sushi, some sake, and only 500 of our closest friends an early bedtime (no, seriously–we went to bed early; that’s not a euphemism for anything).
  • I contracted a bad case of vampirism while doing some prospecting work in the Yukon last summer. The details are a little bit sketchy, but I have a vague suspicion it happened on that one occasion when I was out gold panning in the middle of the night under a full moon and was brutally attacked by a man-sized bat that bit me several times on the neck. At least, that’s my best guess. But, whatever–now that my disease is in full bloom, it’s not so bad any more. I’ve become mostly nocturnal, and I have to snack on the blood of an unsuspecting undergraduate student once every month or two to keep from wasting away. But it seems like a small price to pay in return for eternal life, superhuman strength, and really pasty skin.
  • Overall, I’m enjoying myself quite a bit. I recently read somewhere that people are, on average, happiest in their 30s. I also recently read somewhere else that people are, on average, least happy in their 30s. I resolve this apparent contradiction by simply opting to believe the first thing, because in my estimation, I am, on average, happiest in my 30s.

Ok, enough self-indulgent rambling. Looking over this list, it wasn’t even a very eventful eight months, so I really have no excuse for dropping the ball on this blogging thing. I will now attempt to resume posting one to two posts a month about brain imaging, correlograms, and schweizel units. This might be a good cue for you to hit the UNSUBSCRIBE button.

unconference in Leipzig! no bathroom breaks!

June 11th, 2012 by Tal Yarkoni

Südfriedhof von Leipzig [HDR]

Many (most?) regular readers of this blog have probably been to at least one academic conference. Some of you even have the misfortune of attending conferences regularly. And a still-smaller fraction of you scholarly deviants might conceivably even enjoy the freakish experience. You know, that whole thing where you get to roam around the streets of some fancy city for a few days seeing old friends, learning about exciting new scientific findings, and completely ignoring the manuscripts and reviews piling up on your desk in your absence. It’s a loathsome, soul-scorching experience. Unfortunately it’s part of the job description for most scientists, so we shoulder the burden without complaining too loudly to the government agencies that force us to go to these things.

This post, thankfully, isn’t about a conference. In fact, it’s about the opposite of a conference, which is… an UNCONFERENCE. An unconference is a social event type of thing that strips away all of the unpleasant features of a regular conference–you know, the fancy dinners, free drinks, and stimulating conversation–and replaces them with a much more authentic academic experience. An authentic experience in which you spend the bulk of your time situated in a 10′ x 10′ room (3 m x 3 m for non-Imperialists) with 10 – 12 other academics, and no one’s allowed to leave the room, eat anything, or take bathroom breaks until someone in the room comes up with a brilliant discovery and wins a Nobel prize. This lasts for 3 days (plus however long it takes for the Nobel to be awarded), and you pay $1200 for the privilege ($1160 if you’re a post-doc or graduate student). Believe me when I tell you that it’s a life-changing experience.

Okay, I exaggerate a bit. Most of those things aren’t true. Here’s one explanation of what an unconference actually is:

An unconference is a participant-driven meeting. The term “unconference” has been applied, or self-applied, to a wide range of gatherings that try to avoid one or more aspects of a conventional conference, such as high fees, sponsored presentations, and top-down organization. For example, in 2006, CNNMoney applied the term to diverse events including Foo Camp, BarCamp, Bloggercon, and Mashup Camp.

So basically, my description was accurate up until the part where I said there were no bathroom breaks.

Anyway, I’m going somewhere with this, I promise. Specifically, I’m going to Leipzig, Germany! In September! And you should come too!

The happy occasion is Brainhack 2012, an unconference organized by the creative minds over at the Neuro Bureau–coordinators of such fine projects as the Brain Art Competition at OHBM (2012 incarnation going on in Beijing right now!) and the admittedly less memorable CNS 2007 Surplus Brain Yard Sale (guess what–turns out selling human brains out of the back of an unmarked van violates all kinds of New York City ordinances!).

Okay, as you can probably tell, I don’t quite have this event promotion thing down yet. So in the interest of ensuring that more than 3 people actually attend this thing, I’ll just shut up now and paste the official description from the Brainhack website:

The Neuro Bureau is proud to announce the 2012 Brainhack, to be held from September 1-4 at the Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.

Brainhack 2012 is a unique workshop with the goals of fostering interdisciplinary collaboration and open neuroscience. The structure builds from the concepts of an unconference and a hackathon: The term “unconference” refers to the fact that most of the content will be dynamically created by the participants — a hackathon is an event where participants collaborate intensively on science-related projects.

Participants from all disciplines related to neuroimaging are welcome. Ideal participants span in range from graduate students to professors across any disciplines willing to contribute (e.g., mathematics, computer science, engineering, neuroscience, psychology, psychiatry, neurology, medicine, art, etc…). The primary requirement is a desire to work in close collaborations with researchers outside of your specialization in order to address neuroscience questions that are beyond the expertise of a single discipline.

In all seriousness though, I think this will be a blast, and I’m really looking forward to it. I’m contributing the full Neurosynth dataset as one of the resources participants will have access to (more on that in a later post), and I’m excited to see what we collectively come up with. I bet it’ll be at least three times as awesome as the Surplus Brain Yard Sale–though maybe not quite as lucrative.

 

 

p.s. I’ll probably also be in Amsterdam, Paris, and Geneva in late August/early September; if you live in one of these fine places and want to show me around, drop me an email. I’ll buy you lunch! Well, except in Geneva. If you live in Geneva, I won’t buy you lunch, because I can’t afford lunch in Geneva. You’ll buy yourself a nice Swiss lunch made of clockwork and gold, and then maybe I’ll buy you a toothpick.

R, the master troll of statistical languages

June 8th, 2012 by Tal Yarkoni

Warning: what follows is a somewhat technical discussion of my love-hate relationship with the R statistical language, in which I somehow manage to waste 2,400 words talking about a single line of code. Reader discretion is advised.

I’ve been using R to do most of my statistical analysis for about 7 or 8 years now–ever since I was a newbie grad student and one of the senior grad students in my lab introduced me to it. Despite having spent hundreds (thousands?) of hours in R, I have to confess that I’ve never set aside much time to really learn it very well; what basic competence I’ve developed has been acquired almost entirely by reading the inline help and consulting the Oracle of Bacon Google when I run into problems. I’m not very good at setting aside time for reading articles or books or working my way through other people’s code (probably the best way to learn), so the net result is that I don’t know R nearly as well as I should.

That said, if I’ve learned one thing about R, it’s that R is all about flexibility: almost any task can be accomplished in a dozen different ways. I don’t mean that in the trivial sense that pretty much any substantive programming problem can be solved in any number of ways in just about any language; I mean that for even very simple and well-defined tasks involving just one or two lines of code there are often many different approaches.

To illustrate, consider the simple task of selecting a column from a data frame (data frames in R are basically just fancy tables). Suppose you have a dataset that looks like this:

In most languages, there would be one standard way of pulling columns out of this table. Just one unambiguous way: if you don’t know it, you won’t be able to work with data at all, so odds are you’re going to learn it pretty quickly. R doesn’t work that way. In R there are many ways to do almost everything, including selecting a column from a data frame (one of the most basic operations imaginable!). Here are four of them:

 

I won’t bother to explain all of these; the point is that, as you can see, they all return the same result (namely, the first column of the ice.cream data frame, named ‘flavor’).

This type of flexibility enables incredibly powerful, terse code once you know R reasonably well; unfortunately, it also makes for an extremely steep learning curve. You might wonder why that would be–after all, at its core, R still lets you do things the way most other languages do them. In the above example, you don’t have to use anything other than the simple index-based approach (i.e., data[,1]), which is the way most other languages that have some kind of data table or matrix object (e.g., MATLAB, Python/NumPy, etc.) would prefer you to do it. So why should the extra flexibility present any problems?

The answer is that when you’re trying to learn a new programming language, you typically do it in large part by reading other people’s code–and nothing is more frustrating to a newbie when learning a language than trying to figure out why sometimes people select columns in a data frame by index and other times they select them by name, or why sometimes people refer to named properties with a dollar sign and other times they wrap them in a vector or double square brackets. There are good reasons to have all of these different idioms, but you wouldn’t know that if you’re new to R and your expectation, quite reasonably, is that if two expressions look very different, they should do very different things. The flexibility that experienced R users love is very confusing to a newcomer. Most other languages don’t have that problem, because there’s only one way to do everything (or at least, far fewer ways than in R).

Thankfully, I’m long past the point where R syntax is perpetually confusing. I’m now well into the phase where it’s only frequently confusing, and I even have high hopes of one day making it to the point where it barely confuses me at all. But I was reminded of the steepness of that initial learning curve the other day while helping my wife use R to do some regression analyses for her thesis. Rather than explaining what she was doing, suffice it to say that she needed to write a function that, among other things, takes a data frame as input and retains only the numeric columns for subsequent analysis. Data frames in R are actually lists under the hood, so they can have mixed types (i.e., you can have string columns and numeric columns and factors all in the same data frame; R lists basically work like hashes or dictionaries in other loosely-typed languages like Python or Ruby). So you can run into problems if you haphazardly try to perform numerical computations on non-numerical columns (e.g., good luck computing the mean of ‘cat’, ‘dog’, and ‘giraffe’), and hence, pre-emptive selection of only the valid numeric columns is required.

Now, in most languages (including R), you can solve this problem very easily using a loop. In fact, in many languages, you would have to use an explicit for-loop; there wouldn’t be any other way to do it. In R, you might do it like this*:

numeric_cols = rep(FALSE, ncol(ice.cream))
for (i in 1:ncol(ice.cream)) numeric_cols[i] = is.numeric(ice.cream[,i])

We allocate memory for the result, then loop over each column and check whether or not it’s numeric, saving the result. Once we’ve done that, we can select only the numeric columns from our data frame with data[,numeric_cols].

This is a perfectly sensible way to solve the problem, and as you can see, it’s not particularly onerous to write out. But of course, no self-respecting R user would write an explicit loop that way, because R provides you with any number of other tools to do the job more efficiently. So instead of saying “just loop over the columns and check if is.numeric() is true for each one,” when my wife asked me how to solve her problem, I cleverly said “use apply(), of course!”

apply() is an incredibly useful built-in function that implicitly loops over one or more margins of a matrix; in theory, you should be able to do the same work as the above two lines of code with just the following one line:

apply(ice.cream, 2, is.numeric)

Here the first argument is the data we’re passing in, the third argument is the function we want to apply to the data (is.numeric()), and the second argument is the margin over which we want to apply that function (1 = rows, 2 = columns, etc.). And just like that, we’ve cut the length of our code in half!

Unfortunately, when my wife tried to use apply(), her script broke. It didn’t break in any obvious way, mind you (i.e., with a crash and an error message); instead, the apply() call returned a perfectly good vector. It’s just that all of the values in that vector were FALSE. Meaning, R had decided that none of the columns in my wife’s data frame were numeric–which was most certainly incorrect. And because the code wasn’t throwing an error, and the apply() call was embedded within a longer function, it wasn’t obvious to my wife–as an R newbie and a novice programmer–what had gone wrong. From her perspective, the regression analyses she was trying to run with lm() were breaking with strange messages. So she spent a couple of hours trying to debug her code before asking me for help.

Anyway, I took a look at the help documentation, and the source of the problem turned out to be the following: apply() only operates over matrices or vectors, and not on data frames. So when you pass a data frame to apply() as the input, it’s implicitly converted to a matrix. Unfortunately, because matrices can only contain values of one data type, any data frame that has at least one string column will end up being converted to a string (or, in R’s nomenclature, character) matrix. And so now when we apply the is.numeric() function to each column of the matrix, the answer is always going to be FALSE, because all of the columns have been converted to character vectors. So apply() is actually doing exactly what it’s supposed to; it’s just that it doesn’t deign to tell you that it’s implicitly casting your data frame to a matrix before doing anything else. The upshot is that unless you carefully read the apply() documentation and have a basic understanding of data types (which, if you’ve just started dabbling in R, you may well not), you’re hosed.

At this point I could have–and probably should have–thrown in the towel and just suggested to my wife that she use an explicit loop. But that would have dealt a mortal blow to my pride as an experienced-if-not-yet-guru-level R user. So of course I did what any self-respecting programmer does: I went and googled it. And the first thing I came across was the all.is.numeric() function in the Hmisc package which has the following description:

Tests, without issuing warnings, whether all elements of a character vector are legal numeric values.

Perfect! So now the solution to my wife’s problem became this:

library(Hmisc)
apply(ice.cream, 2, all.is.numeric)

…which had the desirable property of actually working. But it still wasn’t very satisfactory, because it requires loading a pretty large library (Hmisc) with a bunch of dependencies just to do something very simple that should really be doable in the base R distribution. So I googled some more. And came across a relevant Stack Exchange answer, which had the following simple solution to my wife’s exact problem:

sapply(ice.cream, is.numeric)

You’ll notice that this is virtually identical to the apply() approach that crashed. That’s no coincidence; it turns out that sapply() is just a variant of apply() that works on lists. And since data frames are actually lists, there’s no problem passing in a data frame and iterating over its columns. So just like that, we have an elegant one-line solution to the original problem that doesn’t invoke any loops or third-party packages.

Now, having used apply() a million times, I probably should have known about sapply(). And actually, it turns out I did know about sapply–in 2009. A Spotlight search reveals that I used it in some code I wrote for my dissertation analyses. But that was 2009, back when I was smart. In 2012, I’m the kind of person who uses apply() a dozen times a day, and is vaguely aware that R has a million related built-in functions like sapply(), tapply(), lapply(), and vapply(), yet still has absolutely no idea what all of those actually do. In other words, in 2012, I’m the kind of experienced R user that you might generously call “not very good at R”, and, less generously, “dumb”.

On the plus side, the end product is undeniably cool, right? There are very few languages in which you could achieve so much functionality so compactly right out of the box. And this isn’t an isolated case; base R includes a zillion high-level functions to do similarly complex things with data in a fraction of the code you’d need to write in most other languages. Once you throw in the thousands of high-quality user-contributed packages, there’s nothing else like it in the world of statistical computing.

Anyway, this inordinately long story does have a point to it, I promise, so let me sum up:

  • If I had just ignored the desire to be efficient and clever, and had told my wife to solve the problem the way she’d solve it in most other languages–with a simple for-loop–it would have taken her a couple of minutes to figure out, and she’d probably never have run into any problems.
  • If I’d known R slightly better, I would have told my wife to use sapply(). This would have taken her 10 seconds and she’d definitely never have run into any problems.
  • BUT: because I knew enough R to be clever but not enough R to avoid being stupid, I created an entirely avoidable problem that consumed a couple of hours of my wife’s time. Of course, now she knows about both apply() and sapply(), so you could argue that in the long run, I’ve probably still saved her time. (I’d say she also learned something about her husband’s stubborn insistence on pretending he knows what he’s doing, but she’s already the world-leading expert on that topic.)

Anyway, this anecdote is basically a microcosm of my entire experience with R. I suspect many other people will relate. Basically what it boils down to is that R gives you a certain amount of rope to work with. If you don’t know what you’re doing at all, you will most likely end up accidentally hanging yourself with that rope. If, on the other hand, you’re a veritable R guru, you will most likely use that rope to tie some really fancy knots, scale tall buildings, fashion yourself a space tuxedo, and, eventually, colonize brave new statistical worlds. For everyone in between novice and guru (e.g., me), using R on a regular basis is a continual exercise in alternately thinking “this is fucking awesome” and banging your head against the wall in frustration at the sheer stupidity (either your own, or that of the people who designed this awful language). But the good news is that the longer you use R, the more of the former and the fewer of the latter experiences you have. And at the end of the day, it’s totally worth it: the language is powerful enough to make you forget all of the weird syntax, strange naming conventions, choking on large datasets, and issues with data type conversions.

Oh, except when your wife is yelling at gently reprimanding you for wasting several hours of her time on a problem she could have solved herself in 5 minutes if you hadn’t insisted that she do it the idiomatic R way. Then you remember exactly why R is the master troll of statistical languages.

 

 

* R users will probably notice that I use the = operator for assignment instead of the <- operator even though the latter is the officially prescribed way to do it in R (i.e., a <- 2 is favored over a = 2). That’s because these two idioms are interchangeable in all but one (rare) use case, and personally I prefer to avoid extra keystrokes whenever possible. But the fact that you can do even basic assignment in two completely different ways in R drives home the point about how pathologically flexible–and, to a new user, confusing–the language is.

what I’ve learned from a failed job search

May 29th, 2012 by Tal Yarkoni

For the last few months, I’ve been getting a steady stream of emails in my inbox that go something like this:

Dear Dr. Yarkoni,

We recently concluded our search for the position of Assistant Grand Poobah of Academic Sciences in the Area of Multidisciplinary Widget Theory. We received over seventy-five thousand applications, most of them from truly exceptional candidates whose expertise and experience would have been welcomed with open arms at any institution of higher learning–or, for that matter, by the governing board of a small planet. After a very careful search process (which most assuredly did not involve a round or two on the golf course every afternoon, and most certainly did not culminate in a wild injection of an arm into a hat filled with balled-up names) we regret to inform you that we are unable to offer you this position. This should not be taken to imply that your intellectual ability or accomplishments are in any way inferior to those of the person who we ultimately did offer the position to (or rather, persons–you see, we actually offered the job to six people before someone accepted it); what we were attempting to optimize, we hope you understand, was not the quality of the candidate we hired, but a mythical thing called ‘fit’ between yourself and ourselves. Or, to put it another way, it’s not you, it’s us.

We wish you all the best in your future endeavors, and rest assured that if we have another opening in future, we will celebrate your reapplication by once again balling your name up and tossing it into a hat along with seventy-five thousand others.

These letters are typically so warm and fuzzy that it’s hard to feel bad about them. I mean, yes, they’re basically telling me I failed at something, but then, how often does anyone ever actually tell me I’m an impressive, accomplished, human being? Never! If every failure in my life was accompanied by this kind of note, I’d be much more willing to try new things. Though, truth be told, I probably wouldn’t try very hard at anything; it would be worth failing in advance just to get this kind of affirmation.

Anyway, the reason I’ve been getting these letters, as you might surmise, is that I’ve been applying for academic jobs. I’ve been doing this for two years now, and will be doing it for a third year in a row in a few months, which I’m pretty sure qualifies me a world-recognized expert on the process. So in the interest of helping other people achieve the same prowess at failing to secure employment, I’ve decided to share some of the lessons I’ve learned here. This missive comes to you with all of the standard caveats and qualifiers–like, for example, that you should be sitting down when you read this; that you should double-check with people you actually respect to make sure any of this makes sense; and, most importantly, that you’ve completely lost your mind if you try to actually apply any of this ‘knowledge’ to your own personal situation. With that in mind, Here’s What I’ve Learned:

1. The academic job market is really, really, bad. No, seriously. I’ve heard from people at several major research universities that they received anywhere from 150 to 500 applications for individual positions (the latter for open-area positions). Some proportion of these applications come from people who have no real shot at the position, but a huge proportion are from truly exceptional candidates with many, many publications, awards, and glowing letters of recommendation. People who you would think have a bright future in research ahead of them. Except that many of them don’t actually have a bright future in research ahead of them, because these days all of that stuff apparently isn’t enough to land a tenure-track position–and often, isn’t even enough to land an interview.

Okay, to be fair, the situation isn’t quite that bad across the board. For one thing, I was quite selective about my job search this past year. I applied for 22 positions, which may sound like a lot, but there were a lot of ads last year, and I know people with similar backgrounds to mine who applied to 50 – 80 positions and could have expanded their searches still further. So, depending on what kind of position you’re aiming for–particularly if you’re interested in a teaching-heavy position at a small school–the market may actually be quite reasonable at the moment. What I’m talking about here really only applies to people looking for research-intensive positions at major research universities. And specifically, to people looking for jobs primarily in the North American market. I recognize that’s probably a minority of people graduating with PhDs in psychology, but since it’s my blog, you’ll have to live with my peculiar little biases. With that qualifier in mind, I’ll reiterate again: the market sucks right now.

2. I’m not as awesome as I thought I was. Lest you think I’ve suddenly turned humble, let me reassure you that I still think I’m pretty awesome–and I can back that up with hard evidence, because I currently have about 20 emails in my inbox from fancy-pants search committee members telling me what a wonderful, accomplished human being I am. I just don’t think I’m as awesome as I thought I was a year ago. Mind you, I’m not quite so delusional that I expected to have my choice of jobs going in, but I did think I had a decent enough record–twenty-odd publications, some neat projects, a couple of major grant proposals submitted (and one that looks very likely to get funded)–to land at least one or two interviews. I was wrong. Which means I’ve had to take my ego down a peg or two. On balance, that’s probably not a bad thing.

3. It’s hard to get hired without a conventional research program. Although I didn’t get any interviews, I did hear back informally from a couple of places (in addition to those wonderful form letters, I mean), and I’ve had hallway conversations with many people who’ve sat on search committees before. The general feedback has been that my work focuses too much on methods development and not enough on substantive questions. This doesn’t really come as a surprise; back when I was putting together my research statement and application materials, pretty much everyone I talked to strongly advised me to focus on a content area first and play down my methods work, because, they said, no one really hires people who predominantly work on methods–at least in psychology. I thought (and still think) this is excellent advice, and in fact it’s exactly the same advice I give to other people if they make the mistake of asking me for my opinion. But ultimately, I went ahead and marketed myself as a methods person anyway. My reasoning was that I wouldn’t want to show up for a new job having sold myself as a person who does A, B, and C, and then mostly did X, Y, and Z, with only a touch of A thrown in. Or, you know, to put it in more cliched terms, I want people to like me for meeeeeee.

I’m still satisfied with this strategy, even if it ends up costing me a few interviews and a job offer or two (admittedly, this is a bit presumptuous–more likely than not, I wouldn’t have gotten any interviews this time around no matter how I’d framed my application). I do the kind of work I do because I enjoy it and think it’s important; I’m pretty happy where I am, so I don’t feel compelled to–how can I put this diplomatically–fib to search committees. Which isn’t to say that I’m laboring under any illusion that you always have to be completely truthful when applying for jobs; I’m fully aware that selling yourself framing your application around your strengths–and telling people what they want to hear to some extent–is a natural and reasonable thing to do. So I’m not saying this out of any bitterness or naivete; I’m just explaining why I chose to go the honest route that was unlikely to land me a job as opposed to the slightly less honest route that was very slightly more likely to land me a job.

4. There’s a large element of luck involved in landing an academic job. Or, for that matter, pretty much any other kind of job. I’m not saying it’s all luck, of course; far from it. In practice, a single group of maybe three dozen people seem end up filling the bulk of interview slots at major research universities in any given year. Which is to say, while the majority of applicants will go without any interviews at all, some people end up with a dozen or more of them. So it’s clearly very far from a random process; in the long run, better candidates are much more likely to get jobs. But for any given job, the odds of getting an interview and/or job offer depend on any number of factors that you have little or no control over: what particular area the department wants to shore up; what courses need to be taught; how your personality meshes with the people who interview you; which candidate a particular search committee member idiosyncratically happens to take a shining to, and so on. Over the last few months, I’ve found it useful to occasionally remind myself of this fact when my inbox doth overfloweth with rejection letters. Of course, there’s a very thin line between justifiably attributing your negative outcomes to bad luck and failing to take responsibility for things that are under your control, so it’s worth using the power of self-serving rationalization sparingly.

 

In any case, those vacuous observations lessons aside, my plan at this point is still to keep doing essentially the same thing I’ve done the last two years, which consists of (i) putting together what I hope is a strong, if somewhat unconventional, application package; (ii) applying for jobs very selectively–only to places that I think I’d be as happy or happier at than I am in my current position; and (iii) otherwise spending as little of my time as possible thinking about my future employment status, and as much of it as possible concentrating on my research and personal life.

I don’t pretend to think this is a good strategy in general; it’s just what I’ve settled on and am happy with for the moment. But ask me again a year from now and who knows, maybe I’ll be roaming around downtown Boulder fishing quarters out of the creek for lunch money. In the meantime, I hope this rather uneventful report of my rather uneventful job-seeking experience thus far is of some small use to someone else. Oh, and if you’re on a search committee and think you want to offer me a job, I’m happy to negotiate the terms of my employment in the comments below.

Big Pitch or Big Lottery? The unenviable task of evaluating the grant review system

May 26th, 2012 by Tal Yarkoni

This week’s issue of Science has an interesting article on The Big Pitch–a pilot NSF initiative to determine whether anonymizing proposals and dramatically cutting down their length (from 15 pages to 2) has a substantial impact on the results of the review process. The answer appears to be an unequivocal yes. From the article:

What happens is a lot, according to the first two rounds of the Big Pitch. NSF’s grant reviewers who evaluated short, anonymized proposals picked a largely different set of projects to fund compared with those chosen by reviewers presented with standard, full-length versions of the same proposals.

Not surprisingly, the researchers who did well under the abbreviated format are pretty pleased:

Shirley Taylor, an awardee during the evolution round of the Big Pitch, says a comparison of the reviews she got on the two versions of her proposal convinced her that anonymity had worked in her favor. An associate professor of microbiology at Virginia Commonwealth University in Richmond, Taylor had failed twice to win funding from the National Institutes of Health to study the role of an enzyme in modifying mitochondrial DNA.

Both times, she says, reviewers questioned the validity of her preliminary results because she had few publications to her credit. Some reviews of her full proposal to NSF expressed the same concern. Without a biographical sketch, Taylor says, reviewers of the anonymous proposal could “focus on the novelty of the science, and this is what allowed my proposal to be funded.”

Broadly speaking, there are two ways to interpret the divergent results of the standard and abbreviated review. The charitable interpretation is that the change in format is, in fact, beneficial, inasmuch as it eliminates prior reputation as one source of bias and forces reviewers to focus on the big picture rather than on small methodological details. Of course, as Prof-Like Substance points out in an excellent post, one could mount a pretty reasonable argument that this isn’t necessarily a good thing. After all, a scientist’s past publication record is likely to be a good predictor of their future success, so it’s not clear that proposals should be anonymous when large amounts of money are on the line (and there are other ways to counteract the bias against newbies–e.g., NIH’s approach of explicitly giving New Investigators a payline boost until they get their first R01). And similarly, some scientists might be good at coming up with big ideas that sound plausible at first blush and not so good at actually carrying out the research program required to bring those big ideas to fruition. Still, at the very least, if we’re being charitable, The Big Pitch certainly does seem like a very different kind of approach to review.

The less charitable interpretation is that the reason the ratings of the standard and abbreviated proposals showed very little correlation is that the latter approach is just fundamentally unreliable. If you suppose that it’s just not possible to reliably distinguish a very good proposal from a somewhat good one on the basis of just 2 pages, it makes perfect sense that 2-page and 15-page proposal ratings don’t correlate much–since you’re basically selecting at random in the 2-page case. Understandably, researchers who happen to fare well under the 2-page format are unlikely to see it that way; they’ll probably come up with many plausible-sounding reasons why a shorter format just makes more sense (just like most researchers who tend to do well with the 15-page format probably think it’s the only sensible way for NSF to conduct its business). We humans are all very good at finding self-serving rationalizations for things, after all.

Personally I don’t have very strong feelings about the substantive merits of short versus long-format review–though I guess I do find it hard to believe that 2-page proposals could be ranked very reliably given that some very strange things seem to happen with alarming frequency even with 12- and 15-page proposals. But it’s an empirical question, and I’d love to see relevant data. In principle, the NSF could have obtained that data by having two parallel review panels rate all of the 2-page proposals (or even 4 panels, since one would also like to know how reliable the normal review process is). That would allow the agency to directly quantify the reliability of the ratings by looking at their cross-panel consistency. Absent that kind of data, it’s very hard to know whether the results Science reports on are different because 2-page review emphasizes different (but important) things, or because a rating process based on an extended 2-page abstract just amounts to a glorified lottery.

Alternatively, and perhaps more pragmatically, NSF could just wait a few years to see how the projects funded under the pilot program turn out (and I’m guessing this is part of their plan). I.e., do the researchers who do well under the 2-page format end producing science as good as (or better than) the researchers who do well under the current system? This sounds like a reasonable approach in principle, but the major problem is that we’re only talking about a total of ~25 funded proposals (across two different review panels), so it’s unclear that there will be enough data to draw any firm conclusions. Certainly many scientists (including me) are likely to feel a bit uneasy at the thought that NSF might end up making major decisions about how to allocate billions of dollars on the basis of two dozen grants.

Anyway, skepticism aside, this isn’t really meant as a criticism of NSF so much as an acknowledgment of the fact that the problem in question is a really, really difficult one. The task of continually evaluating and improving the grant review process is not one anyone should want to take on lightly. If time and money were no object, every proposed change (like dramatically shortened proposals) would be extensively tested on a large scale and directly compared to the current approach before being implemented. Unfortunately, flying thousands of scientists to Washington D.C. is a very expensive business (to say nothing of all the surrounding costs), and I imagine that testing out a substantively different kind of review process on a large scale could easily run into the tens of millions of dollars. In a sense, the funding agencies can’t really win. On the one hand, if they only ever pilot new approaches on a small scale, they never get enough empirical data to confidently back major changes in policy. On the other hand, if they pilot new approaches on a large scale and those approaches end up failing to improve on the current system (as is the fate of most innovative new ideas), the funding agencies get hammered by politicians and scientists alike for wasting taxpayer money in an already-harsh funding climate.

I don’t know what the solution is (or if there is one), but if nothing else, I do think it’s a good thing that NSF and NIH continue to actively tinker with their various processes. After all, if there’s anything most researchers can agree on, it’s that the current system is very far from perfect.