* Data partners: young scientists have a partner in another lab, with whom they swap data. The goal is to see if their data documentation is good enough that their partner can reproduce their main analysis with minimal interaction.
* Five year plan: When a project is part-way through, students must give a brief report that details what they have done to insure that the data and analyses will be comprehensible to members of the lab in five-year's time, after they have left.
* Submission check: At first submission of an article based on the project, advisors should discuss with their advisees the pros and cons of opening their data, and how the data will be promoted online, if it will be open.
Betrayed by our habits
Other habits, however, can keep us from doing better science. Scientists value openness, at least in the abstract. Many scientists have had the frustrating experience of *closed* science: for instance, colleagues that do not share their data. Yet, most science is not open, in spite of the fact that many tools to facilite open science are freely available.
To us, the reasons seems obvious. Open science does not bring great immediate reward, and open practices are not part of most scientists' habits. This is natural; many scientists were trained before openness was easy and expected. Our habits were formed without an expectation, for instance, that our data would be open to everyone. Analyses are messy, badly documented, and full of ad hoc solutions to problems that we decided to improve later. If you weren't expecting data to be open, then making it so requires work.
When we are faced with opening our data at paper submission or publication, then, our habits betray us. Our values may say "we should be open", but our real choice is *not* between open science and closed science; it is between "hours of work now with uncertain payoff" versus "no work now, and maybe no one will ask for it." The result is not a free choice about open science. Our habits have encumbered our choice with irrelevent issues, such as "I don't feel like doing this work right now. I'll do something else more fun," and everything else is more fun.
If we had habits that were more attuned to the expectation of scientific openness, we might be able to do better. Forming such habits later in a career takes work, but forming them early in one's career is much easier. We suggest here a number of things that senior researchers who run labs can do to help their advisees build better habits. None of these things require much work, but we believe that they can help ensure the next generation of scientists has better habits than the current one.
Helping young scientists form better habits: three ideas
* Require minimal effort on the part of an advisor.
* They require little *marginal* effort from a young scientist. They may even save effort, since they will encourage good practices and help avoid mistakes.
* They encourage development of a "theory of scientific mind": How do other scientists think about data and materials? What would they expect of a data set? Will others understand what I've done?
* They help young scientists *truly* have a choice about whether to be open. By the time the choice must be made, no extra work is necessary. The decision can be driven by the arguments in favor, or against, open science, instead of mere momentary pragmatic concerns.
These are roughly ordered in where they would appear in an advisees training. We should emphasize that none of these require an advisor promoting them. Young scientists can do these things without their advisor's support, to help build good habits.
When collecting and analyzing their data, students should plan to share their data with their data partner with a short report containing an initial methods section, and a primary analysis (but without numbers). The data partner will be expected to reproduce the primary analysis *without* interaction with the student. The data will need to be well-documented, and the analysis sufficiently detailed, for the primary analysis to be reproducible. Details such as how the data are to be cleaned will be critical.
Once the data partner has attempted to reproduce the primary analysis, the two can discuss what was lacking. What could have been more clear? If the results could not be replicated, why? This will build the students' understanding of data analysis, develop their theory of scientific mind, and catch many mistakes early in a project. As a side benefit, the student now has created substantial documentation of their data set: precisely the information necessary for releasing data to others.
The five-year plan
A lab runs on data; old materials (including stimuli), data, and analyses should be archived in a clear way so that if someone from the lab, years later, wants to use the materials or data, or wants to reproduce the analysis, they should be able to do so. This is part of being a good lab citizen.
When a project is mature, advisers should give time to the student in a lab meeting to answer the question "What have you done to insure that this project -- including the materials, data, and statistical analysis -- will be useable in five years?" This encourages students to think of the long-term usefulness of their data to others. Over the years, a formal meeting may be come unnecessary as lab standards become more geared toward openness.
The submission check
If it is decided to open the data, then the next question should be, "How will you promote these data and materials from the project?" For a young scientist thinking of the next stage of their career, promotion is critical. One of the advantages of open data is that it yields another product of the research that can be promoted; open materials and open code provide others. The student should be encouraged to think about how these can be leveraged to their advantage, and to follow through with their promotion ideas.