Analysis
Programming
Generating figures
Doing statistics
fMRI
Literature search
Writing papers
Meetings
Quick References
-
Start with a clear and universal directory structure for organizing your analysis, data, figures, etc. Here is a template you can follow for a transparent directory structure.
-
Atom is very powerful and free text editor that integrates seamlessly with github. Use it for writing experimental code, scanner code, bash scripts and so on. (Hint: Run scripts from atom with package "script")
-
Use jupyter notebooks for development and for analysis pipelines. Install Kyle Dunovan's jupyter themes to make your notebooks pretty and work faster.
-
Make an "autopilot" script for your analyses, so that figures (and even posters if you are feeling ambitious) are updated in real time while the data is collected. Write a cron job to execute an autopilot script that integrates newly collected data and updates analyses perhaps with an email summarizing the results sent to you or your advisor. You can find some autpilot examples here.
-
Make a startup file for your jupyter notebooks that preloads modules like numpy and scipy to save you time and also so that your figures are always publication quality, from the get go, without modification. The config file can specify font sizes, legends, color themes etc.
-
Start using github. It is excellent for version control and for sharing (instead of having analysis_v4_p3.2_final.py you just have analysis.py). Other researchers can replicate exactly what you did. This will save you time, if someone emails you for example.
-
Need to sync files across your various lab computers/clusters and laptop you use at home and don't want to use Dropbox? Use rsync instead. e.g:
rsync -zavr -e ssh --delete --include '*/' --include='*include_these_files.[ext]' --exclude='*' [local_dir] [remote_server]:[remote_dir]
-
You or your lab may be most familiar with Matlab. It is worth considering a switch to Python. Python offers simpler syntax, enables system wide interfacing, is open source, free and for these reasons is being used by more and more scientists. Replication is far easier with Python than Matlab.
-
Thomas Wiecki provides a great introduction to becoming a python data ninja.
-
Python Data Science Handbook by Jake VanderPlas. An excellent introduction to IPython, NumPy, Pandas, Matplotlib, SciKitLearn + Machine Learning for anyone who has rudimentary Python skills, and is working towards mastering the datas. These notebooks are absolutely free and contain the entire book!
-
Anaconda provides a scientific distribution of python that enables high performance computing and analysis.
-
Become a pro at bash shortcuts- it will seriously save you a lot of time.
-
Simulate data and make sure that your analysis works the way you think that it is working.
-
Use hotkeys for google, gmail, atom, & jupyter notebooks. Consider a mechanical keyboard so your labmates love you, then hotkey some more.
-
Not sure how to code something? It may have an answer on stack overflow.
-
Access anything or anywhere on your computer with minimal effort using Keyboard launchers like Albert for linux and Alfred for mac.
-
Data visualization has been made very easy with matplotlib and a library called seaborn
-
Save your figures in svg, or eps. pngs do not scale well and are impossible to modify.
-
Learn to love Bayesian statistics, if you don't already. This is an introduction on bayesian vs. frequentism written by Jake Vanderplas, an astrophysicist and python developer.
-
Beware of p-values and null hypothesis significance testing in general. Read this paper for some of the problems with p-values if you are not familiar with the controversy. Here and here Andrew Gelman sheds light on ways to proceed.
-
As soon as possible, understand bootstraping, cross-validation and permutation tests. Here are some lecture notes that look at these topics in the context of multivariate pattern analysis in fMRI.
-
Rob Kass, CMU statistics faculty, has written Ten Simple Rules for Effective Statistical Practice. They are extremely useful.
-
Are you still not using hierarchical Bayes? Everything is a trivial case of hierarchical Bayesian inference. Thomas Wiecki will show you the way.
-
Frequent datatau for interesting news on data analysis
-
Your stats question may have an answer over here on cross-validated
-
See these Ten simple rules for structuring papers, written by Konrad Kording and Brett Mensh.
-
Share your work with your friends as well as your enemies, the latter might give you even better criticism.
-
Organize your dataset using the BIDS format - this will make your data more accessible to both your collaborators and the field at large.
-
If you can not write down the general linear model you are using from scratch and solve it in closed form, learn how.
-
PyCortex provides excellent visualization of your results project on the surface and dynamically generated in your browser.
-
Before you start down some major project that you will be committed to for years, understand the current literature in your topic. Understand very clearly why you are going to do what you are going to do.
-
Find articles before they are officially published on arxiv
-
You can search the literature with Pubmed & Google-scholar
-
You will need a citation manager early on, PaperPile is a good one that is well integrated with Pubmed
- @tdverstynen cognitive neuroscience, imaging (DSI/fMRI)
- @KordingLab neural data science, computational modeling,
- @StatModeling statistics, hierarhical bayesian modeling, R
- @Neuro_Skeptic neuroscience in general
- @NKriegeskorte fMRI, RSA, deep learning
- @jakevdp python, data analysis, astrophysics
- @fonnesbeck statistical analysis in python
-
Never show up empty handed to meetings with your PI.
-
Have a clear objective to all meetings that everyone else knows as well.
-
Be able to show some evidence of your productivity.
-
You will have some days or weeks where nothing worked. I found that in those cases it is productive to have a "rainy day" folder containing interesting analyses/figures you have not yet shown.
- Matrix Cookbook: A useful reference for facts about matrices.