The python data pipeline defined for BrodyLab.
- Request access to DataJoint: https://frevvo-prod.princeton.edu/frevvo/web/tn/pu.nplc/u/84fd5e8d-587a-4f6a-a802-0c3d2819e8fe/app/_sO14QHzSEemyQZ_M7RLPOg/formtype/_-XYdEEK2Eeqtf7JjRFmYDQ/popupform
- Install conda on your system: https://conda.io/projects/conda/en/latest/user-guide/install/index.html
- If running in Windows get git
- (Optional for ERDs) Install graphviz
- Open a new terminal
- Clone this repository: either with https (
https://github.com/Brody-Lab/bl_pipeline_python.git
) or ssh ([email protected]:Brody-Lab/bl_pipeline_python.git
)- If you cannot clone repositories with ssh, set keys overview, set keys windows
- Create a conda environment:
conda create -n bl_pipeline_python_env python==3.10
- the name can be changed, just keep it consistent for your kernal below
- this repository has been updated to use python 3.10 (3.7 was previously in use)
- Activate environment:
conda activate bl_pipeline_python_env
. (Activate environment each time you use the project) - Change directory to this repository
cd bl_pipeline_python
. - Install the primary required libraries
pip install -e .
- this will take a few minutes
- Install jupyter and ipykernal libraries in series
conda install -c conda-forge jupyterlab conda install -c anaconda ipykernel python -m ipykernel install --user --name=bl_pipeline_python_env # allows you to run notebooks on environment kernal
- Run the Configuration notebook under
notebooks/00-datajoint-configuration.ipynb
- make sure you select the correct kernal from the top right menu
- If you have install issues, especially on a windows machine, see here
- Additional libraries you may want to install:
pip install seaborn pyarrow openpyxl
We have created some tutorial notebooks to help you start working with datajoint
- Querying data (Strongly recommended)
jupyter notebook notebooks/tutorials/Explore_Sessions_Data.ipynb
jupyter notebook notebooks/tutorials/1-Explore U19 data pipeline with DataJoint.ipynb
-
This portion of the workflow usings DataJoint's standard python package element-array-ephys.
-
This workflow uses the
ephys_chronic
module fromelement-array-ephys
.
bl_pipeline_python/datajoint01_pipeline/process/process.py
- Copy data from source tables (to shadow tables) to new tables
- Shadow table allows for renaming of primary key
- Shadow table has same definition of the new table, except that the primary key is the same as the source table
- For each shadow table set the keys as a secondary field when not used as primary key
bl_pipeline_python/notebooks/debugging_data_integrity.ipynb