Template CLAUDE.md Templates

Data Science CLAUDE.md

CLAUDE.md template for data science projects with notebook conventions, data handling, and reproducibility.

python
Insert label Data Science
Prompt

Project Instructions

Notebooks

  • Keep notebooks focused on one analysis or experiment.
  • Clear all outputs before committing. Don’t commit cell outputs to git.
  • Move reusable logic into .py modules. Notebooks are for exploration and presentation, not library code.

Data

  • Never commit raw data to the repo. Document where to get it and how to set it up.
  • Use relative paths for data files. Don’t hardcode absolute paths.
  • Document data schemas and assumptions in comments or a data dictionary.

Reproducibility

  • Pin all dependencies with exact versions.
  • Set random seeds for any stochastic process.
  • Document the steps to reproduce results from scratch.

Code Quality

  • Type hints on functions. Docstrings on anything non-obvious.
  • Use pandas and numpy idiomatically. Avoid row-by-row loops when vectorized operations work.
  • Keep data transformations in named functions, not inline chains that span 20 lines.

Git

  • Write descriptive commit messages: “add feature importance analysis” not “update notebook.”
  • Use .gitignore for data files, model artifacts, and notebook checkpoints.

Use this claude.md template with Crystl.

Get Crystl