This section documents the technical standards, workflows, and review practices that TAs and mentors are expected to enforce in the Data Science Clinic.
Python Best Practices
Common Python pitfalls and language-specific patterns TAs should watch for: mutable defaults, scoping issues, error handling.
Code Style and Structure
Guidelines for organizing code, naming conventions, documentation standards, and professional presentation.
Data Validation and Types
Type annotations and Pydantic validation patterns for ensuring data quality and code clarity.
Configuration Management
Managing settings, secrets, and environment-specific configuration with Pydantic Settings.
Debugging and Testing
Disciplined debugging practices and lightweight testing approaches for data science projects.
Code Review
Guidelines for TAs when reviewing pull requests and student repos.
AI Usage Guidelines
Guidelines for handling AI-assisted code in student projects and code reviews.
GitHub Repo Management
Branching strategies, PR workflows, and common scenarios TAs encounter with student repositories.
Docker and Make
Container setup patterns and Make targets for reproducible data science workflows.
Dependencies and Environments
Managing Python dependencies, Docker environments, and reproducible setups.
Paths and I/O in Containers
Path handling in Docker environments and separating I/O operations from computation logic.
Interview Exercise
Code review prompt used in TA interviews and training.
Examples
Small runnable setups for TAs to practice before working with students.