Webpage for the University of Chicago Data Science Clinic
Hosted on GitHub Pages — Theme by orderedlist
This document contains information on how to prepare you computer for the data science clinic. Note that if you do not have these set up properly your grade may be penalized.
Importantly there may be alternatives to the software listed below that has similar functionality. In the case of you using an alternative you will not receive support from the clinic staff/TAs/etc. We strongly recommend you use the options below.
You need to have access to a command line terminal for many of the tools that are used. If you have a Mac you can find the command line / terminal using the terminal
application.
On Windows machines you will need to install Windows Subsystem for Linux (“WSL”) and Ubuntu. To do this, follow the instructions here. Importantly windows has a terminal called PowerShell which is not the same as a unix terminal. If you aren’t sure which one you are running, the windows version’s prompt will generally looks something like C:\
.
Verification: Make sure that you can open your terminal app and type in the following without getting an error:
/bin/bash --version
Additional Windows Verification: Make sure that you can complete the above and open PowerShell in your terminal app and run:
wsl printf 'Default shell: $0\nUsername: $USER\nHome Directory: $(cd ~ && pwd)'
This should generate a return of:
Default shell: /bin/bash
Username: YOUR_WSL_USERNAME
Home Directory: /home/YOUR_WSL_USERNAME
Where YOUR_WSL_USERNAME
is the username you picked when setting up WSL. It should not be root
If one of these is incorrect, please go to troubleshooting instructions
Additionally, open File Explorer, scroll to the bottom left, select ‘Linux’, ‘Ubuntu’, ‘home’, then right click on your username and select ‘Pin to Quick Access’. Now your ubuntu home directory should appear in the top/middle left of file explorer.
Our default IDE is Visual Studio Code. For both PC and Macs you need to follow the link here.
Verification: Make sure that you can open Visual Studio Code and can open and save a file.
We expect students to have access to command line / terminal versions of git. While there are visual ways to access git (such as TortoiseGit, etc.) we expect git to be available on the command line when debugging issues.
More information on git can be found here. Note that git (probably) will not need to be installed as it is frequently installed as part of another package.
Verification: Make sure that you can open your terminal app and type in the following without getting an error:
git --version
On Mac you should install docker desktop which is relatively straightforward to install.
On Windows you will need to follow the instructions here for how to install docker on WSL.
Verification: Open up your terminal and type in the following command. If it returns without an error then Docker is installed.
docker --version
Many projects use make as a way to simplify project development.
On both Mac and WSL systems with Ubuntu make (should) be installed by default. To verify check the instructions at the end of this section.
If not present on WSL/Ubuntu systems you will need to install the build-essentials
package, which can be done by typing the following at the command prompt or using the Ubuntu installer:
sudo apt-get install build-essential
On Mac systems, make will also generally be installed, but if it is not then type
Verification: Open your terminal and type in the following command. If it returns without an error than make is installed:
make --version
We will use SSH keys to authenticate to github and (if applicable) the DSI Cluster. Please see these instructions.
Verification: Open your terminal and type the following command:
ssh -T git@github.com
If this returns your username and something to the effect of You've Successfully Authenticated
then it has worked.
If you need to access the cluster then you will need to request an account (which should have already been done for you). You will then need to set up SSH keys and verify that you can SSH into the machine. Note that the step-by-step instructions for how to do this are included in the same SSH Keys docs as in section 6.
Note that as part of these instructions you will add your SSH key to github. This is a required part of this process.
Verification: Open your terminal and type in the following command:
ssh fe.ds
If you have set this up correctly you should be connected to the AI cluster and see something like CNET@fe01:~$
. After this, verify you set up ssh keys correctly:
ssh-add -l
ssh -T git@github.com
These commands should return something like 256 SHA256:sdlfjkwljflsdfkjs;flkjs;lfj user@host (ED25519)
and Hi USERNAME! You've successfully authenticated ...
After this, to verify that you have access to the cluster, type in the following at that prompt:
srun -p general --pty /bin/bash
If the above command works then you should see something like CNET@g007:~$
as the prompt. NOTE: you may get the error srun: error: Lookup failed: Unknown host
, but you can ignore it. If you are NOT properly set up you will see srun: error: Unable to allocate resources: Invalid account or account/partition combination specified
.
Make sure to type in exit
when you are done!
You should have your GitHub repository cloned to the correct location(s).
Verification: Open your GitHub repository in VS Code.