Why do we need version control?#
Because this….#
… leads to this#
Version management best practices#
Why is version management important?#
Possible to revert back to a working version if things broke.
Benefit team collaboration.
Improve efficiency.
How should we manage changes?#
Keeping track of changes:#
Back up (almost) everything created by a human as soon as it is created.
Keep changes small.
Share changes frequently.
Create, maintain and use a checklist for saving and sharing changes to the project.
Store each project in a folder that is mirrored off the researchers’ working machine.
This list comes from “Keeping track of changes” in swcarpentry’s paper good-enough practices in scientific computing.
Exercise 1: Manual versioning#
Versions can be managed either by hand or by using a Version Control System (VCS). To illustrate the workings of a VCS we start an excercise using manual versioning. The goals of this excercise are:
Practice with versioning best practices
Understand the limitations of manual version management
1A Setting up the project#
We have set up a shared folder on the JupyterHub used for this course that is accessible to all participants of the course.
Go to the shared folder and create a folder named
simple_trigonometry_YOURNAME
where you replace YOURNAME with your name. This folder is your project folder.Add a file called
CHANGELOG.txt
to your project folder with timestamped changes to your project.Create a subfolder called
current
which is your latest version of your project.
1B Single user version tracking#
Whenever you make a significant change
Copy the entire project (
current
folder) to a directory that is datetimestamped.Update
CHANGELOG.txt
with a timestamped note on the changes.
This will result in your project folder looking like this:
.
|-- project_name
| -- current
| -- ...project content as described earlier...
| -- 171106_130000
| -- ...content of 'current' on Nov 6, 2017 1pm
| -- 171108_110000
| -- ...content of 'current' on Nov 8, 2017 11am
And your CHANGELOG.txt
to look something like this:
## 2016-04-08
* Switched to cubic interpolation as default.
* Moved question about family's TB history to end of questionnaire.
## 2016-04-06
* Added option for cubic interpolation.
* Removed question about staph exposure (can be inferred from blood test results).
Create a new file called
test.py
Add the text
print('hello world')
1C Practice basic version control using trigonometry#
Add your changes every time you finish a bulletpoint.
Add a function to
test.py
to calculate the circumference of a circle. Add your changes.Add a function to
test.py
to calculate the surface area of a circle. Add your changes.Create a new file called
script.py
that is empty. Add your changes.Add some print statement to
script.py
and execute it. Add your changes.Import
test.py
intoscript.py
and call the functions intest.py
and print the output. Add your changes.
1D collaborating on a project, resolving conflicts#
Work with your partner#
Both agree on which of the two project folders you will continue to work. for the rest of this exercise both of you will work in a single project folder.
Creating and resolving a conflict#
Person A make a temporary copy of
test.py
calledtest_A.py
, and person B make a temporary copy oftest.py
calledtest_B.py
.Person A and B each edit their temporary copies:
Person A adds a docstring to the function that calculates the circumference of the circle.
Person B adds a docstring to the function that calculates the surface of the circle.
Person A and B now collaborate to merge the files
test_A.py
andtest_B.py
, so as to incorporate both their individual changes, and save the result into the original filetest.py
.Think about how you would do this if each of you were making more complex changes. What about if you were both editing the same lines?
More practice (optional)#
Both work on the same repository, use
script.py
to test your functionality.Add a function that plots a circle
Add save to
png
functionality to the plot functionMake the plotting function more fancy (add units, labels etc)
Add surface calculation for other shapes (triangle, square, pentagon, hexagon … )
Add circumference calculation for same shapes.
N.B. Once frustration sets in for enough people we will move on to Git.
End of exercise 1#
Problems with manual version control#
It requires a lot of discipline
It is virtually impossible to resolve conflicts
What is Git ?#
VCS store all historical versions of a file#
Git stores snapshots called commits
The differences between files in 2 commits are human readable if they are text files (e.g., .txt
, .py
, .tex
)
Central authority#
For collaborating on code or papers it is useful to have a central authoritative version
Know what the latest version is
Know what other peopler are working on
Easier to maintain than emailing around versions
Git is a “distributed” VCS#
Every copy of the repository contains the complete history
You can keep working if the internet is down
You don’t lose your data (and history) if the server dies
A git “repository” is a folder which has files it keeps track of#
You choose which files to track
Looks like a normal folder but there is a hidden folder (
.git
) inside with the history
GitHub is a web-based Git repository hosting service#
Web hosting
Open issues / bug reports
Suggest changes to projects
Free-private repositories for academic users
Conventient tools
Diff viewer
Commit browser
Excercise 2: Basic single user git#
Setting up a new git repository using github + clone
A basic single user workflow involving: commiting, pulling and pushing your changes
2A setting up git settings#
Only need to do this once (per machine)!
Set up your git config
git config --global user.email "you@example.com"
git config --global user.name "Your Name"
You can check your config using
git config --list
. Use this to check if you are now pointing to your repository on GitHub
Add your SSH public key to your GitHub account to allow you to access your repository without entering a password every time
We have pre-generated an SSH key for you to use during this course. You will find the public part of the key in your home directory in the file
id_rsa.pub
(find instructions how to do that yourself here)Click on the file
id_rsa.pub
in the file browser to open itCopy the contents of this file to the clipboard with
Ctrl-C
Go to settings/keys and click the
New SSH key
buttonPaste the contents of
id_rsa.pub
into the text area labeledKey
. You may enter anything you like into thetitle
field (e.g.casimir course jupyterhub
)
2B setting up a project#
Create new repository on github.
Go to GitHub and log in
Click create a new repository
Name it : “Casimir-programming”
Add
readme.md
add
.gitignore
add a license (e.g., MIT) (optional)
clone the repository into your home directory
Go to the page for your
Casimir-programming
repository on GitHubClick the
Clone or download
buttonVerify that the popup title is
Clone with SSH
. If it isClone with HTTPS
click on theUse SSH
link in the corner of the popupCopy the URL in the text field of the popup. It should look like
git@github.com:...
Open a terminal and type
git clone
, then paste the URL that you copied from GitHub, then hit Enter.This will create a copy of the entire repository in a new folder
2B My first commit#
Create a new file called
test.py
Add the text
print('hello world')
Commit and sync your changes
Type
git status
Type
git add test.py
Type
git commit -m 'my first commit'
Type
git pull
Type
git push
View the commit history
Using the terminal
Type
git log
. See if you understand what you see.Type
git log -p -2
. This shows the changes introduced by the last 2 commitsTake a look at the “Viewing the Commit History to see other useful options
Using GitHub
Click on Commits, open your latest commit.
Click browse files to browse your code at the time of the commit.
Go to Graphs/network, this shows you a line with all comits.
Very useful once we move on to multi-user workflows.
Open a file and check out history
this shows a list of all commits that changed that specific file.
Open a file and look at blame
2C Practice basic git using trigonometry#
Commit your changes every time you finish a bulletpoint. Use the flowchart shown below.
Add a function to test.py to calculate the circumference of a circle. Commit your changes.
Add a function to test.py to calculate the surface area of a circle. Commit your changes.
Create a new file called script.py that is empty. Commit your changes.
Add some print statement to script.py and execute it. Commit your changes.
Import test.py into script.py and call the functions in test.py and print the output. Commit your changes.
Take a look at your repository on GitHub to see an overview of your work
End of excercise 2#
Git-Flow chart#
There is a git flowchart (in pdf, pptx and png) in the day3/img folder.
**! N.B. Avoid using the GitHub App, it gets you into all kinds of trouble! **
Excercise 3: Multiple users, resolving conflicts#
Working with multiple users on a single branch
Resolving conflicts
Work with your partner#
Add your partner as a collaborator#
Go to the repository of person A on github.
Go to settings/collaborators. Enter the GitHub ID of person Band make them a collaborator (write access).
Person B clones the repository of person A (look at exercise 1a if you forgot)
Creating and resolving a conflict#
Both persons will add a docstring (with the triple quotes) to the function that calculates the surface of the circle
Person A commits pulls and pushes.
Person B commits and pulls, this will raise a conflict. Resolve this conflict.
Look at the GitHub network graph to see what happened.
More practice#
Both work on the same repository, use script.py to test your functionality.
Add a function that plots a circle
Add save to png functionality to the plot function
Make the plotting function more fancy (add units, labels etc)
Add surface calculation for other shapes (triangle, square, pentagon, hexagon … )
Add circumference calculation for same shapes.