Table of Contents
git
source code version control
Written by Linus Torvalds to replace svn and cvs.
todo:
- streamline the submodule section
- streamline the internals section
- coordinate the “how-to” sections
- coordinate workflow section with “projects:developer guide” section
Overview
Components of git:
- repository
- branch
- commit
- tag
- remote
- submodule
- worktree
Other terms:
- working copy - sometimes used to refer to the current worktree
- workflow - the procedures and conventions used by a development team
More:
- index - ?
- HEAD - ?
- detached head - ?
- commit-ish
- SHA-1 value
- HEAD
- refs
- branch = ref
- Three object types: blobs, trees, commits
- fourth: tags
- refs: branch, tag, remote
repository
bare
a bare repository is a folder named project.git
a non-bare repository is a subfolder named .git, co-residing in the project folder along with the working copy
git init git clone
remote
A remote is a shortcut label for another repository.
git remote add <name> <repository> # add a new remote git remote -v # list the remotes git remote remove <name> # delete a remote
This makes it easy to copy a branch from one repository to another.
git push <remote> <branch> # copy a branch into the remote repository git pull <remote> <branch> # copy a branch from the remote repository
branch
A repository is subdivided into branches.
You give each branch an arbitrary name.
By convention, we usually keep one master branch named master or main or trunk or central or something like that. This branch lasts forever.
Other branches come and go. Branches can be diffed and merged with each other.
Working with branches:
git branch newfeature # create a new branch git diff master..newfeature # compare two branches git checkout master # make this branch the default, and # replace the the working copy git merge newfeature # merge a branch into the default branch git branch -d newfeature # delete a branch
commit
The “commit” is the unit of version control. Each commit is a version of the source code.
Each commit is given a unique 20-character name by sha1.
To display the history of commits on a project:
$ git log $ git log --pretty=oneline
tag
A tag is a human-readable label attached to a specific commit, or inserted between two commits.
The list of commits can become long. Tags can be used to give it some organization.
For example, a tag might be a release number.
git tag 1.0 # add a tag git log # list of commits and tags git tag # list of tags git show # details of latest commit git tag -d 1.0 # delete a tag
worktree
A worktree is an external folder containing your source code files.
AKA “working copy” or “working tree”.
A worktree can be created and deleted using git-worktree commands.
git worktree add hotfix # creates a subfolder named hotfix # creates a branch named hotfix # does a checkout of the branch into the worktree
A repository is sometimes paired with a worktree.
A bare repository has no paired worktree. You can push, you can pull. You cannot status, merge, commit, diff, etc, because these commands act on the paired worktree.
push vs pull
git push - pushes commits from one repo to another
git pull - pulls changes and merges them into a local worktree
if we wanted to do a pull, we could pull into webprod or webdev, then push from there to voycgit
merge vs rebase
if development is ongoing in the master branch while some special feature development is going on in the feature branch
when you eventually merge the feature back into master, do a merge, to maintain the true history
in the the meantime, to bring master changes into the feature, do a rebase,
handle the conflicts now, so the future feature merge will run without conflicts
handle conflicts conflicts in the feature branch, not when attempting the final feature merge
How to use
Create a new repository
create a remote repository in the git folder
cd webapps/git/repos git init --bare flash.git
create a local repository in the flash folder
cd webapps/flash git init git config user.email "john@hagstrand.com" git config user.name "jhagstrand" git add . git commit -m 'Initial file load.'
in the local repository, point the name 'origin' to the remote repository
git remote add origin xttps://git.hagstrand.com/flash.git git remote # list all remotes. A remote is a "shortcut" to another repository.
in the local repository, push all files up to the remote repository
git push origin master git push -u origin master
Workflow: develop, request pull, deploy
in the local repository, create a branch to work on
git branch # list all branches, asterisk by current one git branch mybranch # create new branch git checkout mybranch # switch to the new branch
in the local repository, remove a file
git rm normalize.css
in the local repository, commit changes on mybranch
git add . # adds any changed or new files to staging git commit -m 'new file changes'
in the local repository, merge mybranch into master
git checkout master git merge mybranch git branch -d mybranch git status git log
in the local repository, bring local master up to date
git pull --rebase origin master
in the local repository, push completed work in master up to central repository
git push origin master note: changes in the working directory are
List all files in the master branch
git ls-tree -r master --name-only
Diff
git diff # before the add git add .
git diff HEAD # after the add
git diff -U0 # do not display context
git diff -w # ignore whitespace
git diff -w –word-diff-regex=[^[:space:]] # ignore whitespace additional
In .gitconfig, add [core] whitespace = -trailing-space,-indent-with-non-tab,-tab-in-indent
superproject and submodule
Add submodule to super-project
cd webapps/peg git submodule add xttps://github.com/necolas/normalize.css git submodule add xttps://git.hagstrand.com/repos/accounts.git git commit -am 'added submodules' git submodule update --init # init + update info in .git/config. (You may also edit manually.) git submodule foreach git pull origin master # pull latest files for all submodules
submodules
In the listings above jslib shows as an empty folder in the worktree,
and as a commit in the .git database.
That is the commit that was originally pulled into this super-project.
git submodule update This gets the url from .git/config [jslib] section. And it gets the commit When we do a git
all-in-one
git clone --recurse-submodules https:git.voyc.com/barecentral/layout
Does the clone and then does a
git submodule update --init --recursive
Move a Superproject to a New GitServer
For example, we recently moved from github to gitlab.
A superproject contains submodules.
Each submodule is identified by name, path, url, and commit-ish.
You specify the name, path and url in the .gitmodules file.
Git retains the commit-ish in the database, as you can see with git ls-tree master
.
Step 1. manually edit the .gitmodules file.
Here we specify the new server url.
Step 2. git submodule init
This copies the changes in .gitmodules into the .git/config file.
Step 3. git submodule sync
This changes the remotes in the submodules.
Step 4. git submodule update
This pulls a version of submodule jslib into the jslib folder,
using the url found in .git/config and the jslib commit-ish found in the layer/.git database.
This version may not be the latest, and therefore we continue with step 4.
Step 5. git submodule foreach git pull origin master
This pulls the latest commit from the jslib/.git.
Step 6. Add, commit and push the layout superproject.
$ git status modified: .gitmodules modified: jslib (new commits) git add . git commit -m 'Pull latest submodules' git push origin master
overview
A superproject is one that has submodules. The submodules are listed in .gitmodules file of the superproject. They are also listed in .git/config file, but not right away.
.gitmodules comes down with the clone .git/config gets updated when you do the git submodule init
create sub create super add sub to super - makes a clone of the sub
we now have three gits: two of the sub and one of the super. if they have been pushed, then we have six gits. Ultimately I want to keep them all in sync, though during development they will drift out of sync.
To solve the “detached head” situation. git checkout master
“commit recorded in the superproject” - what? where?
A superproject has submodules.
A submodule is a full-fledged git project and can be treated as such. It has been cloned from a remote project. You can modify, commit, push, pull just like another other git project.
It addition, a pointer to the submodule is contained in the superproject.
In three places:
- the .gitmodules file
- the .git/config file
- the working tree
When you make changes in the submodule, you then:
- commit in the submodule
- commit in the superproject
- push the submodule
- push the superproject
- optional: pull into any other clones of the submodule project
Likewise, if you pull fresh changes into the submodule, you also want to do a commit in the superproject.
Submodules can be nested in a hierarchy.
Some submodule commands offer a –recursive option
so that all sub-submodules in the hierarchy are processed in the same way.
clone a superproject
$ git clone <projectname>
Create a clone of the project. Checkout all files of the current branch. If the project is a superproject,
- the .gitmodules files is present.
- an empty subdirectory for each submodule is present.
$ git submodule init
Adds the submodule info to the .git/config file.
$ git submodule update
what does the submodule update command do? Three things:
- clone missing submodules
- fetch missing commits in submodules
- update the working tree of the submodules
How is this “update” done? it depends:
- checkout (default)
- rebase
- merge
- custom
- none
By default, the checkout, as in “the commit recorded in the superproject will be checked out in the submodule on a detached head.”
$ git submodule update Cloning into '/home/john/webapps/test/model/html/icon'... Cloning into '/home/john/webapps/test/model/html/jslib'... Cloning into '/home/john/webapps/test/model/html/minimal'... Cloning into '/home/john/webapps/test/model/php/account'... Submodule path 'html/icon': checked out '850cd6e9b06629ec0a5bff5b1ff66f725747a9fa' Submodule path 'html/jslib': checked out '84fa035aef9537eae68b9652a74f3e09cbcbd398' Submodule path 'html/minimal': checked out 'ed76e62d530f32ffabcbd5a615e4fb81e9d1cfc2' Submodule path 'php/account': checked out 'ef11ada975ff25f3c2bd5ab566c0149d0c351168'
It's almost a clone. One of the differences is the .git is not a directory. Is is a text file that contains a pointer to the real .git directory.
$ cat .git gitdir: ,,/,,/.git/modules/html/jslib $ ls ,,/,,/.git/modules/html/jslib branches config description HEAD hooks index info logs objects packed-refs refs
Also, the HEAD file contains the sha1 identifier of the commit that was checked out.
$ cat .git/modules/html/jslib/HEAD 84fa035aef9537eae68b9652a74f3e09cbcbd398
This is by definition a detached head.
After doing the git checkout master.
$ cat .git/modules/html/jslib/HEAD ref: refs/heads/master
Detached Head
Who wants a detached head? Not me. The solution to this state is:
git checkout master
The term “detached head” means the project is not pointed to a branch, but to a specific commit.
The git submodule update command leaves each submodule project in a detached head state.
$ git status HEAD detached at ed76e62 nothing to commit, working tree clean
The solution is to explicitly checkout the master branch.
$ git checkout master Previous HEAD position was ed76e62 removed extraneous semicolon from css Switched to branch 'master' Your branch is up to date with 'origin/master'. $ git status On branch master Your branch is up to date with 'origin/master'. nothing to commit, working tree clean
Clone a super-project
cd webapps git clone xttps://git.hagstrand.com/peg.git cd peg git submodule update --init --recursive git submodule foreach git pull origin master # pull latest files for all submodules
Pull latest submodule files into super-project
git submodule foreach git pull origin master # pull latest files for all submodules git add . git commit -m ‘pull latest’ # this changes the commit hash identifier in .git/FETCH_HEAD git push origin master
Begin
cd webapps/drzinn git clone xttps://github.com/voyc/model # start with model mv model drzinn # rename to drzinn cd drzinn git submodule update --init # copies entries from .gitmodules to .git/config, and pulls each
Combinations
Some git command options effectively execute multiple git commands simultaneously. Here are some examples.
1. git add . 2. git commit -m 'latest fix' 1+2. git -a commit -m 'latest fix' # combine add and commit
1. git clone <project> 2. git submodule init 3. git submodule update 2+3. git submodule update --init 1+2+3. git clone --recursive-submodules <project>
1. git fetch # download patch files 2. git merge # apply patches to local files 1+2. git pull # fetch and merge
Fork
In github ui, fork repo from voyc to hagstrand. A fork is a server-side clone.
Locally:
git clone xttps://github.com/hagstrand/flash flashbang # specified target <directory> git remote add upstream xttps:github.com/voyc/flash # second remote git pull --rebase upstream master # pull changes from upstream, update local, no commit # pull does a fetch and merge into local directory. No need to commit. git push origin master Todo: Rename? Todo: In github UI, “Pull Request” to bring forked project back into upstream.
See the current changes
git diff # before staging (git add *) git diff --staged # after staging
Alternative Workflows
Centralized Workflow
- One central repository. One branch named “master”.
- Each developer has his own copy of the central repository, so he can branch, work, and commit without connecting to the central repository. This is helpful for worldwide distributed teams who may not have good internet connectivity.
Feature Branch Workflow
- Same as above, but a feature branch in the central repository.
- Push changes to the feature branch instead of master. Now your local work is backed up in the central repository.
- After push to feature branch, do a Pull-Request. This is like a code review.
When approved,
GitFlow Workflow
Same as Feature Branch Workflow, plus, it assigns very specific roles to different branches and defines how and when they should interact. In addition to feature branches, it uses individual branches for preparing, maintaining, and recording releases. Of course, you also get to leverage all the benefits of the Feature Branch Workflow: pull requests, isolated experiments, and more efficient collaboration.
Forking Workflow
Fundamentally different than the other workflows discussed in this tutorial. Instead of using a single server-side repository to act as the “central” codebase, it gives every developer a server-side repository. This means that each contributor has not one, but two Git repositories: a private local one and a public server-side one.
Protocol
Three choices: ssh, xttps, git.
$ git clone ssh://voyccom@az1-ss8.a2hosting.com:7822/home/voyccom/voycgit/jslib $ git clone xttps://gitlab.com/voyc/jslib
The git protocol is a daemon that ships with git. It is similar to ssh but has no authentication and no encryption.
Web Access
HTML Interface
product | open source | comment |
---|---|---|
gitweb | yes | included in git |
github | no | bought by Microsoft |
gitlab | yes |
Web Hosting Service
product | max GB | num users |
---|---|---|
github | 15 | 31M |
gitlab | 5 | 16M |
.git Folder
total 68 drwxrwxr-x 9 john john 4096 Aug 29 19:04 . drwxrwxr-x 4 john john 4096 Aug 29 19:09 .. drwxrwxr-x 2 john john 4096 Aug 28 21:54 branches -rw-rw-r– 1 john john 19 Aug 29 12:51 COMMIT_EDITMSG -rw-rw-r– 1 john john 296 Aug 29 12:51 config -rw-rw-r– 1 john john 73 Aug 28 21:54 description -rw-rw-r– 1 john john 121 Aug 29 11:38 FETCH_HEAD -rw-rw-r– 1 john john 24 Aug 29 19:04 HEAD drwxrwxr-x 2 john john 4096 Aug 28 21:54 hooks -rw-rw-r– 1 john john 2818 Aug 29 19:04 index drwxrwxr-x 2 john john 4096 Aug 28 21:54 info drwxrwxr-x 3 john john 4096 Aug 28 21:54 logs drwxrwxr-x 15 john john 4096 Aug 29 12:51 objects -rw-rw-r– 1 john john 41 Aug 29 11:38 ORIG_HEAD -rw-rw-r– 1 john john 114 Aug 29 12:51 packed-refs drwxrwxr-x 5 john john 4096 Aug 28 21:54 refs drwxrwxr-x 3 john john 4096 Aug 29 14:38 worktrees
.git
- branches
- refs
- heads
- remotes
- tags
- worktrees
files -rw-rw-r– 1 john john 19 Aug 29 12:51 COMMIT_EDITMSG -rw-rw-r– 1 john john 296 Aug 29 12:51 config -rw-rw-r– 1 john john 73 Aug 28 21:54 description -rw-rw-r– 1 john john 121 Aug 29 11:38 FETCH_HEAD -rw-rw-r– 1 john john 24 Aug 29 19:04 HEAD -rw-rw-r– 1 john john 2818 Aug 29 19:04 index -rw-rw-r– 1 john john 41 Aug 29 11:38 ORIG_HEAD -rw-rw-r– 1 john john 114 Aug 29 12:51 packed-refs
external files .gitignore .gitsubmodules
git internals
tree = directory
blob = file
tree-ish - a string that refers to a tree
blob-ish - a string that refers to a blob
commit-ish - a string that refers to a commit
For example, a commit is identified by an sha1 string, like:
ef182a11517482f92ba1457a11d8361cc5dbacb5
But it can also be identified by other refs and revs and I don't know what all.
Example
My git named layout has one submodule named jslib.
$ git clone https:gitlab.com/voyc/layout $ cd layout $ ls -al total 56 drwxrwxr-x 5 john john 4096 Aug 31 13:13 . drwxrwxr-x 19 john john 4096 Aug 31 13:13 .. -rw-rw-r-- 1 john john 5430 Aug 31 13:13 favicon.ico drwxrwxr-x 8 john john 4096 Aug 31 13:13 .git -rw-rw-r-- 1 john john 11 Aug 31 13:13 .gitignore -rw-rw-r-- 1 john john 71 Aug 31 13:13 .gitmodules drwxrwxr-x 2 john john 4096 Aug 31 13:13 i -rw-rw-r-- 1 john john 920 Aug 31 13:13 index.html drwxrwxr-x 2 john john 4096 Aug 31 13:13 jslib -rw-rw-r-- 1 john john 798 Aug 31 13:13 layout.css -rw-rw-r-- 1 john john 5104 Aug 31 13:13 layout.js -rw-rw-r-- 1 john john 167 Aug 31 13:13 README.md $ git ls-tree master 100644 blob 49148a942cccfa1a9aca4de2b0001c600ccf9e7b .gitignore 100644 blob 84a45f6bcbd708b61267f1fb14e8f9b743497a3b .gitmodules 100644 blob 07b12b1bd93eed621f7aec713d4af55b75954bf1 README.md 100644 blob f131618830be9987dc2c1b2993464964e146f9b1 favicon.ico 040000 tree 0853532278f591a3e6bd15bc5fd7672032be896e i 100644 blob ce2db1424b3be75df3e1eeefc6126a1c5fdeabd2 index.html 160000 commit ef182a11517482f92ba1457a11d8361cc5dbacb5 jslib 100644 blob dda9c6be32d0857673c9359ecbc47119271072b7 layout.css 100644 blob 2eab57b5802ff2c99c261705af1ecadfcd264a61 layout.js
$ git log --oneline d556386278b2abaa95e051e7c8cde89b1ea98e8d (HEAD -> master, origin/master, origin/HEAD) Fix dragger 8753a13de258b4663de6386f3aacb178aea74f8e Update README.md 24cb5418f6fcc6d2ddd292593d412ec760b57028 Initial file load 69d88ec081a75a52b8b7661da0251325906de3b2 Initial commit
In the listings above jslib shows as an empty folder in the worktree,
and as a commit in the .git database.
That is the commit that was originally pulled into this super-project.