How to upload large files to Github?
While uploading large files to Github, I came to know about the large file restriction in Github. Github has a strict restriction of 100MB in file size. It gives a warning for files greater than 50MB.
I got the error regarding file size restriction after I had pushed the commit with the large files. After that, I removed the files and pushed a new commit. But no luck! Actually, I could not push any kind of changes after this error message appeared. After searching for a while, what I realized is that those large files were in the commit history. So, removing the files and adding a new commit does not matter actually. I had to clean the commit history to completely remove the large files. Then I came to know about BFG Repo Cleaner, which completely removes the files greater than the specified file size. BFG Repo Cleaner is used for removing 1-crazy big files, 2-passwords, credentials, and other private data. It is a 10–720x faster alternative to git-filter-branch.
After downloading the jar file, all I had to do is to run the commands below:
java -jar bfg.jar --strip-blobs-bigger-than 100M some-big-repo.git
git reflog expire --expire=now --all && git gc --prune=now --aggressive
After cleaning the repository, I could finally push my changes but the large files. So one work to upload the large files was still to be finished. Git LFS(Large File Storage) came to the rescue in this regard. What LFS does is that it stores the original files in the remote server whereas it stores the pointer to the originals in the repository.
To use Git LFS, I installed it locally. After that, the below command needs to be run once per local repository.
git lfs install
Then, I needed to track the large files I wanted to upload.
git lfs track *.pkl
It will create .gitattributes in the root repository and track the files with the specified file types and create the LFS objects. Then, usual add, commit is absolutely fine. After that, I had to push the LFS objects to the LFS server at first and then finally push the pointer files to the repository.
git lfs push --all origin master
git push origin master
One thing to be noted, Git LFS should be installed in every local machine where this repo will be pulled in. Otherwise, downloading files from the LFS server won’t be possible as it is maintained automatically by LFS.
In addition to that, Github recommends a repository to be as small as possible in size. We should not explode the repository with unnecessary large files.