删除 git 仓库中无用大文件

删除 git 仓库中无用大文件

这两天整理公司很多旧的代码仓库,上传到在线项目平台。因为存储空间容量有限,所以需要先把一些仓库中的无用大文件删除掉,以节约占用存储空间。

1、找到大文件

在代码仓库目录下执行命令

# 如果是纯版本库,那么这里的 .git/objects/pack/pack-*.idx 应改为 objects/pack/pack-*.idx
git verify-pack -v .git/objects/pack/pack-*.idx | sort -k 3 -g | tail -5

输出结果如下,依次是文件ID文件类型文件字节数size-in-packfiloffset-in-packfile

f628e3087f5e32c2c84f2f7d534e744df31511e1 blob   348670 347901 444010
3e2109358e393b52d0583c28cae8d3b36a4d3c41 blob   503391 146406 59027
21e3ba4aad85cc3ae2cb7e666196b008cac0fbae blob   752709 150644 288525
2832523cb6c5daa0dffac74e4388a11e20af9f64 blob   1415831 297634 10476233
0a160bb8b890e3347121f4c6113e7292f4a279df blob   22888095 9603172 791911

根据文件的ID(SHA1),查找文件路径

> git rev-list --objects --all | grep 0a160bb8b890e3347121f4c6113e7292f4a279df
0a160bb8b890e3347121f4c6113e7292f4a279df my-project/src/assets/images/地球1.jpg

上面的合为一个脚本

#!/bin/bash
LISTS=`git verify-pack -v objects/pack/pack-*.idx | sort -k 3 -g | tail -5`
for item in ${LISTS};
do
        if [ `expr length ${item}` -eq 40 ];then
                git rev-list --objects --all | grep ${item}
        fi
done

2、将要删除的大文件从各个分支中移除

执行下面命令,将文件从分支的提交中移除

# 文件名可以写多个。git rm 可添加 -r 参数,递归删除目录。还可以使用 *.dll *.pdb 这样的通配符匹配模式
git filter-branch --force --index-filter 'git rm --cached --ignore-unmatch 文件名' --prune-empty --tag-name-filter cat -- --all

输出示例如下:

> git filter-branch --force --index-filter 'git rm --cached --ignore-unmatch my-project/src/assets/images/地球1.jpg' --prune-empty --tag-name-filter cat -- --all
WARNING: git-filter-branch has a glut of gotchas generating mangled history
         rewrites.  Hit Ctrl-C before proceeding to abort, then use an
         alternative filtering tool such as 'git filter-repo'
         (https://github.com/newren/git-filter-repo/) instead.  See the
         filter-branch manual page for more details; to squelch this warning,
         set FILTER_BRANCH_SQUELCH_WARNING=1.
Proceeding with filter-branch...

Rewrite b3012a355f8abbd77aafd3bbe061b2a9d6fc2209 (6/75) (0 seconds passed, remaining 0 predicted)    rm 'my-project/src/assets/images/地球1.jpg'
Rewrite 2f46fe9d8f7e41e0fc4fc0c974dfe9d53a7ae16e (7/75) (0 seconds passed, remaining 0 predicted)    rm 'my-project/src/assets/images/地球1.jpg'
Rewrite 48eacdbaac7be54447becb6ba2299c7af7342018 (8/75) (0 seconds passed, remaining 0 predicted)    rm 'my-project/src/assets/images/地球1.jpg'
Rewrite 33dc9bc35ef8ba91d8a7a3ee438cffbc76ae731c (9/75) (0 seconds passed, remaining 0 predicted)    rm 'my-project/src/assets/images/地球1.jpg'
Rewrite 66797c81a7d0661b598ae1208e890a789cc5a431 (10/75) (0 seconds passed, remaining 0 predicted)    rm 'my-project/src/assets/images/地球1.jpg'
Rewrite 7b256cae45d2ffd8066e46e0908be61e8550eb88 (11/75) (0 seconds passed, remaining 0 predicted)    rm 'my-project/src/assets/images/地球1.jpg'
Rewrite e3356d4320f060c9d4bf92ae0c6a2ecd02d62e30 (12/75) (0 seconds passed, remaining 0 predicted)    rm 'my-project/src/assets/images/地球1.jpg'
Rewrite 5b86b0a737ce4d42cd9554cddcdd3da0c637ebf6 (13/75) (1 seconds passed, remaining 4 predicted)    rm 'my-project/src/assets/images/地球1.jpg'
Rewrite 77f6eff022bcfdf28a450829a8f6e678d37e1b29 (13/75) (1 seconds passed, remaining 4 predicted)    rm 'my-project/src/assets/images/地球1.jpg'
Rewrite b270af3f52c39115024d211d75b67af235ee4e86 (13/75) (1 seconds passed, remaining 4 predicted)    rm 'my-project/src/assets/images/地球1.jpg'
Rewrite 0ce2b7e593d7e679e2b4f4d0f1d214ad3c301818 (13/75) (1 seconds passed, remaining 4 predicted)    rm 'my-project/src/assets/images/地球1.jpg'
Rewrite 26f080f7fc6d4feb6833d8db6aa4b8a55f44f092 (13/75) (1 seconds passed, remaining 4 predicted)    rm 'my-project/src/assets/images/地球1.jpg'
Rewrite 1281c094b358354e108e1461ee0dd395dd3b7127 (13/75) (1 seconds passed, remaining 4 predicted)    rm 'my-project/src/assets/images/地球1.jpg'
Rewrite d0166fea8ae38c73e615e95e73e6913df1e37901 (13/75) (1 seconds passed, remaining 4 predicted)    rm 'my-project/src/assets/images/地球1.jpg'
Rewrite 6a5aafcc51e68a6bf3b3ac08ad0e3a01f8b240fc (55/75) (2 seconds passed, remaining 0 predicted)    rm 'my-project/src/assets/images/地球1.jpg'
Rewrite 22e415058e646ab3870c0ea4c0ffca179ad0a739 (55/75) (2 seconds passed, remaining 0 predicted)    rm 'my-project/src/assets/images/地球1.jpg'
Rewrite 83cf6f3db88c423184d9a1eec22c5c23841d492a (55/75) (2 seconds passed, remaining 0 predicted)
Ref 'refs/heads/20180115' was rewritten
Ref 'refs/heads/ese_v2_sge20180111' was rewritten
Ref 'refs/heads/master' was rewritten
Ref 'refs/heads/online_sge20180111' was rewritten
Ref 'refs/heads/show_localhost' was rewritten
Ref 'refs/heads/show_lyd20180315' was rewritten
Ref 'refs/tags/attachment' was rewritten
Ref 'refs/tags/ese_v1.03_online20180321' was rewritten
Ref 'refs/tags/ese_v1.07_show20180321' was rewritten
Ref 'refs/tags/ese_v1.08_online20180323' was rewritten
Ref 'refs/tags/ese_v1.08_show20180323' was rewritten
Ref 'refs/tags/ese_v1.09_online20180323' was rewritten
Ref 'refs/tags/ese_v1.10_show20180528' was rewritten

3、删除缓存下来的ref和git操作记录

git for-each-ref --format='delete %(refname)' refs/original | git update-ref --stdin
git reflog expire --expire=now --all
# 或者这样写也行
git for-each-ref --format='delete %(refname)' refs/original | git update-ref --stdin && git reflog expire --expire=now --all

4、垃圾回收

上面2步把大文件的索引都切断了,这个时候进行垃圾回收,就可以很明显看到效果了

git gc --prune=now

这时候的仓库就已经是删除了大文件的了,把仓库推送到新的远程仓库或者强制推送到远程仓库都是可以的。

git remote add newremote https://xxxxxx.com/xxxx/xxx.git
git push -u --all newremote

参考

posted @ 2020-07-23 16:35  乌合之众  阅读(6730)  评论(0编辑  收藏  举报
clear