git学习
git的基本机制
学习git,看这本书是很推荐的,各个语言的版本都有
然后就是这个网站
最后,会用git了,还可以来开发git:
mega里面的libra是一个类似于git的工具,目前已经实现了基本功能,但还不够成熟,使用纯rust进行开发
项目mega/mercury/src/internal/object at main · web3infra-foundation/mega,欢迎贡献。
一次简单的提交
git会把它的数据全部写在.git/objects文件夹下,objects里面的目录和文件存储,下文会说明。
root@65f8e5d2f4eb:/git# pwd
/git
root@65f8e5d2f4eb:/git# git init
hint: Using 'master' as the name for the initial branch. This default branch name
hint: is subject to change. To configure the initial branch name to use in all
hint: of your new repositories, which will suppress this warning, call:
hint:
hint: git config --global init.defaultBranch <name>
hint:
hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and
hint: 'development'. The just-created branch can be renamed via this command:
hint:
hint: git branch -m <name>
Initialized empty Git repository in /git/.git/
root@65f8e5d2f4eb:/git# echo "hello">hello.txt
root@65f8e5d2f4eb:/git# cat hello.txt
hello
root@65f8e5d2f4eb:/git# git add .
root@65f8e5d2f4eb:/git# git commit -m "hello"
[master (root-commit) 4543b9b] hello
1 file changed, 1 insertion(+)
create mode 100644 hello.txt
root@65f8e5d2f4eb:/git/.git/objects# ls
45 aa ce info pack
git还有一些命令可以查看git对象(git里面以40位hash值标识一个对象,所有对象的存储都是zlib压缩过的)
#查看对象的类型
git cat-file -t 40位hash
#查看对象的具体数据,以人类可读的方式
git cat-file -p 40位hash
#下面使用这两个命令查看之前提交产生的所有对象
root@65f8e5d2f4eb:/git/.git/objects# git cat-file -t 4543b9bdd8267c1c5bc7c457509517dca8781aaf
commit
root@65f8e5d2f4eb:/git/.git/objects# git cat-file -t aaa96ced2d9a1c8e72c56b253a0e2fe78393feb7
tree
root@65f8e5d2f4eb:/git/.git/objects# git cat-file -t ce013625030ba8dba906f756967f9e9ca394464a
blob
root@65f8e5d2f4eb:/git/.git/objects# git cat-file -p ce013625030ba8dba906f756967f9e9ca394464a
hello
root@65f8e5d2f4eb:/git/.git/objects# git cat-file -p aaa96ced2d9a1c8e72c56b253a0e2fe78393feb7
100644 blob ce013625030ba8dba906f756967f9e9ca394464a hello.txt
root@65f8e5d2f4eb:/git/.git/objects# git cat-file -p 4543b9bdd8267c1c5bc7c457509517dca8781aaf
tree aaa96ced2d9a1c8e72c56b253a0e2fe78393feb7
author yyjeqhc <1772413353@qq.com> 1747620563 +0800
committer yyjeqhc <1772413353@qq.com> 1747620563 +0800
hello
#还要注意,上面的提交会自动的更新.git/refs/heads,也就是当前的分支在提交后会进行更新,然后HEAD文件如果没有分离到某次提交,那么它就指向分支的名称
root@65f8e5d2f4eb:/git/.git# ls
COMMIT_EDITMSG HEAD branches config description hooks index info logs objects refs
root@65f8e5d2f4eb:/git/.git# cat HEAD
ref: refs/heads/master
root@65f8e5d2f4eb:/git/.git# ls refs/
heads tags
root@65f8e5d2f4eb:/git/.git# ls refs/heads/
master
root@65f8e5d2f4eb:/git/.git# cat refs/heads/master
4543b9bdd8267c1c5bc7c457509517dca8781aaf
#最后,除了使用git的命令来查看git对象外,还可以对git对象文件进行解压缩,查看十六进制
(base) root@ubuntu:~# zlib-flate
Command 'zlib-flate' not found, but can be installed with:
apt install qpdf
root@65f8e5d2f4eb:/git/.git/objects# zlib-flate -uncompress < 45/43b9bdd8267c1c5bc7c457509517dca8781aaf
commit 160tree aaa96ced2d9a1c8e72c56b253a0e2fe78393feb7
author yyjeqhc <1772413353@qq.com> 1747620563 +0800
committer yyjeqhc <1772413353@qq.com> 1747620563 +0800
hello
root@65f8e5d2f4eb:/git/.git/objects# zlib-flate -uncompress < aa/a96ced2d9a1c8e72c56b253a0e2fe78393feb7
tree 37100644 hello.txt́6%
¨£FJroot@65f8e5d2f4eb:/git/.git/objects#
root@65f8e5d2f4eb:/git/.git/objects# zlib-flate -uncompress < ce/013625030ba8dba906f756967f9e9ca394464a
blob 6hello
首先介绍一下git对象的存储
mega/libra/src/utils/client_storage.rs at main · web3infra-foundation/mega
#[derive(PartialEq, Eq, Hash, Debug, Clone, Copy, Serialize, Deserialize)]
pub enum ObjectType {
Commit = 1,
Tree,
Blob,
Tag,
...//忽略不会使用的几种
}
pub fn put(
&self,
obj_id: &SHA1,
content: &[u8],
obj_type: ObjectType,
) -> Result<String, io::Error> {
let path = self.get_obj_path(obj_id);
let dir = path.parent().unwrap();
fs::create_dir_all(dir)?;
let header = format!("{} {}\0", obj_type, content.len());
let full_content = [header.as_bytes().to_vec(), Vec::from(content)].concat();
let mut file = fs::File::create(&path)?;
file.write_all(&Self::compress_zlib(&full_content)?)?;
Ok(path.to_str().unwrap().to_string())
//git保存在objects的对象。都是
//git对象类型 + 空格 + 数据长度字符串 + '\0' + 数据,然后整体进行zlib压缩
//二进制文件的保存,就是40位hash值,前两位位文件夹名称,后面38位为文件名称
然后介绍一下SHA1结构体
git里面,每个对象都有这样一个hash值,git也以这个hash值来表示一个对象
mega/mercury/src/hash.rs at main · web3infra-foundation/mega
#[derive(
Clone, Copy, Debug, PartialEq, Eq, Hash, PartialOrd, Ord, Default, Deserialize, Serialize,
)]
pub struct SHA1(pub [u8; 20]);
impl SHA1 {
// The size of the SHA-1 hash value in bytes
pub const SIZE: usize = 20;
/// Calculate the SHA-1 hash of the byte slice, then create a Hash value
pub fn new(data: &[u8]) -> SHA1 {
let h = sha1::Sha1::digest(data);
SHA1::from_bytes(h.as_slice())
}
// let header = format!("{} {}\0", obj_type, content.len());
// 这里也是构造了一个完整的header + 数据
// 也就是说,计算hash值的时候是对zlib压缩前的完整数据进行计算,也就是上面的full_content
// 然后,40位hash字符串,前面2位作为文件夹名称,后面38位作为文件名称,保存zlib压缩后的完整数据
pub fn from_type_and_data(object_type: ObjectType, data: &[u8]) -> SHA1 {
let mut d: Vec<u8> = Vec::new();
d.extend(object_type.to_data().unwrap());
d.push(b' ');
d.extend(data.len().to_string().as_bytes());
d.push(b'\x00');
d.extend(data);
SHA1::new(&d)
}
//...
}
下面来看一下git里面每种对象的具体保存结构:
首先是blob,代表一个普通的文件对象
mega/mercury/src/internal/object/blob.rs at main · web3infra-foundation/mega
/// **The Blob Object**
#[derive(Eq, Debug, Clone)]
pub struct Blob {
pub id: SHA1,
pub data: Vec<u8>,
}
impl ObjectTrait for Blob {
/// Creates a new object from a byte slice.
fn from_bytes(data: &[u8], hash: SHA1) -> Result<Self, GitError>
/// Returns the Blob type
fn get_type(&self) -> ObjectType {
ObjectType::Blob
}
fn get_size(&self) -> usize {
self.data.len()
}
fn to_data(&self) -> Result<Vec<u8>, GitError> {
Ok(self.data.clone())
}
}
/*
root@65f8e5d2f4eb:/git/.git/objects# zlib-flate -uncompress < ce/013625030ba8dba906f756967f9e9ca394464a |
> hexdump -C
00000000 62 6c 6f 62 20 36 00 68 65 6c 6c 6f 0a |blob 6.hello.|
0000000d
*/
//blob数据对象的存储比较简单,就是数据的字节流
然后是commit,代表一次git commit
mega/mercury/src/internal/object/commit.rs at main · web3infra-foundation/mega
/// The `Commit` struct is used to represent a commit object.
///
/// - The tree object SHA points to the top level tree for this commit, which reflects the complete
/// state of the repository at the time of the commit. The tree object in turn points to blobs and
/// subtrees which represent the files in the repository.
/// - The parent commit SHAs allow Git to construct a linked list of commits and build the full
/// commit history. By chaining together commits in this fashion, Git is able to represent the entire
/// history of a repository with a single commit object at its root.
/// - The author and committer fields contain the name, email address, timestamp and timezone.
/// - The message field contains the commit message, which maybe include signed or DCO.
#[derive(Eq, Debug, Clone, Serialize, Deserialize)]
pub struct Commit {
pub id: SHA1,
pub tree_id: SHA1,
pub parent_commit_ids: Vec<SHA1>,
pub author: Signature,
pub committer: Signature,
pub message: String,
}
impl ObjectTrait for Commit {
fn from_bytes(data: &[u8], hash: SHA1) -> Result<Self, GitError>
fn get_type(&self) -> ObjectType {
ObjectType::Commit
}
fn get_size(&self) -> usize {
0
}
/// [Git-Internals-Git-Objects](https://git-scm.com/book/en/v2/Git-Internals-Git-Objects)
fn to_data(&self) -> Result<Vec<u8>, GitError> {
let mut data = Vec::new();
data.extend(b"tree ");
data.extend(self.tree_id.to_string().as_bytes());
data.extend(&[0x0a]);
for parent_tree_id in &self.parent_commit_ids {
data.extend(b"parent ");
data.extend(parent_tree_id.to_string().as_bytes());
data.extend(&[0x0a]);
}
data.extend(self.author.to_data()?);
data.extend(&[0x0a]);
data.extend(self.committer.to_data()?);
data.extend(&[0x0a]);
// Important! or Git Server can't parse & reply: unpack-objects abnormal exit
// We can move [0x0a] to message instead here.
// data.extend(&[0x0a]);
data.extend(self.message.as_bytes());
Ok(data)
}
}
///其实和git是一样的,这里注释了data.extend(&[0x0a]);因为会在提交的时候,给提交信息添加一个换行
///mega/common/src/utils.rs
/// Format commit message with GPG signature<br>
/// There must be a `blank line`(\n) before `message`, or remote unpack failed.<br>
/// If there is `GPG signature`,
/// `blank line` should be placed between `signature` and `message`
pub fn format_commit_msg(msg: &str, gpg_sig: Option<&str>) -> String {
match gpg_sig {
None => {
format!("\n{}", msg)
}
Some(gpg) => {
format!("{}\n\n{}", gpg, msg)
}
}
}
/*
root@65f8e5d2f4eb:/git/.git/objects# zlib-flate -uncompress < 45/43b9bdd8267c1c5bc7c457509517dca8781aaf |
> hexdump -C
00000000 63 6f 6d 6d 69 74 20 31 36 30 00 74 72 65 65 20 |commit 160.tree |
00000010 61 61 61 39 36 63 65 64 32 64 39 61 31 63 38 65 |aaa96ced2d9a1c8e|
00000020 37 32 63 35 36 62 32 35 33 61 30 65 32 66 65 37 |72c56b253a0e2fe7|
00000030 38 33 39 33 66 65 62 37 0a 61 75 74 68 6f 72 20 |8393feb7.author |
00000040 79 79 6a 65 71 68 63 20 3c 31 37 37 32 34 31 33 |yyjeqhc <1772413|
00000050 33 35 33 40 71 71 2e 63 6f 6d 3e 20 31 37 34 37 |353@qq.com> 1747|
00000060 36 32 30 35 36 33 20 2b 30 38 30 30 0a 63 6f 6d |620563 +0800.com|
00000070 6d 69 74 74 65 72 20 79 79 6a 65 71 68 63 20 3c |mitter yyjeqhc <|
00000080 31 37 37 32 34 31 33 33 35 33 40 71 71 2e 63 6f |1772413353@qq.co|
00000090 6d 3e 20 31 37 34 37 36 32 30 35 36 33 20 2b 30 |m> 1747620563 +0|
000000a0 38 30 30 0a 0a 68 65 6c 6c 6f 0a |800..hello.|
000000ab
*/
//可以看见,commit对象,zlib解压以后,也是满足
//git对象类型 + 空格 + 数据长度字符串 + '\0' + 数据的格式
//然后最后的数据就是ObjectTrait的to_data()方法
//因为这里是第一次提交,所以也看不见父提交
然后介绍一下git的tag对象
#git tag分两种,轻量级的tag不会保存为git对象
/*
root@65f8e5d2f4eb:/git/.git/objects# ls
45 aa ce d7 e1 info pack
root@65f8e5d2f4eb:/git/.git/objects# git tag 1406
#可见,tag前后,并没有添加一个对象
root@65f8e5d2f4eb:/git/.git/objects# ls
45 aa ce d7 e1 info pack
root@65f8e5d2f4eb:/git/.git# ls refs/tags
1406 v1 v1.4
#轻量级的tag,里面保存指向的对象的hash
root@65f8e5d2f4eb:/git/.git# cat refs/tags/1406
4543b9bdd8267c1c5bc7c457509517dca8781aaf
#评注型tag,里面保存本tag对象的hash
root@65f8e5d2f4eb:/git/.git# cat refs/tags/v1
d738d7b8e4c018fc9e9655c0b399d77379f54d31
root@65f8e5d2f4eb:/git/.git# cat refs/tags/v1.4
e1bca2fee2ce797abf62626a8b2b77124798dbdd
*/
root@65f8e5d2f4eb:/git/.git/objects# git tag -a v1.4 -m "my version 1.4"
root@65f8e5d2f4eb:/git/.git/objects# ls
45 aa ce e1 info pack
root@65f8e5d2f4eb:/git/.git/objects# git cat-file -t e1bca2fee2ce797abf62626a8b2b77124798dbdd
tag
root@65f8e5d2f4eb:/git/.git/objects# git cat-file -p e1bca2fee2ce797abf62626a8b2b77124798dbdd
object 4543b9bdd8267c1c5bc7c457509517dca8781aaf
type commit
tag v1.4
tagger yyjeqhc <1772413353@qq.com> 1747628214 +0800
my version 1.4
root@65f8e5d2f4eb:/git/.git/objects# zlib-flate -uncompress < e1/bca2fee2ce797abf62626a8b2b77124798dbdd | hexdump -C
00000000 74 61 67 20 31 33 37 00 6f 62 6a 65 63 74 20 34 |tag 137.object 4|
00000010 35 34 33 62 39 62 64 64 38 32 36 37 63 31 63 35 |543b9bdd8267c1c5|
00000020 62 63 37 63 34 35 37 35 30 39 35 31 37 64 63 61 |bc7c457509517dca|
00000030 38 37 38 31 61 61 66 0a 74 79 70 65 20 63 6f 6d |8781aaf.type com|
00000040 6d 69 74 0a 74 61 67 20 76 31 2e 34 0a 74 61 67 |mit.tag v1.4.tag|
00000050 67 65 72 20 79 79 6a 65 71 68 63 20 3c 31 37 37 |ger yyjeqhc <177|
00000060 32 34 31 33 33 35 33 40 71 71 2e 63 6f 6d 3e 20 |2413353@qq.com> |
00000070 31 37 34 37 36 32 38 32 31 34 20 2b 30 38 30 30 |1747628214 +0800|
00000080 0a 0a 6d 79 20 76 65 72 73 69 6f 6e 20 31 2e 34 |..my version 1.4|
00000090 0a |.|
00000091
root@65f8e5d2f4eb:/git/.git/objects# git tag -a v1 ce0136 -m "tag a blob"
root@65f8e5d2f4eb:/git/.git/objects# ls
45 aa ce d7 e1 info pack
root@65f8e5d2f4eb:/git/.git/objects# git cat-file -t d738d7b8e4c018fc9e9655c0b399d77379f54d31
tag
root@65f8e5d2f4eb:/git/.git/objects# git cat-file -p d738d7b8e4c018fc9e9655c0b399d77379f54d31
object ce013625030ba8dba906f756967f9e9ca394464a
type blob
tag v1
tagger yyjeqhc <1772413353@qq.com> 1747628604 +0800
tag a blob
root@65f8e5d2f4eb:/git/.git/objects# zlib-flate -uncompress < d7/38d7b8e4c018fc9e9655c0b399d77379f54d31 | hexdump -C
00000000 74 61 67 20 31 32 39 00 6f 62 6a 65 63 74 20 63 |tag 129.object c|
00000010 65 30 31 33 36 32 35 30 33 30 62 61 38 64 62 61 |e013625030ba8dba|
00000020 39 30 36 66 37 35 36 39 36 37 66 39 65 39 63 61 |906f756967f9e9ca|
00000030 33 39 34 34 36 34 61 0a 74 79 70 65 20 62 6c 6f |394464a.type blo|
00000040 62 0a 74 61 67 20 76 31 0a 74 61 67 67 65 72 20 |b.tag v1.tagger |
00000050 79 79 6a 65 71 68 63 20 3c 31 37 37 32 34 31 33 |yyjeqhc <1772413|
00000060 33 35 33 40 71 71 2e 63 6f 6d 3e 20 31 37 34 37 |353@qq.com> 1747|
00000070 36 32 38 36 30 34 20 2b 30 38 30 30 0a 0a 74 61 |628604 +0800..ta|
00000080 67 20 61 20 62 6c 6f 62 0a |g a blob.|
00000089
//git tag的带注释的标注,也会保存为一个git对象。
//git cat-file看见的,和zlib解压缩看见的,是一致的。
看一下具体的代码
mega/mercury/src/internal/object/tag.rs at main · web3infra-foundation/mega
/// The tag object is used to Annotated tag
#[derive(Eq, Debug, Clone)]
pub struct Tag {
pub id: SHA1,
pub object_hash: SHA1,
pub object_type: ObjectType,
pub tag_name: String,
pub tagger: Signature,
pub message: String,
}
impl ObjectTrait for Tag {
//...
fn to_data(&self) -> Result<Vec<u8>, GitError> {
let mut data = Vec::new();
data.extend_from_slice("object".as_bytes());
data.extend_from_slice(0x20u8.to_be_bytes().as_ref());
data.extend_from_slice(self.object_hash.to_string().as_bytes());
data.extend_from_slice(0x0au8.to_be_bytes().as_ref());
data.extend_from_slice("type".as_bytes());
data.extend_from_slice(0x20u8.to_be_bytes().as_ref());
data.extend_from_slice(self.object_type.to_string().as_bytes());
data.extend_from_slice(0x0au8.to_be_bytes().as_ref());
data.extend_from_slice("tag".as_bytes());
data.extend_from_slice(0x20u8.to_be_bytes().as_ref());
data.extend_from_slice(self.tag_name.as_bytes());
data.extend_from_slice(0x0au8.to_be_bytes().as_ref());
data.extend_from_slice(self.tagger.to_data()?.as_ref());
data.extend_from_slice(0x0au8.to_be_bytes().as_ref());
data.extend_from_slice(self.message.as_bytes());
Ok(data)
}
}
//从zlib解压的数据能看出来,git的评注型标签就是这样保存的
最后看一下tree
mega/mercury/src/internal/object/tree.rs at main · web3infra-foundation/mega
//! In Git, a tree object is used to represent the state of a directory at a specific point in time.
//! It stores information about the files and directories within that directory, including their names,
//! permissions, and the IDs of the objects that represent their contents.
//!
//! A tree object can contain other tree objects as well as blob objects, which represent the contents
//! of individual files. The object IDs of these child objects are stored within the tree object itself.
//!
//! When you make a commit in Git, you create a new tree object that represents the state of the
//! repository at that point in time. The parent of the new commit is typically the tree object
//! representing the previous state of the repository.
//!
//! Git uses the tree object to efficiently store and manage the contents of a repository. By
//! representing the contents of a directory as a tree object, Git can quickly determine which files
//! have been added, modified, or deleted between two points in time. This allows Git to perform
//! operations like merging and rebasing more quickly and accurately.
//!
use crate::errors::GitError;
use crate::hash::SHA1;
use crate::internal::object::ObjectTrait;
use crate::internal::object::ObjectType;
use colored::Colorize;
use encoding_rs::GBK;
use serde::Deserialize;
use serde::Serialize;
use std::fmt::Display;
/// In Git, the mode field in a tree object's entry specifies the type of the object represented by
/// that entry. The mode is a three-digit octal number that encodes both the permissions and the
/// type of the object. The first digit specifies the object type, and the remaining two digits
/// specify the file mode or permissions.
#[derive(PartialEq, Eq, Debug, Clone, Copy, Serialize, Deserialize)]
pub enum TreeItemMode {
Blob,
BlobExecutable,
Tree,
Commit,
Link,
}
impl Display for TreeItemMode {
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
let _print = match *self {
TreeItemMode::Blob => "blob",
TreeItemMode::BlobExecutable => "blob executable",
TreeItemMode::Tree => "tree",
TreeItemMode::Commit => "commit",
TreeItemMode::Link => "link",
};
write!(f, "{}", String::from(_print).blue())
}
}
impl TreeItemMode {
/// Convert a 32-bit mode to a TreeItemType
///
/// |0100000000000000| (040000)| Directory|
/// |1000000110100100| (100644)| Regular non-executable file|
/// |1000000110110100| (100664)| Regular non-executable group-writeable file|
/// |1000000111101101| (100755)| Regular executable file|
/// |1010000000000000| (120000)| Symbolic link|
/// |1110000000000000| (160000)| Gitlink|
/// ---
/// # GitLink
/// Gitlink, also known as a submodule, is a feature in Git that allows you to include a Git
/// repository as a subdirectory within another Git repository. This is useful when you want to
/// incorporate code from another project into your own project, without having to manually copy
/// the code into your repository.
///
/// When you add a submodule to your Git repository, Git stores a reference to the other
/// repository at a specific commit. This means that your repository will always point to a
/// specific version of the other repository, even if changes are made to the submodule's code
/// in the future.
///
/// To work with a submodule in Git, you use commands like git submodule add, git submodule
/// update, and git submodule init. These commands allow you to add a submodule to your repository,
/// update it to the latest version, and initialize it for use.
///
/// Submodules can be a powerful tool for managing dependencies between different projects and
/// components. However, they can also add complexity to your workflow, so it's important to
/// understand how they work and when to use them.
pub fn tree_item_type_from_bytes(mode: &[u8]) -> Result<TreeItemMode, GitError> {
Ok(match mode {
b"40000" => TreeItemMode::Tree,
b"100644" => TreeItemMode::Blob,
b"100755" => TreeItemMode::BlobExecutable,
b"120000" => TreeItemMode::Link,
b"160000" => TreeItemMode::Commit,
b"100664" => TreeItemMode::Blob,
b"100640" => TreeItemMode::Blob,
_ => {
return Err(GitError::InvalidTreeItem(
String::from_utf8(mode.to_vec()).unwrap(),
));
}
})
}
/// 32-bit mode, split into (high to low bits):
/// - 4-bit object type: valid values in binary are 1000 (regular file), 1010 (symbolic link) and 1110 (gitlink)
/// - 3-bit unused
/// - 9-bit unix permission: Only 0755 and 0644 are valid for regular files. Symbolic links and gitlink have value 0 in this field.
pub fn to_bytes(self) -> &'static [u8] {
match self {
TreeItemMode::Blob => b"100644",
TreeItemMode::BlobExecutable => b"100755",
TreeItemMode::Link => b"120000",
TreeItemMode::Tree => b"40000",
TreeItemMode::Commit => b"160000",
}
}
}
/// A tree object contains a list of entries, one for each file or directory in the tree. Each entry
/// in the file represents an entry in the tree, and each entry has the following format:
///
/// ```bash
/// <mode> <name>\0<binary object ID>
/// ```
/// - `<mode>` is the mode of the object, represented as a six-digit octal number. The first digit
/// represents the object type (tree, blob, etc.), and the remaining digits represent the file mode or permissions.
/// - `<name>` is the name of the object.
/// - `\0` is a null byte separator.
/// - `<binary object ID>` is the ID of the object that represents the contents of the file or
/// directory, represented as a binary SHA-1 hash.
///
/// # Example
/// ```bash
/// 100644 hello-world\0<blob object ID>
/// 040000 data\0<tree object ID>
/// ```
#[derive(PartialEq, Eq, Debug, Clone, Serialize, Deserialize)]
pub struct TreeItem {
pub mode: TreeItemMode,
pub id: SHA1,
pub name: String,
}
impl TreeItem {
// Create a new TreeItem from a mode, id and name
pub fn new(mode: TreeItemMode, id: SHA1, name: String) -> Self {
TreeItem { mode, id, name }
}
/// Create a new TreeItem from a byte vector, split into a mode, id and name, the TreeItem format is:
///
/// ```bash
/// <mode> <name>\0<binary object ID>
/// ```
///
pub fn from_bytes(bytes: &[u8]) -> Result<Self, GitError>
pub fn to_data(&self) -> Vec<u8> {
let mut bytes = Vec::new();
bytes.extend_from_slice(self.mode.to_bytes());
bytes.push(b' ');
bytes.extend_from_slice(self.name.as_bytes());
bytes.push(b'\0');
bytes.extend_from_slice(&self.id.to_data());
bytes
}
pub fn is_tree(&self) -> bool {
self.mode == TreeItemMode::Tree
}
}
/// A tree object is a Git object that represents a directory. It contains a list of entries, one
/// for each file or directory in the tree.
#[derive(Eq, Debug, Clone, Serialize, Deserialize)]
pub struct Tree {
pub id: SHA1,
pub tree_items: Vec<TreeItem>,
}
impl Tree {
pub fn from_tree_items(tree_items: Vec<TreeItem>) -> Result<Self, GitError> {
if tree_items.is_empty() {
return Err(GitError::EmptyTreeItems(
"When export tree object to meta, the items is empty"
.parse()
.unwrap(),
));
}
let mut data = Vec::new();
for item in &tree_items {
data.extend_from_slice(item.to_data().as_slice());
}
Ok(Tree {
id: SHA1::from_type_and_data(ObjectType::Tree, &data),
tree_items,
})
}
/// After the subdirectory is changed, the hash value of the tree is recalculated.
pub fn rehash(&mut self) {
let mut data = Vec::new();
for item in &self.tree_items {
data.extend_from_slice(item.to_data().as_slice());
}
self.id = SHA1::from_type_and_data(ObjectType::Tree, &data);
}
}
impl TryFrom<&[u8]> for Tree {
type Error = GitError;
fn try_from(data: &[u8]) -> Result<Self, Self::Error> {
let h = SHA1::from_type_and_data(ObjectType::Tree, data);
Tree::from_bytes(data, h)
}
}
impl ObjectTrait for Tree {
fn from_bytes(data: &[u8], hash: SHA1) -> Result<Self, GitError>
fn to_data(&self) -> Result<Vec<u8>, GitError> {
let mut data: Vec<u8> = Vec::new();
for item in &self.tree_items {
data.extend_from_slice(item.to_data().as_slice());
//data.push(b'\0');
}
Ok(data)
}
}
/*
Tree的结构比较复杂,一个Tree可能有多个TreeItem,然后TreeItem是枚举型的,也可能是一颗Tree
root@65f8e5d2f4eb:/git/.git/objects# git cat-file -t aaa96ced2d9a1c8e72c56b253a0e2fe78393feb7
tree
root@65f8e5d2f4eb:/git/.git/objects# git cat-file -p aaa96ced2d9a1c8e72c56b253a0e2fe78393feb7
100644 blob ce013625030ba8dba906f756967f9e9ca394464a hello.txt
root@65f8e5d2f4eb:/git/.git/objects# zlib-flate -uncompress < aa/a96ced2d9a1c8e72c56b253a0e2fe78393feb7 | hexdump -C
00000000 74 72 65 65 20 33 37 00 31 30 30 36 34 34 20 68 |tree 37.100644 h|
00000010 65 6c 6c 6f 2e 74 78 74 00 ce 01 36 25 03 0b a8 |ello.txt...6%...|
00000020 db a9 06 f7 56 96 7f 9e 9c a3 94 46 4a |....V......FJ|
0000002d
因为这个Tree比较简单,所以看得不是很明显。
具体就是git对象类型 + 空格 + treeItem的总长度字符串 + '\0' + treeItem + treeItem + ...
也就是最后的data就是一项挨着一项的treeItem,直接看treeItem.to_data()就好了
*/
最后,介绍一下git的基础的一些命令
#先简单了解一下git的暂存区和工作区
#git init以后,创建文件,创建文件夹,都是在工作区进行修改
#git add a.txt之类的,git add以后的文件会进入暂存区,暂存区里面的内容,会在下一次commit时被提交
#git status,就会检测上一次提交和暂存区的文件的变化,和暂存区与工作区的变化
#数据和上面echo的是一致的,所以hash值也一致
-w选项代表写入objects文件夹, --stdin代表从控制台输入
#git hash-object -w a.txt,把a.txt写入objects文件夹
root@65f8e5d2f4eb:/test# echo "hello" | git hash-object -w --stdin
ce013625030ba8dba906f756967f9e9ca394464a
root@65f8e5d2f4eb:/test/.git/objects# git cat-file -t ce013625030ba8dba906f756967f9e9ca394464a
blob
root@65f8e5d2f4eb:/test/.git/objects# git cat-file -p ce013625030ba8dba906f756967f9e9ca394464a
hello
root@65f8e5d2f4eb:/test/.git/objects# zlib-flate -uncompress < ce/013625030ba8dba906f756967f9e9ca394464a
blob 6hello
#可见,使用hash-objects命令写入的数据,和git add写入的数据是一致的
#对hash-object写入的文件,可以主动把它放置到暂存区并制定文件名称
#hash-object只写入内容,不保存文件的其他元信息
#具体的命令详情,请看上面的gitPro或者AI
root@65f8e5d2f4eb:/test#git update-index --add --cacheinfo 100644 ce013625030ba8dba906f756967f9e9ca394464a hello.txt
root@65f8e5d2f4eb:/test# git status
On branch master
No commits yet
Changes to be committed:
(use "git rm --cached <file>..." to unstage)
new file: hello.txt
Changes not staged for commit:
(use "git add/rm <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
deleted: hello.txt
#树也只是纯数据,所以hash值和上面使用git commit时生成的树是一致的
root@65f8e5d2f4eb:/test# git write-tree
aaa96ced2d9a1c8e72c56b253a0e2fe78393feb7
root@65f8e5d2f4eb:/test/.git/objects# git cat-file -t aaa96ced2d9a1c8e72c56b253a0e2fe78393feb7
tree
root@65f8e5d2f4eb:/test/.git/objects# git cat-file -p aaa96ced2d9a1c8e72c56b253a0e2fe78393feb7
100644 blob ce013625030ba8dba906f756967f9e9ca394464a hello.txt
#使用commit-tree 40位tree的hash,提交信息还包括时间戳,所以是肯定不一样的
root@65f8e5d2f4eb:/test# echo "hello" | git commit-tree aaa96c
bca409430870de8bd98aa020304e4b028919a38f
#很奇怪,已经commit了,为什么暂存区还不干净呢,且看下文
root@65f8e5d2f4eb:/test# git status
On branch master
No commits yet
Changes to be committed:
(use "git rm --cached <file>..." to unstage)
new file: hello.txt
Changes not staged for commit:
(use "git add/rm <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
deleted: hello.txt
root@65f8e5d2f4eb:/test# ls
root@65f8e5d2f4eb:/test/.git/objects# ls
aa bc ce info pack
root@65f8e5d2f4eb:/test/.git/objects# git cat-file -t bca409430870de8bd98aa020304e4b028919a38f
commit
root@65f8e5d2f4eb:/test/.git/objects# git cat-file -p bca409430870de8bd98aa020304e4b028919a38f
tree aaa96ced2d9a1c8e72c56b253a0e2fe78393feb7
author yyjeqhc <1772413353@qq.com> 1747622664 +0800
committer yyjeqhc <1772413353@qq.com> 1747622664 +0800
hello
root@65f8e5d2f4eb:/test/.git# cat HEAD
ref: refs/heads/master
root@65f8e5d2f4eb:/test/.git# ls refs/
heads tags
#仓库还处于git init状态,虽然默认分支是master,但是使用底层命令需要自己创建master分支文件
root@65f8e5d2f4eb:/test/.git# ls refs/heads/
#把刚才的提交写入master分支
root@65f8e5d2f4eb:/test# echo bca409430870de8bd98aa020304e4b028919a38f > .git/refs/heads/master
#显而易见,git status的输出不一样了。
root@65f8e5d2f4eb:/test# git status
On branch master
Changes not staged for commit:
(use "git add/rm <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
deleted: hello.txt
no changes added to commit (use "git add" and/or "git commit -a")
root@65f8e5d2f4eb:/test/.git# ls refs/heads/
master
root@65f8e5d2f4eb:/test/.git# cat refs/heads/master
bca409430870de8bd98aa020304e4b028919a38f