rust学习笔记
安装
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
安装rust之后rustup doc,文档就会在浏览器里打开。点击里面的The Rust Programming Language,就可以看到入门书的网页版了。
升级:rustup update
安装Nightly toolchain:
rustup toolchain install nightly
查看已安装的toolchain:
rustup show
参考:
https://rust-lang.github.io/rustup/basics.html#keeping-rust-up-to-date
https://rust-lang.github.io/rustup/concepts/channels.html
https://stackoverflow.com/questions/66681150/how-to-tell-cargo-to-use-nightly
卸载
rustup self uninstall
cargo
文档
cargo doc --open
可以生成并在浏览器打开项目的文档。
新建项目
cargo new <项目名>
Cargo.toml
version
指定crate的版本。如果把crate托管在github上的话,如果连续几个commit里的version都相同,那么实际上只取最早的那个commit作为这个version的crate。因此假如把一个version push到github之后,如果又进行了修改,那么需要更改version code才能让用户使用新的修改。
[dev-dependencies]
定义只在test里用的依赖:https://stackoverflow.com/questions/29857002/how-to-define-test-only-dependencies
Blocking waiting for file lock on package cache
rm -rf ~/.cargo/registry/index/*
rm ~/.cargo/.package-cache
publish
cargo login
cargo publish
标准库
字符串成员函数
- trim
去掉前后空格。 - parse
把字符串转成特定类型(通过要被赋值给的变量确定?)
排序
排序分为不稳定排序和稳定排序。稳定排序是指相等的元素会保持它们的相对位置不变,不稳定排序不保证这一点。
稳定排序用sort_by:https://doc.rust-lang.org/std/primitive.slice.html#method.sort_by
不稳定排序用sort_unstable_by:https://doc.rust-lang.org/std/primitive.slice.html#method.sort_unstable_by
它们的最坏时间复杂度都是\(O(n \log(n))\)
Entry API
以BTreeMap的Entry API为例。基础用法见标准库文档:https://doc.rust-lang.org/stable/std/collections/struct.BTreeMap.html#method.entry
但是基础的and_modify和or_insert_with接口有个问题,就是它们虽然是互斥的,但是却不能把一个object的ownership同时传给这两个接口。要解决这个问题,假如这个object有一个empty的状态,可以先or_insert把它变成empty,再进行修改操作。来源:https://users.rust-lang.org/t/hashmap-entry-api-and-ownership/81368
另一种比较通用的方法是用match判断返回的Entry是Occupied还是Vacant,这样编译器就知道这两种情况是互斥的了。
match的另一个例子:modify and optionally remove
use std::collections::btree_map::{self, BTreeMap};
fn pop(m: &mut BTreeMap<u32, Vec<u32>>, key: u32) -> Option<u32> {
match m.entry(key) {
btree_map::Entry::Occupied(mut entry) => {
let values = entry.get_mut();
let ret = values.pop();
if values.is_empty() {
entry.remove();
}
ret
}
btree_map::Entry::Vacant(_) => {
None
}
}
}
fn main() {
let mut m: BTreeMap<u32, Vec<u32>> = BTreeMap::new();
m.insert(1, vec![2, 3]);
assert_eq!(pop(&mut m, 1), Some(3));
assert_eq!(pop(&mut m, 1), Some(2));
assert!(m.is_empty());
}
参考:
https://doc.rust-lang.org/stable/std/collections/btree_map/enum.Entry.html
https://doc.rust-lang.org/stable/std/collections/btree_map/struct.VacantEntry.html
mpsc
需求:需要在一个线程里读取数据,发送给另一个线程处理。
我的方法:用mpsc的channel发送和接收。
坑:mpsc的channel从不阻塞发送方,它有无限的缓冲。结果读取远远比写入快,导致大量内存被消耗。
解决方案:用sync_channel:
pub fn sync_channel<T>(bound: usize) -> (SyncSender<T>, Receiver<T>)
这个bound参数应该指的是个数。
文档:https://doc.rust-lang.org/stable/std/sync/mpsc/index.html
Crates
dyn_struct
https://www.reddit.com/r/rust/comments/qbj84o/dyn_struct_create_types_whose_size_is_determined/
https://github.com/nolanderc/dyn_struct
enum_iterator
可以获取enum的可能取值个数。
num-derive
可以把enum转成基本类型。
serde
指定field名字
#[derive(Deserialize)]
struct Info {
#[serde(rename = "num-run-op")]
num_run_op: usize,
}
这样读json的时候就会把json里的num-run-op映射到num_run_op。
文档:https://serde.rs/field-attrs.html
clap
官方文档:https://docs.rs/clap/latest/clap/
derive的用法:https://docs.rs/clap/latest/clap/_derive/index.html
#[arg(...)]
short
自动取field name的首字母作为参数名。
也可以short = 'x'指定参数名。
long
自动把field name的下划线替换为-作为参数名。也可以long = "xxx"指定参数名。
default_value_t
default_value_t [= <expr>]
_t后缀应该是type的意思。
Positional arguments
不指定short之类的,默认就是positional argument。
Optional auguments
把参数定义成Option<类型>即可。
API guidelines
Generic reader/writer functions take R: Read and W: Write by value (C-RW-VALUE)
What is the reason for C-RW-VALUE?
类型转换
int <-> [u8]
Vec<u8> -> String
https://stackoverflow.com/questions/19076719/how-do-i-convert-a-vector-of-bytes-u8-to-a-string
https://doc.rust-lang.org/stable/std/string/struct.String.html#method.from_utf8
Vec<T> -> [T; N]
用try_into: https://stackoverflow.com/questions/29570607/is-there-a-good-way-to-convert-a-vect-to-an-array
char -> u8
https://users.rust-lang.org/t/how-to-convert-char-to-u8/50195
C语言字符串转String
use std::ffi::CStr;
let c_buf: *const c_char = unsafe { hello() };
let c_str: &CStr = unsafe { CStr::from_ptr(c_buf) };
let str_slice: &str = c_str.to_str().unwrap();
let str_buf: String = str_slice.to_owned(); // if necessary
语法
_是通配符

这里指匹配所有的Err,不管里面是啥。
https://users.rust-lang.org/t/calling-function-in-struct-field-requires-extra-parenthesis/14214/2
I/O
读取命令行参数
use std::io;
use std::env;
use std::error::Error;
fn main() -> Result<(), Box<dyn Error>> {
let mut args = env::args();
let arg0 = args.next().unwrap();
// args.len(): Returns the exact remaining length of the iterator.
if args.len() != 1 {
eprintln!("{} dump-file", arg0);
return Err(Box::new(io::Error::new(
io::ErrorKind::Other,
"Invalid arguments",
)));
}
let file_path = args.next().unwrap();
println!("{}", file_path);
Ok(())
}
trait
Rust的trait相当于定义了这个类型有哪些接口。定义了trait之后,可以对已知类型实现这个trait:
trait A {
fn a() -> i32;
}
impl A for f32 {
fn a() -> i32 {
return 2333;
}
}
fn main() {
// 2333
println!("{}", f32::a());
}
相关:
https://users.rust-lang.org/t/box-with-a-trait-object-requires-static-lifetime/35261
Associated type
trait A {
type T;
}
如果B: A,一般可以这样访问T: B::T。但是在template argument中比较特殊:<B as A>::T。例子:
trait A {
type T;
}
struct C<B: A, C = <B as A>::T> { a: B, at: C }
Universal call syntax
文档:https://doc.rust-lang.org/reference/expressions/call-expr.html#disambiguating-function-calls
主要用来call指定trait的某个method:
<T as TraitA>::method_name(xxx)
约束不同类型的associated type相等
这是一个未实现的特性:https://github.com/rust-lang/rust/issues/20041
FnOnce, FnMut, Fn
https://stackoverflow.com/questions/30177395/when-does-a-closure-implement-fn-fnmut-and-fnonce
但是如果要构造function array的话,好像只能用fn类型,也就是普通函数:https://stackoverflow.com/questions/31736656/how-to-implement-a-vector-array-of-functions-in-rust-when-the-functions-co
Higher-Rank Trait Bounds (HRTBs)
官方文档:https://doc.rust-lang.org/nomicon/hrtb.html
基本语法:T: for<'a> TraitName<'a>
相当于对所有的lifetime,T都要满足这个trait bound。例子:
use std::ops::SubAssign;
fn func<T>(a: &mut T, b: &T)
where
T: for<'a> SubAssign<&'a T>,
{
*a -= b;
}
fn main() {
let mut a = 2;
let b = 1;
func(&mut a, &b);
println!("{}", a);
}
多线程
channel
标准库里的mpsc对应的select!已经deprecated了。可以考虑使用crossbeam-channel: https://docs.rs/crossbeam-channel/latest/crossbeam_channel/
select: https://docs.rs/crossbeam-channel/latest/crossbeam_channel/macro.select.html
错误处理
让main函数兼容多种Error
use std::error::Error;
fn main() -> Result<(), Box<dyn Error>> {
将多种Error通过channel发送出去
Box<dyn Error>是没法通过channel发送出去的。可以枚举出有哪些种类的Error,然后手搓一个enum表示它,这样就可以发送出去了:
enum FlushError {
Bincode(bincode::Error),
Io(io::Error),
}
impl From<bincode::Error> for FlushError {
fn from(e: bincode::Error) -> Self {
Self::Bincode(e)
}
}
impl From<io::Error> for FlushError {
fn from(e: io::Error) -> Self {
Self::Io(e)
}
}
参考:
https://fettblog.eu/rust-enums-wrapping-errors/
获得Vec里多个元素的mutable reference
比如要获得a[1]和a[3]的可变引用,可以用iterator:
fn main() {
let mut a = vec![0, 1, 2, 3, 4, 5];
let mut iter = a.iter_mut();
let a1 = iter.nth(1).unwrap();
let a3 = iter.nth(3 - 1 - 1).unwrap();
*a1 = -1;
*a3 = -1;
println!("{:?}", a);
}
也可以用nightly特性get_many_mut:
#![feature(get_many_mut)]
fn main() {
let mut a = vec![0, 1, 2, 3, 4, 5];
let [a1, a3] = a.get_many_mut([1, 3]).unwrap();
*a1 = -1;
*a3 = -1;
println!("{:?}", a);
}
struct成员变量默认值
生成随机数
用rand crate。文档:https://docs.rs/rand/latest/rand/。
基础用法:https://docs.rs/rand/latest/rand/#quick-start
自带的随机数生成器:https://docs.rs/rand/latest/rand/rngs/index.html
如果需要指定随机种子的话,一般rand::rngs::StdRng即可满足需求,文档:https://docs.rs/rand/latest/rand/rngs/struct.StdRng.html
一些自带的分布:https://docs.rs/rand/latest/rand/distributions/index.html
比较常见的均匀分布:https://docs.rs/rand/latest/rand/distributions/struct.Uniform.html
lower_bound / upper_bound
Module
https://doc.rust-lang.org/book/ch07-05-separating-modules-into-different-files.html
条件编译
官方文档:https://doc.rust-lang.org/reference/conditional-compilation.html
https://stackoverflow.com/questions/29857002/how-to-define-test-only-dependencies
仅在测试时derive: #[cfg_attr(test, derive(Deserialize))]。来源:https://www.reddit.com/r/rust/comments/nwywqx/conditionally_derive_for_integration_tests/
仅在测试时impl:
#[cfg(test)]
impl Default for Status {
其他
https://stackoverflow.com/questions/28185854/how-do-i-test-crates-with-no-std
Unstable features
generic_const_exprs
需要的项目:
https://github.com/seekstar/counter-timer-cpp
配合array-macro可以用array存timer而不是Vec。
RFC
Multiple Attributes in an Attribute Container (postponed)
支持不允许Drop的类型:[https://github.com/rust-lang/rfcs/pull/776] (postponed)
Improving Entry API to get the keys back when they are unused
已知问题
Non-lexical lifetimes (NLL)
来源:https://blog.rust-lang.org/2022/08/05/nll-by-default.html
fn last_or_push<'a>(vec: &'a mut Vec<String>) -> &'a String {
if let Some(s) = vec.last() { // borrows vec
// returning s here forces vec to be borrowed
// for the rest of the function, even though it
// shouldn't have to be
return s;
}
// Because vec is borrowed, this call to vec.push gives
// an error!
vec.push("".to_string()); // ERROR
vec.last().unwrap()
}
error[E0502]: cannot borrow `*vec` as mutable because it is also borrowed as immutable
--> a.rs:11:5
|
1 | fn last_or_push<'a>(vec: &'a mut Vec<String>) -> &'a String {
| -- lifetime `'a` defined here
2 | if let Some(s) = vec.last() { // borrows vec
| ---------- immutable borrow occurs here
...
6 | return s;
| - returning this value requires that `*vec` is borrowed for `'a`
...
11 | vec.push("".to_string()); // ERROR
| ^^^^^^^^^^^^^^^^^^^^^^^^ mutable borrow occurs here
这是因为s borrow了vec之后,s是conditional return的,但是编译器仍然将对vec的borrow拓展到所有条件分支了,就导致另一个没有borrow vec的分支也被认为borrow了vec,就编译报错了。
据说下一代borrow checker polonius可以解决这个问题。现在只能通过推迟对vec的borrow绕过这个问题:
fn last_or_push<'a>(vec: &'a mut Vec<String>) -> &'a String {
if !vec.is_empty() {
let s = vec.last().unwrap(); // borrows vec
return s; // extends the borrow
}
// In this branch, the borrow has never happened, so even
// though it is extended, it doesn't cover this call;
// the code compiles.
//
// Note the subtle difference with the previous example:
// in that code, the borrow *always* happened, but it was
// only *conditionally* returned (but the compiler lost track
// of the fact that it was a conditional return).
//
// In this example, the *borrow itself* is conditional.
vec.push("".to_string());
vec.last().unwrap()
}
fn main() { }
调试时不能执行复杂代码
https://stackoverflow.com/questions/68232945/execute-a-statement-while-debugging-in-rust
Raw pointers are !Sync and !Send
https://doc.rust-lang.org/nomicon/send-and-sync.html
主要目的是防止含有裸指针的struct被自动标记为thread-safe。
所以如果需要在不同线程之间共享裸指针,而且可以保证裸指针引用的部分已经做了并发控制的话,可以写一个wrapper:
struct ThreadSafePtr<T>(*mut T);
unsafe impl<T> Send for ThreadSafePtr<T> {}
unsafe impl<T> Sync for ThreadSafePtr<T> {}
但我觉得应该让raw pointer本身是thread safe的,然后在编译器层面不让含有裸指针的struct被自动标记为thread safe。
相关讨论:https://internals.rust-lang.org/t/shouldnt-pointers-be-send-sync-or/8818
drop的时候拿的是mutable reference而不是ownership
https://stackoverflow.com/questions/30905826/why-does-drop-take-mut-self-instead-of-self
这是为了防止编译器在drop的最后又自动调用drop。
如果需要在drop的时候consume某个field,可以通过把这个field放在Option里实现。或者把这个field用unsafe的ManuallyDrop包起来,然后在drop的时候take:https://users.rust-lang.org/t/can-drop-handler-take-ownership-of-a-field/74301/7
我觉得最好的实现应该是让drop拿ownership,然后在编译器里特殊处理这个case,在drop的最后不再调用drop。但是rust核心开发者觉得这个特性需要对编译器做太多修改:https://github.com/rust-lang/rust/issues/4330
如果有自定义的Drop::drop,就不能单独拿某个field的ownership
Copy一个struct的mutable reference field时会mutable borrow这个struct
例如:
struct S<'a> {
m: &'a mut i32,
}
impl<'a> S<'a> {
fn f1<'b>(&'b mut self) -> &'a mut i32 {
let new_m: &'a mut i32 = self.m;
new_m
}
}
fn f2(m: &mut i32) -> &mut i32 {
let mut s = S { m };
s.f1()
}
fn main() {
let mut m = 2;
*f2(&mut m) = 3;
println!("{}", m);
}
会报错:
error: lifetime may not live long enough
--> test.rs:6:20
|
4 | impl<'a> S<'a> {
| -- lifetime `'a` defined here
5 | fn f1<'b>(&'b mut self) -> &'a mut i32 {
| -- lifetime `'b` defined here
6 | let new_m: &'a mut i32 = self.m;
| ^^^^^^^^^^^ type annotation requires that `'b` must outlive `'a`
|
= help: consider adding the following bound: `'b: 'a`
error: aborting due to 1 previous error
显然我们不能改成'b: 'a,因为s是个局部变量,它的生命周期'b比'a短。
出现这个报错的原因是let new_m = self.m并不是单纯的copy,而是将*self.m的写入权限转让给了new_m。而编译器需要保证在写入权限交还给self前,self不能再被读或者写。于是编译器就让new_m mutable reference了self,这样就可以利用borrow机制保证这一点。而new_m mutable reference self就需要保证self活得比new_m长。
我认为我们可以引入一个新概念:mutability transfer。在let new_m = self.m时,我们说the mutability of self is transferred to new_m。当一个object的状态处于mutable transferred时,不允许读写之。这样就避免了影响new_m的lifetime。
目前遇到这种情况,只能让f1 consume self:
struct S<'a> {
m: &'a mut i32,
}
impl<'a> S<'a> {
fn f1(self) -> &'a mut i32 {
let new_m: &'a mut i32 = self.m;
new_m
}
}
fn f2(m: &mut i32) -> &mut i32 {
let s = S { m };
s.f1()
}
fn main() {
let mut m = 2;
*f2(&mut m) = 3;
println!("{}", m);
}
值得注意的是,copy一个immutable reference field是真正的copy,不需要reference整个struct,所以不会有这个问题。例如下面这段代码就可以通过编译:
struct S<'a> {
m: &'a i32,
}
impl<'a> S<'a> {
fn f1<'b>(&'b self) -> &'a i32 {
let new_m: &'a i32 = self.m;
new_m
}
}
fn f2(m: &i32) -> &i32 {
let s = S { m };
s.f1()
}
fn main() {
let m = 2;
println!("{}", f2(&m));
}
但是,如果把S::m改成mutable reference:
struct S<'a> {
m: &'a mut i32,
}
impl<'a> S<'a> {
fn f1<'b>(&'b self) -> &'a i32 {
let new_m: &'a i32 = self.m;
new_m
}
}
fn f2(m: &mut i32) -> &i32 {
let s = S { m };
s.f1()
}
fn main() {
let mut m = 2;
println!("{}", f2(&mut m));
}
即使是copy成一个immutable reference,也需要转让写入权限,所以就需要reference整个self,从而导致跟上面一样的lifetime的问题:
error: lifetime may not live long enough
--> test.rs:6:20
|
4 | impl<'a> S<'a> {
| -- lifetime `'a` defined here
5 | fn f1<'b>(&'b self) -> &'a i32 {
| -- lifetime `'b` defined here
6 | let new_m: &'a i32 = self.m;
| ^^^^^^^ type annotation requires that `'b` must outlive `'a`
|
= help: consider adding the following bound: `'b: 'a`
error: aborting due to 1 previous error
这时也只能通过让f1 consume self来解决问题:
fn f1<'b>(self) -> &'a i32 {

浙公网安备 33010602011771号