13_并发编程
一、什么是Goroutine
在java/c++中我们要实现并发编程的时候,我们通常需要自己维护一个线程池,并且需要自己去包装一个又一个的任务,同时需要自己去调度线程执行任务并维护上下文切换,这一切通常会耗费程序员大量的心智。
Go语言中的goroutine就是这样一种机制,goroutine的概念类似于线程,但 goroutine是由Go的运行时(runtime)调度和管理的。Go程序会智能地将 goroutine 中的任务合理地分配给每个CPU。
OS线程(操作系统线程)一般都有固定的栈内存(通常为2MB),一个 goroutine 的栈在其生命周期开始时只有很小的栈(典型只有2KB),goroutine 的栈不是固定的,它可以按照需求增大和缩小,goroutine 的栈大小限制可达 1GB。
二、Goroutine调度
GPM是Go语言运行时(runtime)层面的实现,是go语言自己实现的一套调度系统。区别于操作系统调度OS线程。
- 1.G很好理解,就是个goroutine的,里面除了存放本goroutine信息外 还有与所在P的绑定等信息。
- 2.P管理着一组goroutine队列,P里面会存储当前goroutine运行的上下文环境(函数指针,堆栈地址及地址边界),P会对自己管理的goroutine队列做一些调度(比如把占用CPU时间较长的goroutine暂停、运行后续的goroutine等等)当自己的队列消费完了就去全局队列里取,如果全局队列里也消费完了会去其他P的队列里抢任务。
- 3.M(machine)是Go运行时(runtime)对操作系统内核线程的虚拟, M与内核线程一般是一一映射的关系, 一个groutine最终是要放到M上执行的;
P与M一般也是一一对应的。他们关系是: P管理着一组G挂载在M上运行。当一个G长久阻塞在一个M上时,runtime会新建一个M,阻塞G所在的P会把其他的G 挂载在新建的M上。当旧的G阻塞完成或者认为其已经死掉时 回收旧的M。
P的个数是通过 runtime.GOMAXPROCS
设定(最大256),Go1.5版本之后默认为物理线程数。 在并发量大的时候会增加一些P和M,但不会太多,切换太频繁的话得不偿失。Go语言可以通过 runtime.GOMAXPROCS()
函数来设置当前程序并发时占用的 CPU 逻辑核心数。
单从线程调度讲,Go语言相比起其他语言的优势在于OS线程是由OS内核来调度的,goroutine则是由Go运行时(runtime)自己的调度器调度的,这个调度器使用一个称为m:n调度的技术(复用/调度m个goroutine到n个OS线程)。 其一大特点是 goroutine 的调度是在用户态下完成的, 不涉及内核态与用户态之间的频繁切换,包括内存的分配与释放,都是在用户态维护着一块大的内存池, 不直接调用系统的malloc函数(除非内存池需要改变),成本比调度OS线程低很多。 另一方面充分利用了多核的硬件资源,近似的把若干goroutine均分在物理线程上, 再加上本身goroutine的超轻量,以上种种保证了go调度方面的性能。
Go语言中操作系统线程和 goroutine 的对应关系:
- 一个操作系统线程对应用户态多个 goroutine;
- go 程序可以同时使用多个操作系统线程;
- goroutine 与操作系统线程是多对多关系,即 m:n,将m个goroutine 调度在 n个 os线程上运行。
//多核情况下,单核处理两个任务,需要做完一个任务后再做另一个任务
func a() {
for i := 1; i < 10; i++ {
fmt.Println("A:", i)
}
}
func b() {
for i := 1; i < 10; i++ {
fmt.Println("B:", i)
}
}
func main() {
runtime.GOMAXPROCS(1)
go a()
go b()
time.Sleep(time.Second)
}
//两个逻辑核心,此时两个任务一起处理
func a() {
for i := 1; i < 10; i++ {
fmt.Println("A:", i)
}
}
func b() {
for i := 1; i < 10; i++ {
fmt.Println("B:", i)
}
}
func main() {
runtime.GOMAXPROCS(2)
go a()
go b()
time.Sleep(time.Second)
}
三、channel
Go 可以通过共享内存进行数据交换,但是共享内存在不同的 goroutine 中容易发生竞态问题。为了保证数据交换的正确性,必须使用互斥变量对内存加锁,这种做法会造成性能问题。
Go 语言并发模型是 CSP,提倡通过通信共享内存而不是通过共享内存实现通信。channel 是可以让一个 goroutine 发送特定值到另一个 goroutine 的通信机制。
3.1 无缓冲通道
无缓冲通道又称为阻塞通道,无缓冲通道只有在有人接收值的时候才能发送值,否则会阻塞形成死锁。
func main() {
ch := make(chan int)
ch <- 10
fmt.Println("发送成功")
}
//output
fatal error: all goroutines are asleep - deadlock!
goroutine 1 [chan send]:
main.main()
.../src/github.com/pprof/studygo/day06/channel02/main.go:8 +0x54
在添加接收者后:
func recv(c chan int) {
ret := <-c
fmt.Println("接收成功", ret)
}
func main() {
ch := make(chan int)
go recv(ch) // 启用goroutine从通道接收值
ch <- 10
fmt.Println("发送成功")
}
3.2 有缓冲通道
通过给通道指定容量来创建有缓冲通道:
func main() {
ch := make(chan int, 1) // 创建一个容量为1的有缓冲区通道
ch <- 10
fmt.Println("发送成功")
}
通过 for-range 从通道中循环取值:
// channel 练习
func main() {
ch1 := make(chan int)
ch2 := make(chan int)
// 开启goroutine将0~100的数发送到ch1中
go func() {
for i := 0; i < 100; i++ {
ch1 <- i
}
close(ch1)
}()
// 开启goroutine从ch1中接收值,并将该值的平方发送到ch2中
go func() {
for {
i, ok := <-ch1 // 通道关闭后再取值ok=false
if !ok {
break
}
ch2 <- i * i
}
close(ch2)
}()
// 在主goroutine中从ch2中接收值打印
for i := range ch2 { // 通道关闭后会退出for range循环
fmt.Println(i)
}
}
3.3 单向通道
chan<- int
:只能发送的通道,可以发送但是不能接收<-chan int
:只能接收的通道,可以接收但是不能发送
func counter(out chan<- int) {
for i := 0; i < 100; i++ {
out <- i
}
close(out)
}
func squarer(out chan<- int, in <-chan int) {
for i := range in {
out <- i * i
}
close(out)
}
func printer(in <-chan int) {
for i := range in {
fmt.Println(i)
}
}
func main() {
ch1 := make(chan int)
ch2 := make(chan int)
go counter(ch1)
go squarer(ch2, ch1)
printer(ch2)
}
3.4 生产者消费者模型
创建工作池,利用通道进行任务投递和结果输出。
package main
import (
"fmt"
"math/rand"
)
type Job struct {
Id int
Randnum int
}
type Result struct {
job *Job
sum int
}
func main() {
//开启job管道、result管道
jobChannel := make(chan *Job, 100)
resultChannel := make(chan *Result, 100)
//创建工作池
createWorkPool(64, jobChannel, resultChannel)
//开启协程打印结果
go func(resultChannel chan *Result) {
// 遍历结果管道打印
for result := range resultChannel {
fmt.Printf("job id:%v randnum:%v result:%d\n", result.job.Id,
result.job.Randnum, result.sum)
}
}(resultChannel)
//随机生成任务放入工作池
var id int
for {
id++
r_num := rand.Int()
job := &Job{
Id: id,
Randnum: r_num,
}
jobChannel <- job
}
}
//工作池
func createWorkPool(num int, jobChannel chan *Job, resultChannel chan *Result) {
for i := 0; i < num; i++ {
go func(jobChannel chan *Job, resultChannel chan *Result) {
//执行运算
for job := range jobChannel {
//计算随机数各个位数字之和
r_num := job.Randnum
var sum int
for r_num != 0 {
sum += r_num % 10
r_num /= 10
}
//通道返回结果
r := &Result{
sum: sum,
job: job,
}
resultChannel <- r
}
}(jobChannel, resultChannel)
}
}
3.5 延时器和定时器
package main
import (
"fmt"
"time"
)
func main() {
//1.timer基础使用
//timer1 := time.NewTimer(2 * time.Second)
//t1 := time.Now()
//fmt.Printf("t1:%v\n", t1)
//t2 := <-timer1.C //延时两秒后执行
//fmt.Printf("t2:%v\n", t2)
//2.timer只能响应一次
//timer1 := time.NewTimer(2 * time.Second)
//for { //会构成死锁
// <-timer1.C
// fmt.Println("时间到")
//}
//3.timer实现延时功能
//time.Sleep(time.Second)
//timer3 := time.NewTimer(2 * time.Second)
////fmt.Println(timer3)
//<-timer3.C
//fmt.Println("2秒到")
//<-time.After(2 * time.Second) //延时2秒
//fmt.Println("2秒到")
//4.停止定时器
//timer4 := time.NewTimer(2 * time.Second)
//go func() {
// <-timer4.C
// fmt.Println("定时器执行了")
//}()
//b := timer4.Stop() //停止定时器
//if b {
// fmt.Println("timer4已经关闭")
//}
//5.重置定时器
//fmt.Println(time.Now())
//timer5 := time.NewTimer(5 * time.Second)
//timer5.Reset(2 * time.Second)
//fmt.Println(time.Now())
//fmt.Println(<-timer5.C)
//for {
//
//}
// ticker时间到了多次执行
ticker := time.NewTicker(2 * time.Second)
i := 0
// 子协程
go func() {
for {
//<-ticker.C
i++
fmt.Println(<-ticker.C)
if i == 5 {
//停止
ticker.Stop()
}
}
}()
for {
}
}
四、并发处理
4.1 写锁、读锁
并发处理临界区资源数据采用互斥锁、读写互斥锁来实现并发处理数据。
package main
import (
"fmt"
"sync"
"time"
)
var (
x int64
wg sync.WaitGroup
lock sync.Mutex
rwlock sync.RWMutex
)
func read() { //加读锁
rwlock.RLock()
time.Sleep(10 * time.Millisecond)
rwlock.RUnlock()
wg.Done()
}
func write() { //加写锁
rwlock.Lock()
x = x + 1
//time.Sleep(10 * time.Millisecond)
rwlock.Unlock()
wg.Done()
}
func main() {
start := time.Now()
for i := 0; i < 4000; i++ {
wg.Add(1)
go write()
}
for i := 0; i < 1000; i++ {
wg.Add(1)
go read()
}
wg.Wait()
fmt.Println(x)
end := time.Now()
fmt.Println(end.Sub(start))
}
4.2 sync包
sync.WaitGroup
go 语言中可以通过 sync.WaitGroup
来实现对并发任务的同步,其内部维护了一个计数器,当我们启动N个并发任务时就将计数器的值增加N,然后通过 Wait()
方法等待并发任务处理结束:
sync.Once
高并发场景下我们需要确保某些操作只会被执行一次,比如只加载一次配置文件、只关闭一次通道等,可以通过 sync.Once
的方法实现。这里以加载配置文件操作为例:
var icons map[string]image.Image
var loadIconsOnce sync.Once
func loadIcons() {
icons = map[string]image.Image{
"left": loadIcon("left.png"),
"up": loadIcon("up.png"),
"right": loadIcon("right.png"),
"down": loadIcon("down.png"),
}
}
// Icon 是并发安全的
func Icon(name string) image.Image {
loadIconsOnce.Do(loadIcons) //调用 Do 方法,即使在并发情况下也只会执行一次 loadIcons 函数
return icons[name]
}
接下来解析以下 sync.Once
的源码:
// sync.Once 源码
type Once struct {
// done indicates whether the action has been performed.
// It is first in the struct because it is used in the hot path.
// The hot path is inlined at every call site.
// Placing done first allows more compact instructions on some architectures (amd64/386),
// and fewer instructions (to calculate offset) on other architectures.
done uint32 // 0表示未执行,1表示执行了
m Mutex // 互斥锁
}
// Do calls the function f if and only if Do is being called for the
// first time for this instance of Once. In other words, given
// var once Once
// if once.Do(f) is called multiple times, only the first call will invoke f,
// even if f has a different value in each invocation. A new instance of
// Once is required for each function to execute.
//
// Do is intended for initialization that must be run exactly once. Since f
// is niladic, it may be necessary to use a function literal to capture the
// arguments to a function to be invoked by Do:
// config.once.Do(func() { config.init(filename) })
//
// Because no call to Do returns until the one call to f returns, if f causes
// Do to be called, it will deadlock.
//
// If f panics, Do considers it to have returned; future calls of Do return
// without calling f.
//
func (o *Once) Do(f func()) {
// Note: Here is an incorrect implementation of Do:
//
// if atomic.CompareAndSwapUint32(&o.done, 0, 1) {
// f()
// }
//
// Do guarantees that when it returns, f has finished.
// This implementation would not implement that guarantee:
// given two simultaneous calls, the winner of the cas would
// call f, and the second would return immediately, without
// waiting for the first's call to f to complete.
// This is why the slow path falls back to a mutex, and why
// the atomic.StoreUint32 must be delayed until after f returns.
// 原子操作读取 done 的值判断是否执行过
if atomic.LoadUint32(&o.done) == 0 {
// Outlined slow-path to allow inlining of the fast-path.
o.doSlow(f)
}
}
func (o *Once) doSlow(f func()) {
o.m.Lock() //加互斥锁
defer o.m.Unlock() //退出函数前解锁
if o.done == 0 {
defer atomic.StoreUint32(&o.done, 1) //执行一次后将done赋值为1
f() //执行指定函数
}
}
4.3 atomic 原子操作包
对于一些基础变量使用互斥锁会对性能产生更大的影响,因为加锁会涉及到内核态的上下文切换比较耗时,代价比较高。针对基本数据类型我们可以通过原子操作来保证并发安全,因为原子操作在用户态下就可以完成。
// SwapInt32 atomically stores new into *addr and returns the previous *addr value.
func SwapInt32(addr *int32, new int32) (old int32)
// SwapInt64 atomically stores new into *addr and returns the previous *addr value.
func SwapInt64(addr *int64, new int64) (old int64)
// SwapUint32 atomically stores new into *addr and returns the previous *addr value.
func SwapUint32(addr *uint32, new uint32) (old uint32)
// SwapUint64 atomically stores new into *addr and returns the previous *addr value.
func SwapUint64(addr *uint64, new uint64) (old uint64)
// SwapUintptr atomically stores new into *addr and returns the previous *addr value.
func SwapUintptr(addr *uintptr, new uintptr) (old uintptr)
// SwapPointer atomically stores new into *addr and returns the previous *addr value.
func SwapPointer(addr *unsafe.Pointer, new unsafe.Pointer) (old unsafe.Pointer)
// CompareAndSwapInt32 executes the compare-and-swap operation for an int32 value.
func CompareAndSwapInt32(addr *int32, old, new int32) (swapped bool)
// CompareAndSwapInt64 executes the compare-and-swap operation for an int64 value.
func CompareAndSwapInt64(addr *int64, old, new int64) (swapped bool)
// CompareAndSwapUint32 executes the compare-and-swap operation for a uint32 value.
func CompareAndSwapUint32(addr *uint32, old, new uint32) (swapped bool)
// CompareAndSwapUint64 executes the compare-and-swap operation for a uint64 value.
func CompareAndSwapUint64(addr *uint64, old, new uint64) (swapped bool)
// CompareAndSwapUintptr executes the compare-and-swap operation for a uintptr value.
func CompareAndSwapUintptr(addr *uintptr, old, new uintptr) (swapped bool)
// CompareAndSwapPointer executes the compare-and-swap operation for a unsafe.Pointer value.
func CompareAndSwapPointer(addr *unsafe.Pointer, old, new unsafe.Pointer) (swapped bool)
// AddInt32 atomically adds delta to *addr and returns the new value.
func AddInt32(addr *int32, delta int32) (new int32)
// AddUint32 atomically adds delta to *addr and returns the new value.
// To subtract a signed positive constant value c from x, do AddUint32(&x, ^uint32(c-1)).
// In particular, to decrement x, do AddUint32(&x, ^uint32(0)).
func AddUint32(addr *uint32, delta uint32) (new uint32)
// AddInt64 atomically adds delta to *addr and returns the new value.
func AddInt64(addr *int64, delta int64) (new int64)
// AddUint64 atomically adds delta to *addr and returns the new value.
// To subtract a signed positive constant value c from x, do AddUint64(&x, ^uint64(c-1)).
// In particular, to decrement x, do AddUint64(&x, ^uint64(0)).
func AddUint64(addr *uint64, delta uint64) (new uint64)
// AddUintptr atomically adds delta to *addr and returns the new value.
func AddUintptr(addr *uintptr, delta uintptr) (new uintptr)
// LoadInt32 atomically loads *addr.
func LoadInt32(addr *int32) (val int32)
// LoadInt64 atomically loads *addr.
func LoadInt64(addr *int64) (val int64)
// LoadUint32 atomically loads *addr.
func LoadUint32(addr *uint32) (val uint32)
// LoadUint64 atomically loads *addr.
func LoadUint64(addr *uint64) (val uint64)
// LoadUintptr atomically loads *addr.
func LoadUintptr(addr *uintptr) (val uintptr)
// LoadPointer atomically loads *addr.
func LoadPointer(addr *unsafe.Pointer) (val unsafe.Pointer)
// StoreInt32 atomically stores val into *addr.
func StoreInt32(addr *int32, val int32)
// StoreInt64 atomically stores val into *addr.
func StoreInt64(addr *int64, val int64)
// StoreUint32 atomically stores val into *addr.
func StoreUint32(addr *uint32, val uint32)
// StoreUint64 atomically stores val into *addr.
func StoreUint64(addr *uint64, val uint64)
// StoreUintptr atomically stores val into *addr.
func StoreUintptr(addr *uintptr, val uintptr)
// StorePointer atomically stores val into *addr.
func StorePointer(addr *unsafe.Pointer, val unsafe.Pointer)
利用 atomic 实现并发安全地数据读写
func TestAtomic(t *testing.T) {
var shareBufPtr unsafe.Pointer
writeDataFn := func() {
data := []int{}
for i := 0; i < 100; i++ {
data = append(data, i)
}
atomic.StorePointer(&shareBufPtr, unsafe.Pointer(&data)) //利用原子操作存储新写好的数据地址
//fmt.Printf("write end.....:%x\n", unsafe.Pointer(&data))
}
readDataFn := func() {
data := atomic.LoadPointer(&shareBufPtr) //原子操作加载新写好的数据地址
fmt.Println(data, *(*[]int)(data))
}
writeDataFn()
var wg sync.WaitGroup
for i := 0; i < 5; i++ {
wg.Add(1) //起5个写协程
go func() {
for i := 0; i < 5; i++ { //每个协程写5遍
writeDataFn()
time.Sleep(time.Millisecond * 100)
}
wg.Done()
}()
wg.Add(1) //起5个读协程
go func() {
for i := 0; i < 5; i++ { //每个协程读5遍
readDataFn()
time.Sleep(time.Millisecond * 100)
}
wg.Done()
}()
}
wg.Wait()
}
可以发现,结果集中虽然有重复的读操作,但是不会影响相互的写操作。
互斥锁方式和原子操作方式性能比较:
package main
import (
"fmt"
"sync"
"sync/atomic"
"time"
)
var x int64
var l sync.Mutex
var wg sync.WaitGroup
//加互斥锁版本
func mutexAdd() {
l.Lock()
x++
l.Unlock()
wg.Done()
}
//原子操作
func atomicAdd() {
atomic.AddInt64(&x, 1)
wg.Done()
}
func main() {
start := time.Now()
for i := 0; i < 10000; i++ {
wg.Add(1)
//go atomicAdd()
go mutexAdd()
}
wg.Wait()
end := time.Now()
fmt.Println(x)
fmt.Println(end.Sub(start))
}
五、GMP原理
六、并发案例
这里列举利用通道、协程并发实现并发爬虫的案例,实际使用时可以利用有缓冲管道进行数据存取,并且结合 sync.WaitGroup 实现多个并发任务的同步。
package main
import (
"fmt"
"io/ioutil"
"net/http"
"regexp"
"strconv"
"strings"
"sync"
"time"
)
/**
并发爬取网站图片
思路: 1.初始化数据管道
2.爬虫写出:26个协程向管道中添加图片链接
3.任务统计协程:检查26个任务是否都完成,完成则关闭数据管道
4.下载协程:从管道里读取链接并下载
*/
var (
chanImgUrls chan string //图片URL管道
wg sync.WaitGroup
chanTask chan string //任务管道
rexImg = `/uploads/allimg/[^"]+?(\.((jpg)|(png)|(jpeg)|(gif)|(bmp)))` //图片URL正则
)
//闭包错误处理
func Try(f func(), handler func(interface{})) {
defer func() {
if err := recover(); err != nil {
handler(err)
}
}()
f()
}
//获取网站中所有图片URL
func getImgUrls(webUrl string) (imgUrls []string) {
Try(func() {
//1.读取页面内容
resp, _ := http.Get(webUrl)
defer resp.Body.Close()
pageBytes, _ := ioutil.ReadAll(resp.Body)
pageStr := string(pageBytes)
//2.解析获取图片URL
regex := regexp.MustCompile(rexImg)
urlStrs := regex.FindAllStringSubmatch(pageStr, -1) //注意这里返回一个二维数组
fmt.Printf("总共找到%d条结果\n", len(urlStrs))
for _, url := range urlStrs {
url := "https://pic.netbian.com" + url[0]
imgUrls = append(imgUrls, url)
}
}, func(err interface{}) {
fmt.Println("crawl pic urls: ", err)
})
return
}
//多个协程向管道中添加图片链接
func sendTask(webUrl string) {
imgUrls := getImgUrls(webUrl)
//向管道中输入图片URL
for _, url := range imgUrls {
chanImgUrls <- url
}
//完成图片URL输入,标识一个网站任务的完成
chanTask <- webUrl
wg.Done()
}
//检查所有爬虫任务是否完成
func checkTaskOk() {
var count int
for {
task := <-chanTask
fmt.Printf("完成了%s的图片URL爬取任务\n", task)
count++
if count == 26 {
close(chanImgUrls)
break
}
}
close(chanTask)
wg.Done()
}
//下载图片
func downloadPic() {
Try(func() {
for url := range chanImgUrls {
picName := getPicName(url)
resp, _ := http.Get(url)
byteArray, _ := ioutil.ReadAll(resp.Body)
fileName := "D:\\其它\\crawl\\gotest\\" + picName
ioutil.WriteFile(fileName, byteArray, 0664)
fmt.Printf("%s下载完成\n", fileName)
}
wg.Done()
}, func(err interface{}) {
fmt.Println("downloadPic: ", err)
})
}
//重命名图片
func getPicName(url string) string {
index := strings.LastIndex(url, "/")
filename := url[index+1:]
timePrefix := strconv.Itoa(int(time.Now().UnixNano()))
filename = timePrefix + "_" + filename
return filename
}
func main() {
chanImgUrls = make(chan string, 1000000)
chanTask = make(chan string, 26)
//1.爬虫协程
for i := 2; i <= 27; i++ {
wg.Add(1)
go sendTask("https://pic.netbian.com/4kfengjing/index_" + strconv.Itoa(i) + ".html")
}
//2.任务统计协程
wg.Add(1)
go checkTaskOk()
//3.下载协程
for i := 0; i < 5; i++ {
wg.Add(1)
go downloadPic()
}
wg.Wait()
}
七、pipe-filterframework 架构
7.1 架构特点
该架构适用于数据处理与数据分析的系统,各个filter与数据松耦合,filter之间通过pipe进行连接。如果各个filter位于分布式系统,那么需要通过网络连接去调用各个filter,如果各个filter是异步处理,那么还需要通过 Buffer 缓冲区去缓存数据。如果filter是进程内的调用,那么可以直接通过调用函数来处理。
7.2 架构示例
如图,要求利用 pipe-filter 架构实现解析 "1,2,3" 字符串,并求出各数字和:
设计:为了解耦数据和 Filter,将 Filter 设计成一个接口,Filter 内部方法通过操作 Request 实现请求处理,所得结果继续传递给下一个 Filter。各个 Filter 之间通过 Pipeline 进行连接,Pipeline 结构体应当包括初始请求 Request、所有的Filter集合,它也实现了Filter接口的 Process 方法,依次调用各个 Filter 实现数据处理,最后得出 Response.
//filter.go
type Response interface{}
type Request interface{}
type Filter interface {
Process(data Request) (Response, error)
}
//split_filter.go
/**
分解字符串的 Filter
*/
var SplitFilterWrongFormatError error = errors.New("待分解数据格式异常")
type SplitFilter struct {
delimiter string
}
func NewSplitFilter(delimiter string) *SplitFilter {
return &SplitFilter{delimiter: delimiter}
}
func (sf *SplitFilter) Process(data Request) (Response, error) {
//先判断数据格式
str, ok := data.(string)
if !ok {
return nil, SplitFilterWrongFormatError
}
parts := strings.Split(str, sf.delimiter)
return parts, nil
}
//to_int_filter.go
/**
转换string切片为int切片的 Filter
*/
var ToIntFilterWrongFormat error = errors.New("待转换字符串切片格式异常")
type ToIntFilter struct {
}
func NewToIntFilter() *ToIntFilter {
return &ToIntFilter{}
}
func (tif *ToIntFilter) Process(data Request) (Response, error) {
parts, ok := data.([]string)
if !ok {
return nil, ToIntFilterWrongFormat
}
ret := []int{}
for _, part := range parts {
num, err := strconv.Atoi(part)
if err != nil {
return nil, err
}
ret = append(ret, num)
}
return ret, nil
}
//sum_filter.go
/**
汇总int切片数值和的 Filter
*/
var SumFilterWrongFormat error = errors.New("求和整型切片异常")
type SumFilter struct {
}
func NewSumFilter() *SumFilter {
return &SumFilter{}
}
func (sf *SumFilter) Process(data Request) (Response, error) {
parts, ok := data.([]int)
if !ok {
return nil, SumFilterWrongFormat
}
res := 0
for _, part := range parts {
res += part
}
return res, nil
}
//straight_pipeline.go
/**
PipeLine:汇总各个Filter并依次执行
*/
type StraightFilterPipeline struct {
Name string
Filters *[]Filter
}
func NewStraightFilterPipeline(name string, filters ...Filter) *StraightFilterPipeline {
return &StraightFilterPipeline{
Name: name,
Filters: &filters,
}
}
func (sfp *StraightFilterPipeline) Process(data Request) (Response, error) {
var ret interface{}
var err error
for _, filter := range *sfp.Filters {
ret, err = filter.Process(data)
if err != nil {
return ret, nil
}
data = ret
}
return ret, nil
}
//最终测试
func TestPipelineFilters(t *testing.T) {
str := "1,2,3"
splitFilter := testContext.NewSplitFilter(",")
toIntFilter := testContext.NewToIntFilter()
sumFilter := testContext.NewSumFilter()
straightPipeline := testContext.NewStraightFilterPipeline("sp", splitFilter, toIntFilter, sumFilter)
ret, err := straightPipeline.Process(str)
if err != nil {
t.Fatal(err)
}
if ret != 6 {
t.Fatalf("The expected is 6, but the result is %d\n", ret)
}
}
八、micro-kernel 架构
8.1 架构特点
适用于如下情况:
- 内核包含公共流程或者通用逻辑;
- 将可变或可扩展的部分规划为扩展点;
- 抽象扩展点的行为,定义接口;
8.2 架构示例
//agent.go
package testMicroKernel
import (
"context"
"errors"
"fmt"
"strings"
"sync"
)
const (
Waiting = 0
Running = 1
)
var WrongStateError = errors.New("can not take the operation in the current state")
//错误隔离
type CollectorsError struct {
CollectorErrors []error
}
//输出全部错误信息
func (ce CollectorsError) Error() string {
var strs []string
for _, err := range ce.CollectorErrors {
strs = append(strs, err.Error())
}
return strings.Join(strs, ";")
}
//事件结构体
type Event struct {
Source string
Content string
}
//传输事件的接口
type EventReceiver interface {
OnEvent(evt Event)
}
//Collector接口
type Collector interface {
Init(evtReceiver EventReceiver) error
Start(agtCtx context.Context) error
Stop() error
Destory() error
}
//主体处理
type Agent struct {
collectors map[string]Collector //内嵌多个Collector
evtBuf chan Event //接收所有Event事件的通道
ctx context.Context //上下文
cancel context.CancelFunc //结束上下文的函数
state int //当前主体状态,
}
//初始化Agent主体对象,指定事件通道容量大小
func NewAgent(sizeEvtBuf int) *Agent {
agt := Agent{
collectors: map[string]Collector{},
evtBuf: make(chan Event, sizeEvtBuf),
state: Waiting,
}
return &agt
}
//启动主体
func (agt *Agent) Start() error {
if agt.state != Waiting {
return WrongStateError
}
agt.state = Running
agt.ctx, agt.cancel = context.WithCancel(context.Background())
go agt.EventProcessGoroutine() //另起一个协程在监听过程中执行操作
return agt.StartCollectors()
}
//结束主体
func (agt *Agent) Stop() error {
if agt.state != Running {
return WrongStateError
}
agt.state = Waiting
agt.cancel()
return agt.StopCollectors()
}
//销毁主体
func (agt *Agent) Destory() error {
if agt.state != Waiting {
return WrongStateError
}
return agt.DestoryCollectors()
}
//启动事件监听处理器
func (agt *Agent) EventProcessGoroutine() {
var evtSeg [10]Event //默认每接收10个Event打印一次
for {
for i := 0; i < 10; i++ {
select {
case evtSeg[i] = <-agt.evtBuf:
case <-agt.ctx.Done():
return
}
}
fmt.Println(evtSeg)
}
}
//传输事件给Agent主体
func (agt *Agent) OnEvent(evt Event) {
agt.evtBuf <- evt
}
//注册Collector
func (agt *Agent) RegisterCollector(name string, collector Collector) error {
if agt.state != Waiting {
return WrongStateError
}
agt.collectors[name] = collector
return collector.Init(agt)
}
//启动所有Collector
func (agt *Agent) StartCollectors() error {
var err error
var errs CollectorsError
var mutex sync.Mutex //互斥锁方便添加Collector
for name, collector := range agt.collectors {
go func(name string, collector Collector, ctx context.Context) { //多协程并发注意统计数据的安全
defer mutex.Unlock()
err = collector.Start(ctx)
mutex.Lock() //加上互斥锁,并发安全地往CollectorErrors Map集合中添加错误信息
if err != nil {
errs.CollectorErrors = append(errs.CollectorErrors, errors.New(name+":"+err.Error()))
}
}(name, collector, agt.ctx)
}
return errs
}
//终止所有Collector
func (agt *Agent) StopCollectors() error {
var err error
var errs CollectorsError
for name, collector := range agt.collectors {
if err = collector.Stop(); err != nil {
errs.CollectorErrors = append(errs.CollectorErrors, errors.New(name+":"+err.Error()))
}
}
if len(errs.CollectorErrors) == 0 {
return nil
}
return errs
}
func (agt *Agent) DestoryCollectors() error {
var err error
var errs CollectorsError
for name, collector := range agt.collectors {
if err = collector.Destory(); err != nil {
errs.CollectorErrors = append(errs.CollectorErrors, errors.New(name+":"+err.Error()))
}
}
if len(errs.CollectorErrors) == 0 {
return nil
}
return errs
}
//agent_test.go
package testMicroKernel
import (
"context"
"errors"
"fmt"
"testing"
"time"
)
//Collector示例
type DemoCollector struct {
evtReceiver EventReceiver
agtCtx context.Context
stopChan chan struct{}
name string
content string
}
func NewDemoCollector(name string, content string) *DemoCollector {
return &DemoCollector{
name: name,
content: content,
stopChan: make(chan struct{}),
}
}
//初始化
func (c *DemoCollector) Init(evtReceiver EventReceiver) error {
fmt.Println("initialize collector", c.name)
c.evtReceiver = evtReceiver
return nil
}
//开始执行
func (c *DemoCollector) Start(agtCtx context.Context) error {
fmt.Println("start collector", c.name)
for {
select {
case <-agtCtx.Done(): //主体结束时所有Collector均结束
c.stopChan <- struct{}{}
break
default: //每一个Collector执行自己的操作
time.Sleep(time.Millisecond * 50)
c.evtReceiver.OnEvent(Event{c.name, c.content})
}
}
}
//终止执行
func (c *DemoCollector) Stop() error {
fmt.Println("stop collector", c.name)
select {
case <-c.stopChan:
return nil
case <-time.After(1 * time.Second):
return errors.New("failed to stop for timeout")
}
}
//销毁
func (c *DemoCollector) Destory() error {
fmt.Println(c.name, "Release resources")
return nil
}
func TestAgent(t *testing.T) {
agt := NewAgent(100)
c1 := NewDemoCollector("c1", "1")
c2 := NewDemoCollector("c2", "2")
agt.RegisterCollector("c1", c1)
agt.RegisterCollector("c2", c2)
err := (agt.Start()).(CollectorsError)
if len(err.CollectorErrors) > 0 {
for _, errItem := range err.CollectorErrors {
fmt.Printf("start error %v\n", errItem)
}
}
fmt.Println("start Agent again:", agt.Start())
time.Sleep(1 * time.Second)
agt.Stop()
agt.Destory()
}
九、JSON配置解析
9.1 Go内置JSON解析工具
go 语言有自己实现的一套 JSON 配置解析函数,但是它实现时使用了很多反射,反射过程过多就会影响高并发时的性能。因此考虑自己实现一套JSON解析功能函数。
9.2 EasyJson
插件安装:
# for Go < 1.17
go get -u github.com/mailru/easyjson/...
# for Go >= 1.17
go get github.com/mailru/easyjson
go install github.com/mailru/easyjson/...@latest
安装完毕后会在 $GOPATH/bin 目录下生成 easyjson.exe 可执行文件:
使用时通过命令: easyjson -all <文件名称>.go
生成包含 UnmarshalJSON()
、MarshalJSON()
JSON解析函数的文件。
注意使用时有如下配置选项:
-lower_camel_case: 将结构体字段field首字母改为小写。如Name=>name。
-build_tags string: 将指定的string生成到生成的go文件头部。
-no_std_marshalers: 不为结构体生成MarshalJSON/UnmarshalJSON函数。
-omit_empty: 没有赋值的field可以不生成到json,否则field为该字段类型的默认值。
-output_filename: 定义生成的文件名称。
-pkg: 对包内指定有`//easyjson:json`结构体生成对应的easyjson配置。
-snke_case: 可以下划线的field如`Name_Student`改为`name_student`。
使用及性能测试:
//待解析或者转换的结构体对象
package testJSON
type BasicInfo struct {
Name string `json:"name"`
Age int `json:"age"`
}
type JobInfo struct {
Skills []string `json:"skills"`
}
type Employee struct {
BasicInfo BasicInfo `json:"basic_info"`
JobInfo JobInfo `json:"job_info"`
}
使用命令:easyjson -all struct_def.go
在当前文件的下一级目录下生成了 struct_def_easyjson.go
//json_test.go
package testJSON
import (
"encoding/json"
"fmt"
"testing"
)
var jsonStr = `{"basic_info":{"name":"Mike","age":12},"job_info":{"skills":["Java","Go","C"]}}`
func TestEasyJson(t *testing.T) {
e := Employee{}
e.UnmarshalJSON([]byte(jsonStr))
fmt.Println(e)
if v, err := e.MarshalJSON(); err != nil {
t.Error(err)
} else {
fmt.Println(string(v))
}
}
func BenchmarkEmbeddedJson(b *testing.B) { //采用go内嵌的JSON解析函数
b.ResetTimer()
e := new(Employee)
for i := 0; i < b.N; i++ {
err := json.Unmarshal([]byte(jsonStr), e)
if err != nil {
b.Error(err)
}
if _, err = json.Marshal(e); err != nil {
b.Error(err)
}
}
}
func BenchmarkEasyJson(b *testing.B) { //采用EasyJson实现的JSON解析函数
b.ResetTimer()
e := Employee{}
for i := 0; i < b.N; i++ {
err := e.UnmarshalJSON([]byte(jsonStr))
if err != nil {
b.Error(err)
}
if _, err = e.MarshalJSON(); err != nil {
b.Error(err)
}
}
}