13_并发编程

一、什么是Goroutine

​ 在java/c++中我们要实现并发编程的时候,我们通常需要自己维护一个线程池,并且需要自己去包装一个又一个的任务,同时需要自己去调度线程执行任务并维护上下文切换,这一切通常会耗费程序员大量的心智。

​ Go语言中的goroutine就是这样一种机制,goroutine的概念类似于线程,但 goroutine是由Go的运行时(runtime)调度和管理的。Go程序会智能地将 goroutine 中的任务合理地分配给每个CPU。

​ OS线程(操作系统线程)一般都有固定的栈内存(通常为2MB),一个 goroutine 的栈在其生命周期开始时只有很小的栈(典型只有2KB),goroutine 的栈不是固定的,它可以按照需求增大和缩小,goroutine 的栈大小限制可达 1GB。

二、Goroutine调度

GPM是Go语言运行时(runtime)层面的实现,是go语言自己实现的一套调度系统。区别于操作系统调度OS线程。

  • 1.G很好理解,就是个goroutine的,里面除了存放本goroutine信息外 还有与所在P的绑定等信息。
  • 2.P管理着一组goroutine队列,P里面会存储当前goroutine运行的上下文环境函数指针,堆栈地址及地址边界),P会对自己管理的goroutine队列做一些调度(比如把占用CPU时间较长的goroutine暂停、运行后续的goroutine等等)当自己的队列消费完了就去全局队列里取,如果全局队列里也消费完了会去其他P的队列里抢任务。
  • 3.M(machine)是Go运行时(runtime)对操作系统内核线程的虚拟, M与内核线程一般是一一映射的关系, 一个groutine最终是要放到M上执行的;

P与M一般也是一一对应的。他们关系是: P管理着一组G挂载在M上运行。当一个G长久阻塞在一个M上时,runtime会新建一个M,阻塞G所在的P会把其他的G 挂载在新建的M上。当旧的G阻塞完成或者认为其已经死掉时 回收旧的M。

P的个数是通过 runtime.GOMAXPROCS 设定(最大256),Go1.5版本之后默认为物理线程数。 在并发量大的时候会增加一些P和M,但不会太多,切换太频繁的话得不偿失。Go语言可以通过 runtime.GOMAXPROCS() 函数来设置当前程序并发时占用的 CPU 逻辑核心数。

单从线程调度讲,Go语言相比起其他语言的优势在于OS线程是由OS内核来调度的,goroutine则是由Go运行时(runtime)自己的调度器调度的,这个调度器使用一个称为m:n调度的技术(复用/调度m个goroutine到n个OS线程)。 其一大特点是 goroutine 的调度是在用户态下完成的, 不涉及内核态与用户态之间的频繁切换,包括内存的分配与释放,都是在用户态维护着一块大的内存池, 不直接调用系统的malloc函数(除非内存池需要改变),成本比调度OS线程低很多。 另一方面充分利用了多核的硬件资源,近似的把若干goroutine均分在物理线程上, 再加上本身goroutine的超轻量,以上种种保证了go调度方面的性能。

Go语言中操作系统线程和 goroutine 的对应关系:

  • 一个操作系统线程对应用户态多个 goroutine;
  • go 程序可以同时使用多个操作系统线程;
  • goroutine 与操作系统线程是多对多关系,即 m:n,将m个goroutine 调度在 n个 os线程上运行。
//多核情况下,单核处理两个任务,需要做完一个任务后再做另一个任务
func a() {
    for i := 1; i < 10; i++ {
        fmt.Println("A:", i)
    }
}

func b() {
    for i := 1; i < 10; i++ {
        fmt.Println("B:", i)
    }
}

func main() {
    runtime.GOMAXPROCS(1)
    go a()
    go b()
    time.Sleep(time.Second)
}  


//两个逻辑核心,此时两个任务一起处理
func a() {
    for i := 1; i < 10; i++ {
        fmt.Println("A:", i)
    }
}

func b() {
    for i := 1; i < 10; i++ {
        fmt.Println("B:", i)
    }
}

func main() {
    runtime.GOMAXPROCS(2)
    go a()
    go b()
    time.Sleep(time.Second)
}  

三、channel

​ Go 可以通过共享内存进行数据交换,但是共享内存在不同的 goroutine 中容易发生竞态问题。为了保证数据交换的正确性,必须使用互斥变量对内存加锁,这种做法会造成性能问题。

​ Go 语言并发模型是 CSP,提倡通过通信共享内存而不是通过共享内存实现通信。channel 是可以让一个 goroutine 发送特定值到另一个 goroutine 的通信机制。

3.1 无缓冲通道

​ 无缓冲通道又称为阻塞通道,无缓冲通道只有在有人接收值的时候才能发送值,否则会阻塞形成死锁。

func main() {
    ch := make(chan int)
    ch <- 10
    fmt.Println("发送成功")
}   

//output
fatal error: all goroutines are asleep - deadlock!

goroutine 1 [chan send]:
main.main()
.../src/github.com/pprof/studygo/day06/channel02/main.go:8 +0x54   

​ 在添加接收者后:

func recv(c chan int) {
    ret := <-c
    fmt.Println("接收成功", ret)
}
func main() {
    ch := make(chan int)
    go recv(ch) // 启用goroutine从通道接收值
    ch <- 10
    fmt.Println("发送成功")
}   

3.2 有缓冲通道

​ 通过给通道指定容量来创建有缓冲通道:

func main() {
    ch := make(chan int, 1) // 创建一个容量为1的有缓冲区通道
    ch <- 10
    fmt.Println("发送成功")
}   

​ 通过 for-range 从通道中循环取值:

// channel 练习
func main() {
    ch1 := make(chan int)
    ch2 := make(chan int)
    // 开启goroutine将0~100的数发送到ch1中
    go func() {
        for i := 0; i < 100; i++ {
            ch1 <- i
        }
        close(ch1)
    }()
    // 开启goroutine从ch1中接收值,并将该值的平方发送到ch2中
    go func() {
        for {
            i, ok := <-ch1 // 通道关闭后再取值ok=false
            if !ok {
                break
            }
            ch2 <- i * i
        }
        close(ch2)
    }()
    // 在主goroutine中从ch2中接收值打印
    for i := range ch2 { // 通道关闭后会退出for range循环
        fmt.Println(i)
    }
}   

3.3 单向通道

  • chan<- int:只能发送的通道,可以发送但是不能接收
  • <-chan int:只能接收的通道,可以接收但是不能发送
func counter(out chan<- int) {
    for i := 0; i < 100; i++ {
        out <- i
    }
    close(out)
}

func squarer(out chan<- int, in <-chan int) {
    for i := range in {
        out <- i * i
    }
    close(out)
}
func printer(in <-chan int) {
    for i := range in {
        fmt.Println(i)
    }
}

func main() {
    ch1 := make(chan int)
    ch2 := make(chan int)
    go counter(ch1)
    go squarer(ch2, ch1)
    printer(ch2)
}   

3.4 生产者消费者模型

​ 创建工作池,利用通道进行任务投递和结果输出。

package main

import (
	"fmt"
	"math/rand"
)

type Job struct {
	Id      int
	Randnum int
}

type Result struct {
	job *Job
	sum int
}

func main() {
	//开启job管道、result管道
	jobChannel := make(chan *Job, 100)
	resultChannel := make(chan *Result, 100)
	//创建工作池
	createWorkPool(64, jobChannel, resultChannel)
	//开启协程打印结果
	go func(resultChannel chan *Result) {
		// 遍历结果管道打印
		for result := range resultChannel {
			fmt.Printf("job id:%v randnum:%v result:%d\n", result.job.Id,
				result.job.Randnum, result.sum)
		}
	}(resultChannel)
	//随机生成任务放入工作池
	var id int
	for {
		id++
		r_num := rand.Int()
		job := &Job{
			Id:      id,
			Randnum: r_num,
		}
		jobChannel <- job
	}
}

//工作池
func createWorkPool(num int, jobChannel chan *Job, resultChannel chan *Result) {
	for i := 0; i < num; i++ {
		go func(jobChannel chan *Job, resultChannel chan *Result) {
			//执行运算
			for job := range jobChannel {
				//计算随机数各个位数字之和
				r_num := job.Randnum
				var sum int
				for r_num != 0 {
					sum += r_num % 10
					r_num /= 10
				}
				//通道返回结果
				r := &Result{
					sum: sum,
					job: job,
				}
				resultChannel <- r
			}
		}(jobChannel, resultChannel)
	}
}
image-20220927144813279

3.5 延时器和定时器

package main

import (
	"fmt"
	"time"
)

func main() {
	//1.timer基础使用
	//timer1 := time.NewTimer(2 * time.Second)
	//t1 := time.Now()
	//fmt.Printf("t1:%v\n", t1)
	//t2 := <-timer1.C //延时两秒后执行
	//fmt.Printf("t2:%v\n", t2)

	//2.timer只能响应一次
	//timer1 := time.NewTimer(2 * time.Second)
	//for { //会构成死锁
	//	<-timer1.C
	//	fmt.Println("时间到")
	//}

	//3.timer实现延时功能
	//time.Sleep(time.Second)
	//timer3 := time.NewTimer(2 * time.Second)
	////fmt.Println(timer3)
	//<-timer3.C
	//fmt.Println("2秒到")
	//<-time.After(2 * time.Second) //延时2秒
	//fmt.Println("2秒到")

	//4.停止定时器
	//timer4 := time.NewTimer(2 * time.Second)
	//go func() {
	//	<-timer4.C
	//	fmt.Println("定时器执行了")
	//}()
	//b := timer4.Stop() //停止定时器
	//if b {
	//	fmt.Println("timer4已经关闭")
	//}

	//5.重置定时器
	//fmt.Println(time.Now())
	//timer5 := time.NewTimer(5 * time.Second)
	//timer5.Reset(2 * time.Second)
	//fmt.Println(time.Now())
	//fmt.Println(<-timer5.C)
	//for {
	//
	//}

	// ticker时间到了多次执行
	ticker := time.NewTicker(2 * time.Second)
	i := 0
	// 子协程
	go func() {
		for {
			//<-ticker.C
			i++
			fmt.Println(<-ticker.C)
			if i == 5 {
				//停止
				ticker.Stop()
			}
		}
	}()
	for {
	}
}

四、并发处理

4.1 写锁、读锁

​ 并发处理临界区资源数据采用互斥锁、读写互斥锁来实现并发处理数据。

package main

import (
	"fmt"
	"sync"
	"time"
)

var (
	x      int64
	wg     sync.WaitGroup
	lock   sync.Mutex
	rwlock sync.RWMutex
)

func read() { //加读锁
	rwlock.RLock()
	time.Sleep(10 * time.Millisecond)
	rwlock.RUnlock()
	wg.Done()
}

func write() { //加写锁
	rwlock.Lock()
	x = x + 1
	//time.Sleep(10 * time.Millisecond)
	rwlock.Unlock()
	wg.Done()
}

func main() {
	start := time.Now()
	for i := 0; i < 4000; i++ {
		wg.Add(1)
		go write()
	}

	for i := 0; i < 1000; i++ {
		wg.Add(1)
		go read()
	}

	wg.Wait()
	fmt.Println(x)
	end := time.Now()
	fmt.Println(end.Sub(start))
}

4.2 sync包

sync.WaitGroup

go 语言中可以通过 sync.WaitGroup 来实现对并发任务的同步,其内部维护了一个计数器,当我们启动N个并发任务时就将计数器的值增加N,然后通过 Wait() 方法等待并发任务处理结束:

sync.Once

高并发场景下我们需要确保某些操作只会被执行一次,比如只加载一次配置文件、只关闭一次通道等,可以通过 sync.Once 的方法实现。这里以加载配置文件操作为例:

var icons map[string]image.Image

var loadIconsOnce sync.Once

func loadIcons() {
    icons = map[string]image.Image{
        "left":  loadIcon("left.png"),
        "up":    loadIcon("up.png"),
        "right": loadIcon("right.png"),
        "down":  loadIcon("down.png"),
    }
}

// Icon 是并发安全的
func Icon(name string) image.Image {
    loadIconsOnce.Do(loadIcons)  //调用 Do 方法,即使在并发情况下也只会执行一次 loadIcons 函数
    return icons[name]
}

接下来解析以下 sync.Once 的源码:

// sync.Once 源码

type Once struct {
	// done indicates whether the action has been performed.
	// It is first in the struct because it is used in the hot path.
	// The hot path is inlined at every call site.
	// Placing done first allows more compact instructions on some architectures (amd64/386),
	// and fewer instructions (to calculate offset) on other architectures.
	done uint32     // 0表示未执行,1表示执行了
	m    Mutex      // 互斥锁
}

// Do calls the function f if and only if Do is being called for the
// first time for this instance of Once. In other words, given
// 	var once Once
// if once.Do(f) is called multiple times, only the first call will invoke f,
// even if f has a different value in each invocation. A new instance of
// Once is required for each function to execute.
//
// Do is intended for initialization that must be run exactly once. Since f
// is niladic, it may be necessary to use a function literal to capture the
// arguments to a function to be invoked by Do:
// 	config.once.Do(func() { config.init(filename) })
//
// Because no call to Do returns until the one call to f returns, if f causes
// Do to be called, it will deadlock.
//
// If f panics, Do considers it to have returned; future calls of Do return
// without calling f.
//
func (o *Once) Do(f func()) {
	// Note: Here is an incorrect implementation of Do:
	//
	//	if atomic.CompareAndSwapUint32(&o.done, 0, 1) {
	//		f()
	//	}
	//
	// Do guarantees that when it returns, f has finished.
	// This implementation would not implement that guarantee:
	// given two simultaneous calls, the winner of the cas would
	// call f, and the second would return immediately, without
	// waiting for the first's call to f to complete.
	// This is why the slow path falls back to a mutex, and why
	// the atomic.StoreUint32 must be delayed until after f returns.

    // 原子操作读取 done 的值判断是否执行过
	if atomic.LoadUint32(&o.done) == 0 {
		// Outlined slow-path to allow inlining of the fast-path.
		o.doSlow(f)
	}
}

func (o *Once) doSlow(f func()) {
	o.m.Lock()    //加互斥锁
	defer o.m.Unlock()  //退出函数前解锁
	if o.done == 0 {
		defer atomic.StoreUint32(&o.done, 1)  //执行一次后将done赋值为1
		f() //执行指定函数
	}
}

4.3 atomic 原子操作包

​ 对于一些基础变量使用互斥锁会对性能产生更大的影响,因为加锁会涉及到内核态的上下文切换比较耗时,代价比较高。针对基本数据类型我们可以通过原子操作来保证并发安全,因为原子操作在用户态下就可以完成。

// SwapInt32 atomically stores new into *addr and returns the previous *addr value.
func SwapInt32(addr *int32, new int32) (old int32)

// SwapInt64 atomically stores new into *addr and returns the previous *addr value.
func SwapInt64(addr *int64, new int64) (old int64)

// SwapUint32 atomically stores new into *addr and returns the previous *addr value.
func SwapUint32(addr *uint32, new uint32) (old uint32)

// SwapUint64 atomically stores new into *addr and returns the previous *addr value.
func SwapUint64(addr *uint64, new uint64) (old uint64)

// SwapUintptr atomically stores new into *addr and returns the previous *addr value.
func SwapUintptr(addr *uintptr, new uintptr) (old uintptr)

// SwapPointer atomically stores new into *addr and returns the previous *addr value.
func SwapPointer(addr *unsafe.Pointer, new unsafe.Pointer) (old unsafe.Pointer)

// CompareAndSwapInt32 executes the compare-and-swap operation for an int32 value.
func CompareAndSwapInt32(addr *int32, old, new int32) (swapped bool)

// CompareAndSwapInt64 executes the compare-and-swap operation for an int64 value.
func CompareAndSwapInt64(addr *int64, old, new int64) (swapped bool)

// CompareAndSwapUint32 executes the compare-and-swap operation for a uint32 value.
func CompareAndSwapUint32(addr *uint32, old, new uint32) (swapped bool)

// CompareAndSwapUint64 executes the compare-and-swap operation for a uint64 value.
func CompareAndSwapUint64(addr *uint64, old, new uint64) (swapped bool)

// CompareAndSwapUintptr executes the compare-and-swap operation for a uintptr value.
func CompareAndSwapUintptr(addr *uintptr, old, new uintptr) (swapped bool)

// CompareAndSwapPointer executes the compare-and-swap operation for a unsafe.Pointer value.
func CompareAndSwapPointer(addr *unsafe.Pointer, old, new unsafe.Pointer) (swapped bool)

// AddInt32 atomically adds delta to *addr and returns the new value.
func AddInt32(addr *int32, delta int32) (new int32)

// AddUint32 atomically adds delta to *addr and returns the new value.
// To subtract a signed positive constant value c from x, do AddUint32(&x, ^uint32(c-1)).
// In particular, to decrement x, do AddUint32(&x, ^uint32(0)).
func AddUint32(addr *uint32, delta uint32) (new uint32)

// AddInt64 atomically adds delta to *addr and returns the new value.
func AddInt64(addr *int64, delta int64) (new int64)

// AddUint64 atomically adds delta to *addr and returns the new value.
// To subtract a signed positive constant value c from x, do AddUint64(&x, ^uint64(c-1)).
// In particular, to decrement x, do AddUint64(&x, ^uint64(0)).
func AddUint64(addr *uint64, delta uint64) (new uint64)

// AddUintptr atomically adds delta to *addr and returns the new value.
func AddUintptr(addr *uintptr, delta uintptr) (new uintptr)

// LoadInt32 atomically loads *addr.
func LoadInt32(addr *int32) (val int32)

// LoadInt64 atomically loads *addr.
func LoadInt64(addr *int64) (val int64)

// LoadUint32 atomically loads *addr.
func LoadUint32(addr *uint32) (val uint32)

// LoadUint64 atomically loads *addr.
func LoadUint64(addr *uint64) (val uint64)

// LoadUintptr atomically loads *addr.
func LoadUintptr(addr *uintptr) (val uintptr)

// LoadPointer atomically loads *addr.
func LoadPointer(addr *unsafe.Pointer) (val unsafe.Pointer)

// StoreInt32 atomically stores val into *addr.
func StoreInt32(addr *int32, val int32)

// StoreInt64 atomically stores val into *addr.
func StoreInt64(addr *int64, val int64)

// StoreUint32 atomically stores val into *addr.
func StoreUint32(addr *uint32, val uint32)

// StoreUint64 atomically stores val into *addr.
func StoreUint64(addr *uint64, val uint64)

// StoreUintptr atomically stores val into *addr.
func StoreUintptr(addr *uintptr, val uintptr)

// StorePointer atomically stores val into *addr.
func StorePointer(addr *unsafe.Pointer, val unsafe.Pointer)

利用 atomic 实现并发安全地数据读写

func TestAtomic(t *testing.T) {
	var shareBufPtr unsafe.Pointer
	writeDataFn := func() {
		data := []int{}
		for i := 0; i < 100; i++ {
			data = append(data, i)
		}
		atomic.StorePointer(&shareBufPtr, unsafe.Pointer(&data)) //利用原子操作存储新写好的数据地址
		//fmt.Printf("write end.....:%x\n", unsafe.Pointer(&data))
	}
	readDataFn := func() {
		data := atomic.LoadPointer(&shareBufPtr) //原子操作加载新写好的数据地址
		fmt.Println(data, *(*[]int)(data))
	}
	writeDataFn()
	var wg sync.WaitGroup
	for i := 0; i < 5; i++ {
		wg.Add(1) //起5个写协程
		go func() {
			for i := 0; i < 5; i++ {  //每个协程写5遍
				writeDataFn()
				time.Sleep(time.Millisecond * 100)
			}
			wg.Done()
		}()
		wg.Add(1) //起5个读协程
		go func() {
			for i := 0; i < 5; i++ {  //每个协程读5遍
				readDataFn()
				time.Sleep(time.Millisecond * 100)
			}
			wg.Done()
		}()
	}
	wg.Wait()
}

​ 可以发现,结果集中虽然有重复的读操作,但是不会影响相互的写操作。

image-20221009201513990

互斥锁方式和原子操作方式性能比较:

package main

import (
	"fmt"
	"sync"
	"sync/atomic"
	"time"
)

var x int64
var l sync.Mutex
var wg sync.WaitGroup

//加互斥锁版本
func mutexAdd() {
	l.Lock()
	x++
	l.Unlock()
	wg.Done()
}

//原子操作
func atomicAdd() {
	atomic.AddInt64(&x, 1)
	wg.Done()
}

func main() {
	start := time.Now()
	for i := 0; i < 10000; i++ {
		wg.Add(1)
		//go atomicAdd()
		go mutexAdd()
	}
	wg.Wait()
	end := time.Now()
	fmt.Println(x)
	fmt.Println(end.Sub(start))
}

image-20220930110242737image-20220930110307777

image-20220930110331056

五、GMP原理

六、并发案例

​ 这里列举利用通道、协程并发实现并发爬虫的案例,实际使用时可以利用有缓冲管道进行数据存取,并且结合 sync.WaitGroup 实现多个并发任务的同步。

package main

import (
	"fmt"
	"io/ioutil"
	"net/http"
	"regexp"
	"strconv"
	"strings"
	"sync"
	"time"
)

/**
并发爬取网站图片
思路: 1.初始化数据管道
      2.爬虫写出:26个协程向管道中添加图片链接
      3.任务统计协程:检查26个任务是否都完成,完成则关闭数据管道
      4.下载协程:从管道里读取链接并下载
*/
var (
	chanImgUrls chan string //图片URL管道
	wg          sync.WaitGroup
	chanTask    chan string                                                    //任务管道
	rexImg      = `/uploads/allimg/[^"]+?(\.((jpg)|(png)|(jpeg)|(gif)|(bmp)))` //图片URL正则
)

//闭包错误处理
func Try(f func(), handler func(interface{})) {
	defer func() {
		if err := recover(); err != nil {
			handler(err)
		}
	}()
	f()
}

//获取网站中所有图片URL
func getImgUrls(webUrl string) (imgUrls []string) {
	Try(func() {
		//1.读取页面内容
		resp, _ := http.Get(webUrl)
		defer resp.Body.Close()
		pageBytes, _ := ioutil.ReadAll(resp.Body)
		pageStr := string(pageBytes)

		//2.解析获取图片URL
		regex := regexp.MustCompile(rexImg)
		urlStrs := regex.FindAllStringSubmatch(pageStr, -1) //注意这里返回一个二维数组
		fmt.Printf("总共找到%d条结果\n", len(urlStrs))
		for _, url := range urlStrs {
			url := "https://pic.netbian.com" + url[0]
			imgUrls = append(imgUrls, url)
		}
	}, func(err interface{}) {
		fmt.Println("crawl pic urls: ", err)
	})
	return
}

//多个协程向管道中添加图片链接
func sendTask(webUrl string) {
	imgUrls := getImgUrls(webUrl)
	//向管道中输入图片URL
	for _, url := range imgUrls {
		chanImgUrls <- url
	}
	//完成图片URL输入,标识一个网站任务的完成
	chanTask <- webUrl
	wg.Done()
}

//检查所有爬虫任务是否完成
func checkTaskOk() {
	var count int
	for {
		task := <-chanTask
		fmt.Printf("完成了%s的图片URL爬取任务\n", task)
		count++
		if count == 26 {
			close(chanImgUrls)
			break
		}
	}
	close(chanTask)
	wg.Done()
}

//下载图片
func downloadPic() {
	Try(func() {
		for url := range chanImgUrls {
			picName := getPicName(url)
			resp, _ := http.Get(url)
			byteArray, _ := ioutil.ReadAll(resp.Body)
			fileName := "D:\\其它\\crawl\\gotest\\" + picName
			ioutil.WriteFile(fileName, byteArray, 0664)
			fmt.Printf("%s下载完成\n", fileName)
		}
		wg.Done()
	}, func(err interface{}) {
		fmt.Println("downloadPic: ", err)
	})
}

//重命名图片
func getPicName(url string) string {
	index := strings.LastIndex(url, "/")
	filename := url[index+1:]
	timePrefix := strconv.Itoa(int(time.Now().UnixNano()))
	filename = timePrefix + "_" + filename
	return filename
}

func main() {
	chanImgUrls = make(chan string, 1000000)
	chanTask = make(chan string, 26)
	//1.爬虫协程
	for i := 2; i <= 27; i++ {
		wg.Add(1)
		go sendTask("https://pic.netbian.com/4kfengjing/index_" + strconv.Itoa(i) + ".html")
	}
	//2.任务统计协程
	wg.Add(1)
	go checkTaskOk()
	//3.下载协程
	for i := 0; i < 5; i++ {
		wg.Add(1)
		go downloadPic()
	}
	wg.Wait()
}

七、pipe-filterframework 架构

7.1 架构特点

​ 该架构适用于数据处理与数据分析的系统,各个filter与数据松耦合,filter之间通过pipe进行连接。如果各个filter位于分布式系统,那么需要通过网络连接去调用各个filter,如果各个filter是异步处理,那么还需要通过 Buffer 缓冲区去缓存数据。如果filter是进程内的调用,那么可以直接通过调用函数来处理。

image-20221010001357148 image-20221010002028257 image-20221010003341715

7.2 架构示例

​ 如图,要求利用 pipe-filter 架构实现解析 "1,2,3" 字符串,并求出各数字和:

image-20221010012345311

​ 设计:为了解耦数据和 Filter,将 Filter 设计成一个接口,Filter 内部方法通过操作 Request 实现请求处理,所得结果继续传递给下一个 Filter。各个 Filter 之间通过 Pipeline 进行连接,Pipeline 结构体应当包括初始请求 Request、所有的Filter集合,它也实现了Filter接口的 Process 方法,依次调用各个 Filter 实现数据处理,最后得出 Response.

//filter.go
type Response interface{}
type Request interface{}

type Filter interface {
	Process(data Request) (Response, error)
}
//split_filter.go
/**
分解字符串的 Filter
*/
var SplitFilterWrongFormatError error = errors.New("待分解数据格式异常")

type SplitFilter struct {
	delimiter string
}

func NewSplitFilter(delimiter string) *SplitFilter {
	return &SplitFilter{delimiter: delimiter}
}

func (sf *SplitFilter) Process(data Request) (Response, error) {
	//先判断数据格式
	str, ok := data.(string)
	if !ok {
		return nil, SplitFilterWrongFormatError
	}
	parts := strings.Split(str, sf.delimiter)
	return parts, nil
}
//to_int_filter.go
/**
转换string切片为int切片的 Filter
*/
var ToIntFilterWrongFormat error = errors.New("待转换字符串切片格式异常")

type ToIntFilter struct {
}

func NewToIntFilter() *ToIntFilter {
	return &ToIntFilter{}
}

func (tif *ToIntFilter) Process(data Request) (Response, error) {
	parts, ok := data.([]string)
	if !ok {
		return nil, ToIntFilterWrongFormat
	}
	ret := []int{}
	for _, part := range parts {
		num, err := strconv.Atoi(part)
		if err != nil {
			return nil, err
		}
		ret = append(ret, num)
	}
	return ret, nil
}
//sum_filter.go
/**
汇总int切片数值和的 Filter
*/
var SumFilterWrongFormat error = errors.New("求和整型切片异常")

type SumFilter struct {
}

func NewSumFilter() *SumFilter {
	return &SumFilter{}
}

func (sf *SumFilter) Process(data Request) (Response, error) {
	parts, ok := data.([]int)
	if !ok {
		return nil, SumFilterWrongFormat
	}
	res := 0
	for _, part := range parts {
		res += part
	}
	return res, nil
}
//straight_pipeline.go
/**
PipeLine:汇总各个Filter并依次执行
*/
type StraightFilterPipeline struct {
	Name    string
	Filters *[]Filter
}

func NewStraightFilterPipeline(name string, filters ...Filter) *StraightFilterPipeline {
	return &StraightFilterPipeline{
		Name:    name,
		Filters: &filters,
	}
}

func (sfp *StraightFilterPipeline) Process(data Request) (Response, error) {
	var ret interface{}
	var err error
	for _, filter := range *sfp.Filters {
		ret, err = filter.Process(data)
		if err != nil {
			return ret, nil
		}
		data = ret
	}
	return ret, nil
}
//最终测试
func TestPipelineFilters(t *testing.T) {
	str := "1,2,3"
	splitFilter := testContext.NewSplitFilter(",")
	toIntFilter := testContext.NewToIntFilter()
	sumFilter := testContext.NewSumFilter()
	straightPipeline := testContext.NewStraightFilterPipeline("sp", splitFilter, toIntFilter, sumFilter)
	ret, err := straightPipeline.Process(str)
	if err != nil {
		t.Fatal(err)
	}
	if ret != 6 {
		t.Fatalf("The expected is 6, but the result is %d\n", ret)
	}
}
image-20221010013102445

八、micro-kernel 架构

8.1 架构特点

image-20221010095952268

适用于如下情况:

  • 内核包含公共流程或者通用逻辑;
  • 将可变或可扩展的部分规划为扩展点;
  • 抽象扩展点的行为,定义接口;

image-20221010100500150

8.2 架构示例

//agent.go
package testMicroKernel

import (
	"context"
	"errors"
	"fmt"
	"strings"
	"sync"
)

const (
	Waiting = 0
	Running = 1
)

var WrongStateError = errors.New("can not take the operation in the current state")

//错误隔离
type CollectorsError struct {
	CollectorErrors []error
}

//输出全部错误信息
func (ce CollectorsError) Error() string {
	var strs []string
	for _, err := range ce.CollectorErrors {
		strs = append(strs, err.Error())
	}
	return strings.Join(strs, ";")
}

//事件结构体
type Event struct {
	Source  string
	Content string
}

//传输事件的接口
type EventReceiver interface {
	OnEvent(evt Event)
}

//Collector接口
type Collector interface {
	Init(evtReceiver EventReceiver) error
	Start(agtCtx context.Context) error
	Stop() error
	Destory() error
}

//主体处理
type Agent struct {
	collectors map[string]Collector //内嵌多个Collector
	evtBuf     chan Event           //接收所有Event事件的通道
	ctx        context.Context      //上下文
	cancel     context.CancelFunc   //结束上下文的函数
	state      int                  //当前主体状态,
}

//初始化Agent主体对象,指定事件通道容量大小
func NewAgent(sizeEvtBuf int) *Agent {
	agt := Agent{
		collectors: map[string]Collector{},
		evtBuf:     make(chan Event, sizeEvtBuf),
		state:      Waiting,
	}
	return &agt
}

//启动主体
func (agt *Agent) Start() error {
	if agt.state != Waiting {
		return WrongStateError
	}
	agt.state = Running
	agt.ctx, agt.cancel = context.WithCancel(context.Background())
	go agt.EventProcessGoroutine() //另起一个协程在监听过程中执行操作
	return agt.StartCollectors()
}

//结束主体
func (agt *Agent) Stop() error {
	if agt.state != Running {
		return WrongStateError
	}
	agt.state = Waiting
	agt.cancel()
	return agt.StopCollectors()
}

//销毁主体
func (agt *Agent) Destory() error {
	if agt.state != Waiting {
		return WrongStateError
	}
	return agt.DestoryCollectors()
}

//启动事件监听处理器
func (agt *Agent) EventProcessGoroutine() {
	var evtSeg [10]Event //默认每接收10个Event打印一次
	for {
		for i := 0; i < 10; i++ {
			select {
			case evtSeg[i] = <-agt.evtBuf:
			case <-agt.ctx.Done():
				return
			}
		}
		fmt.Println(evtSeg)
	}
}

//传输事件给Agent主体
func (agt *Agent) OnEvent(evt Event) {
	agt.evtBuf <- evt
}

//注册Collector
func (agt *Agent) RegisterCollector(name string, collector Collector) error {
	if agt.state != Waiting {
		return WrongStateError
	}
	agt.collectors[name] = collector
	return collector.Init(agt)
}

//启动所有Collector
func (agt *Agent) StartCollectors() error {
	var err error
	var errs CollectorsError
	var mutex sync.Mutex //互斥锁方便添加Collector
	for name, collector := range agt.collectors {
		go func(name string, collector Collector, ctx context.Context) { //多协程并发注意统计数据的安全
			defer mutex.Unlock()
			err = collector.Start(ctx)
			mutex.Lock() //加上互斥锁,并发安全地往CollectorErrors Map集合中添加错误信息
			if err != nil {
				errs.CollectorErrors = append(errs.CollectorErrors, errors.New(name+":"+err.Error()))
			}
		}(name, collector, agt.ctx)
	}
	return errs
}

//终止所有Collector
func (agt *Agent) StopCollectors() error {
	var err error
	var errs CollectorsError
	for name, collector := range agt.collectors {
		if err = collector.Stop(); err != nil {
			errs.CollectorErrors = append(errs.CollectorErrors, errors.New(name+":"+err.Error()))
		}
	}
	if len(errs.CollectorErrors) == 0 {
		return nil
	}
	return errs
}

func (agt *Agent) DestoryCollectors() error {
	var err error
	var errs CollectorsError
	for name, collector := range agt.collectors {
		if err = collector.Destory(); err != nil {
			errs.CollectorErrors = append(errs.CollectorErrors, errors.New(name+":"+err.Error()))
		}
	}
	if len(errs.CollectorErrors) == 0 {
		return nil
	}
	return errs
}
//agent_test.go
package testMicroKernel

import (
	"context"
	"errors"
	"fmt"
	"testing"
	"time"
)

//Collector示例
type DemoCollector struct {
	evtReceiver EventReceiver
	agtCtx      context.Context
	stopChan    chan struct{}
	name        string
	content     string
}

func NewDemoCollector(name string, content string) *DemoCollector {
	return &DemoCollector{
		name:     name,
		content:  content,
		stopChan: make(chan struct{}),
	}
}

//初始化
func (c *DemoCollector) Init(evtReceiver EventReceiver) error {
	fmt.Println("initialize collector", c.name)
	c.evtReceiver = evtReceiver
	return nil
}

//开始执行
func (c *DemoCollector) Start(agtCtx context.Context) error {
	fmt.Println("start collector", c.name)
	for {
		select {
		case <-agtCtx.Done():  //主体结束时所有Collector均结束
			c.stopChan <- struct{}{}
			break
		default: //每一个Collector执行自己的操作
			time.Sleep(time.Millisecond * 50)
			c.evtReceiver.OnEvent(Event{c.name, c.content})
		}
	}
}

//终止执行
func (c *DemoCollector) Stop() error {
	fmt.Println("stop collector", c.name)
	select {
	case <-c.stopChan:
		return nil
	case <-time.After(1 * time.Second):
		return errors.New("failed to stop for timeout")
	}
}

//销毁
func (c *DemoCollector) Destory() error {
	fmt.Println(c.name, "Release resources")
	return nil
}
func TestAgent(t *testing.T) {
	agt := NewAgent(100)
	c1 := NewDemoCollector("c1", "1")
	c2 := NewDemoCollector("c2", "2")
	agt.RegisterCollector("c1", c1)
	agt.RegisterCollector("c2", c2)
	err := (agt.Start()).(CollectorsError)
	if len(err.CollectorErrors) > 0 {
		for _, errItem := range err.CollectorErrors {
			fmt.Printf("start error %v\n", errItem)
		}
	}
	fmt.Println("start Agent again:", agt.Start())
	time.Sleep(1 * time.Second)
	agt.Stop()
	agt.Destory()
}
image-20221010183008389

九、JSON配置解析

9.1 Go内置JSON解析工具

​ go 语言有自己实现的一套 JSON 配置解析函数,但是它实现时使用了很多反射,反射过程过多就会影响高并发时的性能。因此考虑自己实现一套JSON解析功能函数。

image-20221012110904786

9.2 EasyJson

image-20221012111012810

插件安装:

# for Go < 1.17 
go get -u github.com/mailru/easyjson/...

# for Go >= 1.17 
go get github.com/mailru/easyjson
go install github.com/mailru/easyjson/...@latest

安装完毕后会在 $GOPATH/bin 目录下生成 easyjson.exe 可执行文件:

image-20221012115339075

使用时通过命令: easyjson -all <文件名称>.go 生成包含 UnmarshalJSON()MarshalJSON() JSON解析函数的文件。

注意使用时有如下配置选项:

-lower_camel_case:    将结构体字段field首字母改为小写。如Name=>name。  
-build_tags string:   将指定的string生成到生成的go文件头部。  
-no_std_marshalers:  不为结构体生成MarshalJSON/UnmarshalJSON函数。  
-omit_empty:          没有赋值的field可以不生成到json,否则field为该字段类型的默认值。
-output_filename:     定义生成的文件名称。
-pkg:                 对包内指定有`//easyjson:json`结构体生成对应的easyjson配置。
-snke_case:           可以下划线的field如`Name_Student`改为`name_student`。

使用及性能测试:

//待解析或者转换的结构体对象
package testJSON

type BasicInfo struct {
	Name string `json:"name"`
	Age  int    `json:"age"`
}
type JobInfo struct {
	Skills []string `json:"skills"`
}
type Employee struct {
	BasicInfo BasicInfo `json:"basic_info"`
	JobInfo   JobInfo   `json:"job_info"`
}

使用命令:easyjson -all struct_def.go 在当前文件的下一级目录下生成了 struct_def_easyjson.go

//json_test.go
package testJSON

import (
	"encoding/json"
	"fmt"
	"testing"
)

var jsonStr = `{"basic_info":{"name":"Mike","age":12},"job_info":{"skills":["Java","Go","C"]}}`

func TestEasyJson(t *testing.T) {
	e := Employee{}
	e.UnmarshalJSON([]byte(jsonStr))
	fmt.Println(e)
	if v, err := e.MarshalJSON(); err != nil {
		t.Error(err)
	} else {
		fmt.Println(string(v))
	}
}

func BenchmarkEmbeddedJson(b *testing.B) {   //采用go内嵌的JSON解析函数
	b.ResetTimer()
	e := new(Employee)
	for i := 0; i < b.N; i++ {
		err := json.Unmarshal([]byte(jsonStr), e)
		if err != nil {
			b.Error(err)
		}
		if _, err = json.Marshal(e); err != nil {
			b.Error(err)
		}
	}
}

func BenchmarkEasyJson(b *testing.B) {     //采用EasyJson实现的JSON解析函数
	b.ResetTimer()
	e := Employee{}
	for i := 0; i < b.N; i++ {
		err := e.UnmarshalJSON([]byte(jsonStr))
		if err != nil {
			b.Error(err)
		}
		if _, err = e.MarshalJSON(); err != nil {
			b.Error(err)
		}
	}
}
image-20221012120138172
posted @ 2023-10-08 13:11  Stitches  阅读(24)  评论(0)    收藏  举报