从一个需求看json序列化

写在前面

在开发业务的时候，碰到了对接的两个系统返回的json数据大部分字段都是一样的，但是有个别字段名称一样但是类型却不一样的情况。为了偷懒少写代码，就在思考有没有复用结构体的办法。

Go 匿名嵌套struct

从go语言层面来看，使用匿名嵌套struct是可以做到这一点的，比如

type Base struct {
    Field int
    F2 string
    F3 float64
}
type Type1 struct {
    Base
    Field string
}

由于外部struct的同名字段将会覆盖内部struct的同名字段，通过Type1.Field访问到的将是string类型，而不会是int类型

这样我们就可以把大部分相同的字段放在Base结构体中，其余的不同的字段放在各自自定义的结构体中，以达到复用的目的

现在在语言层面我们是可以做到这个了，但是我们是在多个系统之前进行交互的啊，这中间必然涉及到数据序列化了，那么问题来了，由于我们使用的json格式，json对这种形式的数据将会如何处理？支不支持？

json基本用法

在http请求中，json是最常见的传输格式，在go语言中，直接有该语言的标准库。在encoding/json包中，提供了Marshal和Unmarshal两种基本方法用来序列化和反序列化。

我们对上面的定义类型测试一下

d := Type1{
   Base: Base{
      Field: 123,
      F2: "Base_F2",
      F3: 0.45,
   },
   Field: "UP_LOAD",
}
 
bytes, _ := json.Marshal(d)
fmt.Println(string(bytes))
 
d2 := Type1{}
_ = json.Unmarshal(bytes, &d2)
fmt.Printf("%#v", d2)

输出结果

{"F2":"Base_F2","F3":0.45,"Field":"UP_LOAD"}
util.Type1{Base:util.Base{Field:0, F2:"Base_F2", F3:0.45}, Field:"UP_LOAD"}

从序列化结果出来看，当出现重名的字段时，默认是外层的struct覆盖内层的struct，匿名的struct的字段序列化层级和外层的struct的字段是在同一级。反序列化时，也是如此，外层的覆盖内层的字段。

所以在此实验中，我们已经可以很好的实现之前的问题了。但是作为一个有追求的（害怕出锅）的开发来说，我们还需要确定这是设计就是如此，还是有些别的原因导致的，顺便可以学习下json的实现原理。

json序列化一些规则

在encoding/json.Marshal中注释写的比较详细，这里总结介绍下

每个字段可以使用tag中的json这个key控制序列化的过程

第一个,之前表示的字段序列化后的名称，之后的表示其他属性

如果字段的tag写成了-，那么该字段就永远会被忽略，不过需要注意的是，如果写成了-,，那么还是会序列化，只是序列化出来的名称是-

可以使用的属性有

string：表示该字段会使用string类型存储

omitempty：表示当字段为空值的时候，不会进行序列化

有些字段类型（channel,complex,function）是不允许直接出现json结构体中的（不过可以上面的忽略规则进行忽略）

可以被序列化的字段满足go语言的字段可见性

如果字段以小写命名，不会被序列化

在同一层上有多个相同名称的字段，具体代码encoding/json.dominatField

只会考虑最外层的struct层的字段

如果有tag的，将会只考虑有tag的字段

被过滤之后剩余的字段如果还有多个，该字段就会被忽略

json序列化

json序列化过程是一个递归过程

先看下序列化的入口处

该方法首先会从全局编码池获取encoding/json.encodeState，该结构体可以复用内存，编码过程中都会使用该编码状态，编码完成后重新放回编码池

从上述调用栈来看，最后调用的都是typeEncoder，该方法会递归调用所有类型找到对应的编码方法

在json库中有一个全局的类型->编码方法的cache，如果之前已经解析过了，那么就不需要再次解析了，以空间换时间

在该方法中，比较奇怪的使用了WaitGroup这个同步，从注释可以很清晰的了解到，这个是为了解决递归类型的，由于json序列化是递归的，所以如果一个结构体中是递归类型，那么在encode时就会无限递归下去失败

接下来会使用newTypeEncoder方法对该类型进行编码，主要分为几个过程

如果需要编码类型实现了json.Marshaler或者encoding.TextMarshaler接口，那么就使用自定义的编码方式

否则的话，使用内置的JSON编码器，支持的编码器

switch t.Kind() {
case reflect.Bool:
   return boolEncoder
case reflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64:
   return intEncoder
case reflect.Uint, reflect.Uint8, reflect.Uint16, reflect.Uint32, reflect.Uint64, reflect.Uintptr:
   return uintEncoder
case reflect.Float32:
   return float32Encoder
case reflect.Float64:
   return float64Encoder
case reflect.String:
   return stringEncoder
case reflect.Interface:
   return interfaceEncoder
case reflect.Struct:
   return newStructEncoder(t)
case reflect.Map:
   return newMapEncoder(t)
case reflect.Slice:
   return newSliceEncoder(t)
case reflect.Array:
   return newArrayEncoder(t)
case reflect.Ptr:
   return newPtrEncoder(t)
default:
   return unsupportedTypeEncoder
}

简单的编码器如boolEncoder做的事比较简单，直接将true/false写入到encoding/json.encodeState即可

复杂的编码器如structEncoder需要做的事比较多，需要先解析出结构，依照结构每个字段进行编码

先看编码过程，

func (se structEncoder) encode(e *encodeState, v reflect.Value, opts encOpts) {
   next := byte('{')
FieldLoop:
    // 遍历每个字段，字段进行递归编码
   for i := range se.fields.list {
      f := &se.fields.list[i]
 
      // Find the nested struct field by following f.index.
      // 对于内嵌的结构体，f.index有多个，比方说对于上面的Type1的Base.Field该字段来说
      //     index值将是[0, 0]
      fv := v
      for _, i := range f.index {
         if fv.Kind() == reflect.Ptr {
            if fv.IsNil() {
               continue FieldLoop
            }
            fv = fv.Elem()
         }
         fv = fv.Field(i)
      }
 
      if f.omitEmpty && isEmptyValue(fv) {
         continue
      }
      e.WriteByte(next)
      next = ','
      if opts.escapeHTML {
         e.WriteString(f.nameEscHTML)
      } else {
         e.WriteString(f.nameNonEsc) // 写入名称，格式是`"name":`
      }
      opts.quoted = f.quoted
      f.encoder(e, fv, opts) // 使用字段的编码器进行编码
   }
   if next == '{' {
      e.WriteString("{}") // 空结构
   } else {
      e.WriteByte('}')
   }
}

再看下解析结构过程，通过bfs算法遍历struct的每个字段

// typeFields returns a list of fields that JSON should recognize for the given type.
// The algorithm is breadth-first search over the set of structs to include - the top struct
// and then any reachable anonymous structs.
func typeFields(t reflect.Type) structFields {
   // Anonymous fields to explore at the current level and the next.
   current := []field{}
   next := []field{{typ: t}}
 
   // Count of queued names for current level and the next.
   var count, nextCount map[reflect.Type]int
 
   // Types already visited at an earlier level.
   visited := map[reflect.Type]bool{}
 
   // Fields found.
   var fields []field
 
   // Buffer to run HTMLEscape on field names.
   var nameEscBuf bytes.Buffer
 
   for len(next) > 0 {
      current, next = next, current[:0]
      count, nextCount = nextCount, map[reflect.Type]int{}
 
      for _, f := range current {
         if visited[f.typ] {
            continue
         }
         visited[f.typ] = true
 
         // Scan f.typ for fields to include.
         for i := 0; i < f.typ.NumField(); i++ {
            sf := f.typ.Field(i)
            isUnexported := sf.PkgPath != ""
            if sf.Anonymous {
               t := sf.Type
               if t.Kind() == reflect.Ptr {
                  t = t.Elem()
               }
               if isUnexported && t.Kind() != reflect.Struct {
                  // Ignore embedded fields of unexported non-struct types.
                  continue
               }
               // Do not ignore embedded fields of unexported struct types
               // since they may have exported fields.
            } else if isUnexported {
               // Ignore unexported non-embedded fields.
               continue
            }
            // 解析tag，如果是"-"，该字段直接忽略，检查需要开启string选项
            ...
 
            // Record found field and index sequence.
            if name != "" || !sf.Anonymous || ft.Kind() != reflect.Struct {
                // 初始化field的属性
               tagged := name != ""
               if name == "" {
                  name = sf.Name 
               }
               field := field{
                  name:      name,
                  tag:       tagged,
                  ...
 
               fields = append(fields, field)
               if count[f.typ] > 1 {
                  // If there were multiple instances, add a second,
                  // so that the annihilation code will see a duplicate.
                  // It only cares about the distinction between 1 or 2,
                  // so don't bother generating any more copies.
                  fields = append(fields, fields[len(fields)-1])
               }
               continue
            }
 
            // Record new anonymous struct to explore in next round.
            nextCount[ft]++
            if nextCount[ft] == 1 {
               next = append(next, field{name: ft.Name(), index: index, typ: ft})
            }
         }
      }
   }
 
    //按照name, len(index)（嵌套struct的深度）, tag进行排序
     // 为了之后方便过滤掉有重复名称的属性
   sort.Slice(fields, func(i, j int) bool {
      x := fields
      // sort field by name, breaking ties with depth, then
      // breaking ties with "name came from json tag", then
      // breaking ties with index sequence.
      if x[i].name != x[j].name {
         return x[i].name < x[j].name
      }
      if len(x[i].index) != len(x[j].index) {
         return len(x[i].index) < len(x[j].index)
      }
      if x[i].tag != x[j].tag {
         return x[i].tag
      }
      return byIndex(x).Less(i, j)
   })
   out := fields[:0]
   for advance, i := 0, 0; i < len(fields); i += advance {
      // One iteration per name.
      // Find the sequence of fields with the name of this first field.
     ....
   }
 
   fields = out
   sort.Sort(byIndex(fields))
 
   for i := range fields {
      f := &fields[i]
      f.encoder = typeEncoder(typeByIndex(t, f.index)) // 递归遍历每个field的编码器
   }
   ...
   return structFields{fields, nameIndex}
}

posted on 2020-09-13 11:27 RayFong 阅读(305) 评论(0) 收藏举报

刷新页面返回顶部

RayFong