Fork me on GitHub
精简版StringBuilder,提速字符串拼接

精简版StringBuilder,提速字符串拼接

写目的

在频繁的字符串拼接中,为了提升程序的性能,我们往往会用StringBuilder代替String+=String这样的操作;

而我在实际编码中发现,大部分情况下我用到的只是StringBuilder的Append方法;

一些极端的情况下,我希望我的程序性能更高,这时从StringBuilder入手是一个不错的主意;

所以我希望用一种简单的方案代替StringBuilder,我将这个方案命名为QuickStringWriter;

 

方案定义

对于StringBuilder来说他除了Append之外还会有更多的方法,比如Insert,AppendFormat等

QuickStringWriter这个方案,仅仅是用来代替简单的字符串+=这样的操作,所以我不会考虑他们,只需要重新实现Append,并让他们比StringBuilder更快

 

  初步设计

复制代码
class QuickStringWriter : IDisposable
{
    public QuickStringWriter Append(bool val);
    public QuickStringWriter Append(byte val);
    public QuickStringWriter Append(char val);
    public QuickStringWriter Append(DateTime val);
    public QuickStringWriter Append(DateTime val, string format);
    public QuickStringWriter Append(decimal val);
    public QuickStringWriter Append(double val);
    public QuickStringWriter Append(Guid val);
    public QuickStringWriter Append(Guid val, string format);
    public QuickStringWriter Append(short val);
    public QuickStringWriter Append(int val);
    public QuickStringWriter Append(long val);
    public QuickStringWriter Append(sbyte val);
    public QuickStringWriter Append(float val);
    public QuickStringWriter Append(string val);
    public QuickStringWriter Append(ushort val);
    public QuickStringWriter Append(uint val);
    public QuickStringWriter Append(ulong val);
    public QuickStringWriter Clear();
    void Dispose();
    string ToString();
}
复制代码

  结构

QuickStringWriter将使用一个Char数组作为缓冲区(Buff)

使用一个属性Position作为当前字符位置,或者说是当前字符数

重写ToString方法,将当前缓冲区中的内容,从0到Position转为string对象输出

复制代码
char[] Buff;
int Position;

public override string ToString()
{
    return new string(Buff, 0, Position);
}
复制代码

  设置缓冲区

既然有缓冲区,那么就要考虑缓冲区不足时的处理

我设计2个方法解决这个问题

复制代码
//设置缓冲区容量
void SetCapacity(int capacity)
{
    if (capacity > Buff.Length)
    {
        if (capacity > 6000 * 10000)   //6000W
        {
            throw new OutOfMemoryException("QuickStringWriter容量不能超过6000万个字符");
        }
    }
    var newbuff = new char[capacity];
    Array.Copy(Buff, 0, newbuff, 0, Math.Min(Position, capacity));
    Buff = newbuff;
    Position = Math.Min(Position, Buff.Length);
}
//翻倍空间
void ToDouble()
{
    SetCapacity(Math.Min(Buff.Length * 2, 10 * 10000));
}
复制代码

 

第一个方法SetCapacity,我预留了一个缩小当前缓冲区的处理,虽然现在不会使用

第二个方法就是翻倍缓冲区,这里也是有个条件的,如果当前缓冲区大于5W,最多一次也只能扩容10W字符的容量

复制代码
//当容量不足的时候,尝试翻倍空间
void Try()
{
    if (Position >= Buff.Length)
    {
        ToDouble();
    }
}
//测试剩余空间大小,如果不足,则扩展至可用大小后再翻倍
void Check(int count)
{
    var pre = Position + count;
    if (pre >= Buff.Length)
    {
        SetCapacity(pre * 2);
    }
}
复制代码

 

这里还需要2个方法可以方面的调用 

比如在追加单个字符的时候可以调用Try

在追加指定长度字符之前可以调用Check

 

  性能

在性能上,我只要考虑每一个方法的性能都能快StringBuilder就可以了,这点其实并不是非常困难

 bool类型处理

public QuickStringWriter Append(Boolean val)
{
if (val)
{
Check(4);
Buff[Position++] = 't';
Buff[Position++] = 'r';
Buff[Position++] = 'u';
Buff[Position++] = 'e';
}
else
{
Check(5);
Buff[Position++] = 'f';
Buff[Position++] = 'a';
Buff[Position++] = 'l';
Buff[Position++] = 's';
Buff[Position++] = 'e';
}
return this;
}

bool类型处理

百万次追加 false

StringBuilder         19ms

QuickStringWriter  9ms

ps:系统的bool转换为String后首字母都是大小,这里我为了使用更方面直接转为小写的了

 DateTime类型处理

public QuickStringWriter Append(DateTime val)
{
Check(18);
if (val.Year < 1000)
{
Buff[Position++] = '0';
if (val.Year < 100)
{
Buff[Position++] = '0';
if (val.Year < 10)
{
Buff[Position++] = '0';
}
}
}
Append((long)val.Year);
Buff[Position++] = '-';

if (val.Month < 10)
{
Buff[Position++] = '0';
}
Append((long)val.Month);
Buff[Position++] = '-';

if (val.Day < 10)
{
Buff[Position++] = '0';
}
Append((long)val.Day);
Buff[Position++] = ' ';

if (val.Hour < 10)
{
Buff[Position++] = '0';
}
Append((long)val.Hour);
Buff[Position++] = ':';

if (val.Minute < 10)
{
Buff[Position++] = '0';
}
Append((long)val.Minute);
Buff[Position++] = ':';

if (val.Second < 10)
{
Buff[Position++] = '0';
}
Append((long)val.Minute);
return this;
}

DateTime类型处理

十万次追加 DateTime.Now

StringBuilder         90ms

QuickStringWriter  55ms

 整数类型的处理

Char[] NumberBuff;
public QuickStringWriter Append(Int64 val)
{
if (val == 0)
{
Buff[Position++] = '0';
return this;
}

var pos = 63;
if (val < 0)
{
Buff[Position++] = '-';
NumberBuff[pos] = (char)(~(val % 10) + '1');
if (val < -10)
{
val = val / -10;
NumberBuff[--pos] = (char)(val % 10 + '0');
}
}
else
{
NumberBuff[pos] = (char)(val % 10 + '0');
}
while ((val = val / 10L) != 0L)
{
NumberBuff[--pos] = (char)(val % 10L + '0');
}
var length = 64 - pos;
Check(length);
Array.Copy(NumberBuff, pos, Buff, Position, length);
Position += length;
return this;
}

整数类型的处理

百万次追加             long.MaxValue    sbyte.MaxValue

StringBuilder         190ms                 120ms  

QuickStringWriter  115ms                 33ms

 char类型处理
public QuickStringWriter Append(Char val)
{
    Try();
    Buff[Position++] = val;
    return this;
}

百万次追加             'a'

StringBuilder         7ms

QuickStringWriter  4ms

 String处理

public QuickStringWriter Append(String val)
{
if (val == null || val.Length == 0)
{
return this;
}
else if (val.Length <= 3)
{
Check(val.Length);
Buff[Position++] = val[0];
if (val.Length > 1)
{
Buff[Position++] = val[1];
if (val.Length > 2)
{
Buff[Position++] = val[2];
}
}
}
else
{
Check(val.Length);
val.CopyTo(0, Buff, Position, val.Length);
Position += val.Length;
}
return this;
}

String处理

嗯..这个和StringBuilder几乎相同

然后其他的类型就直接按照调用Append(string str) 的处理方式就可以了

 其他类型处理

public QuickStringWriter Append(Guid val)
{
Append(val.ToString());
return this;
}

public QuickStringWriter Append(Decimal val)
{
Append(val.ToString(System.Globalization.NumberFormatInfo.InvariantInfo));
return this;
}

其他类型处理

  带入JsonBuilder

 全部完成了之后 我把他加入到之前的JsonBuilder中试试

//protected StringBuilder Buff = new StringBuilder(4096);
protected QuickStringWriter Buff = new QuickStringWriter(4096);//字符缓冲区

调用需要修改一个地方

复制代码
public string ToJsonString(object obj)
{
    //Buff.Length = 0; //StringBuilder清空方法
    Buff.Clear();//QuickStringWriter清空方法
    AppendObject(obj);
    return Buff.ToString();
}
复制代码

再来看看前后的差距

  • 原StringBuilder缓冲

211ms | 166ms | 162ms | 164ms | 164ms | 160ms | 163ms | 160ms | 179ms | 156ms |

  • 更换为QuickStringWriter后

167ms | 134ms | 134ms | 134ms | 135ms | 134ms | 134ms | 133ms | 134ms | 134ms |

ps:在公司的破电脑上也测了,发现配置越差差距越大

  完整代码

 QuickStringWriter完整代码

using System;
using System.Collections.Generic;
using System.Text;

namespace blqw
{
public class QuickStringWriter : IDisposable
{
static LiteracyGetter StringFirstChar;
static QuickStringWriter()
{
var f = typeof(string).GetField("m_firstChar", System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.Instance);
StringFirstChar = Literacy.CreateGetter(f);

}


public QuickStringWriter() : this(2048) { }
/// <summary>
/// 实例化新的对象,并且指定初始容量
/// </summary>
/// <param name="capacity"></param>
public QuickStringWriter(int capacity)
{
NumberBuff = new Char[64];
Buff = new Char[capacity];
}

//设置缓冲区容量
void SetCapacity(int capacity)
{
if (capacity > Buff.Length)
{
if (capacity > 6000 * 10000) //6000W
{
throw new OutOfMemoryException("QuickStringWriter容量不能超过6000万个字符");
}
}
var newbuff = new char[capacity];
Array.Copy(Buff, 0, newbuff, 0, Math.Min(Position, capacity));
Buff = newbuff;
Position = Math.Min(Position, Buff.Length);
}
//当容量不足的时候,尝试翻倍空间
void ToDouble()
{
SetCapacity(Math.Min(Buff.Length * 2, 10 * 10000));
}

Char[] NumberBuff;
Char[] Buff;
int Position;

public void Dispose()
{
NumberBuff = null;
Buff = null;
}

public QuickStringWriter Append(Boolean val)
{
if (val)
{
Check(4);
Buff[Position++] = 't';
Buff[Position++] = 'r';
Buff[Position++] = 'u';
Buff[Position++] = 'e';
}
else
{
Check(5);
Buff[Position++] = 'f';
Buff[Position++] = 'a';
Buff[Position++] = 'l';
Buff[Position++] = 's';
Buff[Position++] = 'e';
}
return this;
}
public QuickStringWriter Append(DateTime val)
{
Check(18);
if (val.Year < 1000)
{
Buff[Position++] = '0';
if (val.Year < 100)
{
Buff[Position++] = '0';
if (val.Year < 10)
{
Buff[Position++] = '0';
}
}
}
Append((long)val.Year);
Buff[Position++] = '-';

if (val.Month < 10)
{
Buff[Position++] = '0';
}
Append((long)val.Month);
Buff[Position++] = '-';

if (val.Day < 10)
{
Buff[Position++] = '0';
}
Append((long)val.Day);
Buff[Position++] = ' ';

if (val.Hour < 10)
{
Buff[Position++] = '0';
}
Append((long)val.Hour);
Buff[Position++] = ':';

if (val.Minute < 10)
{
Buff[Position++] = '0';
}
Append((long)val.Minute);
Buff[Position++] = ':';

if (val.Second < 10)
{
Buff[Position++] = '0';
}
Append((long)val.Minute);
return this;
}

public QuickStringWriter Append(Guid val)
{
Append(val.ToString());
return this;
}

public QuickStringWriter Append(DateTime val, string format)
{

Append(val.ToString(format));
return this;
}
public QuickStringWriter Append(Guid val, string format)
{
Append(val.ToString(format));
return this;
}

public QuickStringWriter Append(Decimal val)
{
Append(val.ToString());
return this;
}
public QuickStringWriter Append(Double val)
{
Append(Convert.ToString(val));
return this;
}
public QuickStringWriter Append(Single val)
{
Append(Convert.ToString(val));
return this;
}


public QuickStringWriter Append(SByte val)
{
Append((Int64)val);
return this;
}
public QuickStringWriter Append(Int16 val)
{
Append((Int64)val);
return this;
}
public QuickStringWriter Append(Int32 val)
{
Append((Int64)val);
return this;
}

public override string ToString()
{
return new string(Buff, 0, Position);
}

public QuickStringWriter Append(Int64 val)
{
if (val == 0)
{
Buff[Position++] = '0';
return this;
}

var pos = 63;
if (val < 0)
{
Buff[Position++] = '-';
NumberBuff[pos] = (char)(~(val % 10) + '1');
if (val < -10)
{
val = val / -10;
NumberBuff[--pos] = (char)(val % 10 + '0');
}
}
else
{
NumberBuff[pos] = (char)(val % 10 + '0');
}
while ((val = val / 10L) != 0L)
{
NumberBuff[--pos] = (char)(val % 10L + '0');
}
var length = 64 - pos;
Check(length);
Array.Copy(NumberBuff, pos, Buff, Position, length);
Position += length;
return this;
}
public QuickStringWriter Append(Char val)
{
Try();
Buff[Position++] = val;
return this;
}
public QuickStringWriter Append(String val)
{
if (val == null || val.Length == 0)
{
return this;
}
else if (val.Length <= 3)
{
Check(val.Length);
Buff[Position++] = val[0];
if (val.Length > 1)
{
Buff[Position++] = val[1];
if (val.Length > 2)
{
Buff[Position++] = val[2];
}
}
}
else
{
Check(val.Length);
val.CopyTo(0, Buff, Position, val.Length);
Position += val.Length;
}
return this;
}

 

public QuickStringWriter Append(Byte val)
{
Append((UInt64)val);
return this;
}
public QuickStringWriter Append(UInt16 val)
{
Append((UInt64)val);
return this;
}
public QuickStringWriter Append(UInt32 val)
{
Append((UInt64)val);
return this;
}
public QuickStringWriter Append(UInt64 val)
{
if (val == 0)
{
Buff[Position++] = '0';
return this;
}
var pos = 63;

NumberBuff[pos] = (char)(val % 10 + '0');

while ((val = val / 10L) != 0L)
{
NumberBuff[--pos] = (char)(val % 10L + '0');
}
var length = 64 - pos;
Check(length);
Array.Copy(NumberBuff, pos, Buff, Position, length);
Position += length;
return this;
}

public QuickStringWriter Clear()
{
Position = 0;
return this;
}

//当容量不足的时候,尝试翻倍空间
void Try()
{
if (Position >= Buff.Length)
{
ToDouble();
}
}
//测试剩余空间大小,如果不足,则扩展至可用大小后再翻倍
void Check(int count)
{
var pre = Position + count;
if (pre >= Buff.Length)
{
SetCapacity(pre * 2);
}
}


}
}

QuickStringWriter完整代码

 这样就OK了,在很多时间他就是一个完全可以满足需求的精简提速版的StringBuilder!

包括以后如果机会放出个人用的简易ORM也会发现里面的字符串拼接也是用的这个对象

 

我写的文章,除了纯代码,其他的都是想表达一种思想,一种解决方案.希望各位看官不要局限于文章中的现成的代码,要多关注整个文章的主题思路,谢谢
 
分类: ASP.NETC#
posted on 2013-08-26 22:42  HackerVirus  阅读(528)  评论(0编辑  收藏  举报