Erlang的map性能测试

map

Erlang在R17之后,支持了map,并且在R18用hash改进了实现。

使用场景

各种option的配置,比如mysql启动参数。监控树的ChildSpec(官方已用map)。还有一种,就是我个人觉得拿来做索引用,因为lists:keyfind太不靠谱了,k就是id,v就是对象(record)。其他场景,请大家发挥想象。

性能

测试项目地址

https://gitcafe.com/Roowe/erlang-map-test

主要对比了几种Erlang常用的kv做法。

测试结果(R18环境)

find表示可以找到的查找,unfind表示找不到,update表示找得到的更新,add表示找不到要添加。

100条规模

{{dict_add,100},[{min,0.46},{max,1.88},{arithmetic_mean,0.6275999999999998}]}
{{dict_find,100},[{min,0.26},{max,0.72},{arithmetic_mean,0.32360000000000005}]}
{{dict_unfind,100},[{min,0.24},{max,0.44},{arithmetic_mean,0.27279999999999993}]}
{{dict_update,100},[{min,0.48},{max,14.82},{arithmetic_mean,0.8868}]}

{{list_add,100},[{min,2.54},{max,3.56},{arithmetic_mean,2.8532000000000006}]}
{{list_find,100},[{min,0.18},{max,0.54},{arithmetic_mean,0.218}]}
{{list_unfind,100},[{min,0.22},{max,0.34},{arithmetic_mean,0.2512000000000001}]}
{{list_update,100},[{min,1.22},{max,4.98},{arithmetic_mean,1.6080000000000003}]}

{{map_add,100},[{min,0.12},{max,0.26},{arithmetic_mean,0.168}]}
{{map_find,100},[{min,0.1},{max,0.5},{arithmetic_mean,0.17919999999999994}]}
{{map_unfind,100},[{min,0.06},{max,0.12},{arithmetic_mean,0.08160000000000006}]}
{{map_update,100},[{min,0.14},{max,1.28},{arithmetic_mean,0.2156000000000001}]}

10000条规模

{{dict_add,10000},[{min,0.56},{max,1.68},{arithmetic_mean,0.76}]}
{{dict_find,10000},[{min,0.74},{max,1.22},{arithmetic_mean,0.8836000000000002}]}
{{dict_unfind,10000},[{min,0.24},{max,1.22},{arithmetic_mean,0.3724}]}
{{dict_update,10000},[{min,0.64},{max,6.8},{arithmetic_mean,0.9667999999999997}]}

{{list_add,10000},[{min,255.66},{max,651.9},{arithmetic_mean,471.7027999999999}]}
{{list_find,10000},[{min,13.38},{max,149.74},{arithmetic_mean,84.8036}]}
{{list_unfind,10000},[{min,24.68},{max,286.96},{arithmetic_mean,203.0548}]}
{{list_update,10000},[{min,118.78},{max,458.72},{arithmetic_mean,217.12160000000003}]}

{{map_add,10000},[{min,0.26},{max,0.66},{arithmetic_mean,0.31039999999999995}]}
{{map_find,10000},[{min,0.44},{max,1.72},{arithmetic_mean,0.5456000000000001}]}
{{map_unfind,10000},[{min,0.22},{max,0.6},{arithmetic_mean,0.2772}]}
{{map_update,10000},[{min,0.3},{max,0.58},{arithmetic_mean,0.368}]}

结论

总体上,小数据规模,几百左右,三种方法都不太有性能差距,上万的话就很明显了,list就不太中用了。在R17没有优化的map测试的时候,发现依旧比list快,看了底层实现,算法复杂度都是O(n),应该就是链表和数组的差距了,离散内存访问和连续内存访问的开销不一样。