代码改变世界

query and join operation after sharding

2010-11-16 12:20  Sun Yongyue  阅读(350)  评论(0编辑  收藏  举报

I'm handling something about sharding recently. What I want to do is to add more machines or nodes to my system so that storing more data is possibale, or scale-out.


I found many people add a middle layer between their applications and databases, mysql for example. That seems to be perfect for me at first glance. But then I'm puzzled about the query and join operation.


Q1:
How to handle queries with offset and limit like:
'select * from xxxx where yyyy order by col_a asc limit num_b, num_c'
Just perform the query 'select * from xxxx where yyyy order by col_a asc limit 0, num_b + num_c' on each machine, merge the results and return [num_b, num_b + num_c)?


Q2:
How to handle the join operations within different databases?

 

Currently, I'm trying to solve the first problem by merging the results on the middle layer. There comes another tough problem. That is to reduce the memory usage. Working on, and suggestions are welcome~