当前在线人数13690
首页 - 分类讨论区 - 电脑网络 - 数据库版 -阅读文章
未名交友
[更多]
[更多]
文章阅读:Re: question on large tables (>=800 million records, 10 G
[同主题阅读] [版面: 数据库] [作者:tailang] , 2007年01月17日22:03:56
tailang
进入未名形象秀
我的博客
[上篇] [下篇] [同主题上篇] [同主题下篇]

发信人: tailang (西瓜太郎), 信区: Database
标  题: Re: question on large tables (>=800 million records, 10 G bytes
发信站: BBS 未名空间站 (Wed Jan 17 22:03:56 2007)

Have you ever considered a non-DB approach?

The data structure looks pretty straightforward. You may find a way (need
some compression) to put all data including a unique key (8 bytes) into a
16GB 64bit machine.

Now think about the reverse index.You need to build index on cabid/timestamp
/longitude/latitude, each index could fit into a single machine.

When queries come in, get sets of unique keys from all indices and find the
intersect.

Do your final query on the datahost to return result.

I would expect under 10ms of avg execution time. To update/serialize records
is another thing you may want to consider, but it's out of the concern of
query time.

2c

【 在 babycry (babycry) 的大作中提到: 】
: Hello, this is a data set for data mining.
: I believe the experiences on this case should be helpful in general.
: The questions is, how to make fast queries on large tables
: (>=800 million records, 10G bytes of data)
: with ordinary machines ?
: Below are some details:
: There is only one table, with the following fields:
: cabId CHAR(8), timestamps DATETIME, longitude FLOAT, latitude FLOAT,
status
: CHAR(1)
: We want to be able to query on cabId, timestamps, latitude, and longitude.
: ...................



--

※ 来源:·BBS 未名空间站 http://mitbbs.com·[FROM: 71.198.]

[上篇] [下篇] [同主题上篇] [同主题下篇]
[转寄] [转贴] [回信给作者] [修改文章] [删除文章] [同主题阅读] [从此处展开] [返回版面] [快速返回] [收藏] [举报]
 
回复文章
标题:
内 容:

未名交友
将您的链接放在这儿

友情链接


 

Site Map - Contact Us - Terms and Conditions - Privacy Policy

版权所有,未名空间(mitbbs.com),since 1996