发信人: wyr (遗忘小资), 信区: Database
标 题: Re: question on large tables (>=800 million records, 10 G b
发信站: BBS 未名空间站 (Sat Jan 20 01:41:41 2007), 转信
I do not quite catch what you guys are discussing here,
you are just saying that you have a 800M record table with no active insert/
update/delete and you want to query it by PK?
800M line does not sound like a extremely huge one consider your data file
If you just want to find result, then you may build a customized B-Tree to
store your PK . You can easily implement parition or hash here to help
compress the tree, right? With this implementation, you can shrink the
physicalsize of your index file(s)
And with this index, I believe you can easily point each entry of your Pk to
a particular location in a file(s). With the help of customized hash or
parition, yyour search on partitioned index files can't be too long.
Finally, hardwares are so so cheap today, why not simply get a large disk
array and use RAID to improve your IO performance, your RAM is simply too
small for the problem you described here...
【 在 babycry (babycry) 的大作中提到: 】
: Thanks! I like this suggestion.
: This is actually the approach we are currently using.
: It is pretty ad hoc, however, it saves a lot of software-engineering time.
: We dislike software-engineering since we are not accredited for doing it.
: The current query time is normally 2~5 minutes.
: This query time is not good for webapps,
: but is acceptable for data mining.
: Since we do not update/insert,
: the data integrity issue of having several copies of the same data
: is not a problem.
※ 修改:·wyr 于 Jan 20 01:43:20 修改本文·[FROM: 70.244.]
※ 来源:·BBS 未名空间站 mitbbs.com·[FROM: 70.244.]