[CASSANDRA-1526] Make cassandra sampling and startup faster - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Low
Resolution: Fixed
Fix Version/s: 0.6.6
Component/s: None
Labels:
None

Description

http://wiki.apache.org/cassandra/CassandraHardware makes mention of very large disks I do not see how that would be possible.

We have a server class system have 4x processors 16GB RAM a 6 DISK RAID5 (yes RAID0 would be faster but still)

INFO [main] 2010-09-21 12:58:26,348 SSTableReader.java (line 120) Sampling index for /var/lib/cassandra/data/system/LocationInfo-699-Data.db
...
INFO [main] 2010-09-21 13:05:51,333 CassandraDaemon.java (line 124) Binding thrift service to cdbsd07/10.71.71.57:9160

This node has 200GB of data in two column families and the time to sample all tables and startup is 7+ minutes. The logging suggests this process is happening a single SSTable at a time. Additionally the normal system vitals mainly DISK and CPU do not look overtaxed.

Since SSTables are immutable is there a way the sampling of the tables could be saved?
Could this process be done in parallel for speedup?
Can multiple column families be processed at once?

Unless someone has an insanely powerful disk pack making mention of 2TB limitations seem out of place. Unless my calculations are wrong (which they usually are), I have a pretty decent hardware, and if I had 2 TB of data I would have a 95 minute node start up?

I hope that maybe sampling multiple ColumnFamilies at once would make nodes of at least a few hundred GB startup reasonably fast.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

io.txt
21/Sep/10 18:28
24 kB
Edward Capriolo
cpu.txt
21/Sep/10 18:28
10 kB
Edward Capriolo
1526.txt
21/Sep/10 21:19
4 kB
Jonathan Ellis
0.6-0001-Add-AggregateFuture-to-wait-for-a-batch-of-futures-a.patch
30/Sep/10 02:29
8 kB
Stu Hood
0.6-0002-Parallelize-SSTable-open.patch
30/Sep/10 02:29
10 kB
Stu Hood
snitcherror.txt
01/Oct/10 15:50
7 kB
Edward Capriolo
1526-v2.txt
01/Oct/10 16:22
5 kB
Jonathan Ellis
1526-v3.txt
01/Oct/10 22:16
5 kB
Stu Hood

Activity

People

Assignee:: Jonathan Ellis

Reporter:: Edward Capriolo

Authors:: Jonathan Ellis

Reviewers:: Stu Hood

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 21/Sep/10 17:31

Updated:: 16/Apr/19 09:33

Resolved:: 04/Oct/10 16:11