Searching...
Tuesday, September 24, 2013

Stress Testing Apache Cassandra Database

Read Complete Post From Source: BLog.CripperZ.SG

Go to apache cassandra installation folder > tools > bin > cassandra-stress


Below is a simple stress test script that prints out the stats to a file in Ubuntu with dstat installed.


 



#!/bin/bash
dstat -t -r -m -s -d -c -y -p > /root/$1-vmstat.log &
PID=$!
./cassandra-stress -c 10 -S 100 -n 30000000 -i 1 -o INSERT > /root/$1-stress.log
kill $PID
 
dstat -t -r -m -s -d -c -y -p > /root/$1-read-vmstat.log &
PID=$!
./cassandra-stress -c 10 -S 100 -n 30000000 -i 1 -o READ > /root/$1-read-stress.log
kill $PID
 
dstat -t -r -m -s -d -c -y -p > /root/$1-rangeslice-vmstat.log &
PID=$!
./cassandra-stress -c 10 -S 100 -n 30000000 -i 1 -o RANGE_SLICE > /root/$1-rangeslice-stress.log
kill $PID


Pretty self explanatory.


 


Below are some references taken from http://www.datastax.com/docs/1.1/references/stress_java with regards to the stress tool


The cassandra-stress tool is a Java-based stress testing utility for benchmarking and load testing a Cassandra cluster. The binary installation of the tool also includes a daemon, which in larger-scale testing can prevent potential skews in the test results by keeping the JVM warm.


There are different modes of operation:


  • Inserting: Loads test data.

  • Reading: Reads test data.

  • Indexed range slicing: Works with RandomParititioner on indexed column families.

You can use these modes with or without the cassandra-stressd daemon running (binary installs only).



Usage


  • Packaged installs: cassandra-stress [options]

  • Binary installs: <install_location>/tools/bin/cassandra-stress [options]

The available options are:












































































Using the Daemon Mode


Usage for the daemon mode in binary installs:



<install_location>/tools/bin/cassandra-stressd start|stop|status [-h <host>]


During stress testing, you can keep the daemon running and send it commands through it using the --send-to option.




Examples


  • Inserts 1,000,000 rows to given host:


    /tools/bin/cassandra-stress -d 192.168.1.101




    When the number of rows is not specified, one million rows are inserted.


  • Read 1,000,000 rows from given host:


    tools/bin/cassandra-stress -d 192.168.1.101 -o read





  • Insert 10,000,000 rows across two nodes:


    /tools/bin/cassandra-stress -d 192.168.1.101,192.168.1.102 -n 10000000





  • Insert 10,000,000 rows across two nodes using the daemon mode:


    /tools/bin/cassandra-stress -d 192.168.1.101,192.168.1.102 -n 10000000 --send-to 54.0.0.1







Interpreting the output of cassandra-stress


The cassandra-stress tool periodically outputs information about the running tests. For example:




7251,725,725,56.1,95.1,191.8,10
19523,1227,1227,41.6,86.1,189.1,21
41348,2182,2182,22.5,75.7,176.0,31
...



Each line reports data for the interval between the last elapsed time and current elapsed time, which is set by the --progress-interval option (default 10 seconds). The following explains this information:


  • total: the total number of operations since the start of the test.

  • interval_op_rate: the number of operations performed during the interval.

  • interval_key_rate: the number of keys/rows read or written during the interval (normally be the same as interval_op_rate unless doing range slices).

  • latency: the average latency for each operation during that interval.

  • elapsed: the number of seconds elapsed since the beginning of the test.


 



Stress Testing Apache Cassandra Database
Read Complete Post From Source: BLog.CripperZ.SG
Long Option

Short Option

Description
–average-size-values

-V

Generate column values of average rather than specific size.
–cardinality <CARDINALITY>

-C <CARDINALITY>

Number of unique values stored in columns. Default is 50.
–columns <COLUMNS>

-c <COLUMNS>

Number of columns per key. Default is 5.
–column-size <COLUMN-SIZE>

-S <COLUMN-SIZE>

Size of column values in bytes. Default is 34.
–compaction-strategy <COMPACTION-STRATEGY>

-Z <COMPACTION-STRATEGY>

Specifies which compaction strategy to use.
–comparator <COMPARATOR>

-U <COMPARATOR>

Specifies which column comparator to use. Supported types are: TimeUUIDType, AsciiType, and UTF8Type.
–compression <COMPRESSION>

-I <COMPRESSION>

Specifies the compression to use for SSTables. Default is no compression.
–consistency-level <CONSISTENCY-LEVEL>

-e <CONSISTENCY-LEVEL>

Consistency level to use (ONE, QUORUM, LOCAL_QUORUM, EACH_QUORUM, ALL, ANY). Default is ONE.
–create-index <CREATE-INDEX>

-x <CREATE-INDEX>

Type of index to create on column families (KEYS).
–enable-cql

-L

Perform queries using CQL (Cassandra Query Language).
–family-type <TYPE> -y <TYPE>Sets the column family type.
–file <FILE>

-f <FILE>

Write output to a given file.
–help

-h

Show help.
–keep-going

-k

Ignore errors when inserting or reading. When set, –keep-trying has no effect. Default is false.
–keep-trying <KEEP-TRYING>

-K <KEEP-TRYING>

Retry on-going operation N times (in case of failure). Use a positive integer. The default is 10.
–keys-per-call <KEYS-PER-CALL>

-g <KEYS-PER-CALL>

Number of keys to per call. Default is 1000.
–nodes <NODES>

-d <NODES>

Nodes to perform the test against. Must be comma separated with no spaces. Default is localhost.
–nodesfile <NODESFILE>

-D <NODESFILE>

File containing host nodes (one per line).
–no-replicate-on-write

-W

Set replicate_on_write to false for counters. Only for counters with a consistency level of ONE (CL=ONE).
–num-different-keys <NUM-DIFFERENT-KEYS>

-F <NUM-DIFFERENT-KEYS>

Number of different keys. If less than NUM-KEYS, the same key is re-used multiple times. Default is NUM-KEYS.
–num-keys <NUMKEYS>

-n <NUMKEYS>

Number of keys to write or read. Default is 1,000,000.
–operation <OPERATION>

-o <OPERATION>

Operation to perform: INSERT, READ, INDEXED_RANGE_SLICE, MULTI_GET, COUNTER_ADD, COUNTER_GET. Default is INSERT.
–port <PORT>

-p <PORT>

Thrift port. Default is 9160.
–progress-interval <PROGRESS-INTERVAL>

-i <PROGRESS-INTERVAL>

The interval, in seconds, at which progress is output. Default is 10 seconds.
–query-names <QUERY-NAMES>

-Q <QUERY-NAMES>

Comma-separated list of column names to retrieve from each row.
–random

-r

Use random key generator. When used –stdev has no effect. Default is false.
–replication-factor <REPLICATION-FACTOR>>

-l <REPLICATION-FACTOR>

Replication Factor to use when creating column families. Default is 1.
–replication-strategy <REPLICATION-STRATEGY>

-R <REPLICATION-STRATEGY>

Replication strategy to use (only on insert and when a keyspace does not exist.) Default is: SimpleStrategy.
–send-to <SEND-TO>

-T <SEND-TO>

Sends the command as a request to the cassandra-stressd daemon at the specified IP address. The daemon must already be running at that address.
–skip-keys <SKIP-KEYS>

-N <SKIP-KEYS>

Fraction of keys to skip initially. Default is 0.
–stdev <STDEV>

-s <STDEV>

Standard deviation. Default is 0.1.
–strategy-properties <STRATEGY-PROPERTIES>

–O <STRATEGY-PROPERTIES>

Replication strategy properties in the following format: <dc_name>:<num>,<dc_name>:<num>,… For use with NetworkTopologyStrategy.
–threads <THREADS>

-t <THREADS>

Number of threads to use. Default is 50.
–unframed

-m

Use unframed transport. Default is false.
–use-prepared-statements

-P

(CQL only) Perform queries using prepared statements.

0 comments:

Post a Comment

 
Back to top!