Running WiredTiger workloads
Running and configuring YCSB
WiredTiger source is at: https://github.com/wiredtiger/wiredtiger
$ git clone git@github.com:wiredtiger/wiredtiger.git -b develop
$ cd wiredtiger
$ mkdir build
$ cd build
$ cmake ../.
$ cd bench/wtperf
The following scripts will run the YCSB benchmarks:
$ ./wtperf -O ../../../bench/wtperf/runners/ycsb-a.wtperf
$ ./wtperf -O ../../../bench/wtperf/runners/ycsb-b.wtperf
$ ./wtperf -O ../../../bench/wtperf/runners/ycsb-c.wtperf
$ ./wtperf -O ../../../bench/wtperf/runners/ycsb-d.wtperf
$ ./wtperf -O ../../../bench/wtperf/runners/ycsb-e.wtperf
$ ./wtperf -O ../../../bench/wtperf/runners/ycsb-f.wtperf
Here are some useful ways to configure the scripts:
Create the database and run the benchmark separately:
Sometimes you want to create the database prior to the benchmark run, and then run the benchmark on an already existing database. To do that, you'd create two separate .wtperf config files: one that creates the database, another one that runs it. Here is an example:
Original ycsb-c.wtperf:
conn_config="cache_size=40G,log=(enabled=true)" pareto=20 table_config="type=file" key_sz=100 value_sz=1024 icount=120000000 run_time=3600 threads=((count=20,reads=1)) warmup=120 sample_interval=5 populate_threads=8 report_interval=5
Based on this original configuration, here are the two new files:
This one only creates a database: ycsb-c-create.wtperf
conn_config="cache_size=40G,log=(enabled=true)" pareto=20 table_config="type=file" key_sz=100 value_sz=1024 icount=120000000 run_time=10 threads=((count=20,reads=1)) #warmup=120 sample_interval=5 populate_threads=8 report_interval=5
This one runs the benchmark: ycsb-c-run.wtperf
conn_config="cache_size=40G,log=(enabled=true)" pareto=20 table_config="type=file" key_sz=100 value_sz=1024 icount=120000000 run_time=60 threads=((count=20,reads=1)) warmup=120 sample_interval=5 create=false #populate_threads=8 report_interval=5
Adjust the database size:
You can change the database size by increasing or reducing the icount value:
E.g.:
Before: icount=120000000
After: icount=60000000
If you don't want to measure logging, you could disable it as follows:
Before: conn_config="cache_size=40G,log=(enabled=true)"
After: conn_config="cache_size=40G"
Adjust the WiredTiger cache size:
Before: conn_config="cache_size=40G"
After: conn_config="cache_size=20G"
The WiredTiger in-memory cache size will determine how often WiredTiger will have to fetch data from disk.
Other parameters:
To see what other parameters do and how to modify them, read the WiredTiger documentation, or look at examples of other .wtperf files in bench/wtperf/runners directory.
Running a cache-heavy workload with chunkcache
Get the latest version of WiredTiger develop branch here:
git clone git@github.com:wiredtiger/wiredtiger.git -b develop
Build it as described here
To enable chunk cache on disk pass -DENABLE_MEMKIND=1 to cmake.
Running a Workload
Enter build directory:
cd build/bench/wtperf/
Run a workload with a config file
./wtperf -O <config file>
This will run the workload in the current directory. If you want the database to be created elsewhere, you use the -h option, for example:
./wtperf -O <config file> -h /mnt/ssd/<user>/WT_TEST
There are many config files in the <wiredtiger>/bench/wtperf/runners directory.
For example, bench/wtperf/runners/evict-btree.wtperf creates a BTree database and will run a read-only workload on it.
Increasing Workload DB Size
To create a larger database, you can change the icount variable in the configuration file. E.g., to create a 20GB database, just multiply the number in icount by 10:
-icount=10000000 +icount=100000000
Pre-Populate Database
It is useful to create a database first and then run on it many times over, so you don’t waste time creating it every time.
- Run
evict-btree.wtperfworkload as above - Change the run_time=120 to something small, e.g., run_time=10.
run_time=120 is the runtime of the benchmark in seconds. You don’t care about running the benchmark at the time you are only populating the database, so you can simply set it to a very small number.
Once your database is created, you run evict-btree as follows:
- Comment out the line with populate_threads=1 from the file. That way the database will not be populated.
- Set create=false in the config
- When you launch the benchmark specify with the -h argument the directory where the database was created.
Suppose you ran the following command to create the database:
./wtperf -O evict-btree.wtperf -h /mnt/ssd/john/WT_TEST
Then you created a modified file evict-btree-workload.wtperf, which looks like the original evict-btree.wtperf, but without the populate_threads=1 line. Run that workload like this:
./wtperf -O evict-btree-workload.wtperf -h /mnt/ssd/<user>/WT_TEST
so it knows where to find the database.
To enable the chunk cache:
To enable the chunk cache modify the conn_config line:conn_config="cache_size=50M,eviction=(threads_max=8)". This config line specifies that the in-memory cache is 50M (increase it for a larger database). To add the chunk cache, do the following:
conn_config="cache_size=50M,eviction=(threads_max=8),chunk_cache=(enabled=1,capacity=50GB,chunk_size=2GB,type=FILE,device_path=/mnt/ssd/<user>/CACHE),verbose=[chunkcache]"
This will enable the chunk cache that will sit on the ssd in the specified path. Or you can also specify the chunk cache to be in type=DRAM — not relevant for experiments, just for testing.