Welcome to BestConfig’s documentation!¶
BestConfig is a system for automatically finding a best configuration setting within a resource limit for a deployed system under a given application workload. BestConfig is designed with an extensible architecture to automate the configuration tuning for general systems.
Contents¶
QuickStart¶
Good tools make system performance tuning quicker, easier and cheaper than if everything is done manually or by experience.
Bestconfig can find better configurations for a specific large-scale system deployed for a given application workload.
- Overview
- BestConfig Tuning – Taking Spark as the example SUT
- Implementing your own sampling/tuing algorithms for BestConfig
Overview¶
Deployment architecture
Here, “deployment environment” refers to the actual running environment of your applications, while “staging environment” is some environment that is almost the same as the deployment environment but where tests are run without interfering the actual application.
The process of deploying BestConfig
The detailed method of using BestConfig to tune practical system is as the following, which can be showed by a case of spark tuning.
BestConfig Tuning – Taking Spark as the example SUT¶
Step 1. Deploy shells scripts for system under tune¶
There are 9 shell scripts in BestConfig and they are classified into two groups.
One group consists of 5 shell scripts. They are start.sh, isStart.sh, stop.sh, isClosed.sh and terminateSystem.sh and deployed on the system under tune.
The scripts of start.sh and stop.sh deployed on worker and master node are different.
(1) Shell scripts (start.sh and stop.sh) on master node
start.sh(master) – this script will start the system on the master node
stop.sh(master) – this script will stop the system on the master node
(2) Shell scripts (start.sh and stop.sh) on worker node
start.sh(worker) – this script will start the system on the worker node
stop.sh(worker) – this script will stop the system on the worker node
Identical shell scripts on master and worker node
isStart.sh – this script will return OK if the system is successfully started
terminateSystem.sh – this script will terminate the system process on the server
isClosed.sh – this script will return OK if the system is successfully terminated
The other group consists of 4 shell scripts. They are startTest.sh, getTestResult.sh, terminateTest.sh and isFinished.sh and deployed on the test node.
startTest.sh – this script will start a test towards the system under tune
isFinished.sh – this script will return OK if the test is done
getTestResult.sh – this script will return performance metrics regarding the test
terminateTest.sh – this script will terminate the testing process
Step 2. Implement the ConfigReadin and ConfigWrite interfaces¶
As for spark tuning, we need to implement the ConfigReadin and ConfigWrite interfaces as SparkConfigReadin and SparkConfigWrite.
Next, we need to compile SparkConfigReadin and SparkconfigWrite to bytecodes. Then the location(path) of compiled bytecodes need to be added to classpath of BestConfig project.
Step 3. Specify the parameter set for tuning and their ranges¶
- An example of defaultConfig.yaml (specifying the parameters for tuning)
(2) An example of defaultConfig.yaml_range (the valid ranges of parameters)
Step 4. Specify the resource limit and things about the tuning environment (or, sample size/round number)¶
- bestconf.properties
- SUTconfig.properties
Step 5. Start BestConfig¶
Now, you can start BestConfig. BestConfig will automatically run the tuning process without any requirement for user interferences, until the tuning process ends due to resource exhaustion or unhandlable environment errors.
BestConfig will output the best configuration setting into files once the tuning is done.
You can start bestconfig with the help of ant. The detailed instructions are as follows.
(1). cd bestconf-master
(2). ant compile
(3). ant run
Implementing your own sampling/tuing algorithms for BestConfig¶
You can also choose to extend and tailor BestConfig for your specific use cases using your own sampling/tuning algorithms.
- To implement your own sampling algorithms –> Extend the
- abstract class of ConfigSampler
- To implement your own tuning algorithms –> Implement the
- interface of Optimization
FAQ¶
Use cases¶
BestConfig for Hadoop + Hive¶
Experimental Settings¶
We executed Bestconfig for the Hadoop cluster with 4 nodes. The Hadoop cluster consists of 1 master node and 3 slave nodes. All nodes used in our experiment are shown below.
Node | OS | CPU | Memory |
---|---|---|---|
Master | CentOS | 16 Intel(R) Xeon(R) CPU E5620 @ 2.40GHz | 32GB |
Slave 1 | CentOS | 16 Intel(R) Xeon(R) CPU E5620 @ 2.40GHz | 32GB |
Slave 2 | CentOS | 16 Intel(R) Xeon(R) CPU E5620 @ 2.40GHz | 32GB |
Slave 3 | CentOS | 16 Intel(R) Xeon(R) CPU E5620 @ 2.40GHz | 32GB |
Performance Surface¶
We use HiBench that is a widely adopted benchmark tools in the workload generator for Spark to generate the target workload. Figure 1 plot the highly differed performance surfaces for Hadoop+Hive Join workload.
The performance surface of Hadoop+Hive under Hibench-Join workload
Test Results¶
The test results of Hadoop under Join workload hadoopJoin.arff.
The test results of Hadoop under Pagerank workload hadoopPageRank.arff.
The test results of Hadoop under Join workload with 500 samples join-trainingBestConf.arff and join-BestConfig.arff.
Interface Impl¶
The source files of HadoopConfigReadin and HadoopConfigWrite implement the interfaces of ConfigReadin and ConfigWrite respectively.
BestConfig for Spark¶
Experimental Settings¶
We executed Bestconfig for the spark cluster with 4 nodes. The spark cluster consists of 1 master node and 3 slave nodes. All nodes used in our experiment are shown below.
Node | OS | CPU | Memory |
---|---|---|---|
Master | CentOS | 16 Intel(R) Xeon(R) CPU E5620 @ 2.40GHz | 32GB |
Slave 1 | CentOS | 16 Intel(R) Xeon(R) CPU E5620 @ 2.40GHz | 32GB |
Slave 2 | CentOS | 16 Intel(R) Xeon(R) CPU E5620 @ 2.40GHz | 32GB |
Slave 3 | CentOS | 16 Intel(R) Xeon(R) CPU E5620 @ 2.40GHz | 32GB |
Performance Surface¶
We use HiBench that is a widely adopted benchmark tools in the workload generator for Spark to generate the target workload. Figure 1 plot the highly differed performance surfaces for Spark Pagerank workload.
The performance surface of Spark under Hibench-Pagerank workload
Test Results¶
The test result of Spark pagerank workload pagerank. The test result of Spark kmeans workload kmeans.
Interface Impl¶
The source files of SparkConfigReadin and SparkConfigWrite implement the interfaces of ConfigReadin and ConfigWrite respectively.
BestConfig for Cassandra¶
Experimental Settings¶
We executed Bestconfig for the spark cluster with 4 nodes. The spark cluster consists of 1 master node and 3 slave nodes. All nodes used in our experiment are shown below.
Node | OS | CPU | Memory |
---|---|---|---|
Cassandra 1 | CentOS | 16 Intel(R) Xeon(R) CPU E5620 @ 2.40GHz | 32GB |
Cassandra 2 | CentOS | 16 Intel(R) Xeon(R) CPU E5620 @ 2.40GHz | 32GB |
Cassandra 3 | CentOS | 16 Intel(R) Xeon(R) CPU E5620 @ 2.40GHz | 32GB |
YCSB | CentOS | 16 Intel(R) Xeon(R) CPU E5620 @ 2.40GHz | 32GB |
Performance Surface¶
We use YCSB that is a widely adopted benchmark tools in the workload generator for Cassandra to generate the target workload. Currently, the workload adopted in our test is workoada, and we set recorecount to 17000000 and operationcount to 720000. Figure 1 is the scatter plot of performance for Cassandra under YCSB workloada.
The scatter plot of performance for Cassandra under YCSB workloada.
Test Results¶
The test result of Cassandra under YCSB workloada cassandraYcsba.arff.
Interface Impl¶
The source files of CassandraConfigReadin and CassandraConfigWrite implement the interfaces of ConfigReadin and ConfigWrite respectively.
BestConfig for MySQL¶
Experimental Settings¶
We executed Bestconfig for the MySQL system, and we applied sysbench to test the performance of MySQL. All nodes used in our experiment are shown below.
Node | OS | CPU | Memory |
---|---|---|---|
MySQL | CentOS | 16 Intel(R) Xeon(R) CPU E5620 @ 2.40GHz | 32GB |
Sysbench | CentOS | 16 Intel(R) Xeon(R) CPU E5620 @ 2.40GHz | 32GB |
Performance Surface¶
We use Sysbench that is a widely adopted benchmark tools in the workload generator for MySQL to generate the target workload. Currently, the test type in our experiment is oltp and the test mode is simple, and we set num-threads to 16, oltp-table-size to 10000000, and max-time to 300. Figure 1 is the scatter plot of performance for MySQL under OLTP simple test mode.
The scatter plot of performance for MySQL under OLTP simple test mode
Test Results¶
The result of MySQL under the zipfian read-write workload MySQL_zipfian_readwrite.arff. The result of MySQL under OLTP simple test mode MySQL_OLTP_simple.arff.
Interface Impl¶
The source files of MySQLConfigReadin and MySQLConfigWrite implement the interfaces of ConfigReadin and ConfigWrite respectively.
BestConfig for Tomcat Server¶
Experimental Settings¶
We executed Bestconfig for the Tomcat server, and we applied sysbench to test the performance of Tomcat server. All nodes used in our experiment are shown below.
Node | OS | CPU | Memory |
---|---|---|---|
Tomcat Server | CentOS | 16 Intel(R) Xeon(R) CPU E5620 @ 2.40GHz | 32GB |
JMeter | CentOS | 16 Intel(R) Xeon(R) CPU E5620 @ 2.40GHz | 32GB |
Performance Surface¶
We use JMeter that is a widely adopted benchmark tools in the workload generator for Tomcat to generate the target workload.
The performance surface of Tomcat under a page navigation workload.
Test Results¶
All the test resuls of Tomcat under different workloads -> Tomcat_Results.
Interface Impl¶
The source files of TomcatConfigReadin and TomcatConfigWrite implement the interfaces of ConfigReadin and ConfigWrite respectively.
Citing BestConfig¶
Please cite:
@inproceedings{Zhu:2017:BTP:3127479.3128605,
author = {Zhu, Yuqing and Liu, Jianxun and Guo, Mengying and Bao, Yungang and
Ma, Wenlong and Liu, Zhuoyue and Song, Kunpeng and Yang, Yingchun},
title = {BestConfig: Tapping the Performance Potential of Systems via Automatic
Configuration Tuning},
booktitle = {Proceedings of the 2017 Symposium on Cloud Computing},
series = {SoCC '17},
year = {2017},
isbn = {978-1-4503-5028-0},
location = {Santa Clara, California},
pages = {338--350},
numpages = {13},
url = {http://doi.acm.org/10.1145/3127479.3128605},
acmid = {3128605},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {ACT, automatic configuration tuning, performance optimization},
}