We recently added a new API, called
GetThreadList(), that exposes the RocksDB background thread activity. With this feature, developers will be able to obtain the real-time information about the currently running compactions and flushes such as the input / output size, elapsed time, the number of bytes it has written. Below is an example output of
GetThreadList. To better illustrate the example, we have put a sample output of
GetThreadList into a table where each column represents a thread status:
|ThreadType||High Pri||Low Pri|
|ElapsedTime||143.459 ms||607.538 ms|
|OperationProperties||BytesMemtables 4092938 BytesWritten 1050701||BaseInputLevel 1 BytesRead 4876417 BytesWritten 4140109 IsDeletion 0 IsManual 0 IsTrivialMove 0 JobID 146 OutputLevel 2 TotalInputBytes 4883044|
In the above output, we can see
GetThreadList() reports the activity of two threads: one thread running flush job (middle column) and the other thread running a compaction job (right-most column). In each thread status, it shows basic information about the thread such as thread id, it’s target db / column family, and the job it is currently doing and the current status of the job. For instance, we can see thread 140716416169728 is doing compaction on the
picachu column family in database
db2. In addition, we can see the compaction has been running for 600 ms, and it has read 4876417 bytes out of 4883044 bytes. This indicates the compaction is about to complete. The stage property indicates which code block the thread is currently executing. For instance, thread 140716416169728 is currently running
CompactionJob::Install, which further indicates the compaction job is almost done.
Below we briefly describe its API.
How to Enable it?
To enable thread-tracking of a rocksdb instance, simply set
enable_thread_tracking to true in its DBOptions:
1 2 3 4 5
// If true, then the status of the threads involved in this DB will // be tracked and available via GetThreadList() API. // // Default: false bool enable_thread_tracking;
The GetThreadList API is defined in include/rocksdb/env.h, which is an Env function:
virtual Status GetThreadList(std::vector* thread_list)
Since an Env can be shared across multiple rocksdb instances, the output of
GetThreadList() include the background activity of all the rocksdb instances
that using the same Env.
GetThreadList() API simply returns a vector of
ThreadStatus, each describes
the current status of a thread. The
ThreadStatus structure, defined in
include/rocksdb/thread_status.h, contains the following information:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
// An unique ID for the thread. const uint64_t thread_id; // The type of the thread, it could be HIGH_PRIORITY, // LOW_PRIORITY, and USER const ThreadType thread_type; // The name of the DB instance where the thread is currently // involved with. It would be set to empty string if the thread // does not involve in any DB operation. const std::string db_name; // The name of the column family where the thread is currently // It would be set to empty string if the thread does not involve // in any column family. const std::string cf_name; // The operation (high-level action) that the current thread is involved. const OperationType operation_type; // The elapsed time in micros of the current thread operation. const uint64_t op_elapsed_micros; // An integer showing the current stage where the thread is involved // in the current operation. const OperationStage operation_stage; // A list of properties that describe some details about the current // operation. Same field in op_properties might have different // meanings for different operations. uint64_t op_properties[kNumOperationProperties]; // The state (lower-level action) that the current thread is involved. const StateType state_type;
If you are interested in the background thread activity of your RocksDB application, please feel free to give
GetThreadList() a try :)