Tuning Presto¶
This topic describes the tips to tune parallelism and memory that are categorized as follows:
- Tuning Parallelism at a Task Level
- Tuning Parallelism at an Operator Level
- Tuning Memory
- Tips to Avoid Memory Issues
Tuning Parallelism at a Task Level¶
The number of splits in a cluster = node-scheduler.max-splits-per-node * number of worker nodes.
The node-scheduler.max-splits-per-node denotes the target value for the total number of splits that can be running on
any worker node. Its default value is 100.
If there are queries submitted in large batches or for connectors that produce many splits, which get completed quickly,
then it is better to set a higher value for node-scheduler.max-splits-per-node. The higher value may improve the query
latency as it ensures that the worker nodes have sufficient splits to completely engage them.
On the contrary, if you set a very high value, then it may lower the performance as the splits may not be balanced across workers. Typically, set the value so that anytime, there is only one split that is waiting to be processed.
Note
If a query on the Hive catalog suffers from lower parallelism due to the less number of splits that are being generated,
then you can use hive.max-initial-splits and hive.max-initial-split-size to achieve higher parallelism.
Tuning Parallelism at an Operator Level¶
The task concurrency denotes the default local concurrency for parallel operators such as JOINS and AGGREGATIONS. Its default value is 16. The value of the task concurrency must be a multiplier of 2. You can increase/reduce the value depending on the query concurrency and worker nodes utilization as described below:
- Lower values are better for clusters running many queries concurrently as the running queries use cluster nodes. In such a case, increasing the concurrency causes context switching and other overheads and thus there is a slow down in the query execution.
- Higher values are better for clusters that run just one query or a few queries.
You can set the operator concurrency at the cluster level using the task.concurrency property. You can also specify the
operator concurrency at the session level using the task_concurrency session property.
Tuning Memory¶
Presto features these three Memory Pools to manage the available resources:
- General Pool
- Reserved Pool
- System Pool
All queries are initially submitted to the General Memory Pool. As long as the General Pool has memory, queries continue to run in it, but once it runs out of memory, the query using highest amount of memory in the General Pool is moved to the Reserved Pool and thereafter, this one query runs in the Reserved Pool while other queries continue to run in the General Pool. While the Reserved Pool is running a query, if the General Pool runs out of memory again, then the query using highest amount of memory in the General pool is moved to the Reserved pool but it will not resume its execution until the current query running in the Reserved Pool finishes.The Reserved Pool can hold multiple queries but it allows only one query to be executed at a given point in time.
The System Pool provides the memory for the operations, whose memory Presto does not track. Network buffers and IO buffers are examples for a System Pool.
This table describes the memory parameters.
| Memory Type | Description | Parameter and default value |
|---|---|---|
| maxHeap | It is the JVM container size. | Defaults to up to 70% of Instance Memory |
| System Memory | It is the overhead allocation. | Defaults to 40% of maxHeap |
| Reserved Memory | In case, General Memory is exhausted and if more memory is required by jobs, Reserved Memory is used by one job at a time to ensure progress until General Memory is available. | query.max-memory-per-node |
| Total Query Memory | It denotes the total tracked memory that is used by the query. It is applicable to Presto 0.208 and later versions. | query.max-total-memory-per-node |
| General Memory | It is the first stop for all jobs. | maxHeap - Reserved Memory - System Memory |
| Query Memory | It is the maximum memory for the job across the cluster. | query_max_memory is the session property and
query.max-memory is the cluster-level property. |
Tips to Avoid Memory Issues¶
Presto delays jobs when there are not enough Split Slots to support the dataset. Jobs fail when there is no sufficient memory to process the query. If any of the below memory values apply to the current environment, then the configuration is not powerful enough and you can expect a job lag and a failure.
| Reserved Memory * Number of Nodes < Peak Job Size | As a first recommendation, increase the Reserved Memory. However, increasing the Reserved Memory can impact the concurrency as the General Pool shrinks accordingly. As a second recommendation, use a larger instance. |
| General Memory * Number of Nodes < Average Job Size * Concurrent Jobs | As a first recommendation, increase the Reserved Memory. However, increasing the Reserved Memory can shrink the General Pool. When it is not possible to shrink the Reserved Pool, use a larger instance. |
| Reserved Memory * Number of Nodes < Query Memory | Adjust the setting |
| Reserved Memory * Number of Nodes < Query Memory Limit | Adjust the setting |