Setting the JDBC Connection String¶
You must set the JDBC connection string for Hive, Presto, and Spark queries.
Setting the Connection String for Hive and Presto Queries (AWS and Azure)¶
Use the following syntax to set the JDBC connection string for Hive and Presto queries.
jdbc:qubole://<hive or presto>/<Cluster-Label>[/<database>][?propertyName1=propertyValue1[;propertyName2=propertyValue2]...]
In the connection string, <hive or presto> (command type) and the cluster label are mandatory; database name and property name/value are optional.
Note
If you do not specify a database, then in the query, specify either the database or fully-qualified table names.
An example of a connection string for Hive and Presto queries is mentioned below.
jdbc:qubole://hive/default/tpcds_orc_500?endpoint=https://api.qubole.com;chunk_size=86
In the above example, https://api.qubole.com is one of the QDS endpoints on AWS. For a list of supported endpoints, see Supported Qubole Endpoints on Different Cloud Providers.
Connection String Properties¶
| Property Name | Property Value |
|---|---|
| endpoint | The endpoint is not required only for the https://api.qubole.com
endpoint. You must specify the API endpoint for other QDS-on-AWS endpoints
and Cloud providers. For the list, see Supported Qubole Endpoints on Different Cloud Providers. |
| chunk_size | The chunk size in MB and used in streaming large results from Cloud storage. The default value is 100 MB. Reduce the default value if you face out-of-memory (OOM) issues. |
Additional Properties (Optional)¶
In addition, you can:
- Enable Logging as described in Enabling Logging.
- Enable Proxy as described in Enabling the Proxy Connection.
Setting the Connection String for Spark Queries¶
Use the following syntax to set the JDBC connection string for Spark queries.
jdbc:qubole://spark/<Cluster-Label>/<app-id>[/<database>][?propertyName1=propertyValue1[;propertyName2=propertyValue2]...]
For example:
jdbc:qubole://spark/spark-cluster/85/my-sql?endpoint=https://us.qubole.com;chunk_size=86
Note
Create an App with the configuration parameter, zeppelin.spark.maxResult=<A VERY BIG VALUE>. It can return
only the configured maximum number of row results.
In the connection string, spark (command type) and the cluster label are mandatory; database name and
property name/value are optional.
Note
If you do not specify a database, then in the query, specify either the database or fully-qualified table names.
Specifying app-id is mandatory. An app is the main abstraction in the Spark Job Server API. It is used to store
the configuration for a Spark application. Creating an app returns an app-id. (You can get the app id using
GET API http://api.qubole.com/api/v1.2/apps). See Understanding the Spark Job Server for more information.