Setting the JDBC Connection String

You must set the JDBC connection string for Hive, Presto, and Spark queries.

Setting the Connection String for Hive and Presto Queries (AWS and Azure)

Use the following syntax to set the JDBC connection string for Hive and Presto queries.

jdbc:qubole://<hive or presto>/<Cluster-Label>[/<database>][?propertyName1=propertyValue1[;propertyName2=propertyValue2]...]

In the connection string, <hive or presto> (command type) and the cluster label are mandatory; database name and property name/value are optional.

Note

If you do not specify a database, then in the query, specify either the database or fully-qualified table names.

An example of a connection string for Hive and Presto queries is mentioned below.

jdbc:qubole://hive/default/tpcds_orc_500?endpoint=https://api.qubole.com;chunk_size=86

In the above example, https://api.qubole.com is one of the QDS endpoints on AWS. For a list of supported endpoints, see Supported Qubole Endpoints on Different Cloud Providers.

Connection String Properties

Property Name Property Value
endpoint The endpoint is not required only for the https://api.qubole.com endpoint. You must specify the API endpoint for other QDS-on-AWS endpoints and Cloud providers. For the list, see Supported Qubole Endpoints on Different Cloud Providers.
chunk_size The chunk size in MB and used in streaming large results from Cloud storage. The default value is 100 MB. Reduce the default value if you face out-of-memory (OOM) issues.

Additional Properties (Optional)

In addition, you can:

Setting the Connection String for Spark Queries

Use the following syntax to set the JDBC connection string for Spark queries.

jdbc:qubole://spark/<Cluster-Label>/<app-id>[/<database>][?propertyName1=propertyValue1[;propertyName2=propertyValue2]...]

For example:

jdbc:qubole://spark/spark-cluster/85/my-sql?endpoint=https://us.qubole.com;chunk_size=86

Note

Create an App with the configuration parameter, zeppelin.spark.maxResult=<A VERY BIG VALUE>. It can return only the configured maximum number of row results.

In the connection string, spark (command type) and the cluster label are mandatory; database name and property name/value are optional.

Note

If you do not specify a database, then in the query, specify either the database or fully-qualified table names.

Specifying app-id is mandatory. An app is the main abstraction in the Spark Job Server API. It is used to store the configuration for a Spark application. Creating an app returns an app-id. (You can get the app id using GET API http://api.qubole.com/api/v1.2/apps). See Understanding the Spark Job Server for more information.