NB5 Docs► Reference Section► Drivers▼ Pulsar 🖺

Pulsar


1. Overview

The NB Pulsar driver allows you to simulate and run different types of workloads (as below) against a Pulsar cluster through NoSQLBench (NB).

1.1. Issues Tracker

If you have issues or new requirements for this driver, please add them at the pulsar issues tracker.

2. Execute the NB Pulsar Driver Workload

In order to run a NB Pulsar driver workload, it follows similar command as other NB driver types. But it does have its unique execution parameters. The general command has the following format:

<nb_cmd> run driver=pulsar threads=<thread_num> cycles=<cycle_count> web_url=<pulsar_web_svc_url> service_url=<pulsar_svc_url> config=<pulsar_client_config_property_file> yaml=<nb_scenario_yaml_file> [<other_common_NB_execution_parameters>]

In the above command, make sure the driver type is pulsar and provide the following Pulsar driver specific parameters:

2.1. NB Pulsar Driver Yaml File High Level Structure

Just like other NB driver types, the actual NB Pulsar workload is defined in a YAML file with the following high level structure:

description: |
  ...

bindings:
  ...

params:
  ...

blocks:
  <block_1>:
    ops:
      op1:
        <OpTypeIdentifier>: "<static_or_dynamic_value>"
        <op_param_1>: "<some_value>"
        <op_param_2>: "<some_value>"
        ...

  <block_2>:
  ...

2.2. NB Pulsar Driver Configuration Parameters

The NB Pulsar driver configuration parameters can be set at 3 different levels:

Please NOTE that when a parameter is specified at multiple levels, the one at the lowest level takes precedence.

2.2.1. Global Level Parameters

The parameters at this level are those listed in the command line config properties file.

The NB Pulsar driver relies on Pulsar's Java Client API complete its workloads such as creating/deleting tenants/namespaces/topics, generating messages, creating producers to send messages, and creating consumers to receive messages. The Pulsar client API has different configuration parameters to control the execution behavior. For example, this document lists all possible configuration parameters for how a Pulsar producer can be created.

All these Pulsar "native" parameters are supported by the NB Pulsar driver, via the global configuration properties file (e.g. config.properties). An example of the structure of this file looks like below:

### Schema related configurations - MUST start with prefix "schema."
#schema.key.type=avro
#schema.key.definition=</path/to/avro-key-example.avsc>
schema.type=avro
schema.definition=</path/to/avro-value-example.avsc>

### Pulsar client related configurations - MUST start with prefix "client."
# http://pulsar.apache.org/docs/en/client-libraries-java/#client
client.connectionTimeoutMs=5000
client.authPluginClassName=org.apache.pulsar.client.impl.auth.AuthenticationToken
client.authParams=
# ...

### Producer related configurations (global) - MUST start with prefix "producer."
# http://pulsar.apache.org/docs/en/client-libraries-java/#configure-producer
producer.sendTimeoutMs=
producer.blockIfQueueFull=true
# ...

### Consumer related configurations (global) - MUST start with prefix "consumer."
# http://pulsar.apache.org/docs/en/client-libraries-java/#configure-consumer
consumer.subscriptionInitialPosition=Earliest
consumer.deadLetterPolicy={"maxRedeliverCount":"5","retryLetterTopic":"public/default/retry","deadLetterTopic":"public/default/dlq","initialSubscriptionName":"dlq-sub"}
consumer.ackTimeoutRedeliveryBackoff={"minDelayMs":"10","maxDelayMs":"20","multiplier":"1.2"}
# ...

There are multiple sections in this file that correspond to different categories of the configuration parameters:

2.2.2. Document Level Parameters

For the Pulsar NB driver, Document level parameters can only be statically bound; and currently, the following Document level configuration parameters are supported:

3. NB Pulsar Driver OpTemplates

For the NB Pulsar driver, each OpTemplate has the following format:

blocks:
  <some_block_name>:
    ops:
      <some_op_name>:
        <OpTypeIdentifier>: <tenant|namespace|topic_name>
        <op_param_1>: "<some_value>"
        <op_param_2>: "<some_value>"
        ...

The OpTypeIdentifier determines which NB Pulsar workload type (OpType) to run, and it has the following value:

public enum PulsarOpType {
    AdminTenant,
    AdminNamespace,
    AdminTopic,
    MessageProduce,
    MessageConsume
}

Its value is mandatory and depending on the actual identifier, its value can be one of the following:

Each Pulsar OpType may have optional Op specific parameters. Please refer to here for the example NB Pulsar YAML files for each OpType

4. Message Generation and Schema Support

4.1. Message Generation

A Pulsar message has three main components: message key, message properties, and message payload. Among them, message payload is mandatory when creating a message.

When running the "message producing" workload, the NB Pulsar driver is able to generate a message with its full content via the following OpTemplate level parameters:

The actual values of them can be static or dynamic (which are determined by NB data binding rules)

For msg_key, its value can be either

For msg_property, its value needs to be a JSON string that contains a list of key-value pairs. An example is as below. Please NOTE that if the provided value is not a valid JSON string, the NB Pulsar driver will ignore it and treat the message as having no properties.

  msg_property: |
    {
      "prop1": "{myprop1}",
      "prop2": "{myprop2}"
    }

For msg_value, its value can be either

4.2. Schema Support

The NB Pulsar driver supports the following Pulsar schema types:

The following 2 global configuration parameters define the required schema type

The following 2 global configuration parameters define the schema specification (ONLY needed when Avro is the schema type)

Back to top