Version: 1.5.0

Azure Data Explorer

Microsoft Azure Observability

Synopsis

Creates an Azure Data Explorer (Kusto) target that ingests data directly into Azure Data Explorer tables. Supports multiple tables, custom schemas, and various file formats.

Schema

- name: <string>
  description: <string>
  type: azdx
  pipelines: <pipeline[]>
  status: <boolean>
  properties:
    tenant_id: <string>
    client_id: <string>
    client_secret: <string>
    endpoint: <string>
    database: <string>
    table: <string>
    schema: <string>
    type: <string>
    flush_immediately: <boolean>
    timeout: <numeric>
    batch_size: <numeric>
    max_size: <numeric>
    field_format: <string>
    tables:
      - name: <string>
        schema: <string>
    interval: <string|numeric>
    cron: <string>
    debug:
      status: <boolean>
      dont_send_logs: <boolean>

Configuration

The following fields are used to define the target:

Field	Required	Default	Description
`name`	Y	-	Target name
`description`	N	-	Optional description
`type`	Y	-	Must be `azdx`
`pipelines`	N	-	Optional post-processor pipelines
`status`	N	`true`	Enable/disable the target

Azure

Field	Required	Default	Description
`tenant_id`	N(1)	-	Azure tenant ID (required unless using managed identity)
`client_id`	N(1)	-	Azure client ID (required unless using managed identity)
`client_secret`	N(1)	-	Azure client secret (required unless using managed identity)
`endpoint`	Y	-	Azure Data Explorer cluster endpoint
`database`	Y	-	Target database name
`table`	N(2)	-	Default/fallback table name (catch-all for unmatched events)
`schema`	N(2)	-	Table schema definition for the default/fallback table
`type`	N	`parquet`	Data format: `parquet`, `json`, `multijson`, or `avro`

(1) = Conditionally required (see authentication methods above) (2) = Required if you want a catch-all table for unmatched events, or if not using the tables array

Ingestion Options

Field	Required	Default	Description
`flush_immediately`	N	`true`	Send data to ADX without waiting for batch completion
`timeout`	N	`30`	Connection timeout in seconds
`batch_size`	N	`100000`	Maximum number of messages per batch
`max_size`	N	`32MB`	Maximum batch size in bytes (0 for unlimited)
`field_format`	N	-	Data normalization format. See applicable Normalization section

Multiple Tables

You can define multiple tables to ingest data into:

targets:
  - name: azdx_multiple_tables
    type: azdx
    properties:
      tables:
        - name: "security_logs"
          schema: "<schema definition>"
        - name: "system_logs"
          schema: "<schema definition>"

Scheduler

Field	Required	Default	Description
`interval`	N	realtime	Execution frequency. See Interval for details
`cron`	N	-	Cron expression for scheduled execution. See Cron for details

Debug Options

Field	Required	Default	Description
`debug.status`	N	`false`	Enable debug logging
`debug.dont_send_logs`	N	`false`	Process logs but don't send to target (testing)

Details

The Azure Data Explorer target supports ingesting data into multiple tables with different schemas. When using the SystemS3 field in your logs, the value will be used to route the message to the appropriate table.

Table Routing and Catch-All Behavior

The target uses a routing system to direct events to the appropriate table:

Explicit Table Matching: If an event has a SystemS3 field, the target looks for a table defined in the tables array with a matching name
Catch-All Table: If no matching table is found (or if SystemS3 is not set), the event is routed to the default table specified at the root level

The table and schema properties at the root level serve as a catch-all mechanism. This is particularly useful for automation scenarios where systems may look for specific tables with specific schemas. If no matching table is found in the tables array, these events will fall back to the default table instead of being dropped.

Example routing logic:

Event with SystemS3="security_logs" → routes to "security_logs" table if defined
Event with SystemS3="unknown_table" → routes to default table if configured
Event without SystemS3 → routes to default table if configured

The target automatically validates table existence before starting ingestion. Data is buffered locally until batch_size or max_size is reached, or when an explicit flush is triggered.

For tables not defined in the configuration, the target can automatically discover them from the database if they exist. The service principal must have appropriate permissions on the database and tables.

Formats

Format	Description
`json`	Each log entry is written as a separate JSON line (JSONL format)
`multijson`	All log entries are written as a single JSON array
`avro`	Apache Avro format with schema
`parquet`	Apache Parquet columnar format with schema (default)

warning

Consider cluster capacity when setting batch sizes and timeouts.

Examples

Basic

The minimum required configuration for Parquet ingestion:

targets:
  - name: basic_adx
    type: azdx
    properties:
      tenant_id: "00000000-0000-0000-0000-000000000000"
      client_id: "11111111-1111-1111-1111-111111111111"
      client_secret: "your-client-secret"
      endpoint: "https://cluster.region.kusto.windows.net"
      database: "logs"
      table: "system_events"

Multiple Tables with Catch-All

Configuration with multiple target tables and a catch-all default table:

targets:
  - name: multi_table_adx
    type: azdx
    properties:
      tenant_id: "00000000-0000-0000-0000-000000000000"
      client_id: "11111111-1111-1111-1111-111111111111"
      client_secret: "your-client-secret"
      endpoint: "https://cluster.region.kusto.windows.net"
      database: "logs"
      type: "parquet"
      # Catch-all table for unmatched events
      table: "general_logs"
      schema: "TimeGenerated:datetime,Message:string,Source:string"
      tables:
        - name: "security_events"
          schema: "TimeGenerated:datetime,Computer:string,EventID:int,Message:string"
        - name: "system_events"
          schema: "TimeGenerated:datetime,Computer:string,EventID:int,Message:string"
        - name: "application_events"
          schema: "TimeGenerated:datetime,Computer:string,EventID:int,Message:string"

In this example, events with SystemS3 set to "security_events", "system_events", or "application_events" will route to their respective tables. All other events will route to the "general_logs" catch-all table.

High-Volume

Configuration optimized for high-volume ingestion:

targets:
  - name: high_volume_adx
    type: azdx
    properties:
      tenant_id: "00000000-0000-0000-0000-000000000000"
      client_id: "11111111-1111-1111-1111-111111111111"
      client_secret: "your-client-secret"
      endpoint: "https://cluster.region.kusto.windows.net"
      database: "logs"
      type: "parquet"
      batch_size: 50000
      max_size: 536870912
      timeout: 60
      flush_immediately: false

With Debugging

Configuration with debug options:

targets:
  - name: debug_adx
    type: azdx
    properties:
      tenant_id: "00000000-0000-0000-0000-000000000000"
      client_id: "11111111-1111-1111-1111-111111111111"
      client_secret: "your-client-secret"
      endpoint: "https://cluster.region.kusto.windows.net"
      database: "logs"
      debug:
        status: true
        dont_send_logs: true  # Test mode that doesn't actually upload

Normalized

Using field normalization before ingestion:

targets:
  - name: normalized_adx
    type: azdx
    properties:
      tenant_id: "00000000-0000-0000-0000-000000000000"
      client_id: "11111111-1111-1111-1111-111111111111"
      client_secret: "your-client-secret"
      endpoint: "https://cluster.region.kusto.windows.net"
      database: "logs"
      field_format: "asim"

Automation-Friendly Configuration

Configuration designed for automation tools that expect specific table names:

targets:
  - name: automation_adx
    type: azdx
    properties:
      tenant_id: "00000000-0000-0000-0000-000000000000"
      client_id: "11111111-1111-1111-1111-111111111111"
      client_secret: "your-client-secret"
      endpoint: "https://cluster.region.kusto.windows.net"
      database: "logs"
      # Catch-all ensures automation-generated events always have a destination
      table: "automated_events"
      schema: "Timestamp:datetime,EventType:string,Data:string,Source:string"
      tables:
        - name: "monitoring_alerts"
          schema: "Timestamp:datetime,AlertLevel:string,Message:string"
        - name: "deployment_logs"
          schema: "Timestamp:datetime,Service:string,Version:string,Status:string"

In this configuration, automation tools looking for "monitoring_alerts" or "deployment_logs" will find their specific tables with the expected schemas. Any other automated events will be captured in the "automated_events" catch-all table, ensuring no data is lost.

Synopsis​

Schema​

Configuration​

Azure​

Ingestion Options​

Multiple Tables​

Scheduler​

Debug Options​

Details​

Table Routing and Catch-All Behavior​

Formats​

Examples​

Basic​

Multiple Tables with Catch-All​

High-Volume​

With Debugging​

Normalized​

Automation-Friendly Configuration​