Skip to main content
Version: 1.5.0

Azure Data Explorer

Microsoft Azure Observability

Synopsis

Creates an Azure Data Explorer (Kusto) target that ingests data directly into Azure Data Explorer tables. Supports multiple tables, custom schemas, and various file formats.

Schema

- name: <string>
description: <string>
type: azdx
pipelines: <pipeline[]>
status: <boolean>
properties:
tenant_id: <string>
client_id: <string>
client_secret: <string>
endpoint: <string>
database: <string>
table: <string>
schema: <string>
type: <string>
flush_immediately: <boolean>
timeout: <numeric>
batch_size: <numeric>
max_size: <numeric>
field_format: <string>
tables:
- name: <string>
schema: <string>
interval: <string|numeric>
cron: <string>
debug:
status: <boolean>
dont_send_logs: <boolean>

Configuration

The following fields are used to define the target:

FieldRequiredDefaultDescription
nameY-Target name
descriptionN-Optional description
typeY-Must be azdx
pipelinesN-Optional post-processor pipelines
statusNtrueEnable/disable the target

Azure

FieldRequiredDefaultDescription
tenant_idN(1)-Azure tenant ID (required unless using managed identity)
client_idN(1)-Azure client ID (required unless using managed identity)
client_secretN(1)-Azure client secret (required unless using managed identity)
endpointY-Azure Data Explorer cluster endpoint
databaseY-Target database name
tableN(2)-Default/fallback table name (catch-all for unmatched events)
schemaN(2)-Table schema definition for the default/fallback table
typeNparquetData format: parquet, json, multijson, or avro

(1) = Conditionally required (see authentication methods above) (2) = Required if you want a catch-all table for unmatched events, or if not using the tables array

Ingestion Options

FieldRequiredDefaultDescription
flush_immediatelyNtrueSend data to ADX without waiting for batch completion
timeoutN30Connection timeout in seconds
batch_sizeN100000Maximum number of messages per batch
max_sizeN32MBMaximum batch size in bytes (0 for unlimited)
field_formatN-Data normalization format. See applicable Normalization section

Multiple Tables

You can define multiple tables to ingest data into:

targets:
- name: azdx_multiple_tables
type: azdx
properties:
tables:
- name: "security_logs"
schema: "<schema definition>"
- name: "system_logs"
schema: "<schema definition>"

Scheduler

FieldRequiredDefaultDescription
intervalNrealtimeExecution frequency. See Interval for details
cronN-Cron expression for scheduled execution. See Cron for details

Debug Options

FieldRequiredDefaultDescription
debug.statusNfalseEnable debug logging
debug.dont_send_logsNfalseProcess logs but don't send to target (testing)

Details

The Azure Data Explorer target supports ingesting data into multiple tables with different schemas. When using the SystemS3 field in your logs, the value will be used to route the message to the appropriate table.

Table Routing and Catch-All Behavior

The target uses a routing system to direct events to the appropriate table:

  1. Explicit Table Matching: If an event has a SystemS3 field, the target looks for a table defined in the tables array with a matching name
  2. Catch-All Table: If no matching table is found (or if SystemS3 is not set), the event is routed to the default table specified at the root level

The table and schema properties at the root level serve as a catch-all mechanism. This is particularly useful for automation scenarios where systems may look for specific tables with specific schemas. If no matching table is found in the tables array, these events will fall back to the default table instead of being dropped.

Example routing logic:

Event with SystemS3="security_logs" → routes to "security_logs" table if defined
Event with SystemS3="unknown_table" → routes to default table if configured
Event without SystemS3 → routes to default table if configured

The target automatically validates table existence before starting ingestion. Data is buffered locally until batch_size or max_size is reached, or when an explicit flush is triggered.

For tables not defined in the configuration, the target can automatically discover them from the database if they exist. The service principal must have appropriate permissions on the database and tables.

Formats

FormatDescription
jsonEach log entry is written as a separate JSON line (JSONL format)
multijsonAll log entries are written as a single JSON array
avroApache Avro format with schema
parquetApache Parquet columnar format with schema (default)
warning

Consider cluster capacity when setting batch sizes and timeouts.

Examples

Basic

The minimum required configuration for Parquet ingestion:

targets:
- name: basic_adx
type: azdx
properties:
tenant_id: "00000000-0000-0000-0000-000000000000"
client_id: "11111111-1111-1111-1111-111111111111"
client_secret: "your-client-secret"
endpoint: "https://cluster.region.kusto.windows.net"
database: "logs"
table: "system_events"

Multiple Tables with Catch-All

Configuration with multiple target tables and a catch-all default table:

targets:
- name: multi_table_adx
type: azdx
properties:
tenant_id: "00000000-0000-0000-0000-000000000000"
client_id: "11111111-1111-1111-1111-111111111111"
client_secret: "your-client-secret"
endpoint: "https://cluster.region.kusto.windows.net"
database: "logs"
type: "parquet"
# Catch-all table for unmatched events
table: "general_logs"
schema: "TimeGenerated:datetime,Message:string,Source:string"
tables:
- name: "security_events"
schema: "TimeGenerated:datetime,Computer:string,EventID:int,Message:string"
- name: "system_events"
schema: "TimeGenerated:datetime,Computer:string,EventID:int,Message:string"
- name: "application_events"
schema: "TimeGenerated:datetime,Computer:string,EventID:int,Message:string"

In this example, events with SystemS3 set to "security_events", "system_events", or "application_events" will route to their respective tables. All other events will route to the "general_logs" catch-all table.

High-Volume

Configuration optimized for high-volume ingestion:

targets:
- name: high_volume_adx
type: azdx
properties:
tenant_id: "00000000-0000-0000-0000-000000000000"
client_id: "11111111-1111-1111-1111-111111111111"
client_secret: "your-client-secret"
endpoint: "https://cluster.region.kusto.windows.net"
database: "logs"
type: "parquet"
batch_size: 50000
max_size: 536870912
timeout: 60
flush_immediately: false

With Debugging

Configuration with debug options:

targets:
- name: debug_adx
type: azdx
properties:
tenant_id: "00000000-0000-0000-0000-000000000000"
client_id: "11111111-1111-1111-1111-111111111111"
client_secret: "your-client-secret"
endpoint: "https://cluster.region.kusto.windows.net"
database: "logs"
debug:
status: true
dont_send_logs: true # Test mode that doesn't actually upload

Normalized

Using field normalization before ingestion:

targets:
- name: normalized_adx
type: azdx
properties:
tenant_id: "00000000-0000-0000-0000-000000000000"
client_id: "11111111-1111-1111-1111-111111111111"
client_secret: "your-client-secret"
endpoint: "https://cluster.region.kusto.windows.net"
database: "logs"
field_format: "asim"

Automation-Friendly Configuration

Configuration designed for automation tools that expect specific table names:

targets:
- name: automation_adx
type: azdx
properties:
tenant_id: "00000000-0000-0000-0000-000000000000"
client_id: "11111111-1111-1111-1111-111111111111"
client_secret: "your-client-secret"
endpoint: "https://cluster.region.kusto.windows.net"
database: "logs"
# Catch-all ensures automation-generated events always have a destination
table: "automated_events"
schema: "Timestamp:datetime,EventType:string,Data:string,Source:string"
tables:
- name: "monitoring_alerts"
schema: "Timestamp:datetime,AlertLevel:string,Message:string"
- name: "deployment_logs"
schema: "Timestamp:datetime,Service:string,Version:string,Status:string"

In this configuration, automation tools looking for "monitoring_alerts" or "deployment_logs" will find their specific tables with the expected schemas. Any other automated events will be captured in the "automated_events" catch-all table, ensuring no data is lost.