Skip to main content

Core Concepts

This page explains the key concepts and terminology used in Anomstack.

Metric Batch

A metric batch is the fundamental unit of configuration in Anomstack. It consists of:

  • A configuration file (config.yaml)
  • A SQL query file (query.sql) or Python ingest function
  • Optional preprocessing function
  • Optional custom configuration

Example structure:

metrics/
my_metric_batch/
config.yaml
query.sql
preprocess.py (optional)

Jobs

Anomstack runs several types of jobs for each metric batch:

Ingest Job

  • Pulls data from your data source
  • Executes your SQL query or Python function
  • Stores raw data for processing

Train Job

  • Processes historical data
  • Trains anomaly detection models
  • Saves trained models to storage

Score Job

  • Applies trained models to new data
  • Calculates anomaly scores
  • Identifies potential anomalies

Alert Job

  • Evaluates anomaly scores
  • Sends notifications via configured channels
  • Handles alert throttling and snoozing

Change Detection Job

  • Monitors for significant changes in metrics
  • Detects level shifts and trends
  • Triggers alerts for important changes

Plot Job

  • Generates visualizations of metrics
  • Creates anomaly score plots
  • Produces plots for alerts and dashboard

Alerts

Alerts are notifications sent when anomalies are detected. They can be configured to:

  • Send via email or Slack
  • Include visualizations
  • Use custom templates
  • Support different severity levels
  • Include LLM-powered analysis

Dashboard

The dashboard provides:

  • Real-time metric visualization
  • Anomaly score monitoring
  • Alert history and management
  • Metric configuration interface
  • Performance analytics

Storage

Anomstack uses storage for:

  • Trained models
  • Configuration files
  • Alert history
  • Performance metrics
  • Dashboard data

Supported storage backends:

  • Local filesystem
  • Google Cloud Storage (GCS)
  • Amazon S3
  • Azure Blob Storage (coming soon)

Data Sources

Anomstack supports various data sources:

  • Python (direct integration)
  • BigQuery
  • Snowflake
  • ClickHouse
  • DuckDB
  • SQLite
  • MotherDuck
  • Turso
  • Redshift (coming soon)

Configuration

Configuration is handled through:

  • YAML files for metric batches
  • Environment variables
  • Command-line arguments
  • Dashboard settings

Scheduling

Jobs can be scheduled using:

  • Cron expressions
  • Dagster schedules
  • Manual triggers
  • Event-based triggers

LLM Agent

The LLM agent provides:

  • AI-powered anomaly analysis
  • Natural language explanations
  • Automated reporting
  • Intelligent alert prioritization
  • Historical context analysis