Skip to main content

Scripts & Utilities

Anomstack includes a collection of utility scripts for common administrative tasks, database management, system maintenance, and development workflows. These scripts are located in the scripts/ directory.

๐Ÿ“ Directory Structureโ€‹

scripts/
โ”œโ”€โ”€ README.md # Overview of all scripts
โ”œโ”€โ”€ kill_long_running_tasks.py # Terminate stuck Dagster jobs
โ”œโ”€โ”€ reload_config.py # Hot-reload Dagster configuration
โ”œโ”€โ”€ posthog_example.py # Test PostHog integration
โ”œโ”€โ”€ sqlite/ # SQLite database utilities
โ”‚ โ”œโ”€โ”€ create_index.py # Create performance indexes
โ”‚ โ”œโ”€โ”€ list_tables.py # List all database tables
โ”‚ โ”œโ”€โ”€ list_indexes.py # List all database indexes
โ”‚ โ”œโ”€โ”€ qry.py # Execute custom SQL queries
โ”‚ โ””โ”€โ”€ qry.sql # Example/template SQL query
โ””โ”€โ”€ utils/ # System utilities
โ”œโ”€โ”€ reset_docker.sh # Docker environment reset tool
โ”œโ”€โ”€ cleanup_dagster_storage.sh # Clean up Dagster storage
โ””โ”€โ”€ README.md # Utilities documentation

๐Ÿ”ง Administrative Utilitiesโ€‹

Configuration Hot-Reloadโ€‹

Reload Dagster configuration without restarting containers.

Script: reload_config.py

cd scripts/
python reload_config.py

Use Cases:

  • Applied configuration changes to YAML files
  • Updated environment variables
  • Added new metric batches
  • Modified job schedules

Features:

  • Works with running Docker containers
  • Reloads code locations via GraphQL API
  • Provides feedback on reload status
  • Supports custom Dagster host/port configuration

Long-Running Task Managementโ€‹

Terminate or mark as failed any Dagster runs that exceed configured timeouts.

Script: kill_long_running_tasks.py

cd scripts/
python kill_long_running_tasks.py

Use Cases:

  • Clean up stuck or hanging jobs
  • Recover from unresponsive user code servers
  • Free up system resources
  • Maintenance and cleanup

Features:

  • Uses same timeout as the timeout sensor (ANOMSTACK_KILL_RUN_AFTER_MINUTES)
  • Graceful handling of unreachable code servers
  • Detailed logging of termination actions
  • Automatically marks stuck runs as failed when termination isn't possible

๐Ÿ—„๏ธ Database Utilities (SQLite)โ€‹

For users running Anomstack with SQLite as their database backend.

Performance Optimizationโ€‹

Create indexes on common metric columns for better query performance.

Script: sqlite/create_index.py

cd scripts/sqlite/
python create_index.py

Creates indexes on:

  • metric_timestamp - Time-based queries
  • metric_batch - Batch-based filtering
  • metric_type - Type-based filtering (metric, score, alert)
  • Combined indexes for common query patterns

Database Inspectionโ€‹

Explore your SQLite database structure and contents.

List all tables:

cd scripts/sqlite/
python list_tables.py

List all indexes:

cd scripts/sqlite/
python list_indexes.py

Custom SQL Queriesโ€‹

Execute ad-hoc SQL queries against your SQLite database.

Script: sqlite/qry.py

cd scripts/sqlite/
# Edit qry.sql with your custom query first
python qry.py

Example queries:

-- Recent anomalies across all metric batches
SELECT
metric_timestamp,
metric_batch,
metric_name,
metric_value as anomaly_score
FROM metrics
WHERE metric_type = 'alert'
AND metric_value = 1
AND metric_timestamp >= datetime('now', '-7 days')
ORDER BY metric_timestamp DESC;

-- Top metrics by anomaly count
SELECT
metric_batch,
metric_name,
COUNT(*) as anomaly_count
FROM metrics
WHERE metric_type = 'alert'
AND metric_value = 1
GROUP BY metric_batch, metric_name
ORDER BY anomaly_count DESC
LIMIT 10;

๐Ÿณ System Utilitiesโ€‹

Docker Environment Resetโ€‹

Comprehensive Docker reset utility with multiple cleanup levels.

Script: utils/reset_docker.sh

make reset-interactive

Provides a guided menu with options:

  1. ๐Ÿ”„ Gentle - Rebuild containers (safest)
  2. ๐Ÿงน Medium - Remove containers, keep data
  3. โ˜ข๏ธ Nuclear - Remove all data and containers
  4. ๐Ÿ’ฅ Full Nuclear - Nuclear + full Docker cleanup

Direct Reset Levelsโ€‹

Gentle Reset (Safest - keeps all data):

make reset-gentle
# or
./scripts/utils/reset_docker.sh gentle

Medium Reset (Removes containers, preserves data):

make reset-medium
# or
./scripts/utils/reset_docker.sh medium

Nuclear Reset (Removes all data):

make reset-nuclear
# or
./scripts/utils/reset_docker.sh nuclear

Full Nuclear Reset (Maximum cleanup):

make reset-full-nuclear
# or
./scripts/utils/reset_docker.sh full-nuclear

Featuresโ€‹

  • ๐Ÿ“Š Shows current disk usage before cleanup
  • โš ๏ธ Multiple safety confirmations for destructive operations
  • ๐ŸŽจ Colorful output with clear visual feedback
  • ๐Ÿ›ก๏ธ Graceful fallback handling
  • ๐Ÿ’พ Always preserves source code and configuration

Dagster Storage Cleanupโ€‹

Clean up old Dagster storage files and logs.

Script: utils/cleanup_dagster_storage.sh

./scripts/utils/cleanup_dagster_storage.sh

Cleans up:

  • Old compute logs
  • Artifact storage files
  • Event log storage
  • Run storage artifacts

๐Ÿงช Development Utilitiesโ€‹

API Integration Testingโ€‹

Test external service integrations and credentials.

PostHog Integration Test:

cd scripts/
python posthog_example.py

Validates that your PostHog API credentials work correctly.

๐Ÿ“‹ Common Usage Patternsโ€‹

Daily Maintenance Workflowโ€‹

# 1. Clean up any stuck jobs
cd scripts/
python kill_long_running_tasks.py

# 2. Optimize database performance (if using SQLite)
cd sqlite/
python create_index.py

# 3. Check system status
cd ../
python list_tables.py # Verify data is being written

Configuration Update Workflowโ€‹

# 1. Make changes to your YAML configs or .env file
vim metrics/my_metric/config.yaml

# 2. Hot-reload the configuration
cd scripts/
python reload_config.py

# 3. Verify changes took effect in Dagster UI

Database Analysis Workflowโ€‹

# 1. Inspect database structure
cd scripts/sqlite/
python list_tables.py
python list_indexes.py

# 2. Run custom analysis queries
vim qry.sql # Write your analysis query
python qry.py

# 3. Create performance indexes if needed
python create_index.py

System Reset Workflowโ€‹

# 1. Interactive guided reset (recommended)
make reset-interactive

# 2. Or direct reset for automation
make reset-gentle # Safe option
make reset-nuclear # Clean slate option

๐Ÿ› ๏ธ Advanced Usageโ€‹

Custom Environment Configurationโ€‹

Most scripts respect environment variables:

# Custom Dagster endpoint for reload_config.py
export DAGSTER_HOST=dagster.example.com
export DAGSTER_PORT=3000
python reload_config.py

# Custom timeout for kill_long_running_tasks.py
export ANOMSTACK_KILL_RUN_AFTER_MINUTES=120
python kill_long_running_tasks.py

# Custom database path for SQLite scripts
export ANOMSTACK_SQLITE_PATH=/path/to/custom.db
cd sqlite/
python list_tables.py

Automation and Schedulingโ€‹

Scripts can be integrated into automation workflows:

#!/bin/bash
# Daily maintenance script

# Clean up stuck jobs
python /path/to/anomstack/scripts/kill_long_running_tasks.py

# Optimize database
python /path/to/anomstack/scripts/sqlite/create_index.py

# Generate daily report
python /path/to/anomstack/scripts/sqlite/qry.py > daily_report.txt

Docker Integrationโ€‹

Some scripts work seamlessly with Docker:

# Execute scripts inside running containers
docker exec anomstack_code python /opt/dagster/app/scripts/reload_config.py
docker exec anomstack_code python /opt/dagster/app/scripts/kill_long_running_tasks.py

๐Ÿ” Security Considerationsโ€‹

  1. File Permissions: Ensure scripts have appropriate execute permissions
  2. Database Access: SQLite scripts require read/write access to database files
  3. Network Access: Config reload scripts need access to Dagster GraphQL endpoint
  4. Secrets: Never hardcode credentials in custom scripts

๐Ÿ’ก Best Practicesโ€‹

  1. Test First: Always test scripts in development before production use
  2. Backup Data: Create backups before running destructive operations
  3. Monitor Logs: Check script output and logs for errors
  4. Document Changes: Keep track of when and why you run scripts
  5. Version Control: Track any modifications to scripts in version control

๐Ÿ†˜ Troubleshootingโ€‹

Common Issuesโ€‹

Script can't find database:

# Set the correct database path
export ANOMSTACK_SQLITE_PATH=/path/to/your/database.db

Config reload fails:

# Check if Dagster is running and accessible
curl http://localhost:3000/server_info

Permission denied errors:

# Make scripts executable
chmod +x scripts/utils/reset_docker.sh

SQLite locked errors:

# Stop Anomstack before running database scripts
make docker-stop
cd scripts/sqlite/
python your_script.py
make docker

Getting Helpโ€‹

  • Check individual script help: python script_name.py --help
  • Review README files in each directory
  • Check Dagster UI for job status and logs
  • Use interactive modes when available (make reset-interactive)