TUTORIALS 13 min read

Create an AI-Powered Analytics Dashboard: Real-Time Insights Without SQL

Build a dashboard that lets anyone query data in plain English and get instant visualizations. No SQL knowledge required. Full stack tutorial.

By EgoistAI ·
Create an AI-Powered Analytics Dashboard: Real-Time Insights Without SQL

The biggest lie in business intelligence is that “anyone can use our dashboard.” In reality, getting meaningful insights from data still requires SQL skills, understanding of data schemas, and the patience to click through 15 filters to find what you need.

AI changes this completely. Natural language to SQL is now reliable enough for production use. This tutorial builds a dashboard where anyone — marketing managers, executives, customer support leads — types a question in plain English and gets instant, accurate charts and tables.

No SQL. No training. No “please submit a ticket to the data team and wait three business days.”

What We’re Building

Chapter 1: What We're Building

A web-based analytics dashboard with:

  1. Natural language query interface (“What were our top 10 products by revenue last quarter?”)
  2. AI-generated SQL that executes against your database
  3. Automatic chart selection and rendering
  4. Query history and saved reports
  5. Data export (CSV, PNG)
  6. Conversation mode (follow-up questions with context)

Tech Stack

  • Streamlit for the web interface
  • Claude API for natural language to SQL
  • DuckDB for fast analytical queries (works with CSV, Parquet, and databases)
  • Plotly for interactive charts
  • Python 3.11+

Step 1: Project Setup

Chapter 2: Setup

mkdir ai-dashboard && cd ai-dashboard
python -m venv venv
source venv/bin/activate
pip install streamlit anthropic duckdb plotly pandas python-dotenv

Create .env:

ANTHROPIC_API_KEY=sk-ant-your-key

Step 2: Database Setup With DuckDB

Chapter 3: Database

DuckDB is perfect for this project — it’s an embedded analytical database that reads CSV and Parquet files directly, requires no server, and is blazing fast for analytical queries.

Create database.py:

import duckdb
import pandas as pd

def get_connection():
    return duckdb.connect("analytics.db")

def init_sample_data():
    """Create sample data for testing."""
    conn = get_connection()

    conn.execute("""
    CREATE TABLE IF NOT EXISTS orders (
        order_id INTEGER,
        customer_id INTEGER,
        product_name VARCHAR,
        category VARCHAR,
        quantity INTEGER,
        unit_price DECIMAL(10,2),
        total_amount DECIMAL(10,2),
        order_date DATE,
        region VARCHAR,
        status VARCHAR
    )
    """)

    conn.execute("""
    CREATE TABLE IF NOT EXISTS customers (
        customer_id INTEGER,
        name VARCHAR,
        email VARCHAR,
        signup_date DATE,
        plan VARCHAR,
        lifetime_value DECIMAL(10,2)
    )
    """)

    # Insert sample data if empty
    count = conn.execute("SELECT COUNT(*) FROM orders").fetchone()[0]
    if count == 0:
        # Generate 10,000 sample orders
        conn.execute("""
        INSERT INTO orders
        SELECT
            row_number() OVER () as order_id,
            (random() * 500 + 1)::INTEGER as customer_id,
            (['Widget A','Widget B','Gadget X','Gadget Y','Tool Alpha',
              'Tool Beta','Premium Pack','Starter Kit','Pro Bundle',
              'Enterprise Suite'])[((random()*10)::INTEGER % 10) + 1] as product_name,
            (['Electronics','Software','Hardware','Services','Accessories']
             )[((random()*5)::INTEGER % 5) + 1] as category,
            (random() * 10 + 1)::INTEGER as quantity,
            (random() * 200 + 10)::DECIMAL(10,2) as unit_price,
            0 as total_amount,
            DATE '2025-01-01' + INTERVAL ((random()*450)::INTEGER) DAY as order_date,
            (['North','South','East','West']
             )[((random()*4)::INTEGER % 4) + 1] as region,
            (['completed','pending','cancelled','refunded']
             )[((random()*4)::INTEGER % 4) + 1] as status
        FROM generate_series(1, 10000)
        """)
        conn.execute("UPDATE orders SET total_amount = quantity * unit_price")

    conn.close()

def get_schema():
    """Get database schema for AI context."""
    conn = get_connection()
    tables = conn.execute(
        "SELECT table_name FROM information_schema.tables WHERE table_schema='main'"
    ).fetchall()

    schema = []
    for (table,) in tables:
        cols = conn.execute(f"""
            SELECT column_name, data_type
            FROM information_schema.columns
            WHERE table_name = '{table}'
        """).fetchall()
        col_str = ", ".join([f"{name} ({dtype})" for name, dtype in cols])
        schema.append(f"Table: {table}\nColumns: {col_str}")

    conn.close()
    return "\n\n".join(schema)

def execute_query(sql: str) -> pd.DataFrame:
    """Execute SQL and return DataFrame."""
    conn = get_connection()
    try:
        result = conn.execute(sql).fetchdf()
        return result
    finally:
        conn.close()

Step 3: AI Query Engine

Chapter 4: Query Engine

Create query_engine.py:

import json
from anthropic import Anthropic
from database import get_schema, execute_query

client = Anthropic()

def natural_language_to_sql(question: str, history: list = None) -> dict:
    """Convert natural language question to SQL and chart config."""

    schema = get_schema()

    system_prompt = f"""You are a SQL query generator for DuckDB.
Given a natural language question, generate:
1. A valid DuckDB SQL query
2. A chart recommendation

Database schema:
{schema}

Respond ONLY with valid JSON:
{{
    "sql": "SELECT ...",
    "chart_type": "bar|line|pie|scatter|table",
    "title": "Chart title",
    "x_axis": "column_name_for_x",
    "y_axis": "column_name_for_y",
    "explanation": "Brief explanation of what the query does"
}}

Rules:
- Use DuckDB SQL syntax
- Always include ORDER BY for sorted results
- LIMIT to 100 rows maximum
- Use aliases for readability
- For time series, use line charts
- For comparisons, use bar charts
- For proportions, use pie charts
- For correlations, use scatter plots
- If the question is vague, use table format"""

    messages = []
    if history:
        messages.extend(history)
    messages.append({"role": "user", "content": question})

    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=1024,
        system=system_prompt,
        messages=messages
    )

    # Parse JSON response
    text = response.content[0].text
    # Handle markdown code blocks
    if "```json" in text:
        text = text.split("```json")[1].split("```")[0]
    elif "```" in text:
        text = text.split("```")[1].split("```")[0]

    return json.loads(text.strip())

Step 4: Build the Streamlit Dashboard

Chapter 5: Dashboard UI

Create app.py:

import streamlit as st
import plotly.express as px
import plotly.graph_objects as go
from database import init_sample_data, execute_query
from query_engine import natural_language_to_sql
import pandas as pd

st.set_page_config(page_title="AI Analytics Dashboard", layout="wide")

# Initialize database
init_sample_data()

# Session state
if "history" not in st.session_state:
    st.session_state.history = []
if "queries" not in st.session_state:
    st.session_state.queries = []

st.title("AI Analytics Dashboard")
st.markdown("Ask questions about your data in plain English.")

# Query input
question = st.text_input(
    "What do you want to know?",
    placeholder="e.g., What are the top 5 products by revenue?"
)

if question:
    with st.spinner("Analyzing your question..."):
        try:
            result = natural_language_to_sql(
                question, st.session_state.history
            )

            # Show the generated SQL
            with st.expander("View SQL Query"):
                st.code(result["sql"], language="sql")
                st.caption(result["explanation"])

            # Execute query
            df = execute_query(result["sql"])

            if df.empty:
                st.warning("No results found.")
            else:
                # Render chart
                chart_type = result.get("chart_type", "table")
                title = result.get("title", "Results")
                x_col = result.get("x_axis")
                y_col = result.get("y_axis")

                if chart_type == "bar" and x_col and y_col:
                    fig = px.bar(df, x=x_col, y=y_col, title=title)
                    st.plotly_chart(fig, use_container_width=True)
                elif chart_type == "line" and x_col and y_col:
                    fig = px.line(df, x=x_col, y=y_col, title=title)
                    st.plotly_chart(fig, use_container_width=True)
                elif chart_type == "pie" and x_col and y_col:
                    fig = px.pie(df, names=x_col, values=y_col, title=title)
                    st.plotly_chart(fig, use_container_width=True)
                elif chart_type == "scatter" and x_col and y_col:
                    fig = px.scatter(df, x=x_col, y=y_col, title=title)
                    st.plotly_chart(fig, use_container_width=True)

                # Always show data table
                st.dataframe(df, use_container_width=True)

                # Export options
                col1, col2 = st.columns(2)
                with col1:
                    csv = df.to_csv(index=False)
                    st.download_button("Download CSV", csv,
                                      "results.csv", "text/csv")

            # Update conversation history
            st.session_state.history.append(
                {"role": "user", "content": question}
            )
            st.session_state.history.append(
                {"role": "assistant", "content": str(result)}
            )
            st.session_state.queries.append({
                "question": question, "result": result
            })

        except Exception as e:
            st.error(f"Error: {str(e)}")

# Sidebar with query history
with st.sidebar:
    st.header("Query History")
    for i, q in enumerate(reversed(st.session_state.queries)):
        with st.expander(q["question"][:50] + "..."):
            st.code(q["result"]["sql"], language="sql")

Step 5: Run and Test

Chapter 6: Testing

streamlit run app.py

Try these queries:

  • “What are the top 10 products by total revenue?”
  • “Show me monthly revenue trends for 2025”
  • “What percentage of orders are in each status?”
  • “Compare revenue by region”
  • “What’s the average order value by category?”
  • “Show me customers who placed more than 5 orders”

The AI generates appropriate SQL, selects the right chart type, and renders interactive visualizations — all from plain English.

Step 6: Security Considerations

Chapter 7: Security

Before deploying to production, address these security concerns:

SQL Injection Prevention

Never execute AI-generated SQL directly on production databases. Use a read-only database connection:

def get_readonly_connection():
    conn = duckdb.connect("analytics.db", read_only=True)
    return conn

Query Validation

Add a validation layer between AI-generated SQL and execution:

FORBIDDEN_KEYWORDS = ["DROP", "DELETE", "INSERT", "UPDATE", "ALTER", "CREATE", "TRUNCATE"]

def validate_sql(sql: str) -> bool:
    upper_sql = sql.upper().strip()
    for keyword in FORBIDDEN_KEYWORDS:
        if keyword in upper_sql:
            return False
    if not upper_sql.startswith("SELECT"):
        return False
    return True

Row Limits

Always enforce maximum row limits to prevent resource exhaustion.

Step 7: Deployment

Chapter 8: Deployment

Create Dockerfile:

FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
EXPOSE 8501
CMD ["streamlit", "run", "app.py", "--server.port=8501", "--server.address=0.0.0.0"]

Deploy to Railway, Render, or any container platform. For internal tools, Streamlit Cloud offers free hosting with GitHub integration.

Extending the Dashboard

Chapter 9: Extensions

Connect Real Data Sources

Replace DuckDB sample data with your actual data:

  • PostgreSQL: Use duckdb.connect().execute("ATTACH 'dbname=mydb ...' AS pg (TYPE POSTGRES)")
  • CSV/Parquet Files: DuckDB reads them directly: SELECT * FROM 'data.csv'
  • APIs: Fetch data, load into DuckDB, query with AI

Add Scheduled Reports

Use Streamlit’s caching and a scheduler to generate daily reports automatically.

Multi-User Support

Add authentication with Streamlit’s built-in auth or use a reverse proxy with SSO.

The Bottom Line

An AI-powered analytics dashboard democratizes data access across your organization. Instead of bottlenecking on the data team for every question, anyone can explore data in plain English.

The build takes about 3-4 hours. The ongoing cost is minimal — Claude API usage for generating SQL queries costs cents per day for typical usage. And the productivity gain is immediate.

Your data team will thank you for reducing “Can you pull this report?” requests by 80%.

Share this article

> Want more like this?

Get the best AI insights delivered weekly.

> Related Articles

Tags

AI dashboarddata visualizationnatural language SQLPythonStreamlittutorial

> Stay in the loop

Weekly AI tools & insights.