SQL Server

This guide walks you through extracting the required metadata and query logs from SQL Server for the Solid analysis process.

Overview

Solid requires access to:

  • The schema of all relevant SQL Server databases
  • Query logs and execution history
  • [Optional] Read-only grants for data profiling

User and Permissions Setup

To run the metadata and query log extraction scripts, you'll need a SQL Server login with specific permissions. It is a security best practice to create a dedicated, low-privilege user for this purpose.

Required Permissions

The user will need the following permissions:

PermissionPurpose
VIEW SERVER STATERequired to query server-wide Dynamic Management Views (DMVs), which are used for extracting cached query plans
VIEW DATABASE STATERequired to access the Query Store within each target database
VIEW DEFINITIONRequired to read metadata from sys catalog views and INFORMATION_SCHEMA views
SELECT (Optional)Required only if you need to collect data samples for profiling

Setup Script

Use the following T-SQL script to create a new login and grant it the necessary permissions.

-- 1. Create a new SQL Server Login
-- This command should be run in the 'master' database
USE master;
CREATE LOGIN solid_user WITH PASSWORD = '<YourStrongPassword>';

-- 2. Grant Server-Level Permissions
-- This allows the user to see server-wide performance and state information
GRANT VIEW SERVER STATE TO solid_user;

-- 3. Create a User and Grant Database-Level Permissions
-- The following commands MUST be run on EACH database you wish to monitor
USE [<YourDatabaseName>]; -- Replace with your database name

-- Create a database user mapped to the server login
CREATE USER solid_user FOR LOGIN solid_user;

-- Grant permissions to view database metadata and query store
GRANT VIEW DATABASE STATE TO solid_user;
GRANT VIEW DEFINITION TO solid_user;

-- 4. [Optional] Grant Read-Only Data Access
-- Run this command on each target database if you need to collect data samples
-- GRANT SELECT TO solid_user;

Important:

  • Replace <YourStrongPassword> with a strong password
  • Replace <YourDatabaseName> with each database you want to monitor
  • Repeat the database-level commands (Step 3) for each database

Metadata Extraction

Gather information from system tables to understand the structure of your databases, including details about tables, columns, data types, schemas, and users.

Option 1: Using INFORMATION_SCHEMA Views (Recommended)

The INFORMATION_SCHEMA views provide a standardized, SQL-92 compliant way to retrieve metadata. This script automatically iterates through all user databases and collects schema information.

DECLARE @DatabaseName NVARCHAR(128);
DECLARE @SQL NVARCHAR(MAX);

-- Cursor to iterate over each database, excluding system databases
DECLARE db_cursor CURSOR FOR
SELECT name 
FROM sys.databases 
WHERE state_desc = 'ONLINE' AND database_id > 4;

OPEN db_cursor;
FETCH NEXT FROM db_cursor INTO @DatabaseName;

WHILE @@FETCH_STATUS = 0
BEGIN
    -- Build dynamic SQL to query INFORMATION_SCHEMA in each database
    SET @SQL = N'
    USE [' + @DatabaseName + '];
    SELECT 
        ''' + @DatabaseName + ''' AS DATABASE_NAME,
        s.SCHEMA_NAME AS TABLE_SCHEMA,
        t.TABLE_NAME AS TABLE_NAME,
        t.TABLE_TYPE AS TABLE_TYPE,
        c.COLUMN_NAME AS COLUMN_NAME,
        c.DATA_TYPE AS DATA_TYPE
    FROM 
        INFORMATION_SCHEMA.SCHEMATA s
    LEFT JOIN 
        INFORMATION_SCHEMA.TABLES t ON s.SCHEMA_NAME = t.TABLE_SCHEMA
    LEFT JOIN 
        INFORMATION_SCHEMA.COLUMNS c ON t.TABLE_NAME = c.TABLE_NAME AND t.TABLE_SCHEMA = c.TABLE_SCHEMA;';

    -- Execute the dynamic SQL
    EXEC sp_executesql @SQL;

    FETCH NEXT FROM db_cursor INTO @DatabaseName;
END;

CLOSE db_cursor;
DEALLOCATE db_cursor;

Option 2: Using sys Catalog Views

The sys schema provides a more detailed, SQL Server-specific view of the metadata. This script collects detailed information about schemas, users, tables, columns, and types.

DECLARE @DatabaseName NVARCHAR(128);

-- Cursor to iterate over each database
DECLARE db_cursor CURSOR FOR
SELECT name FROM sys.databases WHERE state_desc = 'ONLINE';

OPEN db_cursor;
FETCH NEXT FROM db_cursor INTO @DatabaseName;

WHILE @@FETCH_STATUS = 0
BEGIN
    -- Dynamic SQL to retrieve and join metadata from sys views
    EXEC('USE ' + @DatabaseName + ';
    SELECT
        ''Database'' = ''' + @DatabaseName + ''',
        s.name AS schemas_name,
        u.name AS sysusers_owner,
        t.name AS tables_name,
        t.[type] AS tables_type,
        t.type_desc AS tables_type_desc,
        t.create_date AS tables_create_date,
        t.modify_date AS tables_modify_date,
        c.name AS columns_name,
        c.max_length AS columns_max_length,
        c.is_nullable AS columns_is_nullable,
        ty.name AS types_name,
        ty.collation_name AS types_collation_name
    FROM
        sys.tables t
    JOIN
        sys.columns c ON t.object_id = c.object_id
    JOIN
        sys.types ty ON c.user_type_id = ty.user_type_id
    JOIN
        sys.schemas s ON t.schema_id = s.schema_id
    JOIN
        sys.sysusers u ON s.principal_id = u.uid
    ORDER BY
        t.name;');

    FETCH NEXT FROM db_cursor INTO @DatabaseName;
END;

CLOSE db_cursor;
DEALLOCATE db_cursor;

Query Log Extraction

Capturing query history is crucial for understanding data usage. SQL Server offers several methods for this.

Method 1: Query Store (Recommended)

If enabled on your databases, the Query Store is the most reliable method. It captures a detailed history of queries, execution plans, and performance statistics.

Prerequisites:

  • Query Store must be enabled on each database you want to monitor
  • Enable Query Store: ALTER DATABASE [YourDatabase] SET QUERY_STORE = ON;

Extraction Script:

Run this script on each database where Query Store is enabled:

SELECT
    q.query_id,
    q.query_text_id,
    qt.query_sql_text,
    rs.avg_duration,
    rs.execution_type_desc,
    rs.execution_count,
    rs.last_execution_time
FROM sys.query_store_query q
JOIN sys.query_store_query_text qt ON q.query_text_id = qt.query_text_id
JOIN sys.query_store_runtime_stats rs ON q.query_id = rs.query_id
ORDER BY rs.last_execution_time DESC;

Benefits:

  • Persistent query history (survives server restarts)
  • Detailed execution statistics
  • Performance metrics and query plans

Method 2: Dynamic Management Views (DMVs)

DMVs provide information about cached query plans and execution statistics from memory.

Note: This information is volatile and can be evicted from the cache if there is memory pressure.

Extraction Script:

Run this script at the instance level:

SELECT 
    t.[text], 
    s.*, 
    p.*
FROM sys.dm_exec_cached_plans AS p
INNER JOIN sys.dm_exec_query_stats AS s
    ON p.plan_handle = s.plan_handle
CROSS APPLY sys.dm_exec_sql_text(p.plan_handle) AS t
ORDER BY s.last_execution_time DESC;

Limitations:

  • Query history is lost on server restart
  • Cache can be cleared due to memory pressure
  • Limited historical data

Method 3: Other Methods

SQL Server Trace / Extended Events:

  • Can be configured to capture detailed query activity to a file
  • Requires setup and management
  • More complex but provides comprehensive logging

Third-Party Tools:

  • ApexSQL Log
  • Redgate SQL Log Rescue
  • Can read transaction logs to reconstruct query history

Exporting Results

After running the extraction scripts, export the results for upload to Solid:

  1. In SQL Server Management Studio (SSMS):

    • Run the query
    • Right-click on the results grid
    • Select Save Results As
    • Choose CSV format
  2. Using sqlcmd:

   sqlcmd -S ServerName -d DatabaseName -U solid_user -P Password -Q "SELECT * FROM ..." -o output.csv -s"," -w 999
  1. Compress the files:
    • Archive all CSV files (ZIP or GZIP)
    • Upload to Solid Azure Storage container or as directed by your Solid administrator

What Data Does Solid Collect?

Metadata

Solid collects structural information about your databases:

  • Database names and properties
  • Schema names
  • Table names and types (BASE TABLE, VIEW)
  • Column names and data types
  • Column properties (nullable, max length, collation)
  • Creation and modification dates
  • User and ownership information

Query History

Solid collects query execution information:

  • Query text (SQL statements)
  • Execution statistics (count, duration, last execution time)
  • Query plans (when available)
  • User information (who executed the query)
  • Performance metrics (CPU time, I/O statistics)

Data Profiling (Optional)

If SELECT permissions are granted, Solid can sample data:

  • Numeric columns: MIN, MAX, AVG values
  • Text columns: Distinct values, row counts
  • Date columns: MIN and MAX dates
  • Null counts and data quality metrics

Troubleshooting

Permission Issues

If you encounter "permission denied" errors:

  1. Verify server-level permissions:
   SELECT * FROM sys.server_permissions WHERE grantee_principal_id = USER_ID('solid_user');
  1. Verify database-level permissions:
   USE [YourDatabase];
   SELECT * FROM sys.database_permissions WHERE grantee_principal_id = USER_ID('solid_user');
  1. Ensure the user exists in each database:
   SELECT name FROM sys.database_principals WHERE name = 'solid_user';

Query Store Issues

If Query Store data is not available:

  1. Check if Query Store is enabled:
   SELECT name, is_query_store_on FROM sys.databases;
  1. Enable Query Store:
   ALTER DATABASE [YourDatabase] SET QUERY_STORE = ON;
  1. Configure Query Store settings:
   ALTER DATABASE [YourDatabase] 
   SET QUERY_STORE (
       OPERATION_MODE = READ_WRITE,
       DATA_FLUSH_INTERVAL_SECONDS = 900,
       INTERVAL_LENGTH_MINUTES = 60,
       MAX_STORAGE_SIZE_MB = 1000
   );

Connection Issues

If you can't connect to SQL Server:

  1. Verify SQL Server authentication mode (must allow SQL Server authentication)
  2. Check firewall rules and network connectivity
  3. Confirm the login is enabled:
   SELECT name, is_disabled FROM sys.sql_logins WHERE name = 'solid_user';

Security Best Practices

  • Use strong passwords for the solid_user login
  • Grant minimum necessary permissions - avoid granting sysadmin or db_owner roles
  • Rotate passwords regularly according to your organization's policies
  • Use Windows Authentication when possible instead of SQL authentication
  • Monitor login activity through SQL Server audit logs
  • Disable the account when not in use for extended periods
  • Use dedicated service accounts rather than personal accounts
  • Restrict network access to SQL Server ports (default 1433)
  • Enable encryption for SQL Server connections (SSL/TLS)

SQL Server Version Compatibility

This guide is compatible with:

  • SQL Server 2016 and later (for Query Store support)
  • SQL Server 2012 and later (for basic metadata extraction)
  • Azure SQL Database
  • Azure SQL Managed Instance

Note: Query Store is available in SQL Server 2016+ and all Azure SQL offerings.


Additional Resources