SQL Server
This guide walks you through extracting the required metadata and query logs from SQL Server for the Solid analysis process.
Overview
Solid requires access to:
- The schema of all relevant SQL Server databases
- Query logs and execution history
- [Optional] Read-only grants for data profiling
User and Permissions Setup
To run the metadata and query log extraction scripts, you'll need a SQL Server login with specific permissions. It is a security best practice to create a dedicated, low-privilege user for this purpose.
Required Permissions
The user will need the following permissions:
| Permission | Purpose |
|---|---|
| VIEW SERVER STATE | Required to query server-wide Dynamic Management Views (DMVs), which are used for extracting cached query plans |
| VIEW DATABASE STATE | Required to access the Query Store within each target database |
| VIEW DEFINITION | Required to read metadata from sys catalog views and INFORMATION_SCHEMA views |
| SELECT (Optional) | Required only if you need to collect data samples for profiling |
Setup Script
Use the following T-SQL script to create a new login and grant it the necessary permissions.
-- 1. Create a new SQL Server Login
-- This command should be run in the 'master' database
USE master;
CREATE LOGIN solid_user WITH PASSWORD = '<YourStrongPassword>';
-- 2. Grant Server-Level Permissions
-- This allows the user to see server-wide performance and state information
GRANT VIEW SERVER STATE TO solid_user;
-- 3. Create a User and Grant Database-Level Permissions
-- The following commands MUST be run on EACH database you wish to monitor
USE [<YourDatabaseName>]; -- Replace with your database name
-- Create a database user mapped to the server login
CREATE USER solid_user FOR LOGIN solid_user;
-- Grant permissions to view database metadata and query store
GRANT VIEW DATABASE STATE TO solid_user;
GRANT VIEW DEFINITION TO solid_user;
-- 4. [Optional] Grant Read-Only Data Access
-- Run this command on each target database if you need to collect data samples
-- GRANT SELECT TO solid_user;Important:
- Replace
<YourStrongPassword>with a strong password - Replace
<YourDatabaseName>with each database you want to monitor - Repeat the database-level commands (Step 3) for each database
Metadata Extraction
Gather information from system tables to understand the structure of your databases, including details about tables, columns, data types, schemas, and users.
Option 1: Using INFORMATION_SCHEMA Views (Recommended)
The INFORMATION_SCHEMA views provide a standardized, SQL-92 compliant way to retrieve metadata. This script automatically iterates through all user databases and collects schema information.
DECLARE @DatabaseName NVARCHAR(128);
DECLARE @SQL NVARCHAR(MAX);
-- Cursor to iterate over each database, excluding system databases
DECLARE db_cursor CURSOR FOR
SELECT name
FROM sys.databases
WHERE state_desc = 'ONLINE' AND database_id > 4;
OPEN db_cursor;
FETCH NEXT FROM db_cursor INTO @DatabaseName;
WHILE @@FETCH_STATUS = 0
BEGIN
-- Build dynamic SQL to query INFORMATION_SCHEMA in each database
SET @SQL = N'
USE [' + @DatabaseName + '];
SELECT
''' + @DatabaseName + ''' AS DATABASE_NAME,
s.SCHEMA_NAME AS TABLE_SCHEMA,
t.TABLE_NAME AS TABLE_NAME,
t.TABLE_TYPE AS TABLE_TYPE,
c.COLUMN_NAME AS COLUMN_NAME,
c.DATA_TYPE AS DATA_TYPE
FROM
INFORMATION_SCHEMA.SCHEMATA s
LEFT JOIN
INFORMATION_SCHEMA.TABLES t ON s.SCHEMA_NAME = t.TABLE_SCHEMA
LEFT JOIN
INFORMATION_SCHEMA.COLUMNS c ON t.TABLE_NAME = c.TABLE_NAME AND t.TABLE_SCHEMA = c.TABLE_SCHEMA;';
-- Execute the dynamic SQL
EXEC sp_executesql @SQL;
FETCH NEXT FROM db_cursor INTO @DatabaseName;
END;
CLOSE db_cursor;
DEALLOCATE db_cursor;Option 2: Using sys Catalog Views
The sys schema provides a more detailed, SQL Server-specific view of the metadata. This script collects detailed information about schemas, users, tables, columns, and types.
DECLARE @DatabaseName NVARCHAR(128);
-- Cursor to iterate over each database
DECLARE db_cursor CURSOR FOR
SELECT name FROM sys.databases WHERE state_desc = 'ONLINE';
OPEN db_cursor;
FETCH NEXT FROM db_cursor INTO @DatabaseName;
WHILE @@FETCH_STATUS = 0
BEGIN
-- Dynamic SQL to retrieve and join metadata from sys views
EXEC('USE ' + @DatabaseName + ';
SELECT
''Database'' = ''' + @DatabaseName + ''',
s.name AS schemas_name,
u.name AS sysusers_owner,
t.name AS tables_name,
t.[type] AS tables_type,
t.type_desc AS tables_type_desc,
t.create_date AS tables_create_date,
t.modify_date AS tables_modify_date,
c.name AS columns_name,
c.max_length AS columns_max_length,
c.is_nullable AS columns_is_nullable,
ty.name AS types_name,
ty.collation_name AS types_collation_name
FROM
sys.tables t
JOIN
sys.columns c ON t.object_id = c.object_id
JOIN
sys.types ty ON c.user_type_id = ty.user_type_id
JOIN
sys.schemas s ON t.schema_id = s.schema_id
JOIN
sys.sysusers u ON s.principal_id = u.uid
ORDER BY
t.name;');
FETCH NEXT FROM db_cursor INTO @DatabaseName;
END;
CLOSE db_cursor;
DEALLOCATE db_cursor;Query Log Extraction
Capturing query history is crucial for understanding data usage. SQL Server offers several methods for this.
Method 1: Query Store (Recommended)
If enabled on your databases, the Query Store is the most reliable method. It captures a detailed history of queries, execution plans, and performance statistics.
Prerequisites:
- Query Store must be enabled on each database you want to monitor
- Enable Query Store:
ALTER DATABASE [YourDatabase] SET QUERY_STORE = ON;
Extraction Script:
Run this script on each database where Query Store is enabled:
SELECT
q.query_id,
q.query_text_id,
qt.query_sql_text,
rs.avg_duration,
rs.execution_type_desc,
rs.execution_count,
rs.last_execution_time
FROM sys.query_store_query q
JOIN sys.query_store_query_text qt ON q.query_text_id = qt.query_text_id
JOIN sys.query_store_runtime_stats rs ON q.query_id = rs.query_id
ORDER BY rs.last_execution_time DESC;Benefits:
- Persistent query history (survives server restarts)
- Detailed execution statistics
- Performance metrics and query plans
Method 2: Dynamic Management Views (DMVs)
DMVs provide information about cached query plans and execution statistics from memory.
Note: This information is volatile and can be evicted from the cache if there is memory pressure.
Extraction Script:
Run this script at the instance level:
SELECT
t.[text],
s.*,
p.*
FROM sys.dm_exec_cached_plans AS p
INNER JOIN sys.dm_exec_query_stats AS s
ON p.plan_handle = s.plan_handle
CROSS APPLY sys.dm_exec_sql_text(p.plan_handle) AS t
ORDER BY s.last_execution_time DESC;Limitations:
- Query history is lost on server restart
- Cache can be cleared due to memory pressure
- Limited historical data
Method 3: Other Methods
SQL Server Trace / Extended Events:
- Can be configured to capture detailed query activity to a file
- Requires setup and management
- More complex but provides comprehensive logging
Third-Party Tools:
- ApexSQL Log
- Redgate SQL Log Rescue
- Can read transaction logs to reconstruct query history
Exporting Results
After running the extraction scripts, export the results for upload to Solid:
-
In SQL Server Management Studio (SSMS):
- Run the query
- Right-click on the results grid
- Select Save Results As
- Choose CSV format
-
Using sqlcmd:
sqlcmd -S ServerName -d DatabaseName -U solid_user -P Password -Q "SELECT * FROM ..." -o output.csv -s"," -w 999- Compress the files:
- Archive all CSV files (ZIP or GZIP)
- Upload to Solid Azure Storage container or as directed by your Solid administrator
What Data Does Solid Collect?
Metadata
Solid collects structural information about your databases:
- Database names and properties
- Schema names
- Table names and types (BASE TABLE, VIEW)
- Column names and data types
- Column properties (nullable, max length, collation)
- Creation and modification dates
- User and ownership information
Query History
Solid collects query execution information:
- Query text (SQL statements)
- Execution statistics (count, duration, last execution time)
- Query plans (when available)
- User information (who executed the query)
- Performance metrics (CPU time, I/O statistics)
Data Profiling (Optional)
If SELECT permissions are granted, Solid can sample data:
- Numeric columns: MIN, MAX, AVG values
- Text columns: Distinct values, row counts
- Date columns: MIN and MAX dates
- Null counts and data quality metrics
Troubleshooting
Permission Issues
If you encounter "permission denied" errors:
- Verify server-level permissions:
SELECT * FROM sys.server_permissions WHERE grantee_principal_id = USER_ID('solid_user');- Verify database-level permissions:
USE [YourDatabase];
SELECT * FROM sys.database_permissions WHERE grantee_principal_id = USER_ID('solid_user');- Ensure the user exists in each database:
SELECT name FROM sys.database_principals WHERE name = 'solid_user';Query Store Issues
If Query Store data is not available:
- Check if Query Store is enabled:
SELECT name, is_query_store_on FROM sys.databases;- Enable Query Store:
ALTER DATABASE [YourDatabase] SET QUERY_STORE = ON;- Configure Query Store settings:
ALTER DATABASE [YourDatabase]
SET QUERY_STORE (
OPERATION_MODE = READ_WRITE,
DATA_FLUSH_INTERVAL_SECONDS = 900,
INTERVAL_LENGTH_MINUTES = 60,
MAX_STORAGE_SIZE_MB = 1000
);Connection Issues
If you can't connect to SQL Server:
- Verify SQL Server authentication mode (must allow SQL Server authentication)
- Check firewall rules and network connectivity
- Confirm the login is enabled:
SELECT name, is_disabled FROM sys.sql_logins WHERE name = 'solid_user';Security Best Practices
- Use strong passwords for the solid_user login
- Grant minimum necessary permissions - avoid granting sysadmin or db_owner roles
- Rotate passwords regularly according to your organization's policies
- Use Windows Authentication when possible instead of SQL authentication
- Monitor login activity through SQL Server audit logs
- Disable the account when not in use for extended periods
- Use dedicated service accounts rather than personal accounts
- Restrict network access to SQL Server ports (default 1433)
- Enable encryption for SQL Server connections (SSL/TLS)
SQL Server Version Compatibility
This guide is compatible with:
- SQL Server 2016 and later (for Query Store support)
- SQL Server 2012 and later (for basic metadata extraction)
- Azure SQL Database
- Azure SQL Managed Instance
Note: Query Store is available in SQL Server 2016+ and all Azure SQL offerings.
Additional Resources
Updated about 2 months ago
