A lot of stuff to flush out

After 10 years at my prior company, I’m starting work at Microsoft tomorrow as a Premier Field Engineer.  Before I do, i wanted to load various scripts and projects that I’ve built to this site.  However, since I left my thumb drive at home (I’m in a hotel in Charlotte right now), and since my Employee Agreement says that i retain the rights to any script or project that i post in any phase of completion before i start tomorrow, I’m going to do a whirlwind tour of my stuff.  Some of these are building blocks of others, some stand alone.

indexing and statistics:

script to filter indices by a variety of parameters, with an option to return separate or combined data about missing indices, operational stats, usage stats, compression estimations, rebuild/reorg/drop/create/update stats statements, fragmentation, and cache usage, along with various metadata aspects (e.g. size, partition number, compression settings) and categorical options to simplify pivot tables and other analytics.  index definitions – with filter conditions and included columns – are available.

scripts and data model to persist such data about indices, plus additional data collected through other processes

sql job to collect index usage and operational stats data, fragmentation data, size/growth data, missing index data, and ddl history/change data on a regular basis

database triggers to store ddl activity.  though used for other purposes, within the context of indices and statistics i can pull index/stats creation, modification, and deletion dates, as well as maintain a history of scheduled and ad hoc rebuilds, reorganizes, and updates.

Script to manage operational tasks – namely, rebuilds, reorganizes, and statistics updates – based on data collected above.  logic allows for different actions on different days of week (e.g., large indices on weekends; offline indices on sundays only); fragmentation and update percents are captured, and then size, historical usage and fragmentation data are considered to determine whether to rebuild, reorganize, or skip an index.  For example, large indices that are only seeked, not scanned, will only be eligible for reorgs to minimize log growth, while indices without any usage data in a defined period of time will be skipped completely.  historical fragmentation rates are reviewed either suggest or implement changes to fill factor to minimize page splits.

monitoring and reporting processes for the operational tasks

to shut down operational tasks when log is filling, and to check log size before restarting;

to provide operational level reports on rebuild and update activities (e.g., durations, number and amount of times stopped because of log growth); number of errors generated (e.g., deadlock victim); number of indices not completed based on parameters; indices excluded from consideration with reason codes;and others

and to provide trend-level reports (e.g. indices with frequently high fragmentation rates; unused indices; index operational tasks resource and time analysis.)

reports on activity history for tables/indices/stats – last index rebuild; last stat update; last reorganize; reasons why not processed; ad hoc rebuilds (perhaps indicating attempts to improve poor query performance;

compression framework to test and document findings for indices over time, using ddl log to retest indices as tables have new columns added, etc; generate scripts to compress based on parameterized guidelines (index usage and space savings) to determine row or page compression.

job and script(s) to capture recurring snapshots of table/index row counts and size to allow for trend analysis of growth.

queries and reports to look at table/index growth data over time, in two ways:

  • by record count/size over time, looking at snapshots taken on a daily, weekly, monthly basis.
  • by data age distribution within a table currently, by grouping record counts by creation or revision date.  this second option is especially useful for identifying tables that need to be archived or purged.  both options are good for reviewing run-away tables.

————————-

tempdb analysis script (already on blog)

——————————

sessions script, pulling data from sys.dm_exec_sessions, requests, which also pulls in lock info, dbcc input buffer, waits, query plans, sql text, and memory grant information, with a wide set of parameters controlling outputs, sort orders, and other features.

————————————

dynamic delete script (presented at jax ssug), which provides a template for managing deletes safely, by checking for a wide variety of errors and issues, including but not limited to…

  • running in the wrong environment or database
  • not reviewing the count of records to be purged before purging
  • not including a where clause
  • not monitoring log growth
  • not optimizing loop sizes based on performance
  • not persisting IDs deleted (if required)
  • not logging reasons for purging data

—————————————-

sql log parser, which provides summary and detail-level outputs related to

  • backups
  • login failures
  • deadlocks
  • dbcc checkdb
  • other

—————————————–

scripts to monitor, persist, and alert on identity columns nearing int limit.

—————————————–

scripts to collect, persist, and alert on file size; volume size; table and index size; data cache size; cpu, i/o, memory; wait stats; latch stats; perfmon counters (SQL and OS) and other performance and system data (e.g., resource groups, vlfs, tempdb usage) using dmvs and power shell scripts.

———————————————-

scripts to collect proc, trigger, function, and ad hoc statement performance at a proc and statement level on a recurring basis, along with sql text and query plans.  query plan storage is optimized using query hash and query plan hash.  query plans are parsed to allow easy retrieval of settings; parameters; missing index recommendations; implicit converions; index use; methods of access; degree of parallelism, high-impact actions (e.g., table spools, key lookups)

scripts and reports to analyze query performance data, and to correlate data with other resource performance data captured as noted above.  e.g., correlating hourly spikes in query cpu with spikes in wait stats with spikes in specific resource pools; correlating general index usage data with query plan’s index usage to see what exactly is causing certain types of index usage (allowing for index removal after refactoring, for example).

———————————————–

script to create and report on extended events that allow metadata to be stored at table and index levels (e.g., index creation date; table role (type table, master data table)

————————————————-

script to recurse through FKs to produce full hierarchies of relationships from an anchoring parent table.  include options to generate join statements for fast code generation.

————————————————-

compare FKs against indices to find FKs not supported by indices.

————————————————

table-driven backups, similar to ola halegren’s.

————————————————-

template for looping over databases

————————————————–

model and scripts to populate server inventories and server change histories (e.g., restarts, memory changes, sp_configure changes; upgrades)

—————————————————

model and scripts to populate db config change histories

————————————————–

persisted sql agent job/step histories – failures; cancellations; retries, as well as durations.  analytic reports for changing durations over time; alerts and reports for job failures

——————————————————

job metadata change histories , to provide reports of changes to jobs, steps, and schedules to auditors; to provide alerts for reports disabled recently that may need to be re-enabled.

——————————————————

back size and duration histories, allowing for trend reporting to spot possible problems ; longer-term growth for capacity planning; and planning job schedule overlaps.

—————————————————-

Advertisements

Hi!

Hey, y’all!  I’m John Kauffman, a Database Architect in Jacksonville, FL.  I’ve been working with SQL Server for 9 years.  I started as a report developer, and quickly realized the difference between t-SQL that gets the right results and t-SQL that performs well while doing so.  My quest to improve my query performance got me deeper into SQL internals, and I transitioned into DBA work about 6 years ago.

Now, I spend a lot of time writing scripts to provide visibility to SQL internals; to collect internals data for long-term trend analysis; or to simplify recurring tasks.  I’ve gotten a lot of ‘Hey, that’s useful.  Do you have a blog?’ comments when I share my scripts, but never gotten around to it.  But, when Paul Randall says it, I figured I should listen!

I hope you find the scripts useful.  Know that I write scripts to work for me, and so am not used to thinking about making scripts generic enough to work with different versions, different collations, etc.  So, if you’ve got problems or enhancements, I’d appreciate the constructive feedback.

JK