Paul and I chatted with Hilton Giesenow about SQL Server, Sharepoint and Involuntary DBAs.  Paul blogged a bit more details about it in the post titled: Paul and Kimberly interview from MVP Summit. And, you can get the actual interview here:

Episode 21 – Diving Into SQL Server With Paul Randal & Kim Tripp

Enjoy!
kt

OK, I've had a bit of a tangent going on optimizing stored procedures and its all revolved around adhoc plan caching and potentially wasted cache due to the single-use plans created (see the most recent posts titled:Plan cache and optimizing for adhoc workloads and Plan cache, adhoc workloads and clearing the single-use plan cache bloat). To be more specific, these are statements that when generated (via adhoc or prepared statements) don't end up being parameterized (in general - only relatively simple statements can be effectively parameterized and when your statements complex, SQL Server won't parameterize them - this can be good and in some systems this can be terribly bad). But, now you know how bad... you know how much of your cache you're wasting but where is it going and are there any similarities to these statements? Should you (or can you?) bring them together to see if there is a commonality that can be leveraged?

In general, the answer is "it depends" (you knew that was coming!). However, I do have good news, you can definitely gauge what's actually going on and how many of the statements are in fact, similar. This might allow you to leverage another database option (forced parameterization) and/or determine that you really NEED stored procedures. However, that might end up being more complex as it will require rewriting the app (or at least some of the app). Again, the good news here is that you'll be able to tell which part of the app.

Take for example the following:

DBCC FREEPROCCACHE
go
SELECT * FROM dbo.Member WHERE Lastname = 'Tripp';
go
SELECT
* FROM dbo.Member WHERE Lastname = 'Tripped'
;
go
SELECT
* FROM dbo.Member WHERE Lastname = 'Tripper'
;
go
SELECT
* FROM dbo.Member WHERE Lastname = 'Tripps'
;
go
SELECT
* FROM dbo.Member WHERE Lastname = 'Falls'
;
go

SELECT cp.objtype, cp.cacheobjtype, cp.size_in_bytes, cp.refcounts, cp.usecounts, st.text --, *
FROM sys.dm_exec_cached_plans AS cp
CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) AS st
WHERE cp.objtype IN ('Adhoc', 'Prepared'
)
        AND st.text LIKE '%from dbo.member%' 
        AND st.text NOT LIKE
'%SELECT cp.objecttype%'
ORDER BY cp.objtype
go

objtype       cacheobjtype          size_in_bytes   refcounts  usecounts     text
Adhoc         Compiled Plan         24576               2              1                 SELECT * FROM dbo.Member WHERE Lastname = 'Falls'; 
Adhoc         Compiled Plan         24576               2              1                 SELECT * FROM dbo.Member WHERE Lastname = 'Tripps'; 
Adhoc         Compiled Plan         24576               2              1                 SELECT * FROM dbo.Member WHERE Lastname = 'Tripper'; 
Adhoc         Compiled Plan         24576               2              1                 SELECT * FROM dbo.Member WHERE Lastname = 'Tripped'; 
Adhoc         Compiled Plan         24576               2              1                 SELECT * FROM dbo.Member WHERE Lastname = 'Tripp'; 

The first thing to notice is that each of these queries has a *different* statement in the adhoc plan cache. The second is to recognize that each of these statements take 24K in the plan cache. If we were to change to "optimize for ad hoc workloads" then things might be a bit better:

sp_configure 'optimize for ad hoc workloads', 1
go
reconfigure
go
DBCC
FREEPROCCACHE
go
SELECT * FROM dbo.Member WHERE Lastname = 'Tripp';
go
SELECT
* FROM dbo.Member WHERE Lastname = 'Tripped'
;
go
SELECT
* FROM dbo.Member WHERE Lastname = 'Tripper'
;
go
SELECT
* FROM dbo.Member WHERE Lastname = 'Tripps'
;
go
SELECT
* FROM dbo.Member WHERE Lastname = 'Falls'
;
go

SELECT cp.objtype, cp.cacheobjtype, cp.size_in_bytes, cp.refcounts, cp.usecounts, st.text --, *
FROM sys.dm_exec_cached_plans AS cp
CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) AS st
WHERE cp.objtype IN ('Adhoc', 'Prepared'
)
        AND st.text LIKE '%from dbo.member%' 
        AND st.text NOT LIKE
'%SELECT cp.objecttype%'
ORDER BY cp.objtype
go

objtype       cacheobjtype         size_in_bytes   refcounts  usecounts     text
Adhoc         Compiled Plan Stub  320                 2              1                 SELECT * FROM dbo.Member WHERE Lastname = 'Falls'; 
Adhoc         Compiled Plan Stub  320                 2              1                 SELECT * FROM dbo.Member WHERE Lastname = 'Tripps'; 
Adhoc         Compiled Plan Stub  320                 2              1                 SELECT * FROM dbo.Member WHERE Lastname = 'Tripper'; 
Adhoc         Compiled Plan Stub  320                 2              1                 SELECT * FROM dbo.Member WHERE Lastname = 'Tripped'; 
Adhoc         Compiled Plan Stub  320                 2              1                 SELECT * FROM dbo.Member WHERE Lastname = 'Tripp'; 

But, this is not the only issue. Even though each statement will only be in the cache once and will only take 320 bytes (instead of 24K) the cumulative effect of this might be significant. So, how can you see this? If you try to aggregate over these statements then you'll only be able to use the first 40-50 characters while that works here it definitely won't work for more complex statements. Until - SQL Server 2008. In 2008, you can use the query_hash to find similar statements and their overall affect on the query cache.

SELECT qs2.query_hash AS [Query Hash]
        , SUM(qs2.size_in_bytes) AS [Total Cache Size]
        , SUM(qs2.total_worker_time)/SUM(qs2.execution_count) AS [Avg CPU Time]
        , SUM(qs2.total_elapsed_time)/SUM(qs2.execution_count) AS [Avg Duration]
        , COUNT(*) AS [Number of plans] 
        , MIN(qs2.statement_text) AS [Statement Text]
FROM (SELECT qs.*, cp.size_in_bytes, SUBSTRING(ST.text, (QS.statement_start_offset/2) + 1, ((CASE statement_end_offset WHEN -1 THEN DATALENGTH(st.text) 
                ELSE QS.statement_end_offset END - QS.statement_start_offset)/2) + 1) AS statement_text 
                FROM sys.dm_exec_cached_plans AS cp 
                        JOIN sys.dm_exec_query_stats AS qs ON cp.plan_handle = qs.plan_handle
                CROSS APPLY sys.dm_exec_sql_text(QS.sql_handle) as ST) as qs2
GROUP BY qs2.query_hash ORDER BY [Avg Duration] DESC;

This is incredibly important because this will give you the information to determine what SIMILAR query (or queries) are consuming the cache and which ones have the highest cumulative effect. And, if you see a statement that is consistent and can be optimized using indexes (and consistently using the same indexes) what you might try first is "forced parameterization." This is a DATABASE-level option (available in 2005+) that significantly improves but still doesn't guarantee that a statement will be parameterized. See the Books Online for Forced Parameterization (SQL Server 2008 Forced Parameterization and SQL Server 2005 Forced Parameterization) for the rules on when SQL Server parameterizes. However, there are many statements that will. If you have A LOT of plan cache bloat AND you think your plans might benefit from forced parameterization - this is definitely something you can try.

Again, the good news here is that the QP will try harder to parameterize but there are still many cases where it won't. In this case, SQL Server parameterizes this statement (it becomes a Prepared statement) as:

(@0 varchar(8000))select * from dbo . Member where Lastname = @0

You can see this through the following query/results:

SELECT st.text, cp.objtype, cp.cacheobjtype, cp.size_in_bytes, cp.refcounts, cp.usecounts, qp.query_plan --, *
FROM sys.dm_exec_cached_plans AS cp
        CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) AS st
        CROSS APPLY sys.dm_exec_query_plan(plan_handle) AS qp
WHERE cp.objtype IN ('Adhoc', 'Prepared'
)
        AND st.text LIKE
'%member%'
ORDER BY cp.objtype
go

text objtype cacheobjtype size_in_bytes refcounts usecounts query_plan
SELECT * FROM dbo.Member WHERE Lastname = 'Falls';   Adhoc Compiled Plan 24576 2 1 <ShowPlanXML xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan" Version="1.1" Build="10.0.2531.0"><BatchSequence><Batch><Statements><StmtSimple StatementText="SELECT * FROM dbo.Member WHERE Lastname = 'Falls';&#xD;&#xA;" StatementId="1" StatementCompId="1" StatementType="SELECT" ParameterizedPlanHandle="0x0600050088DDF510B8C0630A000000000000000000000000" ParameterizedText="(@0 varchar(8000))select * from dbo . Member where Lastname = @0" /></Statements></Batch></BatchSequence></ShowPlanXML>
SELECT * FROM dbo.Member WHERE Lastname = 'Tripps';   Adhoc Compiled Plan 24576 2 1 <ShowPlanXML xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan" Version="1.1" Build="10.0.2531.0"><BatchSequence><Batch><Statements><StmtSimple StatementText="SELECT * FROM dbo.Member WHERE Lastname = 'Tripps';&#xD;&#xA;" StatementId="1" StatementCompId="1" StatementType="SELECT" ParameterizedPlanHandle="0x0600050088DDF510B8C0630A000000000000000000000000" ParameterizedText="(@0 varchar(8000))select * from dbo . Member where Lastname = @0" /></Statements></Batch></BatchSequence></ShowPlanXML>
SELECT * FROM dbo.Member WHERE Lastname = 'Tripper';   Adhoc Compiled Plan 24576 2 1 <ShowPlanXML xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan" Version="1.1" Build="10.0.2531.0"><BatchSequence><Batch><Statements><StmtSimple StatementText="SELECT * FROM dbo.Member WHERE Lastname = 'Tripper';&#xD;&#xA;" StatementId="1" StatementCompId="1" StatementType="SELECT" ParameterizedPlanHandle="0x0600050088DDF510B8C0630A000000000000000000000000" ParameterizedText="(@0 varchar(8000))select * from dbo . Member where Lastname = @0" /></Statements></Batch></BatchSequence></ShowPlanXML>
SELECT * FROM dbo.Member WHERE Lastname = 'Tripped';   Adhoc Compiled Plan 24576 2 1 <ShowPlanXML xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan" Version="1.1" Build="10.0.2531.0"><BatchSequence><Batch><Statements><StmtSimple StatementText="SELECT * FROM dbo.Member WHERE Lastname = 'Tripped';&#xD;&#xA;" StatementId="1" StatementCompId="1" StatementType="SELECT" ParameterizedPlanHandle="0x0600050088DDF510B8C0630A000000000000000000000000" ParameterizedText="(@0 varchar(8000))select * from dbo . Member where Lastname = @0" /></Statements></Batch></BatchSequence></ShowPlanXML>
SELECT * FROM dbo.Member WHERE Lastname = 'Tripp';   Adhoc Compiled Plan 24576 2 1 <ShowPlanXML xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan" Version="1.1" Build="10.0.2531.0"><BatchSequence><Batch><Statements><StmtSimple StatementText="SELECT * FROM dbo.Member WHERE Lastname = 'Tripp';&#xD;&#xA;" StatementId="1" StatementCompId="1" StatementType="SELECT" ParameterizedPlanHandle="0x0600050088DDF510B8C0630A000000000000000000000000" ParameterizedText="(@0 varchar(8000))select * from dbo . Member where Lastname = @0" /></Statements></Batch></BatchSequence></ShowPlanXML>
(@0 varchar(8000))select * from dbo . Member where Lastname = @0 Prepared Compiled Plan 40960 7 10 <ShowPlanXML xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan" Version="1.1" Build="10.0.2531.0"><BatchSequence><Batch><Statements><StmtSimple StatementText="(@0 varchar(8000))select * from dbo . Member where Lastname = @0" StatementId="1" StatementCompId="1" StatementType="SELECT" StatementSubTreeCost="0.00657038" StatementEstRows="1" StatementOptmLevel="FULL" QueryHash="0xDEF0805C7B74A31E" QueryPlanHash="0x785C8E320D853B97" StatementOptmEarlyAbortReason="GoodEnoughPlanFound"><StatementSetOptions QUOTED_IDENTIFIER="true" ARITHABORT="true" CONCAT_NULL_YIELDS_NULL="true" ANSI_NULLS="true" ANSI_PADDING="true" ANSI_WARNINGS="true" NUMERIC_ROUNDABORT="false" /><QueryPlan CachedPlanSize="24" CompileTime="1" CompileCPU="1" CompileMemory="136"><RelOp NodeId="0" PhysicalOp="Nested Loops" LogicalOp="Inner Join" EstimateRows="1" EstimateIO="0" EstimateCPU="4.18e-006" AvgRowSize="173" EstimatedTotalSubtreeCost="0.00657038" Parallel="0" EstimateRebinds="0" EstimateRewinds="0"><OutputList><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="member_no" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="lastname" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="firstname" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="middleinitial" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="street" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="city" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="state_prov" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="country" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="mail_code" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="phone_no" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="photograph" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="issue_dt" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="expr_dt" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="region_no" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="corp_no" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="prev_balance" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="curr_balance" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="member_code" /></OutputList><NestedLoops Optimized="0"><OuterReferences><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="member_no" /></OuterReferences><RelOp NodeId="1" PhysicalOp="Index Seek" LogicalOp="Index Seek" EstimateRows="1" EstimateIO="0.003125" EstimateCPU="0.0001581" AvgRowSize="20" EstimatedTotalSubtreeCost="0.0032831" TableCardinality="10000" Parallel="0" EstimateRebinds="0" EstimateRewinds="0"><OutputList><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="member_no" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="lastname" /></OutputList><IndexScan Ordered="1" ScanDirection="FORWARD" ForcedIndex="0" ForceSeek="0" NoExpandHint="0"><DefinedValues><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="member_no" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="lastname" /></DefinedValue></DefinedValues><Object Database="[credit]" Schema="[dbo]" Table="[member]" Index="[IX_Member_Lastname]" IndexKind="NonClustered" /><SeekPredicates><SeekPredicateNew><SeekKeys><Prefix ScanType="EQ"><RangeColumns><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="lastname" /></RangeColumns><RangeExpressions><ScalarOperator ScalarString="[@0]"><Identifier><ColumnReference Column="@0" /></Identifier></ScalarOperator></RangeExpressions></Prefix></SeekKeys></SeekPredicateNew></SeekPredicates></IndexScan></RelOp><RelOp NodeId="3" PhysicalOp="Clustered Index Seek" LogicalOp="Clustered Index Seek" EstimateRows="1" EstimateIO="0.003125" EstimateCPU="0.0001581" AvgRowSize="161" EstimatedTotalSubtreeCost="0.0032831" TableCardinality="10000" Parallel="0" EstimateRebinds="0" EstimateRewinds="0"><OutputList><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="firstname" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="middleinitial" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="street" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="city" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="state_prov" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="country" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="mail_code" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="phone_no" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="photograph" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="issue_dt" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="expr_dt" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="region_no" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="corp_no" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="prev_balance" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="curr_balance" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="member_code" /></OutputList><IndexScan Lookup="1" Ordered="1" ScanDirection="FORWARD" ForcedIndex="0" ForceSeek="0" NoExpandHint="0"><DefinedValues><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="firstname" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="middleinitial" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="street" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="city" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="state_prov" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="country" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="mail_code" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="phone_no" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="photograph" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="issue_dt" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="expr_dt" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="region_no" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="corp_no" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="prev_balance" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="curr_balance" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="member_code" /></DefinedValue></DefinedValues><Object Database="[credit]" Schema="[dbo]" Table="[member]" Index="[member_ident]" TableReferenceId="-1" IndexKind="Clustered" /><SeekPredicates><SeekPredicateNew><SeekKeys><Prefix ScanType="EQ"><RangeColumns><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="member_no" /></RangeColumns><RangeExpressions><ScalarOperator ScalarString="[credit].[dbo].[member].[member_no]"><Identifier><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="member_no" /></Identifier></ScalarOperator></RangeExpressions></Prefix></SeekKeys></SeekPredicateNew></SeekPredicates></IndexScan></RelOp></NestedLoops></RelOp><ParameterList><ColumnReference Column="@0" ParameterCompiledValue="'Tripp'" /></ParameterList></QueryPlan></StmtSimple></Statements></Batch></BatchSequence></ShowPlanXML>

Ideally, you'll end up with less wasted cache because all adhoc statements from here - will use the Prepared version of the statement. This also means that they'll run with a compiled plan (so you'll also save compile time) and - if the plans are consistent then all of this is good. However, if other values (for example 'Smith') require a completely different plan - then this could actually be worse (this is the "It Depends" part). But, if you still have statements that do have consistent execution plans (because you know the data and/or you know exactly how this particular data is being used) then EITHER sp_executesql or a stored procedure would effectively force the parameterization as well and it would say to me that you really understand your data/application. But, if you're wrong... anytime you force the parameterize (and increase re-use of a plan - you could end up forcing the re-use of a bad (and possibly REALLY, REALLY, REALLY, REALLY, REALLY, REALLY, REALLY, REALLY, REALLY bad plan) :)).

So, while you solve one problem - you could create another one (see the blog post titled: Using the OPTION (RECOMPILE) option for a statement for more information). If the plans are NOT consistent then forcing a plan (through either forced parameterization, sp_executesql OR a regular stored procedure) can result in poor performance. When writing the stored procedure (and through good testing processes), you'll be able to create better and more effective code. This is truly my preferred method. And, I'm just about to wrap up my series with roughly 3 more posts (at least that's what I currently have planned but you guys have certainly been steering me in this direction for a bit :-)).

Thanks for reading,
kt

In my post describing the new SQL Server 2008 configuration option [optimize for adhoc workloads] titled: Plan cache and optimizing for adhoc workloads, I asked for you to send me your plan cache numbers. In addition to the comments on the post, I received a lot of feedback in email (and lots of comments - and even a related blog posts). The general state of the cache for many was that only 10s of MB were wasted, for a couple it was 100s MB and for a few it was GBs of memory - completely wasted. I've taken some of the worst numbers and compiled them below:

 CacheType Total Plans  Total MBs  Avg Use Count   Total MBs - USE Count 1   Total Plans - USE Count 1  % of Total
Prepared

1,541

7,925.85

1,245

7,858.43

1,448

99.15

Adhoc

86,624

8,592.48

5

7,332.59

76,145

85.34

Prepared

148,527

7,428.82

444

5,074.25

94,851

68.30

Adhoc

63,471

4,565.00

1,328

4,161.54

50,737

91.16

Adhoc

1,358

2,704.27

3,676

2,673.25

425

98.85

Adhoc                

48,140

6,233.64

173

2,106.05

20,493

33.79

Prepared

18,639

2,590.82

127

2,027.72

13,614

78.27

Prepared

18,010

3,237.18

354

1,944.48

9,561

60.07

Adhoc

17,392

2,417.38

100

1,787.33

7,741

73.94

Prepared

6,276

1,875.71

59,370

1,560.66

4,020

83.20

Prepared

16,832

2,844.61

10,829

1,507.63

5,732

53.00

Prepared

3,075

1,681.56

46,262

1,427.33

604

84.88

Adhoc

29,047

1,923.84

39

1,362.09

22,827

70.80

Adhoc 

17,028

1,255.60

185

1,160.85

15,845

92.45

Adhoc

118,838

2,073.25

2,813

1,128.47

41,212

54.43

Adhoc                

13,895

1,300.72

11

875.45

9,351

67.31

Prepared 

8,266

476.90

31

462.82

8,031

97.05

Adhoc 

8,865

507.63

6

293.98

4,738

57.91

Adhoc 

10,066

401.54

9

282.87

5,717

70.45

Adhoc 

18,676

506.66

12,463

260.48

7,938

51.41

Adhoc 

2,113

223.62

142

207.36

2,024

92.73

Adhoc 

10,107

240.27

76

157.51

6,073

65.55

Prepared            

1,888

154.91

4

125.08

1,438

80.74

Prepared            

1,561

195.27

14,761

123.70

1,089

63.35

Prepared 

742

125.30

14,517

109.39

593

87.30

Prepared 

22,566

838.79

357

99.87

505

11.91

Adhoc   

478

109.01

2

92.62

423

84.96

Adhoc 

2,917

173.43

26,651

82.73

1,313

47.70

Adhoc 

2,906

122.81

24

77.08

1,911

62.76

Prepared 

609

98.06

106

75.59

360

77.09

Adhoc   

638

75.74

24

52.04

414

68.71

Prepared             

322

23.12

8

17.82

275

77.09

First, some of these do have the "optimize for adhoc workloads" option set. However, not too many. And, for those that do have this set, the single-use plan cache was for the amount of space taken by the compile plan stubs - not actual execution plans. Typically, these are significantly smaller (typically only a matter of bytes) rather than 10s-100s of kilobytes. So, while the amount of cache wasted is reduced to only a fraction of what it would have been without the option set, there's still the potential for a lot of cache wasted.

Another thing to notice is that the only interesting nmumbers came from these two types of cache: Adhoc and Prepared. This is to be expected - the numbers for single-use plan cache are worst for the scenarios where statements are auto parameterized and "prepared" but not often re-used (where the statements submitted vary wildly because of different client applications/executions). But, while it is expected, what can you do about this? Unfortunately, not much - without changing the client application interface to the data (meaning - use stored procedures).

However, what you might benefit from (even if you have the option set) is periodically clearing the cache. However, you don't necessarily need to clear all of the cache - you can just clear the Adhoc plan cache.

To clear just the "SQL Plans" from the plan cache, use:

DBCC FREESYSTEMCACHE('SQL Plans')

If you want to clear all of the cache, you can use:

DBCC FREEPROCCACHE

And, in SQL Server 2005/2008 there are a host of combinations that will allow you to clear as much or as little as you'd like - down to just a single plan (using the plan_handle) in SQL Server 2008. For more information, check out the BOL topic: DBCC FREEPROCCACHE.

But, if you want to be more clever about it, you might want to automate the process of clearing these 'SQL Plans' when they're wasting more than a certain percentage of the memory that SQL Server's using OR more simply, once the single-use plan cache reaches a certain size (depending on the amount of memory that you have you might choose something as low as 500MB or possibly as high as 2-3GB). To help you out, I decided to write the queries to do this and give you a few options from which to choose. And, it's definitely easier to write more interesting code in 2008 because of some of the new memory-related DMVs (specifically: sys.dm_os_sys_memory and sys.dm_os_process_memory). However, you can still do this fairly easily in both editions.

So, let me summarize:
1) If you're running SQL Server 2008 and you have cache being wasted by single-use plans, be sure to use the new "optimize for adhoc workloads."
2) If you still find that you're wasting 100s of MB or GB of cache, consider creating a job that programmatically checks cache and then clears the 'SQL Plans' from cache based on one of the options/code below.

  1.  
    1. This is probably the easiest and most simplfied option.
    2. This is probably overkill for most folks but probably won't hurt either, especially if you have a lot of other types of cache being wasted by single-use plans.
    3. This is probably the most interesting and uses some of the new DMVs to see how much of the ACTUAL working cache is going to single-use plans. However, this is only for 2008. Could I write it for 2005, yes, but it's not really worth it. I'd just go with option 1 (or possibly 2) for 2005.

1. Clearing *JUST* the 'SQL Plans' based on *just* the amount of Adhoc/Prepared single-use plans (2005/2008):

DECLARE @MB decimal(19,3)
        , @Count bigint
        , @StrMB nvarchar(20)

SELECT @MB = sum(cast((CASE WHEN usecounts = 1 AND objtype IN ('Adhoc', 'Prepared') THEN size_in_bytes ELSE 0 END) as decimal(12,2)))/1024/1024 
        , @Count = sum(CASE WHEN usecounts = 1 AND objtype IN ('Adhoc', 'Prepared') THEN 1 ELSE 0 END)
        , @StrMB = convert(nvarchar(20), @MB)
FROM sys.dm_exec_cached_plans

IF @MB > 10
        BEGIN
                DBCC FREESYSTEMCACHE('SQL Plans') 
                RAISERROR ('%s MB was allocated to single-use plan cache. Single-use plans have been cleared.', 10, 1, @StrMB)
       
END
ELSE
        BEGIN
                RAISERROR ('Only %s MB is allocated to single-use plan cache - no need to clear cache now.', 10, 1, @StrMB)
               
-- Note: this is only a warning message and not an actual error.
        END
go

2. Clearing *ALL* of your cache based on the total amount of wasted by single-use plans (2005/2008):

DECLARE @MB decimal(19,3)
        , @Count bigint
        , @StrMB nvarchar(20)

SELECT @MB = sum(cast((CASE WHEN usecounts = 1 THEN size_in_bytes ELSE 0 END) as decimal(12,2)))/1024/1024 
        , @Count = sum(CASE WHEN usecounts = 1 THEN 1 ELSE 0 END)
        , @StrMB = convert(nvarchar(20), @MB)
FROM sys.dm_exec_cached_plans

IF @MB > 1000
        DBCC FREEPROCCACHE
ELSE
        RAISERROR ('Only %s MB is allocated to single-use plan cache - no need to clear cache now.', 10, 1, @StrMB)
go

3. Stored Procedure to report/track + logic to go into a job based on percentage OR MB of wasted cache (2008 only):

This I put into a script that you can play with here (sp_SQLskills_CheckPlanCache.sql (3.67 kb)).

Enjoy!
kt

 

Paul and I enjoyed Dublin so much in 2008 and 2009 that we're going back for two weeks this time - and teaching three classes while we're there! We'll be working with our fabulous friends (Carmel, Sandra, and Bob) at Prodata and the event will be hosted by the nice folks at Microsoft Ireland. We'll also be doing a user group presentation, as usual, and you can register for this on the MTUG website. Not sure what we'll talk about at the user group yet (probably SQL Server related :-) - suggestions?

If you register for any of the classes, be sure to use this special discount code "KTB" which gets you 5% off and we'll know you saw the class on my blog.

We decided to structure the first class as a 4-day version of our popular Immersion Event and the following week do two deeper classes focusing on performance and disaster recovery. And, even with as much content as we've recorded and mentioned online, our "live" Immersion Events are more in-depth and interactive. You'll get a lot of information in a relatively short period of time and you'll be able to go back and immediately apply it. We base our content on what really works in production and what we've learned from actual implementations/architectures.

We've taught several classes this year already and had rave reviews of the content, our knowledge, and presentation style. Check them out for yourself (with quotes extracted from their blog posts):

  • Greg Gonzalez, President and CEO of SQL Sentry wrote a long blog post about the Immersion Event he attended in Boston in April:
    • "When it comes to the instructors themselves, Kimberly and Paul simply have no equal.  Not only are they both ultimate authorities, but they have endless enthusiasm about the material, and spot on delivery.  If either ever got tired they never showed it, even after going all day and all week.  We witnessed countless demos over the course of the week, some extremely involved, multi-step processes, and I can’t recall a one that didn’t go the way it was supposed to."
    • "These are not beginner presenters, and they put an extreme amount of preparation and attention to detail into everything that they do.  Completely, utterly professional."
    • "You might think that with this extreme level of skill comes extreme levels of egotism and lack of patience.  Nothing could be further from the truth. ... They simply know how to teach, and are approachable, humble, and patient."
  • Fellow-MVP Aaron Bertrand discussing the same Immersion Event:
    • "The additional insight around what's printed in the book or in Books Online is invaluable; and the experience Paul and Kimberly have had with real live customers yields a lot more information and things to watch out for than you'd ever get from the documentation alone."
  • Newly-minted MCM Robert Davis of Microsoft explaining how the previous training he'd received from us at Microsoft helped him pass his MCM certification (which we also teach):
    • "I try to take the SQLSkills classes when I can. I highly recommend to everyone that they take advantage of any training opportunities that they see from this group."
  • Newly-minted MCM Brent Ozar of Quest summing up the first week of the recent MCM class that we taught:
    • "To call them good trainers is an epic understatement.  They know how to deliver very, very technical material in ways that illustrate it well.  I had to stop Paul at one point and ask him how long it took to build a particular slide because the animations were so good at conveying a hard-to-describe process."

We also have a few testimonials and quotes from past customers and finally, here's the press release from last year's class. Phew!

4-Day Immersion Event, June 28-July 1 2010, Dublin, Ireland: See the Prodata site for full details and for registration links and options.

Day 1: SQL Server Internals

  • On-disk structures: how the data is stored
    • Records, pages, allocation bitmaps
  • Index internals: how the data is organized
  • Logging and recovery: how the data is protected
    • What is logging?
    • How recovery works
    • How the transaction log is structured
    • Recovery models 

Day 2: Designing for Performance

  • Choosing the RIGHT Data Type
    • Understanding data types
    • Character data, overflow and LOB
    • LOB considerations
    • Date and time data types
    • Heterogeneous data types
    • Sparse columns
    • Application inconsistencies in types
  • Table & Index Partitioning Strategies
    • Concepts/motivating factors in Partitioning
    • SQL Server 2000+ Partitioned Views
    • SQL Server 2005+ Partitioned Tables
    • Partitioning Design Techniques Combined
    • Partitioned and non-partitioned Indexes
    • Implementing the Sliding Window Scenario

Day 3: Indexing for Performance

  • Data Access
    • Data Access Patterns
    • Covering
    • Using INCLUDE (SQL Server 2005+)
    • Using Filters (SQL Server 2008+)
  • Indexing Strategies
    • Indexing for AND
    • Indexing for OR
    • Indexing for Joins
    • Indexing for Aggregates
    • Indexed Views (Overview)

Day 4: Essential Database Maintenance

  • Data and log file provisioning and management
    • Database layout considerations
    • Growing and shrinking
    • Instant initialization
  • Tempdb
  • Index and statistics maintenance
  • Using backup and restore
  • Consistency checking

Performance Optimization Masterclass, July 5-6 2010, Dublin, Ireland: See the Prodata site for full details and for registration links and options.

(This is *NOT* duplicate content to the 4-day class - it's deeper knowledge and requires the equivalent knowledge to the 4-day class to get the best out of it.)

Part 1: Optimizing Procedural Code

  • Module 1: Batches and Plan Cache
    • Statement execution and plan caching
    • sp_executesql and dynamic string execution
  • Module 2: Optmizing Procedural Code
  • Understanding stored procedures
    • Understanding recompilation
    • Creating an optimal plan
    • Forcing plans & plan guides

Part 2: Resource Monitoring and Troubleshooting

  • Module 1: Waits & Queues
    • What are waits and queues?
    • What should you be looking for?
  • Module 2: Locking / blocking
    • Locking overview
    • Examples of blocking scenarios
    • Troubleshooting blocking
    • Blocking avoidance strategies
  • Module 3: Resource governor
  • Module 4: Extended Events
  • Module 5: Performance Data Collection

Part 3: Are Your Index Strategies Working?

  • Module 1: index cleanup
    • Index usage and consolidation
  • Module 2: Index health
    • What is fragmentation
    • Detecting fragmentation
    • Removing fragmentation
  • Module 3: Missing indexes
    • DTA

Disaster Recovery Masterclass, July 7-8 2010, Dublin, Ireland: See the Prodata site for full details and for registration links and options.

(This is *NOT* duplicate content to the 4-day class - it's deeper knowledge and really requires the equivalent knowledge to the 4-day class to get the best out of it.)

Part I: Fundamentals

  • Module 1: Internals
    • Introduction to database structures
    • Introduction to logging and recovery
    • How recovery models affect disaster recovery
  • Module 2: Data Storage
    • Choosing a RAID level
    • Storage tuning for fast recovery

Part II: Strategy

  • Module 3: Planning a disaster recovery strategy
    • Requirements and limitations
    • Testing
  • Module 4: Architecting for recoverability
    • Partial database availability
    • Planning a backup strategy
    • Database snapshots
    • Utilizing high-availability technologies

Part III: Practicalities

  • Module 5: Recovering from a disaster with backups
    • Tail-of-the-log backups
    • Determining restore sequence
    • Point-in-time restore
    • Piecemeal restore
    • System databases
  • Module 6: Recovering from a disaster without backups
    • Interpreting DBCC CHECKDB output
    • Using database repair
    • Using EMERGENCY mode

We really hope to see you at one or more of these classes! And, if you're interested in attending two or more classes - special discounts are available. Additionally, we're also offering discounts to folks that attended last year's Dublin Immersion Event - so that they can continue where they left of and add on one or both of the 2-day classes offered in the second week. Feel free to shoot either of us an email [Kimberly OR Paul at SQLskills dot com] OR use one of the contact pages on the ProdData site. You can also find this event on LinkedIn here.

We look forward to seeing you there!
kt (and pr)

Categories:
Events | SQL Server 2008

I mentioned that servers receiving a lot of dynamic constructs (typically those generated by client SQL generaters/ORM/LINQ, etc.) can start to consume too much plan cache and have problems with "single-use plans" in my last post titled: Statement execution and why you should use stored procedures. I also mentioned that SQL Server 2008 has an option/feature specifically to help reduce the bloat on the plan cache by only storing a query plan hash the first time a plan is created.

First - let's check to see how your plan cache is currently allocated:

SELECT objtype AS [CacheType]
        , count_big(*) AS [Total Plans]
        , sum(cast(size_in_bytes as decimal(12,2)))/1024/1024 AS [Total MBs]
        , avg(usecounts) AS [Avg Use Count]
        , sum(cast((CASE WHEN usecounts = 1 THEN size_in_bytes ELSE 0 END) as decimal(12,2)))/1024/1024 AS [Total MBs - USE Count 1]
        , sum(CASE WHEN usecounts = 1 THEN 1 ELSE 0 END) AS [Total Plans - USE Count 1]
FROM sys.
dm_exec_cached_plans
GROUP BY objtype
ORDER BY [Total MBs - USE Count 1]
DESC
go

This statement will show you how much of your cache is allocated to single use plans... and, I'd love to hear what your numbers are... this query works in 2005 and 2008; however, the primary solution I'm describing here (optimize for adhoc workloads) will only work in SQL Server 2008.

If you have a lot of your cache going to plans that are only executed once, then it's time to clean up the cache and take better advantage of it with plans that are more consistent and more stable. Ideally, this means using more stored procedures and writing these stored procedures effectively for better performance (I'm currently in the midst of doing this in my Optimizing Procedural Code category). However, if you absolutely must use a lot of adhoc SQL, then you should consider upgrading and turning on this new option. It's an advanced configuration option so you won't see it until you "show advanced options" and it's set using sp_configure. There have been some other really good posts out there on how to use this and what this is so I'm just going to bring together some great resources for you to read. The most important post to read (and it's especially important for those on you on versions of SQL Server 2000 or SQL Server 2005 RTM/SP1 [er... why aren't you on SP2?]) is that plan cache can get out of control. 2005 SP2 and 2008 reduce the total size but there no upper limit (which is again - a GREAT reason for the addition of "optimize for adhoc workloads"). Kalen Delaney talks about how things really work in her SQL Server 2005 SP2 post titled: Did You Know? SP2 does NOT limit the amount of plan cache you can have (key word there is LIMIT). Again, SQL Server doesn't set an upper limit but it does [drastically] reduce the total size that's possible (as of SP2). As for even more details on plan caching, recompilation and SQL Server 2008's better cache control - check out Greg Low's (blog|twitter) whitepaper titled: Plan Caching in SQL Server 2008. Taken STRAIGHT from the first section of the whitepaper:

SQL Server 2008 and SQL Server 2005 SP2
* 75% of visible target memory from 0-4GB + 10% of visible target memory from 4Gb-64GB + 5% of visible target memory > 64GB
 
SQL Server 2005 RTM and SQL Server 2005 SP1
* 75% of visible target memory from 0-8GB + 50% of visible target memory from 8Gb-64GB + 25% of visible target memory > 64GB
 
SQL Server 2000
* SQL Server 2000 4GB upper cap on the plan cache 

Finally, lots of additional posts on this topic will give you even more details:

However, I'm still really interested in seeing your numbers from the query above - let me know!

NEW/ADDITIONAL REQUEST: Let me know your max server memory setting as well as the total memory available on the box?

Thanks for reading!
kt

Well, back in April of 2009 I blogged Social networking, keeping up with friends/family and getting more tech info! and I essentially said that I was going to limit myself to only Facebook and Twitter. Low and behold, I've succumbed to yet another network - LinkedIn. The SQL Server Community is *incredibly* strong online: in forums, twitter groups (such as #sql, #sqlserver, #sqlpass, #sqlhelp) and in many of these online commnunities. And, with so many events and so little time to find where everyone is at and when - these social tools can really help to keep you informed.

So, as Paul just blogged a few minutes ago "Just when I thought I'd discovered all the online networks that could successfully prevent me from doing useful work, I re-discovered LinkedIn this morning after Kimberly got sucked in, dragged me in as well. It makes sense I suppose as a more serious way to keep in touch with people we met at conferences and online." And, so it goes, here are my accounts if you want to keep in touch through other methods:

And, as usual - you can find me here for most of my techie blog posts and as many resource posts as I can create! And, on SQL Server Magazine on our new blog for SQL Server Questions Answered (I just blogged about this earlier today here).

Cheers!
kt

PS - That's *REALLY* it. I will NOT join any other social networks... ever. (well, until someone convinces me of why I need to :)). Never say never (er, except for the database option: autoshrink! ;-))

Categories:
Personal | Social Networking

In March, Paul and I began blogging in a new column for SQL Server Magazine. This column is solely a Q&A based column... yes, we'll still write feature articles and yes, we'll still be blogging on each of our own blogs (in fact, have you noticed that I've actually blogged more this month than almost any other since I started blogging 6+ years ago...).

Anyway, we've been answering some of the most common questions we receive and we're putting lots of details in our answers - even code to help you solve the problem. Below is a list of our first few posts and there's more to come... well, if you keep asking us questions!

Submit your questions through the link on the left-hand side of the page (where it says: EMAIL PAUL AND KIMBERLY).  

Enjoy!
kt

I'm speaking at a few user groups over the next couple of months... are you local?

Vienna, Austria - April 29, 2010

  • Presentation: SQL Server 2005/2008: Index Internals
  • Presenter: Kimberly L. Tripp (yep, just me!)
  • Information/details: Vienna SQL Server User Group 

Houston, Texas (the Woodlands) - May 18, 2010

Dublin, Ireland - June 29, 2010

I/we hope to see you at one of these sessions!
kt

Categories:
Events

OK, so, this is where we're at...

When a client application makes a request to SQL Server there are a few ways in which they can do it:

  1. Submit an adhoc transact-SQL statement
    • This statement can be parameterized and the parameterized statement can be deemed:
      • Safe: These are statements where the optimizer has evaluated the plan and feels that changes to values will not affect the execution plan. And, there are quite a few rules that you must meet for this to be the case. And, as a result, this is NOT something I really rely on and/or aim for (for optimization - I'd rather use stored procedures). You can monitor this through the Perfmon counter: SQLServer:SQL Statistics (Safe Auto-Params/sec).
      • Unsafe: These are statements where the optimizer feels that subsequent values will affect the execution plan. And, for adhoc statements many will be deemed unsafe. You can monitor this through the Perfmon counter: SQLServer:SQL Statistics (Unsafe Auto-Params/sec).
    • This statement might not end up being parameterized (this is not likely and the "failed parameterized statements" count should be low - you can monitor it through the Perfmon counter: SQLServer:SQL Statistics (Failed Auto-Params/sec).
  2. Submit a statement via sp_executesql that parameterizes the statement through defined parameters - this creates a plan that is neither safe nor unsafe but is ALWAYS an exact match for subsequent executions (even when the parameter values change). This is where I generally have problems with it...
  3. Execute a stored proceudre which has one or more of the above in it and/or DSE in it. This is where I think things start to get interesting and this is really the reson for why I started this series on optimizing procedural code.

First, and foremost, I believe that stored procedures are a SIGNIFICANTLY better way to develop high-end applications that absolutely must scale. Don't get me wrong, I understand the benefits of some of the tools out there that can generate your SQL and allow more rapid development of your application. And, if the application isn't trying to handle thousands of inserts/sec then this might work out really well. However, you still have to be careful. Many of these applications (that auto-generate SQL code) rely (very heavily in some cases) on sp_executesql and this can result in poor performance. Additionally, when there's A LOT of adhoc SQL then you can end up with a lot of your cache going to "single-plan use" where quite a bit of it is wasted (NOTE: there's a SQL Server 2008 configuration option [called "optimize for adhoc workloads"] that you really need to know about IF you have a lot of adhoc statements. This can significantly reduce the amount of cache that's wasted on plans that only execute once... I'll do a post on that feature NEXT). Again, while there are quite a few benefits the performance problems can become SEVERE. OK, I've definitely said this before but why am I rehashing this?

Because this is where we're at - we've looked at some of the problems with DSE in my post titled: Little Bobby Tables, SQL Injection and EXECUTE AS. Then, we started to look at EXEC and sp_executesql - how are they different? in the second post. Finally, I started to show *statement-level* recompilation using OPTION (RECOMPILE) in the post titled: Using the OPTION (RECOMPILE) option for a statement. The real reason for why I started this series is to prove: where, why and how stored procedures really shine (we're getting close!).

Reasons to use stored procedures:

  • Centralization - Sharing common code between applications asnd only need to make your changes ONCE! And, there's no possibility of different business rules being enforced in one application and not another and/or for incorrect business rules to be "out there" in your applications.
  • Simplification - Users should not need to know your schema. They shoudn't care that you've denormalized something and they should reap the benenfits through YOUR creating an access method that is simple and works even if you end up changing back and normalizing. The database schema should be isolated from the users...
  • Change control - You should be able to change your schema without breaking applications. This about this one a bit - this is EXACTLY the tactic that SQL Server takes with it's own internal tables. We access them trhough stored procedurs, views (aka catalog views) and functions. If they need to change the underlying schema then we don't need to worry - as long as they also make the sp, function or view return the same way it always has. This has complicated things over time. First, the SQL team has bascially decided that (for backward compatiblity) the sps will never handle new features (and, yes, this is HORRIBLY annoying (and, it's the reason for why I've rewriten sp_helpindex here [and, fyi, I have an even better/newer/shinier version coming soon]). And, they've had to add functions like DATABASEPROPERTYEX for the "extended" properties that were added around recovery models in SQL Server 2000. However, as long as the funtionality of the interface stays the same then our applications don't break. That's incredibly important and probably one of my favorite reasons to create some separation between the schema and the users (and/or interface).
  • Security - This is HUGE. And, it requires an understanding of object chaining and permissions. The general idea is that when objects are all within the same schema (and therefore have the same owner) that when execution rights are given on a stored procedure the caller does not need explicit permissions to the base objects (execution is essentially granted because the ownership chain is not broken). Another way to think about it is that your granting rights to execute a "process" and one that you have control (and ownershp) over all of the object involved. So, SQL Server only requires execute rights on the sp. However, this ONLY goes for explicity statements and not statements that are built dynamically. For those, the CALLER needs explicit permission OR the procedure needs to be created with EXECUTE AS. So, if you plan to use EXECUTE AS be sure to read my post titled: Little Bobby Tables, SQL Injection and EXECUTE AS.

But, more than anything else - you CAN use them for better performance. No, it's not solely because the might have an already compiled (and therefore optimized) plan that's available at runtime (yes, that CAN help) but compilation (or recompilation) can often be a small portion of runtime compared to the execution of a bad plan. And, this is where optimizing the procedural code can help. The bad news is that you have to know what you're doing. And, I could make some arguments that SQL Server could do some things better here but what I've been doing with this series is getting you familar with the options that will finally come together in the next a post titled: Stored Procedure Performance. (NOTE: I'll update this with a corrected title/link once the post is completed. I will try to get this this within the next couple of days.)

Thanks for reading!!
kt

This event is scheduled for TOMORROW (April 21) at 7:30am PDT. So, if you haven't already put this on your schedule - you might want to! And, if you miss it, a replay will be available for 30 days beginning one hour after the end of the call.

Here's the official press release. And, here's the official SQL Server Virtual Pressroom.

Enjoy,
kt

In addition to the SQL Server 2008 Database Infrastructure and Scalability content that Paul and I delivered for SQL Server 2008, and the content that our colleague Bob Beachemin (blog) delivered (Clinic 10164: Essential SQL Server 2008 for Developers), our colleague Stacia Misner (blog|twitter) also delivered content for SQL Server 2008 titled and available as: Clinic 10261: Introduction to SQL Server 2008 BI Solution Development.

From here - there are many directions in which you can go. For SQL Server 2008 R2 (which is primarily a BI release - but, there are a few other topics on which to focus), you can check out these FREE eLearning classes:

And, I'm sure more Microsoft eLearning resources will continue to become available on SQL Server 2008 R2 given that it will release next month!

Finally, last, but definitely not least - consider checking out the new eBook Introducing Microsoft SQL Server 2008 R2 written by our good friends Ross Mistry (blog|twitter) and Stacia Misner (blog|twitter).

Enjoy!
kt

In addition to the SQL Server 2008 Database Infrastructure and Scalability content that Paul and I delivered for SQL Server 2008, our colleague Bob Beachemin (blog) also delivered complementary content under the Developer track. This course is officially titled and available as: Clinic 10164: Essential SQL Server 2008 for Developers.

Additionally, this content has been updated for SQL Server 2008 R2. The entire "jumpstart" content has been managed by Roger Doherty (blog) and he created a very comprehensive page at Channel 9 for this which is titled and available as: Microsoft SQL Server 2008 R2 Update for Developers Training Course. He's done a few blog posts about demos and content so be sure to check out his blog as well!

And, I still have a bit more coming on resources/content!

Enjoy,
kt

When SQL Server 2008 was in beta, Paul and I worked on a project to create content that would "jumpstart" folks in learning and working with SQL Server 2008. We delivered these "new features" courses at various locations and finally, we recorded the content. All of this content is now available FOR FREE from Microsoft through their eLearning outlet. This course is officially titled and available as: Clinic 10259: SQL Server 2008: Database Infrastructure and Scalability.

There are 12 hours of lecture and demo content... lots of good stuff.

Modules & Lessons

Availability Enhancements

  • Module overview
  • Demo: Database Mirroring automatic page repair
  • Database Mirroring enhancements continued, automatic page repair, log stream
  • Backup compression and using Transparent Data Encryption (TDE) to encrypt data
  • Hot-add CPU, DBCC CHECKDB enhancements, failover clustering enhancements
  • Peer-to-peer replication enhancements
  • Demo: Peer-to-peer replication

Policy-Based Management and Multi-Server Administration

  • Module overview
  • Central management server
  • Demo: Central management server
  • Central management server metadata, multi-server script execution
  • Demo: Multi-server script execution
  • Policy-based management overview, conditions, facets and policies
  • Policy-based management demo: Facets and conditions
  • Applying facets and conditions to policies, policy enforcement
  • Demo: Applying facets and conditions to policies
  • Policy based management architecture and execution
  • Demo: Applying policies to multiple servers
  • Policy based management advanced topics

Performance Data Collection

  • Module overview, management data warehouse
  • Demo: Creating and configuring a management data warehouse
  • Data collection overview and architecture
  • Demo: Configuring system data collection on an instance
  • Demo: Transact-SQL custom data collection
  • Data collection reports and module summary

Resource Management and Troubleshooting Enhancements

  • Module Overview
  • Demo: Resource Governor
  • Extended events, predicates, actions, targets, types, maps, sessions

Security Enhancements

  • Module overview
  • Transparent data encryption
  • Demo: Transparent data encryption
  • Backups and transparent data encryption, transparent data encryption limitations
  • Integrated authentication enhancements, SQL Server 2008 auditing improvements
  • Demo: SQL Server 2008 all actions audited
  • Security Summary

Management Implications of New Features Part 1

  • Module overview, spatial indexes, integrated full-text search
  • Problems addressed by sparse columns
  • Demo: Sparse columns and column sets
  • Sparse columns and filtered indexes, best practices for sparse columns
  • Demo: Constructing tabular data sets with sparse columns
  • Limitations of sparse columns and interaction with other features
  • Overview of filtered indexes
  • Demo: Using filtered indexes with sparse columns
  • Filtered statistics, monitoring filtered indexes and statistics
  • Demo: Using filtered indexes for improved index coverage

Management Implications of New Features Part 2

  • Change tracking overview, change tracking best practices
  • Change data capture (CDC) overview
  • Demo: Change Data Capture
  • Blob storage with Filestream overview, enabling Filestream
  • Filestream security, filestream and transactions, monitoring filestream storage
  • Demo: Filestream manageability demo

Scalability Enhancements

  • Data compression overview, data compression trade-offs
  • Demo: Estimating data compression space savings
  • Estimating data compressing space savings, choosing row compression
  • Demo: Enabling compression and comparing performance
  • Enabling and disabling compression, data compression options and syntax
  • Service broker improvements, Service Broker conversation priority
  • Query optimizer enhancements, MERGE overview
  • Forcing query optimizer plans with plan guides
  • Demo: Using plan guides
  • Star join optimizer improvements
  • Partitioning enhancements overview, partition-aligned indexed views
  • Partitioning and lock escalation enhancements
  • Demo: Improving concurrency with partition lock escalation
  • Module review

There are a few more resources to mention. I've got a few more posts coming on this right away!

Enjoy!!
kt

Last week we completed another SQL Server Magazine/SQLConnections conference and already, it's time to start planning for the Fall show. Once again, Paul and I will be managing the SQL side of the conference and we'd like to invite you to submit abstracts for sessions at the conference. The Fall 2010 SQL Connections conference will be held in Las Vegas from November 1st through 5th (2010).

The conference will focus heavily on SQL Server 2008. Abstracts are still welcome on best practices for SQL Server 2005 and how to upgrade and migrate applications from SQL Server 200x to SQL Server 2008. However, we will consider all topics - from design to performance to troubleshooting to architectures to programming. Also, we're likely to closely follow our past conference format with three primary themes/tracks: Infrastructure and Scalability, Programming, and Business Intelligence. You should aim for the intermediate to advanced audience for any SQL Server 2005 materials and for SQL Server 2008 we'll consider any level - especially depending on the newness of the release. We're also looking at creating a "Foundations" track. It's titled: .NET Foundations but I think there's much that can be said for SQL topics and/or best practices in SQL for .NET developers.

For submitting session abstracts, please use this URL: 
http://www.deeptraining.com/devconnections/abstracts

The tool will be open from now until to April 26th (yes, just one week!), after which we won't accept any abstracts - no exceptions. If you have used this site before and have forgotten your password, you can have your secret word mailed to you. It's better to do this then to create a new account.

Please keep each abstracts under 200 words and in one paragraph. Please do not use bulleted items or line breaks, and please use a spell-checker. If you have an issue with the site, please follow up with an email to paul@sqlskills.com. Comments are disabled for this post.

Please create a new speaker record (if you haven't submitted before) and submit at least 3 abstracts; however, it will help your chances of being selected if you submit 5 or 6 abstracts. Please do not submit abstracts for sessions that you have previously presented at the Fall show.

What you will get if selected:
- $500 per conference talk. (Additional compensation for pre/post conference workshops.)
- Coach airfare and hotel stay paid by the conference
- Free admission to all of the co-located conferences
- Speaker party
- The adoration of attendees
-
etc.
 
Thanks for your continued support for our SQL Connections conference and we look forward to seeing your bright and shiny *new* abstracts!

Thanks,
Kimberly L. Tripp & Paul S. Randal
SQL Connections Conference Chairs

I started this series with the post titled: Little Bobby Tables, SQL Injection and EXECUTE AS. I then moved to discussing some of the differences with the post titled: EXEC and sp_executesql - how are they different?

Today, I want to address a few of the comments as well as continue with a few tips and tricks using these commands.

First off - could we have helped the performance of the sp_executesql statement?

Yes...

If we know that a statement returns a varying amount of data (based on the parameters supplied) then we can use the SQL Server 2005 feature WITH RECOMPILE to tell SQL Server that the statement being executed should have it's own plan created and that prior plans (if they exist) should not reuse the statement. It also tells SQL Server that this particular plan is a "single-use plan" that should not be reused for subsequent users. To see the combination of all of these things - I'll use some of the DMVs that track plan cache and plan utilization.

DBCC FREEPROCCACHE
go

SELECT st.text, qs.EXECUTION_COUNT --, qs.*, cp.*
FROM sys.dm_exec_query_stats AS
qs
CROSS APPLY sys.dm_exec_sql_text(sql_handle) AS st
CROSS APPLY sys.dm_exec_query_plan(plan_handle) AS cp
WHERE st.text like
'%FROM dbo.member%'
go

Right now, this query returns 0 rows.

I'll execute the following and then recheck the plan cache:

DECLARE @ExecStr nvarchar(4000)
SELECT @ExecStr =
'SELECT * FROM dbo.member WHERE lastname LIKE @lastname'
EXEC sp_executesql @ExecStr, N'@lastname varchar(15)', 'Tripp'
go

Now, we have a row for our parameterized query plan:

text                                                                                                                                                            EXECUTION_COUNT
(@lastname varchar(15))SELECT * FROM dbo.member WHERE lastname LIKE @lastname              1

So, what is this showing us... it's showing us that there's a plan in cache for this statement. And, if we're interested in seeing the plan, we can remove the commented out cp.* in the query above to get the cp.query_plan column. It shows the following:

<ShowPlanXML xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan" Version="1.1" ...LOTS OF OTHER STUFF THAT I... </ShowPlanXML>  (here's the FULL text of this: query plan xml.sqlplan (11.98 kb))

OK, I know that's not overly helpful... but, you're more than welcome to copy it and save it as a .sqlplan file. Once saved as a .sqlplan file - you can open it directly into SQL Server 2005 or SQL Server 2008's Management Studio. However, the very *nice* feature of SQL Server 2008's Management Studio is that a single click on an XML showplan - will go DIRECTLY into a graphical query plan window:

 

And, once again, we see the optimal plan (to use the index and do a bookmark lookup) because this query is highly selective (only 1 row).

We'll execute the exact same statement again - using the value of Anderson just to get setup to the point where we were last week:

DECLARE @ExecStr nvarchar(4000)
SELECT @ExecStr = 'SELECT * FROM dbo.member WHERE lastname LIKE @lastname'
EXEC sp_executesql @ExecStr, N'@lastname varchar(15)', 'Anderson'
go

And, we see that it uses the EXACT same plan (looking at showplan). In fact, we can see that from checking our plan cache as well:

text                                                                                                                                                            EXECUTION_COUNT
(@lastname varchar(15))SELECT * FROM dbo.member WHERE lastname LIKE @lastname              2

And, while we know that this plan (to use the index) is good for the highly selective value of 'Tripp' it's is not good for the value of Anderson as there are many rows that match. If we suspected this and/or knew this when we were executing (from the client) then we could use OPTION (RECOMPILE) to force SQL Server to get a new plan:

DECLARE @ExecStr nvarchar(4000)
SELECT @ExecStr = 'SELECT * FROM dbo.member WHERE lastname LIKE @lastname OPTION (RECOMPILE)'
EXEC sp_executesql @ExecStr, N'@lastname varchar(15)', 'Anderson'
go

The query plan as a result of this execution is:

And, this is the more optimal plan given this parameter. At ths point, the questions (IMO) are:

  1. Would someone really be able to programmatically guestimate that a parameter submitted from the client is "atypical" and/or that it warrants recompilation? And, I guess I can answer yes - for some parameters - like those with a wildcard. But, if we're just talking about two different values against a single column, this would be guessing statistics of the data.
  2. Should this particular statement always be executed to recompile and never save a plan?
  3. What did SQL Server do with the plan itself?
I'll go with answering #3 first as that one is easy to answer. Using the same statement, I'll again query the plan cache:

text                                                                                                                                                            EXECUTION_COUNT
(@lastname varchar(15))SELECT * FROM dbo.member WHERE lastname LIKE @lastname              2

Even though this is the third time we have executed this statement, this final execution was NOT put in cache. It was used solely for the execution with OPTION (RECOMPILE). And, it will NOT affect future executions. If I go back and execute without the OPTION (RECOMPILE) then I will get the prior plan (to use the index).

Now, the other two questions - these are a lot more interesting and this is where I think that stored procedures should be used. Personally, I think that developers that know the data and know the application - will be a lot better at creating the RIGHT code especially when they understand all of the options available to them.

Here's the way I think about stored procedures and recompilation:

  • If I know that a particular statement always returns the same number of rows and uses the same plan (and, I'd know this from testing), then I'll create the stored procedure normally and let the plan get cached.
  • If I know that a particular statement wildly varies from execution to execution and the optimal plan varies (again, I should know this from testing multiple sample executions), then I'll create the stored procedure normally and I'll use OPTION (RECOMPILE) to make sure that the statement's plan is not cached or saved with the stored procedure. On each execution that stored procedure will get different parameters and the particularly nasty statement will get a new plan on each execution.

However, this is also where things get more challenging. I've often seen stored procedures where people try and use conditional logic to break up the different types of parameters and this doesn't usually work out as well as expected (I'll blog this next). And, this is always where some decide that they want to dynamically build the statement that gets executed - now, we need to determine whether or not we should use sp_executesql or EXEC. And, there are really a couple of options at this point. Ultimately, in one or two more posts - I'll finally show you where EXEC is a clear winner over sp_executesql because even the OPTION (RECOMPILE) doesn't always help ALL kinds of plans (and especially one of the more common types of plans I see).

So, I'm getting closer... at least two more to go here. Thanks for the great questions/comments!
kt

In my last post: Little Bobby Tables, SQL Injection and EXECUTE AS, I wanted to highlight how to avoid SQL Injection when using EXEC. A few people brought up the point that I could have avoided SQL Injection by using sp_executesql. And, while sp_executesql can be used in some cases, it's not always the most ideal from a performance perspective. So, to really show this, I'm going to start with focusing on the similarities and differences of EXEC and sp_executesql. I'll start here with some performance details but you'll find that I'm going to just hit the tip of the iceberg with this one. I'll definitely need another post or two!

First, a quick overview:

sp_executesql (also known as "Forced Statement Caching")

  • Allows for statements to be parameterized
  • Only allows parameters where SQL Server would normally allow parameters; however, this string can be built using forms of dynamic constructs. I'll give more details on this in additional posts.
  • Has strongly typed variables/parameters - and this can reduce injection and offer some performance benefits!
  • Creates a plan on first execution (similar to stored procedures) and subsequent executions reuse this plan

EXEC  (also known as "Dynamic String Execution" or DSE)

  • Allows *any* construct to be built
  • Treats the statement similarly to an adhoc statement. This means that the statement goes through the same process that adHoc statements do - they are parsed, probably parameterized and possibly deemed "safe" for subsequent executions to re-use
  • Does not have strongly typed parameters in the adhoc statement and therefore can cause problems when the statements are executed (I have ways around this)
  • Does not force a plan to be cached.
    • This can be a pro in that SQL Server can create a plan for each execution
    • This can be a con in the SQL Server needs to recompile/optimize for each execution

Let's start with using sp_executesql to parameterize a query where SQL Server would also allow parameters:

DECLARE @ExecStr nvarchar(4000)
SELECT @ExecStr =
'SELECT * FROM dbo.member WHERE lastname LIKE @lastname'
EXEC sp_executesql @ExecStr, N'@lastname varchar(15)',
'Tripp'
go

Because 'Tripp' is a highly selective name, SQL Server uses an index to the lookup of the data:

So, for our next execution, I'll supply a different lastname - a lastname of Anderson. In this database, Anderson is NOT highly selective:

DECLARE @ExecStr nvarchar(4000)
SELECT @ExecStr =
'SELECT * FROM dbo.member WHERE lastname LIKE @lastname'
EXEC sp_executesql @ExecStr, N'@lastname varchar(15)',
'Anderson'
go

However, the query plan looks exactly the same:

Or does it? It turns out that this query plan does look a tiny bit different but it's not very obvious... it's in the thickness of the lines. In this second execution it clearly shows that more data is being passed between the steps. But, is this a bad thing? Maybe, maybe not. Let's drill in a bit deeper. If I hover over the Index Seek (on member.test), I can see the following tooltip:

The key point here is that it shows an "Estimated Number of Rows" of 1.96 but an "Actual Number of Rows" of 385. That's pretty far off... why? Because this statement's plan was determined by the first execution of sp_executesql. Let's try another execution:

Because the query has LIKE in it, we can use wildcards. And, let's do that! This time I'll supply a wildcard of %e%:

DECLARE @ExecStr nvarchar(4000)
SELECT @ExecStr =
'SELECT * FROM dbo.member WHERE lastname LIKE @lastname'
EXEC sp_executesql @ExecStr, N'@lastname varchar(15)',
'%e%'
go

And, again, the query plan looks the same... but with even thicker lines. Yes, I realize... this is not blindingly obvious:

And, by turning on SET STATISTICS IO ON, we can also review the [Logical] IOs performed:

For 'Tripp' 

Table 'member'. Scan count 1, logical reads 8, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

For 'Anderson'

Table 'member'. Scan count 1, logical reads 772, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

For '%e%'

Table 'member'. Scan count 1, logical reads 10019, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

This table only has 144 pages. In the cases of 'Anderson' and '%e%', SQL Server would have been better off doing a table scan. But, because I used sp_executesql I forced SQL Server to do caching. Subsequent users use the plan whether it's optimal or not.

And, there are even more complicated scenarios than this. I plan to keep tackling these issues over the next few days and I'll add quite a bit more to this. However, it's late. So, I'll dangle the carrot for now. Within the next couple of days, I'll show a series of EXEC statements that generate both SAFE and UNSAFE plans.

Thanks for reading!
kt

OK, I know many of you have seen this before (an oldie, but a goodie!):

(image from xkcd.com, with "copy and share" license described here: License)

But, what can you do to prevent this? And, when would this even be possible?

This is possible when DSE (dynamic string execution) occurs. There are still some VERY relevant and important reasons to use DSE and some are performance related (ok, this is another post for another day) but suffice it to say - I use DSE but I also know how to prevent Little Bobby Tables from running amuk within my database.

(April 5 note: after a few comments about sp_executesql, I've modified the following paragraph to explain a bit more... I'm *very* aware of sp_executesql and while it does have some benefits, it also has some negatives! I promise, this is coming soon in another post. I really want this post to focus solely on how to minimize injection when you do NEED to use a dynamically constructed string and there are times when this is the ONLY way to solve certain performance problems. sp_executesql - while often compared to EXEC as a better choice just isn't ALWAYS better and really it's not a simple comparison as they are not equivalent!! In fact, there are some cases where I would specifically avoid sp_executesql... So, I promise... this will be coming up in a blog post soon!! Great comments! Thanks!! :))

So, while the discussion on when, where and why you might use DSE is another one - especially when compared with sp_executesql (yes, I hear another blog post coming - and, well, I'm kind of on a roll right now :)... I want to give you a couple of very cool tips/tricks to reduce the possibilities of problems if/when you do use it (and, there REALLY are places where it's necessary!). It's especially important as there are a few things that many folks just don't even know exist. And, there are 3 parts to this article... I would expect that most apps need QUOTENAME() and/or REPLACE() but ALL apps should consider my recommendations/comments for EXECUTE AS - if/when you're actually using DSE (through EXEC).

  • QUOTENAME()
  • REPLACE()
  • EXECUTE AS

Part I: QUOTENAME()

In a question/thread (this is many moons ago), the following question was asked (NOTE: I have "tweaked" the code to highlight ONLY the problem here AND I've made it so that it won't execute):

I have a procedure that creates a database (below). When people use spaces in their database names it fails. How can I make it work if they supply spaces?

CREATE PROCEDURE dbo.CreateDBProc
(
     @DBName
sysname
)
AS
DECLARE
@ExecStr nvarchar(2000
)
SELECT @ExecStr = N'CREATE DATABASE ' + @DBName
-- + all of the other stuff to place the files, etc...
SELECT @ExecStr
--EXEC(@ExecStr)
go

And, there was a reply to this question where the answer was (and I am NOT recommending this answer. Please do NOT stop reading here otherwise you are asking for a world of pain - and LOTS of SQL Injection):

Just go ahead and do the following. By putting brackets around the name you will solve this problem.

CREATE PROCEDURE dbo.CreateDBProc
(
    
@DBName
sysname
)
AS
DECLARE
@ExecStr nvarchar(2000
)
SELECT @ExecStr = N'CREATE DATABASE [' + @DBName + N']'
-- + all of the other stuff to place the files, etc...
SELECT @ExecStr
--EXEC(@ExecStr)
go

OK, this is a SCARY recommendation. Not just because it won't always work (what if someone actually wants [don't get me wrong - I don't know WHY they'd want to do this but...] to put [brackets] in the name?) but because it's prone to SQL Injection.

This won't work:

EXEC dbo.CreateDBProc N'This is the most stupid :) name I can think of with [brackets] and all sorts of junk! in the name'

And, this is just scary what I can do:

EXEC dbo.CreateDBProc N'fakedbname] DROP DATABASE foo--'

EXEC dbo.CreateDBProc N'[fakedbname] EXEC(''CREATE PROC Test AS SELECT * FROM pubs.dbo.authors'') DROP DATABASE foo --'

So, what should we do?

ANSWER: If you're building strings where identifiers are parameterized (and, there are many reasons for why this many not be the best idea - but, if you absolutely must), USE QUOTENAME()!

So, here's the correct way to code this procedure so that an identifier cannot be subjected to SQL Injection:

CREATE PROCEDURE dbo.CreateDBProc
(
    
@DBName
sysname
)
AS
DECLARE
@ExecStr nvarchar(2000
)
SELECT @DBName = QUOTENAME(@DBName, N']'
)
SELECT @ExecStr = N'CREATE DATABASE ' + @DBName
-- + all of the other stuff to place the files, etc...
SELECT @ExecStr
--EXEC(@ExecStr)
go

Now, no matter what messed up name someone actually WANTS to create - this will protect it such that SQL Injection is not possible. You can try any of these combinations with this version of the proc and while creating a database actually named "[fakedbname] DROP DATABASE foo --" probably isn't desirable, dropping a database isn't either (and I would argue is MUCH worse).

All of these executions are "protected" from SQL Injection:

EXEC dbo.CreateDBProc N'this is my test database name'
EXEC dbo.CreateDBProc N
'[fakedbname] DROP DATABASE foo --'
EXEC dbo.CreateDBProc N
'[fakedbname] EXEC(''CREATE PROC Test AS SELECT * FROM pubs.dbo.authors'') DROP DATABASE foo --'
EXEC dbo.CreateDBProc N'This is the most stupid :) name I can think of with [brackets] and all sorts of junk! in the name'

Note: The first and second execute (if you actually execute instead of just SELECT @ExecStr) *and* create databases with these ugly names. The third and fourth don't execute because some of the injected characters are not supported for file/directory names. However, the most important point is that *all* are protected from SQL Injection!

Part II: REPLACE()

But, what if the string you need to work with isn't an identifier? Can you still use QUOTENAME()? And, the answer is YES, IF the string is compatible with SYSNAME which is an nvarchar(128). However, if you need longer strings or if you are protecting a string which is not an identifier - what else can you do? Well... it's definitely NOT as pretty but it works. The answer here is to use REPLACE. If I had wanted to do this with the procedure above, I would have had to do the following:

CREATE PROCEDURE dbo.CreateDBProcQuotes
(
     
@DBName sysname
)
AS
DECLARE
@ExecStr nvarchar(2000)
SELECT @DBName = REPLACE(@DBName, N'''', N'''''')
SELECT @ExecStr = N'CREATE DATABASE ' + @DBName -- + all of the other stuff to place the files, etc...
SELECT @ExecStr
--EXEC(@ExecStr)
go

However, this will NOT work here because an identifier must be properly quoted within the string and there are only 2 ways to "quote" identifiers. One is the [bracket] and the other is "double quotes" not single. Instead, this is what the procedure will look like (nope! we don't need REPLACE):

CREATE PROCEDURE dbo.CreateDBProcQuotes
(
     
@DBName sysname
)
AS
DECLARE
@ExecStr nvarchar(2000)
SELECT @ExecStr = N'CREATE DATABASE [' + @DBName + N']'-- + all of the other stuff to place the files, etc...
SELECT @ExecStr
--EXEC(@ExecStr)
go

And, SQL only supports brackets by default (without any special settings). To allow double quotes, you can turn on QUOTED_IDENTIFIER (SET QUOTED_IDENTIFIER ON) to allow double quotes instead of brackets but this has a few negative consequences (See SET Options That Affect Results) and personally, I always try to avoid making SET options changes in most of my stored procs.

CREATE PROCEDURE dbo.CreateDBProcQuotes
(
     
@DBName sysname
)
AS
SET QUOTED_IDENTIFIER ON
DECLARE
@ExecStr nvarchar(2000)
SELECT @DBName = REPLACE(@DBName, N'''', '''''')
SELECT @ExecStr = N'CREATE DATABASE "' + @DBName + N'"'-- + all of the other stuff to place the files, etc...
SELECT @ExecStr
--EXEC(@ExecStr)
go

Once again, REPLACE isn't needed. And, to be honest, this becomes a nightmare with " [double quotes] or actual [brackets] in the name. QUOTENAME() deals with those properly. So, with an identifier you really need QUOTENAME(). So, where does replace come in? With actual strings in your DSE. Imagine that you're wanting to pass in a first name but something about your application requires that this happen through DSE (and, this CAN definitely happen - mostly for performance purposes). Don't get me wrong, I just don't have the ability to do all of this within one post but simply put, because of how stored procedures create their execution plans there are OFTEN reasons for why you might want a string to be built dynamically. No, probably not one quite this simple but I'm trying to keep this particular example simple. This is probably what it would look like to start:

CREATE PROCEDURE GetMembers
(
      @FirstName nvarchar(50
)
)
AS
DECLARE
@ExecStr nvarchar(2000
)
SELECT @ExecStr = N'SELECT * FROM dbo.member WHERE FirstName = ' + @FirstName

SELECT @ExecStr
-- EXEC(@ExecStr)
go

And, on first test:

EXEC GetMembers N'Kimberly'

And, on first test it generates: SELECT * FROM dbo.member WHERE FirstName = Kimberly which gives you an error because the firstname isn't properly quoted. So, you think... oh, I need to quote these in my EXEC statement:  

ALTER PROCEDURE GetMembers
(
    
@FirstName nvarchar(50
)
)
AS
DECLARE
@ExecStr nvarchar(2000
)
SELECT @ExecStr = N'SELECT * FROM dbo.member WHERE FirstName = ''' + @FirstName +
N''''
SELECT
@ExecStr
--EXEC(@ExecStr)

go

EXEC GetMembers N'Kimberly'

Ah, that works... for awhile. Then, someone complains that it doesn't work for C'Anne (I just met someone named C'Anne - pronounced SeaAnne) this past week.

EXEC GetMembers N'C''Anne'

And, it generates:

SELECT * FROM dbo.member WHERE FirstName = 'C'Anne'

which gives you a syntax error.

Msg 102, Level 15, State 1, Line 1
Incorrect syntax near 'Anne'.
Msg 105, Level 15, State 1, Line 1
Unclosed quotation mark after the character string ''.

Darn. But, at least this is how you catch it. Instead of with Little Bobby Tables. What if this gets executed:

EXEC GetMembers N'''Robert''); DROP TABLE Students; --'''

Luckily, I got:

Msg 3701, Level 11, State 5, Line 1
Cannot drop the table 'Students', because it does not exist or you do not have permission.

But, what if there had been a table called Students AND the context of the execution was such that the could execute this? Well, that's what's proposed by the cartoon and well... I've *unfortunately* heard of that happening!

So, what do you do? First, you need to protect the string being inserted! This is what you need:

ALTER PROCEDURE GetMembers
(
     @FirstName nvarchar(50
)
)
AS
DECLARE
@ExecStr nvarchar(2000
)
SELECT @ExecStr = N'SELECT * FROM dbo.member WHERE FirstName = ''' + REPLACE(@FirstName, N'''', N'''''') +
N''''
SELECT @ExecStr
--EXEC(@ExecStr)

go

And, this requires that the string be surrounded with quotes AND the string submited have all single quotes replaced with '' [single quote followed immediately by a second single quote]). Yes, I realize that this is starting to look ugly but... what happens with Little Bobby Tables? We get his row. NOTE: We do have other potential performance problems here because the parameter (when it gets thrown back out into the string - is no longer an NVARCHAR, it's only a varchar). So, we should also consider adding an N into the string like the following:

SELECT @ExecStr = N'SELECT * FROM dbo.member WHERE FirstName = N''' + REPLACE(@FirstName, N'''', N'''''') + N''''

And, what if the string being built has numerics or other types - what should we do there? To be honest, I often EXPLICITLY type the variable in what looks like a completely unnecessary CONVERT. However, for performance purposes I *often* do this. Ugh, the future posts are REALLY starting to build from here! Again, I'm not hitting performance in this one so I'm not going to take the bait of the tangent. Another time for sure!

Finally, what if our strings are REALLY complex and we need to do some strange stuff and well... we're not sure that the string can be completely protected? What if we're passing in a much more complicated "piece" of a WHERE clause or a full and customized ORDER BY? What if it's not just a SINGLE value? Ugh... this is much harder but you definitely need EXECUTE AS.

Part III: EXECUTE AS

This is the ultimate in protection for your DSE code. This is where you decide that you're going to allow virtually anything in the WHERE clause or ORDER by but you really want to protect your database. Here, what I need is EXECUTE AS. And I'm not a big fan of this kind of code - but, again, I'm not looking at perf or anything here... I'm *just* looking at the security and potential of the injection.

SQL Server 2005 introduced EXECUTE AS and it has a lot of options - in fact, 4 options:

  • EXECUTE AS Caller
  • EXECUTE AS User
  • EXECUTE AS Self
  • EXECUTE AS Owner

EXECUTE AS Caller is the default. This means that the user that executes the procedure must have the DIRECT rights associated with the Dynamically Executed String. And, so, this is good. This should protect MOST cases altogether. However, what if the users have higher privileges? Well, I'd say that there's already something wrong with your database (because this should NEVER be the case) but... well... I've seen apps (and even RECENTLY) heard of apps that connect as SA. Yes, seriously. So... what can they do... Well, much worse that DROP TABLE. How about DROP DATABASE. Or, well, let's just say that if I found vulnerabilities in that - I could possible get outside of their SQL Server and do even more damage - to files, to the network. So... this is REALLY REALLY REALLY REALLY BAD. I would STRONGLY suggest that anyone that connects as SA - IMMEDIATELY WORK HARD TO CHANGE THIS.

EXECUTE AS Owner is almost as bad (although nothing's as bad as connecting as SA) - especially when the OWNER has elevated rights. So, I'd try to avoid EXECUTE AS Owner unless you SEVERELY restrict who has permissions to the procedure in general. But, don't EXECUTE AS Owner for a stored procedure that's granted EXEC to public. That completely defeats the purpose!

EXECUTE AS Self is a bit strange. But, when a stored procedure is created in a schema, the ownership chaining and privileges follow from the OWNER of the schema. So, what if the author of the stored procedure has rights that the owner doesn't. No, I don't really recommend this but that's where "self" comes in. This is where the procedure can execute under the context of the actual creator (not the owner). However, often these are the same and often they're DBO. This is even worse... Although still, not as bad as connecting as SA.

EXECUTE AS User (where the User is a low-privileged user with very few rights) is good. And, in fact, this will be the answer to this problem!

Yes, even if you connect as SA, I can STILL protect you. Yes, we can still protect this code even when run under the connection context of SA - with EXECUTE AS.

CREATE USER User_GetMembers
WITHOUT
LOGIN
go

GRANT SELECT ON Member TO User_GetMembers
go

ALTER PROCEDURE GetMembers
(
    
@FirstName nvarchar(50)
)
WITH EXECUTE AS N'User_GetMembers'
AS
DECLARE
@ExecStr nvarchar(2000)
SELECT @ExecStr = N'SELECT * FROM dbo.member WHERE FirstName = ''' + REPLACE(@FirstName, '''', '''''') + ''''
SELECT @ExecStr
--EXEC(@ExecStr)
go

And, even if for some reason you can't use REPLACE() this procedure will be limited in what they can do because the user (User_procedurename) has limited rights within the database. In fact, I would recommend that the user ONLY have the necessary rights for THAT stored procedure and I'd even recommend that you create one user for each stored procedure (that has DSE). Yes, that's potentially a lot of users... but, better than the alternative(s)!

Summary/Resources

Finally, please be sure that you're doing some form of auditing if you're using EXECUTE AS. And, you should also do a quick check to see if anyone is ELEVATING user's rights to a more privileged user. Check out these two blog posts for additional information:

And... ah... I'm done! There's a lot here but a lot of tips. You CAN prevent SQL Injection.

Thanks for reading!
kt

In our Boston SQL Training workshop today we're discussing disaster recovery and minimizing data loss. One of the key discussions is around backup/restore so we starting discussing resources and found that there just isn't a great single place where a bunch of resources are listed (ok, maybe there is but we didn't find one) so, while Paul's (blog|twitter) lecturing for a bit - I thought I'd put together a list for you to use!

Webcasts to learn more about backups and disaster recovery:

  • Check out my 10-part MSDN series and/or my 11-part TechNet series that both have some topics around backup/restore. Both series (and their associated blog posts) can be found here. Specifically, check out:
    • MSDN Series, Part 1: Creating a Recoverable Database (Level 200) [NOTE: This is recorded for SQL Server 2005 but still applies to SQL Server 2008.] 
    • MSDN Series, Part 2: Creating a Reliable and Automated Backup Strategy (Level 200) [NOTE: This is recorded for SQL Server 2005 but still applies to SQL Server 2008.]
    • TechNet Series, Part 10: Recovering from Human Error (Level 200) [NOTE: This is recorded for SQL Server 2005 but still applies to SQL Server 2008.]
  • Check out some of our additional recordings that we've done at conferences such as TechEd EMEA (these are also on our webcasts page)
    • TechEd 2005: SQL Server 2005 VLDB Availability and Recovery Strategies (Level 300), presented by me - here.
    • TechEd IT Pro 2008: Surviving Database Corruption (Level 300), presented by Paul - here. 

Whitepapers:

Articles:

Additional blog posts/topics:

Some other interesting issues - what about the application ecosystem (if you're moving to another instance or even another physical server):

  • Logins, etc.? Check out this KB article to help you migrate the logins: How to transfer logins and passwords between instances of SQL Server.
  • What about server-level roles? While the KB article helps you migrate logins to the secondary server - what about any role membership that you've set? These are NOT migrated! This is something that you may want to prepare beforehand. (thanks to fellow SQL Server MVP, Aaron Bertrand (blog|twitter) for the great discussion!)
  • However, even if you get the logins migrated and the system roles scripted, there any other issues that you could run into! What about jobs, operators, service accounts, external batch files, executables - anything and everything that's not *in* the database but keeps that database "ticking" that's NOT backed up nor restored (er, except for filestream - which is external but *IS* backed up by default). So, these are additional things to prepare and test so that your disaster recovery strategy succeeds. 

So, while there are a lot of resources out there - sometimes they can be difficult to find. In fact, it took me about 2 hours to bring everything together here and most of the resources were mine or Paul's... and, in all honesty, there's even more out there!

Enjoy!

Thanks for reading,
kt

Categories:
Backup & Restore | Resources

Theme design by Nukeation based on Jelle Druyts