In addition to the SQL Server 2008 Database Infrastructure and Scalability content that Paul and I delivered for SQL Server 2008, and the content that our colleague Bob Beachemin (blog) delivered (Clinic 10164: Essential SQL Server 2008 for Developers), our colleague Stacia Misner (blog|twitter) also delivered content for SQL Server 2008 titled and available as: Clinic 10261: Introduction to SQL Server 2008 BI Solution Development.

From here - there are many directions in which you can go. For SQL Server 2008 R2 (which is primarily a BI release - but, there are a few other topics on which to focus), you can check out these FREE eLearning classes:

And, I'm sure more Microsoft eLearning resources will continue to become available on SQL Server 2008 R2 given that it will release next month!

Finally, last, but definitely not least - consider checking out the new eBook Introducing Microsoft SQL Server 2008 R2 written by our good friends Ross Mistry (blog|twitter) and Stacia Misner (blog|twitter).

Enjoy!
kt

When SQL Server 2008 was in beta, Paul and I worked on a project to create content that would "jumpstart" folks in learning and working with SQL Server 2008. We delivered these "new features" courses at various locations and finally, we recorded the content. All of this content is now available FOR FREE from Microsoft through their eLearning outlet. This course is officially titled and available as: Clinic 10259: SQL Server 2008: Database Infrastructure and Scalability.

There are 12 hours of lecture and demo content... lots of good stuff.

Modules & Lessons

Availability Enhancements

  • Module overview
  • Demo: Database Mirroring automatic page repair
  • Database Mirroring enhancements continued, automatic page repair, log stream
  • Backup compression and using Transparent Data Encryption (TDE) to encrypt data
  • Hot-add CPU, DBCC CHECKDB enhancements, failover clustering enhancements
  • Peer-to-peer replication enhancements
  • Demo: Peer-to-peer replication

Policy-Based Management and Multi-Server Administration

  • Module overview
  • Central management server
  • Demo: Central management server
  • Central management server metadata, multi-server script execution
  • Demo: Multi-server script execution
  • Policy-based management overview, conditions, facets and policies
  • Policy-based management demo: Facets and conditions
  • Applying facets and conditions to policies, policy enforcement
  • Demo: Applying facets and conditions to policies
  • Policy based management architecture and execution
  • Demo: Applying policies to multiple servers
  • Policy based management advanced topics

Performance Data Collection

  • Module overview, management data warehouse
  • Demo: Creating and configuring a management data warehouse
  • Data collection overview and architecture
  • Demo: Configuring system data collection on an instance
  • Demo: Transact-SQL custom data collection
  • Data collection reports and module summary

Resource Management and Troubleshooting Enhancements

  • Module Overview
  • Demo: Resource Governor
  • Extended events, predicates, actions, targets, types, maps, sessions

Security Enhancements

  • Module overview
  • Transparent data encryption
  • Demo: Transparent data encryption
  • Backups and transparent data encryption, transparent data encryption limitations
  • Integrated authentication enhancements, SQL Server 2008 auditing improvements
  • Demo: SQL Server 2008 all actions audited
  • Security Summary

Management Implications of New Features Part 1

  • Module overview, spatial indexes, integrated full-text search
  • Problems addressed by sparse columns
  • Demo: Sparse columns and column sets
  • Sparse columns and filtered indexes, best practices for sparse columns
  • Demo: Constructing tabular data sets with sparse columns
  • Limitations of sparse columns and interaction with other features
  • Overview of filtered indexes
  • Demo: Using filtered indexes with sparse columns
  • Filtered statistics, monitoring filtered indexes and statistics
  • Demo: Using filtered indexes for improved index coverage

Management Implications of New Features Part 2

  • Change tracking overview, change tracking best practices
  • Change data capture (CDC) overview
  • Demo: Change Data Capture
  • Blob storage with Filestream overview, enabling Filestream
  • Filestream security, filestream and transactions, monitoring filestream storage
  • Demo: Filestream manageability demo

Scalability Enhancements

  • Data compression overview, data compression trade-offs
  • Demo: Estimating data compression space savings
  • Estimating data compressing space savings, choosing row compression
  • Demo: Enabling compression and comparing performance
  • Enabling and disabling compression, data compression options and syntax
  • Service broker improvements, Service Broker conversation priority
  • Query optimizer enhancements, MERGE overview
  • Forcing query optimizer plans with plan guides
  • Demo: Using plan guides
  • Star join optimizer improvements
  • Partitioning enhancements overview, partition-aligned indexed views
  • Partitioning and lock escalation enhancements
  • Demo: Improving concurrency with partition lock escalation
  • Module review

There are a few more resources to mention. I've got a few more posts coming on this right away!

Enjoy!!
kt

Whenever I setup a new machine, I like to set a few default options in SSMS. Here are my favorites:

Tools, Options

  • Environment
    • Fonts and Colors
      • Text Editor font: Lucida Console (a bit thicker and it's a fixed-width font)
      • Text Editor: Selected Text (under display items)
        • Item foreground: Black
        • Item background: Yellow (looks like a highlighter)
      • Sometimes I'll also make the results windows have larger fonts - especially if it's a presentation machine
    • Keyboard
      • Keyboard scheme: SQL Server 2000 (ok, maybe I'm old-school but the QA keyboard shortcuts still seem a lot more natural than the VS keyboard shortcuts... but, if you're more of a VS person, then stick with Standard)
      • Query shortcuts:
        • Ctrl+F1: sp_helpindex2 (if you highlight an object and then hit Ctrl+F1, then it passes the highlighted object in as a parameter. The only negative is that it doesn't delimit it so you can't highlight schema.object unless it's already quoted for the sp 'schema.object')
        • Ctrl+3: SELECT object_name(object_id) AS ObjName, * FROM sys.dm_db_index_physical_stats(db_id(), NULL, NULL, NULL, NULL) ORDER BY 1, 4
        • Ctrl+4: other common queries/procedures...
  • Text Editor
    • All Languages
      • Line numbers
    • Editor Tab and Status Bar - I usually change the colors of the "Group connections" setting to something very noticeable like Fuschia. I've already blogged why here.
  • Query Execution
    • By default, open new queries in SQLCMD mode (however, if you set this - you will disable Intellisense and Transact-SQL Debugging... and, there's no "hint" or warning that you're doing it.)
  • Query Results
    • SQL Server, Results to Grid
      • Include the query in the result set (this will show you what you executed in the results window while you wait for completion... also confirms what you think you executed. This gets annoying for large batches but is sometimes useful to see what you last executed)
      • Include column headers when copying or saving the results (VERY useful if you're pushing data over to Excel, etc.)
      • Display results in a separate tab (this gives you Tabbed mode instead of split-window mode meaning that results will go to a results Tab instead of the lower half of your window. I prefer this when I have a lot of results to review AND when I'm presenting and typically run with a lower screen res.)
        • Switch to results tab after the query executes (I prefer this so that I'm waiting for the results in the results window)
    • SQL Server, Results to Text
      • I set most of the same settings as Results to Grid (Include column headers, Include the query, Display results in separate tab and Switch to the results tab) BUT, I usually turn off Scroll results as received. If I'm looking at rows at the top of the set (while it's still processing) it often takes my cursor down to the end. I also turn this off in Profiler.

The primary reason for this post though - is to make sure that you realize that one specific option - Tools, Options, Query Execution, "By default, open new queries in SQLCMD mode" - can really surprise you. I used to think that this option was really helpful (and benign) because SQLCMD is a superset of Transact-SQL. So, I typically recommended that you turn it on. However, turning on this option disables Intellisense and the Transact-SQL Debugger. It's documented here: http://msdn.microsoft.com/en-us/library/ms174187.aspx (thanks Paul Mestemaker! :) and I thought this was a bit unintuitive at first but it really does make sense. Both the Transact-SQL Debugger and Intellisense are debugging and/or prompting you with Transact-SQL (not SQLCMD commands). So, disabling it seems right (well, I guess I could argue that it could work with Intellisense but debugging would be much harder). However, I just wish there were some better information in the Tools/Options dialog. I'd like my 2 hours back (trying to figure out which option disabled Intellisense, chatting with my friends on fb to see if they knew, filing a connect bug when I figured out that it was SQLCMD, and then learning from Paul Mest that it's actually expected behavior (duh!), changing the Connect bug to say that the dialog might be better with a warning... it was definitely one of those mornings!  :) :) :)

So, my main point for this post - save you the time I lost in trying to figure out where Intellisense went. It's expected behavior (and it does make sense........now :).

Cheers,
kt

Categories:
Manageability | Tools

A couple of weeks ago I wrote a blog post titled Whose job is it anyway? It's an interesting debate and something I've been hearing more and more - that SQL Server is a "set it and forget it" technology - a black box where you just don't need to know how it works to do well with it. In fact, I've even had a few folks comment that they think it would be better to "roll their own" database rather than have to learn how to work in a "general purpose" database. And, while there are certainly lots of different angle to this debate - one fact remains... if you don't know anything about the database on which you're developing (whether it's SQL Server, mySQL, Oracle, whatever), I *PROMISE* you won't have a truly scalable, optimal solution. Why do you think there are so many knobs? It's because there are so many different ways to work with data. There is more than one way to query, more than one way to design. This is also why every answer to a "how should I do this" question starts with "It depends". And, while that seems like a scary response it's actually a good one. It means that you have lots of options - options that can offer many different pros/cons. And, as a result of knowing these pros/cons, you can make better decisions - decisions that will ultimately determine how well you can scale.

So..... while I don't think this debate will EVER be finished (as to WHOSE job it is to know these things), I do think a lot of folks are seeing the effects of not knowing more about their store (and, again, this is NOT limited to SQL Server in any way, shape or form).

At a minimum, hear the discussion on RunAsRadio with Richard, Greg and I and let us know what you think!

Kim Tripp on the Roles of Developers and DBAs with the Database!

Cheers,
kt

This is a tough topic. It's a big topic and more than any other - I think there are a lot of misunderstandings about what the log is for, why it's so critical and ESPECIALLY when/why it gets extrememly large. Simply put, it gets large when it's not managed correctly. OK, there are times when it can become large - even if it is well managed. But, more often than not, when a transaction log is wildly out of control (orders of magnitude larger than the data itself) it indicates a management/maintenance problem.

There are a lot of places where you can go to find out the technical details behind the transaction log but I'm going to target this blog post to the relatively straightforward easy (no, really easy!!) facts about transaction log maintenance.

What kind of transaction log management is right for YOUR database?

First and foremost, you MUST decide whether or not you need to do log backups. SQL Server *requires* you to make some form of decision. Well, I take that back. They don't tell you anywhere that you need to make this decision but the transaction log can get wildly out of control if you don't (see the next section for more details on this one :)).

Why? Transaction log backups will allow you better recovery options in the event of a disaster. If you create a good backup strategy, you should be able to recover from a disaster very close (possibly even up-to-the-minute) to the time of the disaster. Howevre, you are not required to do log backups. Instead you can do only database-level backups and recover with those. That's fine. There's really nothing wrong with that strategy. However, it does mean that you have a greater potential for data loss. Basically, if you decide that you're doing to do weekly full backups - then you need to be OK with losing everything that's happened since your last full backup. If that's OK, then performing full database backups (and never worrying about the log) is absolutely fine.

However, if you want more granular control and more recovery options (again, possibly even up-to-the-minute recovery - which is transactional recovery up to the time of the disaster), then you MUST add transaction log backups into your disaster recovery strategy.

So, make this decision FIRST:

  1. Am I OK with some data loss? (then you're probably OK with just database-level backups... but, you will need to do something else! be sure to keep reading!!!)
  2. Do I want to minimize data loss to the smallest amount possible? (then you're going to want to AUTOMATE transaction log backups)

But I didn't do anything - why is the log WAY out of control (in terms of size)?

OK, even if you consciously make the decision to ONLY do database-level backups, you are NOT DONE!!! In fact, this is actually what led me to do this post. I found these two (relatively dated but interesting nonetheless) MSDN forum discussions for TFS (Team Foundation Server) databases:

    MSDN Forum discussion "Recommended SQL Maintenance Plan": http://social.msdn.microsoft.com/forums/en-US/tfsadmin/thread/b23f7018-3eaa-4596-96e4-728b02cf6211/ 
    MSDN Forum discussion "Huge log files":
http://social.msdn.microsoft.com/forums/en-US/tfsadmin/thread/605d51f7-23fd-470c-945e-53fa7ed5aa87/

And, I know EXACTLY what happened in ALL of these cases (and MANY more... Paul and I see this ALL the time, in fact). In the "Huge log files" thread, there's a database mentioned (TfsWareHouse) with a 124MB mdf and a transaction log of 61.8GB. It didn't mention whether or not there were other data files but my guess is that there weren't. My guess is that they were completely shocked by why the data portion had grown to a size that's 510 TIMES the size of the database... The reason is actually somewhat simple (no pun intended). If you're not going to do transaction log maintenance (meaning transaction log backups), then you need to tell SQL Server that. (This is the part that's completely unexpected.)

When a database is created, SQL Server runs that database in a "pseudo simple recovery model". (Yes, I know - that didn't help.) What that means is that SQL Server automatically clears inactive records from the transaction log once it knows that it no longer needs them. It no longer needs them to be stored in the log because no one is using the log (i.e. you're not doing ANY backups). However, once you do start to do backups (and, people generally start by doing a full database backup), then SQL Server looks to your recovery model to determine what to do with log records. If the recovery model is set to full (and, yes, this is the default), then SQL Server gives you the "full feature set" with regard to backup/restore. SQL Server is expecting YOU to manage the transaction log by backing it up. Once it's backed up, SQL Server can remove the inactive records from the transaction log (and when you do a transaction log backup, it automatically clears the inactive records by default).

So, there are really two choices - and ONLY two choices here:

  1. Perform transaction log backups as part of your maintenance plan
  2. Change the recovery model to the SIMPLE recovery model so that SQL Server clears inactive transactions from the log automatically

Is there anything else to do for the transaction log? 

Yes! If you decide that you want to do transaction log backups then I would recommend a few things. I'd first recommend reading 8 Steps to Better Transaction Log Throughput and when you decide how large your transaction log needs to be, then also read Transaction Log VLFs - too many or too few?. These two posts will help you to create a more appropriately sized log as well as one that won't be prone to performance problems (such as internal VLF fragmentation).

If you want to learn more about the transaction log, I'd suggest a few of Paul's resources (it's probably because he has such a fantastic tech editor... oh, I'm asking for trouble with this comment!! ;-):

  1. Read Paul’s blog post to his TechNet article on Logging & Recovery. It’s a great article that covers a lot of different aspects of logging. He also did a great short video on why the transaction log grows wildly out of control. Here’s a link to the blog post that pulls all of the TechNet resources together: http://www.sqlskills.com/BLOGS/PAUL/post/TechNet-Magazine-feature-article-on-understanding-logging-and-recovery.aspx.
  2. Read Paul’s blog post to his TechNet article on Database Maintenance. It’s a great overview of all of these maintenance tasks and will give you a good overview of what each one does. Here’s a link to the blog post that pulls all of the TechNet resources together: http://www.sqlskills.com/BLOGS/PAUL/post/TechNet-Magazine-Effective-Database-Maintenance-article-and-August-SQL-QA-column.aspx

OK, so, I think that sums up part III. I think that's the last one in the series for now. I'll go through and explain "The Tipping Point" next. However, I was hoping for more results to my brain teasers (in those two posts)!!

Cheers,
kt

When you decide to rebuild or reorganzie an index, you have an option to set something called FILLFACTOR. This is probably the MOST IMPORTANT thing to understand about index maintenance and reducing fragmentation (especially in databases that are prone to it). Unfortunately, we need to dive into some internals to really understand why this is so helpful...

What is FILLFACTOR?

FILLFACTOR defines how full the leaf level pages of the index are filled when rebuilding or reorganizing an index. The leaf level of an index is the largest structure of an index and it's the most prone to fragmentation. A good analogy would be to use the phone book (ah, loosely for this one) and imagine that new people come to the area. The *entire* phone book needs to get reprinted. Obviously this is terribly expensive and therefore it's not done often (well, we all use the internet anyway...). But, imagine if EACH page of the phone book were only filled to 90%. When new people come to the area we could just add their information - on the specific page where it should reside - and the phone book stays relatively intact and we don't have to kill a bunch of trees (OK, you could argue that you would have had to kill more trees to make a phone book that has 10% whitespace...but, let's not go there right now. :). OK, I know this analogy is weak at best but think back to "a page in the phone book" and think of it representing a "page" within SQL Server (quite literally here - SQL Server actually stores data on 8KB pages and indexes are doubly-linked lists of these 8KB pages).

So, if the pages are full and there's nowhere to put a particular row (let's say someone with a lastname of 'Tripp'), how does SQL Server "insert" this new row? They can't possibly shift all of the data down and rewrite all of the pages AFTER 'Tripp' could they? I suppose they could but it would get REALLY expensive - especially if someone with the last name of 'Anderson' moves to town...

How does SQL Server make room for new rows? They do something called a page split. Now, this can be A LOT more technical than I'm going to explain here but simply put, they split the page (on which the row is to be located) into half. The first 50% of the rows STAYS on the current page and the 50% MOVES to a new page (and this new page is probably nowhere near the page being split). However, they keep this list LOGICALLY in order by maintaining a doubly-linked list. So, SQL Server knows where to go when it's reading the data - even if the pages are not physically/sequentially in order. If this happens once or twice, it's not a huge problem. But as it continues to happen, it makes scans of your data more and more expensive and the process of the split can be expensive in and of itself. The biggest problems are:

  1. The data is not physically in order (and can create lot random IOs when scanning)
  2. The pages are no longer very dense (they've gone from 100% to 50%). Splitting does has a positive side in that it's going to take awhile before the page splits again BUT, splitting is meant as a temporary fix and ideally, it shouldn't be happening to every page. But, this also negatively impacts cache - you're only using 50% of the page that's split so now you're wasting half of that page in memory (and, on disk). So, while making a table 10% wider from the start seems bad; it reduces the need for splitting and therefore your table will stay more intact over time. 

What indexes have fragmentation?

Some indexes are prone to more fragmentation than others. For example, indexes with a high-order element (the first column of the index) that is NOT in ever-increasing order. This does NOT mean that ALL indexes should be created such that their first column is ever-increasing (that's NOT a good way to create indexes!). But, it does mean that you should pay special attention to some of your larger tables and therefore larger indexes (meaning ESPECIALLY the clustered index since the clustered index is over ALL of the columns). 

In terms of index management, here are the main things to think about for each index type: 

Rebuilding the clustered index is the most expensive index to rebuild. It's also the one that you'd rather avoid if you can. It has the most limitations... A clustered index CANNOT be rebuilt as an online operation IF the table has ANY LOB columns in it at all. What this means is that the clustered index - if fragmented - must be rebuilt using an offline operation. This translates into downtime for that table - while it's being maintained. So, if you can create a good clustering key EARLY, then you might be able to minimize clustered index maintenance.

Rebuilding a nonclustered index is much easier and likely to be an ONLINE operation. Because nonclustered indexes contain a much smaller number of columns - and rarely ever include a LOB column - they can be managed/maintained through ONLINE index operations. So, even if these are prone to more fragmentation (and, in general, they are) then it's not as big of a deal.

How do I solve this problem?

So, the BEST thing to do is to visually picture the ORDER of the data in the index (just think of it like a phone book - a LIST of records on PAGES - ordered by the KEY of the index). If you can visualize that structure then you can visualize what happens on an INSERT, UPDATE and a DELETE. Once you have a good visual you should be able to define a FILLFACTOR based on these two things:

  1. How much fragmentation is happening (is it becoming massively fragmented quickly) - then set a lower fillfactor (like 70 or 80 - depending on how big the table is)
  2. How frequently are you going to be running maintenance (if you can do this daily or weekly - for some objects) then you might not need much free space. If you can run maintenance against the table frequently then you won't need as much free space.

And, of course, you can directly check the level of fragmentation your table has by using the DMV: dm_db_index_physical_stats (which I referenced in Part I) and ideally, you'll create a procedure that runs through your objects determining how much fragmentation they have and the appropriate FILLFACTOR. Sometimes the best way to do this is to create your own "management table" which lists all of your tables and then have definitions for when to rebuild, when to reorg and what fillfactor to use. There are a few folks that have done things like this. Check out Ola's site here. He's put a lot of effort into a more robust and flexible combination of options for database maintenance. It might be more than you need but it's a GREAT EXAMPLE of all of the things that you can evaluate and consider wrt to maintenance.

Solving this using Database Maintenance Plans

Finally, if you CHOOSE to use a maintenance plan you ONLY have two options:

Reorganize pages with the default amount of free space
Does NOT sound like what it actually does (IMO). It sounds like it uses some form of SQL Server default value. However, it actually uses the "OrigFillFactor" value that's set for the index. This is actually a good thing IF you've set the FILLFACTOR on the CREATE INDEX statement OR on a rebuild. Reorganizing does NOT allow you to set this ONLY a rebuild does. So, if you have set this then this choice is probably best. If you run a query against sysindexes (yes, the backward compatibility view NOT the new sys.indexes catalog view), then you can see the OrigFillFactor setting for ALL of your indexes. (And, it's probably 0 - which sounds strange... 0 actually means 100... I could explain this but just trust me on this one.)

Change free space per page percentage to _____ %
Well, this is certainly an option you can consider. If you've never set the FILLFACTOR (or, even if you have...), this would overwrite it AND change the OrigFillFactor value to the percent you use here. If you decide to take this "sledgehammer" approach then I'd suggest something quite high - like 90-95% so that you don't end up wasting a lot of space. Some indexes (ones that have a high-order element that IS ever-increasing) don't really need free space and this might be completely wasted. So, I think it's better to set this indivually using stored procedures and/or a more custom maintenance procedure.

NOTE/TIP FROM COMMENTS POSTED: While I explain this concept (above) using the index option/setting (called FILLFACTOR) the UI is wanting the FREE SPACE percentage. So, if you want to use 90-95% FULL then be sure to set this value to 5-10%. Using the syntax, you will set page fullness.

OK, I still have at least 2 more in this series and EVEN THOUGH I've been completely sucked into facebook (as Paul blogged here), I'm still trying to manage to work and save SQLskills. Paul, on the other hand, just uploaded more photos. ;-)

Cheers,
kt

OK, it seems as though there's A LOT of confusion about what steps are required for proper database maintenance. And, it seems as though some recommendations are being given as "quick fixes" without any real recommendation for root cause analysis. I'm not saying that the generalizations are horribly wrong but in many cases they're just too broad and/or unspecific to actually be useful (and, well, in all honesty, some of them are just really bad recommendations because they’re so ambiguous). And, in my random internet trolling for the day, I found 4 different references that I want to go through (which is why this is only Part I). For this post, I’ll focus just on Sharepoint.

First, what did I see that’s motivating this post?
I found the following KB article – which was referenced by numerous sites as recommended reading. And, without knowing a lot about SQL (and, that’s NOT meant as a dig at all – it’s just that most apps that sit on SQL don’t ever even recommend that you need to know SQL and I can argue certain aspects of that point as well BUT, wrt to maintenance, it can really become a problem if you don't know a few things about these tasks), it does seem like good reading: Information about the Maintenance Plan Wizard in SQL Server 2005 and about tasks that administrators can perform against SharePoint databases

Here is the part that over-simplifies picking what maintenance tasks to run vs. what not to run:

DIRECTLY TAKEN FROM THIS KB ARTICLE IS THE FOLLOWING:

We have tested these tasks and the effects that these tasks have on database schema and performance. The following table summarizes the results of these tests.

Task                                        Safe to perform this task?
Check database Yes
Reduce a database Yes
Reorganize an index Yes
Clean up the history Yes
Update statistics Yes. However, this task is unnecessary because the SharePoint Timer service performs this task automatically.
Rebuild an index No. The task does not restore existing index options before the rebuild operation. However, you can use scripts that restore index options.
Note This problem was corrected in SQL Server 2005 Service Pack 2.

We used the following criteria to determine whether a task was safe to perform:

  • Whether the task modified the database schema from its default state
  • Whether the task decreased performanceResults may vary depending on the environment.

However, if you use the Maintenance Plan Wizard to perform the tasks that are listed in the table as "safe to perform," you are likely to experience increased performance in SQL Server 2005.

The big problem is: this is just too little information about too many VERY important tasks!

Let me break this down task by task and give you a few other places to go for more information.

Check database

The check database task refers to DBCC CHECKDB. This is definitely an important part of any maintenance plan. And, it really is a safe task to run as it’s NOT corrective by default. However, there is nothing mentioned about how this command may completely flush your buffer pool as it reads all of the pages of all of the objects it’s checking. So, this might impact performance but, of all of the tasks, this is the safest to run and it’s definitely a recommended task.

If you want to learn more about DBCC CHECKDB, check out these things: 

  1. Read Paul’s blog post to his TechNet article on Database Maintenance. It’s a great overview of all of these maintenance tasks and will give you a good overview of what each one does. Here’s a link to the blog post that pulls all of the TechNet resources together: http://www.sqlskills.com/BLOGS/PAUL/post/TechNet-Magazine-Effective-Database-Maintenance-article-and-August-SQL-QA-column.aspx  
  2. Read Paul’s blog post on Myths around causing corruption – so that you can get better insight into where/why the actual corruptions are occurring. 
  3. Finally, if you’re really interested in the internals of CHECKDB and how it works – Paul’s written a ton about it in his CHECKDB from Every Angle category. FYI, 3 of his 9 years on the SQL Server Development Team were spent writing CHECKDB and repair for SQL Server 2005 (so, he definitely knows how it works J). Here’s the link to the category: http://www.sqlskills.com/BLOGS/PAUL/category/CHECKDB-From-Every-Angle.aspx

Reduce a database 

OK, I’m sure I’ll get a lot of responses to this one but IMO, a database maintenance plan SHOULD NEVER INCLUDE A SHRINK.

Let me explain… J
To be honest, I'm not even a fan of manually running database-level shrinks (DBCC SHRINKDATABASE) either. Don't get me wrong - there are ACCEPTABLE times to shrink parts of a database but, in general, I'd recommend only using DBCC SHRINKFILE for individual file-level shrinks. I wouldn't schedule shrinks nor would I EVER turn on [the database option] autoshrink. I don't think shrinks should EVER be automated - either through the database option OR through maintenance plans.

If you need to do regular shrinks - then it's likely that you have some other problem. And, without DIRECTLY addressing this problem, you *might* be making things worse.

This is a bigger discussion and I’ve found a few other references that I want to pull together. I’ll post another post about this within the next day or so – and link to it from here BUT, for right now…Know this – free space is generally GOOD. Excessive free space has happened FOR A REASON. Maybe there’s a pattern to it but often shrinking is worse than just leaving the free space for the next data explosion (a bunch of data comes in, the database grows, the data is archived, the free space remains for the next set of data that comes in).

If you shrink the database you might make things worse by fragmenting everything. Paul’s video that goes with the TechNet article on Database Maintenance shows you the [shocking if you didn’t know this] effect of shrinking a database on indexes.

Reorganize an index, Update statistics and Rebuild an index

These need to be grouped together to start because this KB article does NOT address the impact of running these together. In fact, the problem, if you run these together – is that you MIGHT make things worse. First, let me give you an overview of each:

Reoganizing an index removes fragmentation in the largest part of an index (it’s called the leaf level of the index) and removing fragmentation in this level has the greatest (and positive) effect on range query scans and cache. So, this is really the most important type of fragmentation to remove. However, this is NOT the only way to do it… 

Rebuilding an index completely and totally removes ALL forms of fragmentation in all levels of an index; however, this is the most expensive (yet most effective) way to do it. As a result of rebuilding an index, SQL Server also updates the statistics for the indexes that were rebuilt. Therefore you do not need to update statistics OR reorganize an index if it gets rebuilt.

Updating statistics is important for query processing and optimization. The query processor uses statistics on your data to help determine how many rows will be processed by your query/statement. If SQL Server can accurately estimate the rows, then it can choose a more effective plan. However, if it doesn’t have good statistics, then it may not do as good of a job at accurately estimating rows and therefore it might not come up with as optimal of a plan. So, this is an integral part to good database health. However, some of this might be done via the database option: auto update statistics which is ON by default in SQL Server (and, YES, you should leave this on). Check out this post on: Auto update statistics and auto create statistics - should you leave them on and/or turn them on??

However, if you use a maintenance plan then I really see two problems: 

  1. You’ll end up doing maintenance on things that may not need it. The default behavior for these tasks is just to run them on the selected objects. And, since many people will choose all objects (possibly even of all databases) then you’ll probably select objects that won’t really need this as frequently as you run this maintenance plan. 
  2. You might end up running a combination of things that either – wastes cycles/CPU and a MASSIVE amount of log space (which can translate into all sorts of concerns for DR technologies like database mirroring which will need to send all log rows to the secondary server). For example, if you run ALL three of these things then they’ll have to be run in a certain order (you can change this in a maintenance plan). However, the default order is: Reorganize Index(es), Rebuid Index(es), Update Statistics. This means that the work that’s done by reorganizing is effectively wasted as the rebuild would have taken care of it AND the work that’s done for updating statistics could mean that they update statistics TWICE (during the rebuild AND after) and the end result is WORSE because the update statistics command might use a sampling mechanism to generate statistics (which can lead to LESS EFFECTIVE statistics information). However, this is ONLY if you change the wizard’s default. The default is for the updating statistics command to do a “full scan”. So, even if the statistics will end up being the same – it’s still problematic because it means that for all indexes you’ve just rebuilt – you’ve now updated their statistics TWICE.

SUMMARY

A database maintenance plan is CRITICAL for best performance (especially for databases that are prone to some of the problems corrected by these maintenance tasks (yes, you can read SHAREPOINT into that statement). Sharepoint uses GUIDs as PRIMARY KEYs (read this post to hear more about the side effects of this choice) and as a result, as clustering keys. This means that many Sharepoint tables are prone to [potentially a MASSIVE amount of] fragmentation.

You absolutely need to have a maintenance plan. But, what should it be?

My absolute preference is to NOT use the Database Maintenance Plan Wizard UNLESS you really know what you’re doing. It just doesn’t give enough prescriptive advice. And, if you just select the defaults, you will end up with an inoptimal maintenance plan.

A better approach would be to create your own maintenance plan. If you write the code yourself (or leverage one of the custom ones that are already out on the web) then you can strategically target ONLY the objects that have the warning signs and/or are out of date and you can set when to rebuild vs. when to reorganize (generally people rebuild if a table has more than 30% fragentation and they reorganize when it's less than 30%). Fragmentation is something that can be detected programmatically using the DMV: dm_db_index_physical_stats (in SQL 2005/2008) or by using DBCC SHOWCONTIG (in SQL 7.0/2000). Here are a few places to go to see the more flexible and programmatic way of rebuilding/reorganizing indexes:

Smart Indexing Part II - Conditional Rebuilding a blog post (with conditional index rebuild code) from SQLMCA Bob Duffy (a good friend who is located in Dublin, Ireland and whose wife (Carmel) just had a baby last week – congrats again Bob!! You guys are seriously outnumbered now!!!) here: http://blogs.msdn.com/boduff/archive/2007/06/08/smart-indexing-part-ii-conditional-rebuilding.aspx

Custom Index Defrag / Rebuild Procedures - a blog post with some posted code as well. http://www.sqlstuff.info/post/2008/03/Custom-Index-Defrag--Rebuild-Procedures.aspx

Rebuild and Reorganize Indexes in SQL 2005 – an article (with conditional index rebuild code) from SQL Server Central here: http://www.sqlservercentral.com/scripts/31857/  (NOTE: You will need to become a subscriber to get to this article.)

Rebuild Only the Indexes that Need Help - an article by Andrew Kelly (SQL MVP) on SQL Server Magazine here: http://www.sqlmag.com/articles/index.cfm?articleid=99019&pg=1 (NOTE: You will need to become a suscriber to get to the full text of the article.)

Or, build your own! Check out the BOL topic for the sys.dm_db_index_physical_stats for SQL 2005 here: http://msdn.microsoft.com/en-us/library/ms188917(SQL.90).aspx, Example D has sample code to help you get started! For SQL 2008 it’s here: http://msdn.microsoft.com/en-us/library/ms188917.aspx. It’s still Example D for the sample code to leverage. J

The most important thing I can tell you is that a SMALL amount of time getting familiar with what’s really happening in SQL as well as WHY it’s happening to you WILL BE A LOT MORE PRODUCTIVE then just slamming in a maintenance plan that solves some problems but probably creates others.

Hope this helps!
kt

There's a new feature of SQL Server 2008 that during beta was called a Configuration Server... In RTM it was renamed to a Central Management Server. It's not overly obvious and without knowing how cool it can be - you might skip right by it. And, if you're using SQL Server 2008 to manage 2000, 2005 and 2008 servers - this actually works for all of those (and can be quite cool).

In the "Registered Servers" pane, expand the "Database Engine" option to see this feature. It's basically a separate option in addition to Local Server Groups. Right-click to register a Central Management Server.

A Central Management Servers (at first glance) seems as though it's solely a way to store server groups and registered servers - centrally - so that you (and your team) have easier access to the registration properties of a group of servers. And, in setting it up, adding groups, and registering servers - well, it doesn't seem like it is any different. In fact, that's all it is - in terms of creation. However, there are new options in terms of how to use it. It definitely IS different! In fact, it's in this simplicity (and the options it exposes) that lies its coolness. Let me show you :)

On a VPC (that we typically use for demos/labs, etc. for SQL Server 2008), we have 5 instances installed (SQLDev01, SQLDev02, SQLDev03, SQLExpress and ConfigServer). All instances (except Express) are SQL Server 2008 RTM Developer Edition and they're just simply named instances (there is no special designation (other than its name) for "ConfigServer"). I created the "ConfigServer" instance as a "server to hold configuration and general centralized management features - like Policy-Based Management, the management data warehouse for Performance Data Collection, a Master Server (for Master Server/Target Server Administration), and even a centralized management server. For many of these things you'll want to use a non-Express Edition of SQL Server as some features won't work on SQL Express (I'll explain more on this later). So, for this example, I'm going to use the ConfigServer instance (note: Centralized Management Servers *DO* work with SQL Express - something I previously thought was not possible... and, this is WAY cool). Regardless, the overall features that I plan to use with this server (like Policy-Based Management) require a version other than SQL Express so I'm going to use a Developer Edition instance for this centralized management server.

OK, so in this case, I created two groups (Development and Production) and each group has servers...

If I right-click on ANY level (a specific server, a group, or the Centralized Management Server itself) then you get multiple options as seen above:

  • New Query
  • Object Explorer
  • Evaluate Policies
  • Import Policies

In this case, I'm going to right-click on the Production Group and choose New Query – this opens a query window with one slight difference. The status bar at the bottom has a different color. OK, I bet you won’t even really notice this but a regular query window has a pale yellow status bar. A centralized management server query window has a pale pink status bar. I find that this isn’t overly noticeable – so I change it to a much more bold color (fuschia!). To change this, use Tools, Options, Text Editor, Editor Tab and Status Bar and then change the setting for the Group Connections option under the Status Bar Layout and Colors section. Anything that stands out is preferred. The reason why will be apparent soon…Within this query window, anything I execute will be executed against each of the servers in this group. I can even choose to execute a query against the centralized management server itself and this will execute against every server in every group. Powerful, but potentially-too-easily dangerous.

Here, I’ll execute SELECT @@version against the Production group:

The default behavior is that this is executed against all servers and the results are unioned (or merged). If you want to change this behavior you can also set this in Tools, Options under Query Results, SQL Server, Multiserver Results – “Merge Results” equals True. You can also set whether or not the login name is appended to the results (this defaults to false) in addition to whether or not the server name is appended (this defaults to true). You can change all three of these settings.

So, why did I make the status bar fuschia… what if I execute DROP DATABASE dbname… yes, it will try and drop this database from all servers in the group. While very powerful, this is something of which you should be careful.

OK, so here are the most interesting things about centralized management servers:

1) To designate a server as a CMS - that server must have an msdb. Originally, I thought that you couldn't use SQLExpress but because SQLExpress *does* have an msdb, this does work. However, SQLExpress lacks many other features/capabilities so outside of simplifying connectivity and query access, I'd probably target a higher edition of SQL Server. Even more specifically, if you're interested in where all of the server group information is stored:

Local Server Groups stored in: C:\Documents and Settings\<user>\Application Data\Microsoft\Microsoft SQL Server\100\Tools\Shell\RegSvr.xml

Centralized Management Server details are stored within the msdb in:

dbo.sysmanagement_shared_registered_servers_internal

dbo.sysmanagement_shared_server_groups_internal

2) Not only can you execute queries but you can right-click and choose Object Explorer and each server will be connected to and entered into your Object Explorer window. This will minimize your manually connecting/opening each server one by one.

3) AND, the servers can be servers other than SQL Server 2008... I've connected to 2005 and 2008 and you shouldn't have a problem adding any other SQL Servers - even 7.0...but, I haven't tried that (let me know if it works for 6.5 :).

3) Finally, in addition to executing queries, you can also Evaluate or Import Policies. This is something that can be really powerful. In addition to executing something directly, you can evaluate a policy against a number of servers in one step. Policy-Based Management is something that you'll start (if you haven't already) hearing a lot more about. And, since I'm always behind at blogging - check out these other links/blogs for more info on PBM:

Enjoy and thanks for reading,

kt

Theme design by Nukeation based on Jelle Druyts