SQL Server 2008 has been out for more than a year, and perhaps an overview and assessment is due. From a database administrator’s point of view, what features are standing the test of time?
What’s New and Enduring
SQL Server is a large, complex application, and Microsoft has quite a number of developer teams in the SQL Server group working on each release. So it stands to reason that each release is going to have a lot of new features and enhancements.
It’s easy to find many raw lists of the new SQL Server 2008 features on the Web. Many of us like to turn first to the “What’s New” section in SQL Server 2008 Books Online.
But not all new features are crafted equally. Some of them turn out to be quite useful and real hits in the field, while others seemingly fade into the background, to be rehabilitated in later versions.
In any case, I’d like to take you on a tour of two of the top SQL Server 2008 features that have turned out, in my opinion, to be real winners.
Data Compression
The leader of the feature pack for SQL Server 2008 is pretty clearly data compression. Think again about that name: it’s not “database” but “data” compression. Data compression does not apply to the whole database, but to tables and indexes of your choosing within a database. Not all data in the database can benefit from data compression. But the data that does can benefit quite a lot!
How Data Compression Works
SQL Server 2008 can apply data compression to both tables and indexes. For each, SQL Server 2008 can compress the data at the row or page level. Page level compression incorporates row level compression and adds further compression techniques.
Data compression has resulted in a dramatic decrease in I/O for SQL Server queries.
When SQL Server compresses data, the compressed data is stored on the page in memory and on disk. Only at the time of querying is the data decompressed and held uncompressed in some temporary memory buffer, to be parsed and maybe collected into a results set for a query. So decompression only needs to occur when a compressed row or page must have some data in it scanned or copied out. The rest of the time, the data stays compressed.
Benefits and Costs
Data compression for many users has resulted in a dramatic decrease in I/O for SQL Server data, both I/O from disk as well as I/O in memory when pages have to be searched. That decrease in I/O results in faster query performance, and the relative benefit increases with the size of the data being compressed.
The largest tables and indexes show the greatest benefit, and for them that benefit has proven outstanding in both relational and data warehouse settings. The benefit of data compression on large databases has proven so outstanding that a colleague of mine once said that data compression would turn out to be the “killer feature” of SQL Server 2008, and perhaps it has.
On the downside, SQL Server’s process of compressing and decompressing data is CPU-intensive. Now that could make compression less attractive, but for the most part that doesn’t seem to be the case. There are a couple factors that might mitigate that CPU cost. First, newer database servers offer a lot more CPU power than even a couple years ago, and the overall server architectures have improved as well. I’ve also noticed that the CPU cost of compression even on somewhat older servers tends to be relatively small, and well worth the benefit of the improved query performance.
Invisible to the Application
One last winning aspect of data compression is that it’s invisible to the application and the database. When you tell SQL Server to compress a table, SQL Server still uses the same indexes and it queries in the same way as when uncompressed. As a result, you can apply data compression to databases that were never designed with it in mind, and the application will not know the difference. No wonder it’s a winning feature.
Resource Governor
Not all users of the database deserve equal time. (CPU time, I mean!) Some have more important needs than others. New orders may be more important than customer service, and both more important than management reports.
SQL Server users have been waiting a long time for the ability to somehow prioritize activity within the database server, a very long time, and SQL Server 2008 delivers at least a first installment with the Resource Governor. This is a bit of a sleeper feature in the sense that it can deliver more than you think, even after you learn about its limitations.
With SQL Server 2008’s Resource Governor you can limit the CPU and memory resources available to users. Let’s pause a moment: the CPU resource that can be limited is the share of all the CPU power available to that SQL Server instance. Ok, so far so good. But the case of memory is a little different: the memory resource that can be limited is query memory, not the memory cache that SQL Server uses to buffer data. I’ll come back to this point soon.
Limiting Users
So how does SQL Server Resource Governor limit users? You write a SQL Server user-defined function and register it as a “classifier function.” When you enable Resource Governor, your classifier function gets executed when every user logs in. You also define workload groups to classify users and set your priorities. When your function executes, it needs some kind of choosing logic to decide what type of workload group you want the user to be in. You can base your classification on any of the following:
- The login name.
- Whether the user is a member of a certain Windows group.
- Whether the user is a member of a database server role.
- An application name (provided the user’s connection string supplies it).
- The host name.
The first two are probably the most widely used: the user’s login and whether the user is in a particular Windows group.
SQL Server users have been waiting a long time for the ability to prioritize activity within the database server.
Workload Groups and Resource Pools
In the meantime, you associate those workload groups with various resource pools and place CPU and memory restrictions on the resource pools. There’s always a default resource pool, which behaves like SQL Server without Resource Governor active. All those users that drop into the default pool based on your classifier function have the usual “first come first served” prioritization that you are accustomed to.
Only when you direct user sessions into limited resource pools will they face restrictions, and only when there’s enough competition on the machine to make it meaningful. So if you limit the reporting logins to only 30% CPU maximum on their queries, if no one else is on the system, they can have as much CPU as SQL Server can supply. It’s only when other logins with higher priorities also request resources that the lower priority logins get scaled back.
Memory Limitation
The limitation of CPU resources is something easy to understand because we’re all used to looking at CPU usage in Windows Perfmon (System Monitor) or Task Manager. As I mentioned a few paragraphs above, memory is a bit trickier.
The memory that you limit with Resource Governor is not SQL Server’s memory by itself, or even the data or procedure caches on the server. Instead, the Resource Governor’s memory resource refers to the amount of memory set aside for queries. By default, SQL Server will assign up to 25% of an available query memory area for any given query. Up to three queries can each use 25% of available query memory, but no more, by default. A fourth query trying to get that much memory will generate an error telling the user that there’s just not enough memory to run the query.
Most of us who work on OLTP systems will never see queries trying to use that much memory, though it does happen. It’s much more common to see high memory requirements on data warehouse queries, and that’s where the memory limitation becomes very useful.
You can change that default (normally to lower it), so that more than three of your largest queries can access the system at once, by tinkering with the memory limit percentage values for your resource pools.
Using Resource Governor to Monitor Workload Groups
One surprising use of Resource Governor doesn’t involve trying to limit anyone at all. SQL Server 2008 supplies a number of new Perfmon counters for Resource Governor. The counters to monitor resource usage in the SQLServer:Resource Pool Stats object, and the counters for workload group monitoring are in the SQLServer:Workload Group object.
If you funnel all the logins to the default resource group, no one will have their resources limited. However, the workload counters will still report activity. So even if you are not actively limiting CPU or query memory to your logins, you can observe activity, including CPU and query memory usage, for each workload group you define!
This is a new way, and so far as I can tell the only way, you can see inside a running SQL Server and get a report on CPU and query memory usage. You have to define meaningful workload groups, but once you do that, you can see things that aren’t possible any other way.
Focus on the Enterprise
Now there’s one very important thing to know about these two features: they are available on SQL Server 2008 Enterprise Edition only for production purposes. (For development purposes, they are also available in SQL Server 2008 Developer Edition.)
It appears that Microsoft has made an effort to further differentiate the Enterprise Edition from all other editions by limiting two of the nicer features, and certainly more useful features, to it.
As is often the case, a feature of SQL Server might not be important when you don’t need it. But when you suddenly do need it, you may really need it! Both data compression and the Resource Governor could easily be that kind of feature.