In show #312 Richard and I talked to Andy Leonard about unit testing the database.
Carl Franklin: Andy Leonard is a Solid Quality Mentor, SQL Server MVP, SQL Server database and Integration Services developer, and engineer. He is a co-author of Professional SQL Server 2005 Integration Services, Professional Software Testing with Visual Studio 2005 Team System: Tools for Software Developers and Test Engineers, and MCITP Self-Paced Training Kit (Exam 70-441): Designing Database Solutions by Using Microsoft SQL Server 2005. Andy founded and manages VSTeamSystemCentral.com in August 2005 and maintains two blogs there: Applied Team System and Applied Business Intelligence. He also blogs for SQLBlog.com. Andy’s experience includes Web application architecture and development, Visual Basic, ASP, and ASP.NET; SQL Server Integration Services; data warehouse development using SQL Server 2000 and 2005; and test-driven database development. Welcome Andy.
Andy Leonard: Thanks.
Richard Campbell: People have been asking about more testing, stuff like unit testing so I think database is an interesting twist on it and I really get the sense that this really came about, this is all made possible by Data Dude, the Database Edition of Studio.
Andy Leonard: I would agree that the Database Edition has given us a nice framework for it, but I was doing database testing as far back as five years ago.
Carl Franklin: I was going to say, can’t you write unit test in code the way that you normally would write them against code except that you’re writing data code?
Andy Leonard: You certainly can. You can do it and there’s some good work out there. I’ll give props to Adam Machanic who has written a book about Expert SQL Server 2005 Development for Apress and he has a wonderful chapter in there about why and how to test the database. He is doing it from the application side. He makes careful notes about not writing the test in such a way that when they break, it’s an application error.
Carl Franklin: Yes.
Andy Leonard: That’s sort of my point. I’m not trying to take this thing away from what Adam is saying. Any testing is better than none at all. It’s sort of the old thing about beer with pizza. Bad beer with pizza is still a beer and pizza.
Carl Franklin: Still a beer and pizza, yeah.
Andy Leonard: Yeah. Anyway, Adam has an approach on it and I admire it and I’m a huge fan of Adam’s anyway, but the issues that I see with that is that it’s very easy to write a test in the application layer and have it fail because of something in the application and, as you mentioned, Richard, Data Dude alleviates a lot of that and to go to your point Carl, you could write T-SQL in Query Analyzer and run your own test and when you start doing complex processing and/or ETLs or data warehouses or any other process…
Richard Campbell: Acronym police, ETL.
Andy Leonard: Oh sorry, extract, transform and load.
Richard Campbell: Right.
Andy Leonard: It is what SQL Server Integration Services is built to do. SQL Server Integration Services does a lot more than that, but it does that very well. That’s the idea of copying data from some system of record or a source system and transforming it into some other shape or different style of schema, for instance from a third normal form into a star schema and then loading it into a database again in a new format.
Richard Campbell: Really, the database just makes this so much easier and of course you can do it by hand, but maybe we should just talk irrespective of tools. What does a unit test for a database look like?
Andy Leonard: Well, typically, when you’re doing unit test against the database, you’re testing a stored procedure. That’s the most common use for it and what you’re after is validating the contract. Now, being a former developer, I won’t call myself a developer because I have too much respect for developers. I’ve not done it in years. I’m out of that game. However, it’s validating the contract. Is the API doing what we want it to do? What we expect it to do? In this case, in a stored procedure, we may be passing an ID and expecting a name to come out. In the unit test portion of that, all we would do is validate that when we send in an integer that we get back a varchar or a string and that would be the extent of the unit test. We don’t care on the unit test if it’s the right string.
Carl Franklin: Right.
Andy Leonard: We care about things like maybe the column name right. Does it come back as Name, a column named Name, and does it have a value and is that value a string?
Carl Franklin: Well, that leaves a whole bunch open for issues and areas that aren’t covered by the unit test.
Andy Leonard: That’s true and we covered those in functional tests.
Carl Franklin: Okay.
Andy Leonard: So that’s when we start validating that it did bring back the right name for the integer we passed in.
Carl Franklin: Yeah.
Andy Leonard: You can do a lot of other things inside of the functional test as well. I use a variation of a functional test for what I call exception testing. I want to see if I’m calling a stored procedure that does an insert into a table, that maybe that table is related to another table by a foreign key. I want to see that if I try to insert a row for this parent table that doesn’t have a child record or vice versa, that the foreign key string fires and returns an exception.
Carl Franklin: Right.
Andy Leonard: And that’s gotten a lot easier. When you talk about Data Dude, that really made it a lot easier. I see that there are two big advantages for Data Dude. I’ll name three big advantages. One is that it really gave us this nice wrapper that we can work in. It’s inside of the Visual Studio IDE. I like the Visual Studio IDE a lot. There was some concern, I will say, on behalf of some of my fellow database professionals about losing the ability to do some database development inside of Enterprise Manager when we moved to 2005. I wasn’t in that camp. I was excited about it. I like what we get from the Visual Studio IDE.
Richard Campbell: Well, it’s in everybody’s best interest that we have one IDE that’s common to all of us, that gets all the same features. That’s all good.
Andy Leonard: Amen, yeah. We get such a rich environment and when I’m conducting training on SSIS or SQL Server Database Edition, I make sure that I point out all of these disadvantages that we get. Another good thing about Database Edition is we get to hook into the same testing framework that’s built into the .NET framework and it shows up specifically in Visual Studio for Testers. So we’re connecting to that same framework. There are some differences, but there are a lot of similarities. We’re using a lot of the same code. When you’re writing your unit test and I’m writing mine, we’re using a lot of the same code. I would say the third thing is take it up to the next step into the application life cycle management. We’re both able now to allow the sender a tool like Team Foundation Server and have these tests fired automatically and for the first time, I think, in the database development sphere, we are now able to approach continuous database integration.
Carl Franklin: Wow.
Richard Campbell: Which is a fascinating concept. We had Skinner and Drapers on when Data Dude was first being announced and of course we keep calling it Data Dude, which was its code name, because Visual Studio Team System Database Edition is just a dreadful name.
Carl Franklin: Is that the full name? I think it’s even longer than that.
Richard Campbell: Well, I left out the version number, right? It could be Visual Studio Team System 2008 Database Edition or 2005 Database Edition. That’s why we call it Data Dude.
Andy Leonard: That’s shorter than the original, Visual Studio 2005 Team Edition for Database Professionals.
Carl Franklin: That’s what I thought it was.
Andy Leonard: I think they shortened that because it ran off the box.
Carl Franklin: They want to save on ink.
Richard Campbell: Oh my Lord. What is up with these names?
Carl Franklin: It’s ridiculous. That is probably the most ridiculous one.
Richard Campbell: Yeah, that’s where the naming scheme broke down. It’s finally too long.
Carl Franklin: Somebody said, "All right!"
Richard Campbell: But getting back to the point here, now databases are a full-fledged member of the development team, they’re part of the team suite, so they get all the reporting, they get all the automation, that just sort of falls out for free once you make that commitment to Team System.
Andy Leonard: Exactly, and it’s a great thing. It’s a bit of a hard sell because-I don’t know if you’ve been around many DBAs before.
Richard Campbell: Oh, yeah, I know. I’ve been a DBA. The line is, “Change is good. You go first.”
Andy Leonard: I love that. I’ve done production DBA work, Richard. I was okay at it but I wasn’t very good at, I’ll just be honest.
Richard Campbell: In the end, the problem with this job is if you do it perfectly, this is true of IT in general, if you do your job well, nobody can tell. You are completely invisible.
Andy Leonard: You are correct.
Carl Franklin: Getting back to the last amazing thing you said, which was we’re approaching continuous database integration. Do you mean to say that the days of having testing and staging databases is really over?
Andy Leonard: Oh no, don’t take it that far yet.
Carl Franklin: Okay.
Andy Leonard: What I will say is that what this has allowed us to do-and really Database Edition does a fine job on the schema side of things, but that’s really at best only half of a database application. The other half is the data itself and there’s still enough room in there, enough wiggle room and enough need for the actual data testing, which you can do in Database Edition, to maintain a test database on a staging database and I will never advocate moving away from development integration, you know, testing or QA and then a production layer. I like a four-tier structure. It just catches all of my mistakes usually, or I’ll say all five 9’s. It gets five 9’s and that makes me feel better.
Richard Campbell: The essential difference between data and applications is that data has to be carried forward from one version to the next. It has to be. We can’t just replace it like we can with code.
Andy Leonard: That’s a great point, Richard, and that’s part of the issue with database development and there’s no free lunch right now on doing that. Database Edition does a wonderful job of managing the schema and you could do things like roll the schema back but there is always some coupling there between whatever application is accessing that data schema and certainly the data itself and decoupling that; I don’t know what we do about that.
Richard Campbell: Yeah, I don’t think that it’s actually possible. The thing is roll backs on the database have always been a disaster.
Andy Leonard: Yes and Database Edition doesn’t change that.
Richard Campbell: No, I think it’s still a disaster. Just because you’ve automated it doesn’t mean people haven’t still lost data. It doesn’t mean you’re not face-planted.
Andy Leonard: You’re absolutely correct.
Richard Campbell: But that’s what makes DBAs cranky. A rollback of a database is a total disaster.
Carl Franklin: Yeah.
Andy Leonard: We don’t have a tool yet other than you and I typing a lot and during restores from backups and running queries that select all the data that’s not in the existing database from the restored backup and pushing it into the owner. That’s the way we do it. I don’t know of a better way to do this and, again, I don’t claim to know everything about this, but that’s still an issue and we haven’t gotten away from that. What we do have is a way to add more on the preventative side with Database Edition with the idea of even having unit tests. We’ve gone a little more proactive and maybe we catch the error in a unit test before we deploy and have our customers catch it. I’ve got a slide in my presentation on testing the database that says, "When do you want to find the issue? Through unit test or at integration or at QA or in an e-mail from your soon-to-be former largest customer? Where do you want to find the issue?"
Richard Campbell: Yeah. Well, McConnell’s book, Code Complete and so forth, it said way back when, it’s 200 times more expensive to catch it later, to catch it at the customer level, than it was to catch it at the development level.
Andy Leonard: Absolutely. Yeah, absolutely. I would say it probably applies even more to databases than it does in the application.
Richard Campbell: I totally agree.
Andy Leonard: That data is the life-blood of so many businesses now and there’s lots of good information out there about just how important the databases are and database development itself has gotten so little attention. I don’t think the database community complains about it because by and large, we’re pretty happy being in the background and just as you said, Richard, when we do our jobs perfectly, no one messes with it.
Richard Campbell: Right.
Carl Franklin: Andy, when you were talking about running a unit test site, I take it that you mean you’re writing code. Is it possible to write unit test with other stored procedures? For that matter, is it possible to write a stored procedure that itemizes all of the other stored procedures and then calls them automatically?
Andy Leonard: There certainly is, Carl, and that’s the way I was doing it up until June 1st or so of 2006 or 2007, whenever Microsoft announced that they are going to release Data Dude and they put CTP 1 out within like five days or something. It was nuts. Kudos to the development team there. They did a great job. Yeah, that’s how you did it. You created your own testing framework. I’ve built my own databases that were holding information in there because I want to compare the test run today to the test run next week and in my mind, coming from a developer, I was an Internet Web guy who did a little bit of form stuff in Visual Basic up through version 6 and just dabbled in .NET, but even then, I was doing test first development in VB 6. I’ve jumped the fence and there’s a funny story how I got over that fence but the short version is, at one point in my life, I held the SA password and that made me the most qualified person on the team. So, I was the DBA at that point, I was the database developer and that first serious database development effort was a 1.6 terabyte data warehouse in SQL Server 2000.
The conversation continues online at www.shrinkster.com/vq4.