Backups? What are you talking about?
One of the most basic administrative tasks in SQL Server is maintaining proper backups. A backup is nothing more than a dump of all the data, metadata and objects on a database (like stored procedures, functions, etc.) Not only do you need to make sure that backups of your DBs are taken regularly — but you also need to make sure that those backups are kept on a resilient location (such as RAID 5 arrays) plus, you need to test your backups by restoring them regularly (how frequently really depends on how comfortable you are with your infrastructure) to ensure you can recover in the event of a failure.
How do backups work?
To understand how backups work, and how to create and maintain a proper Recovery Strategy, you need to understand what Recovery Models are. What recovery model you choose for your DB with will greatly impact the amount of data loss in the event of a failure. And as you already know — hardware is unreliable. Failures are bound to happen. As a DBA, you need to make sure that you can satisfy your business requirements for acceptable data loss. Your business rules might dictate that it is OK to lose a whole week’s worth of data in the event of a failure, so a weekly backup would be okay. This is rarely the case. Most businesses perform nightly backups and are okay with losing a whole day’s worth of data. There is talk of some financial institutions backing up their databases every 10 seconds or less!
Now, obviously, backing up an entire 100Gb database every 10 seconds is unrealistic (at least as of time of writing). But that’s where Recovery Models come into play. As you know, SQL databases are comprised of data files and a transaction log. The Recovery Model you choose for your database dictates how much information is kept on the transaction log. In SQL Server, every single transaction performed against the database is written to the transaction log first, and from there to the data file.
On the next post on this series, we will discuss the different types of backups and how those types of backups relate to the recovery model you have chosen for your database.
Ok, What do I choose from?
There are three different recovery models. Simple, Bulk-Logged, and Full. Let’s discuss each one.
The Simple Recovery Model
The Simple recovery model is what it implies. The information on the transaction log is kept to a minimum. If you choose this recovery model, you can only restore your database as it was when you issued your last backup. The types of backup supported by this model are full and differential (more on these in the next post). Space used by transactions in the transaction log is reclaimed automatically as they are written to the data file. Point-in-time recovery (where you can restore the database as it was at a specific time) is not available using this model.
The Full Recovery Model
In the Full recovery model, transactions in the transaction log are kept until you perform what is called a Transaction Log Backup, or tlog backup. In this model, point-in-time recovery is available, so you can recover your database to a specific moment, provided that a tlog backup was taken after that point. In this recovery model, your transaction log will grow indefinitely until you issue tlog backups. Transaction log backups are tipically much smaller than full backups (as they only backup information in the tlog)
The Bulk-Logged Recovery Model
The Bulk-Logged recovery model maintains a higher level of recoverability than the simple recovery model, as you still need to perform tlog backups with it. However, information for bulk-logged operations like BULK INSERT, CREATE INDEX, and SELECT INTO are not kept in the transaction log, but rather in the data files. This means that if a bulk-logged operation happened since your last backup, you will be forced to redo those operations manually. Also, point-in-time recovery is not available with this model. You might want to choose this model if your Bulk operations are done at different times than the normal processing.
Okay, where do I find more information on the Recovery Models?
SQLServerPedia.com has a great wiki on Recovery Models. It is available here.
MSDN has a copy of the entry in Books Online about Recovery Models – check it out here.
SQL Server 2000 Recovery Models Give You Backup-and-Restore Flexibility — rather old article from SQL Server Magazine available at MSDN. Still relevant.
What next??
Stay tuned for the next post on Backups, where we will discuss backup types and how to create a proper Backup and Recovery Strategy.
Couple of clarifications:
1) the amount of transaction log generated in the SIMPLE and BULK_LOGGED recovery models is exactly the same. Only difference is that the log can be cleared (if possible) on a checkpoint, rather than waiting for a log backup. The log space can only be marked as unused once an entire VLF doesn’t contain any active log records, not just because the data pages that the log records affected have been written to the data files.
2) minimally-logged operations do generate transaction log in the BULK_LOGGED recovery model, just not anywhere near as much as when fully-logged
3) point-in-time recovery is available in the BULK_LOGGED recovery model, but not at any point in a log backup that contains a minimally-logged operation.
4) in any recovery model, it’s not just a checkpoint or log backup that says when transaction log records are no longer required – you need to consider database mirroring send queues, transactional replication log reader, change data capture, *other* active transactions preventing log records from committed transactions being overwritten.
You can find a good primer on log architecture and recovery model behavior in the TechNet Magazine article I wrote on Understanding Logging and Recovery in SQL Server: http://technet.microsoft.com/en-us/magazine/2009.02.logging.aspx
Thanks
Hey fellow SQL people, I have finally posted my massive SQL backup scripts that are 1000% better than default maintenance plans that SQL comes with.
-Chuck Lathrope
@SQLGuyChuck