Author Topic: SQL Server Restore Problem (Read 2998 times)

AZRedhawk44 · « **on:** January 21, 2011, 08:27:09 PM »

I know my MS SQL Server pretty well.

I know the difference between FULL and SIMPLE recovery modes.

I know how to shrink down a bloated transaction log:

Code: [Select]

ALTER DATABASE TargetDB SET RECOVERY SIMPLE
BACKUP LOG TargetDB WITH TRUNCATEONLY
DBCC SHRINKFILE (TargetDB_log, 1)

And I know that ideally a smart DBA has a disaster recovery plan for a production database in FULL recovery mode that consists of regular FULL and LOG backups that empty the T-log so it doesn't bloat into eternity and wind up consuming the entire disk system of the database server. It just ain't that hard.

I used to do it for a medical records warehousing company and had several million patient records under my fingertips. They were backed up quite thoroughly, and the T-logs were never more than perhaps 250MB to catch the serialized transactions of the busiest periods of activities between log backups. Each of these databases had MDF files that were anywhere from 5 to 100GB, and I had dozens of them to monitor.

However... the people that sometimes send me databases that I work with... do not apparently know these things.

IMO, a DBA that doesn't know how to keep a transaction log manageable should be stoned to death with old SCSI/Fibre Channel drives, but that's a battle for another day.

The battle I'm trying to fight right now, is:

Assume you have a BAK file made on SQL Server.

When you run RESTORE FILELISTONLY FROM DISK = 'E:\Path\to\backup.bak'

you end up getting this:

Code: [Select]

DBName1	E:\SQLData\DBName.mdf	D	PRIMARY	20971520
DBName1_log	E:\SQLData\DBName_log.ldf	L	NULL	22454526336

Yes, that log file is 22,454,526,336 bytes (22GB) in size.

Yes, the actual MDF file is only 20MB.

All you have is the BAK file.

This idgit spent hours transmitting what should be a 20MB database to you, sending a worthless 22GB log file.

You don't have the MDF file, so you can't use sp_attach_single_file_db to generate a clean log.

Is there any feasible way at all of doing a partial restore and NOT restoring the LDF file?

I deliberately created a bloated T-log in one of my test databases (in FULL recovery mode and the LDF file set to autogrow) with the following looping code and letting it run for 15 minutes:

Code: [Select]

declare @i int

set @i = 1

while @i < 2
begin
	begin transaction
	INSERT INTO junk_table (col1) VALUES ('junk')
	commit transaction
	
	begin transaction
	DELETE FROM junk_table
	WHERE col1 = 'junk'
	commit transaction
end

Then I backed up the database to a BAK file.

I tried:

Code: [Select]

RESTORE DATABASE MDFTEST FILEGROUP = 'PRIMARY'
FROM disk = 'E:\Path\to\backup.bak'
WITH PARTIAL
, MOVE 'DBName1' TO 'E:\SQLData\MDFTEST.mdf'

But that complains that I am not giving a suitable MOVE command for the LDF file.

When I give a MOVE command for the LDF file, then it restores my bloated LDF file in the BAK file.

I tried adding the NORECOVERY flag to the RESTORE command, but I still get the same error about redirecting the LDF file.

Is there any way to issue a RESTORE command that skips population of the LDF file and just creates a new one?

AZRedhawk44 · « **Reply #1 on:** January 24, 2011, 03:13:47 PM »

No one, huh?

I've got a database server running SQL 2000 inside of a Virtual Server 2005R2 environment, restoring a 20GB MDF and 20GB LDF.

It ran lickitysplit while restoring the actual files, but Enterprise Manager is stuck at about the 80 or 85% point and PerfMon.exe shows slow activity in exactly 8KB increments to disk. Funny thing, those 8KB increments. Exactly the same size as the smallest committed disk transaction called a "page" that can be done in MS-SQL.

Database has been doing something in 8KB increments all weekend while I let it restore. This server has restored 20-50GB databases before with no issues. It's a pretty well established test box for development work with no prior drama with large DB's. Performance ain't awesome, but sufficient to our typical expectations. 2GB RAM, 1 CPU thread, 500GB VHD file hosted on a RAID5 array. CPU during the whole restore operation shows sub 5% utilization and memory shows ~1.7GB consumed, ~300MB free, and ~1.6GB pagefile consumed.

I have a sneaky suspicion, fed by the 8KB disk operations, that the SQL Server is performing a COMMIT/CHECKPOINT step by step validation of the MDF file based upon the contents of the LDF file, since the log obviously hasn't been backed up in eons. I can't find any resources that tell me what EXACTLY the RESTORE command does after it's done writing MDF/LDF files but before it puts a database into operational multiuser mode. Some sort of dirty page detection routine, I'm guessing. Though I wouldn't expect a full backup to contain anything like that.

I just have no idea why it's going so danged slow.

PerfMon shows the PhysicalDisk counters are well within the host controller limits. Current Disk Queue Length just kind of sits at 1, then spikes to 3 or 4 every now and then and falls right back down to 1. % disk read time = 0 typically, while % disk write time = 20-25%.

Our whole virtualized database environment is getting shifted to Hyper-V in the next couple weeks and this is really just a legacy server, but it's never been slow before. I keep getting drawn to something going on with this huge LDF file...

Running other SQL tasks against other databases on that server doesn't seem impeded.

RevDisk · « **Reply #2 on:** January 24, 2011, 04:13:46 PM »

Quote from: AZRedhawk44 on January 24, 2011, 03:13:47 PM

I've got a database server running SQL 2000 inside of a Virtual Server 2005R2 environment, restoring a 20GB MDF and 20GB LDF.

::cringe::

Production DB running MS-SQL inside Virtual Server..?

I've just been toying around with MySQL (and leaning towards trying out MariaDB).

AZRedhawk44 · « **Reply #3 on:** January 24, 2011, 04:44:19 PM »

Nothing's "production" for us, Rev.

The only "production" data we have is source code on a file server and customer relations information in 1 small SQL Server database.

Everything else is development sample data from clients that are experiencing errors. Doesn't need to be blazing fast, just fast enough to fix whatever problem they have.

I powered off the VM, added another 512MB RAM to it, started it back up and re-ran the restore. It got done in maybe 10 minutes.

Weird.

RevDisk · « **Reply #4 on:** January 24, 2011, 05:49:54 PM »

Quote from: AZRedhawk44 on January 24, 2011, 04:44:19 PM

Nothing's "production" for us, Rev.

The only "production" data we have is source code on a file server and customer relations information in 1 small SQL Server database.

Everything else is development sample data from clients that are experiencing errors. Doesn't need to be blazing fast, just fast enough to fix whatever problem they have.

I powered off the VM, added another 512MB RAM to it, started it back up and re-ran the restore. It got done in maybe 10 minutes.

Weird.

Ahh, gotcha.

Glad it worked. Damn that's a bit strange. I'd just chalk it up to the environment. The sys or app logs show anything strange?

AZRedhawk44 · « **Reply #5 on:** January 24, 2011, 06:09:38 PM »

Quote from: RevDisk on January 24, 2011, 05:49:54 PM

Ahh, gotcha.

Glad it worked. Damn that's a bit strange. I'd just chalk it up to the environment. The sys or app logs show anything strange?

System log shows "msvmscsi" took a dump right around the same time I started the restore.

Prolly should have checked there before assuming it was an unmanaged log file issue.

Quote

Event Type:   Error
Event Source:   msvmscsi
Event Category:   None
Event ID:   9
Date:      1/21/2011
Time:      1:38:51 PM
User:      N/A

Description:
The device, \Device\Scsi\msvmscsi1, did not respond within the timeout period.

Yeah. I guess that's why we don't run any production servers on Virtual Server 2005.

I guess the data I was querying may have been held in RAM/cache, when I was seeing if SQL was responsive before. Funky.

Oh, well. It's moving to Hyper-V in the next week or two.

RevDisk · « **Reply #6 on:** January 24, 2011, 09:00:30 PM »

Quote from: AZRedhawk44 on January 24, 2011, 06:09:38 PM

System log shows "msvmscsi" took a dump right around the same time I started the restore. Prolly should have checked there before assuming it was an unmanaged log file issue.

Yeah. I guess that's why we don't run any production servers on Virtual Server 2005.

I guess the data I was querying may have been held in RAM/cache, when I was seeing if SQL was responsive before. Funky.

Oh, well. It's moving to Hyper-V in the next week or two.

Virtual Box is a LOT better than Virtual Server. VMWare is honestly the only one I've used that I'd be comfy running as production, but that's strictly based on my familiarity with it.

GigaBuist · « **Reply #7 on:** January 25, 2011, 01:22:16 PM »

Compressing a .BAK with something like 7zip does wonders for the file size. Very handy when you need to move a large DB across a network. I wouldn't be surprised if that 22GB backup turned into something like 1GB after 7zip was done with it.

tyme · « **Reply #8 on:** January 25, 2011, 07:35:57 PM »

Virtualized I/O for a database server (unless it's not virtualized, for instance you pass through an entire pci disk controller to the guest) is not going to win speed awards. The prospect of running a somewhat busy database on a VM frightens me, and is the main reason TFL isn't already running on EC2 or Linode.

Have you disabled transactions (or fsync-after-every-transaction) during restore (I don't know anything about SQL Server specifically)? If not, forced disk sync after every commit if it's committing after every row is going to be mind-bogglingly slow on a virtual disk. That could explain what you're seeing.

@Revdisk: Postgresql. end of story. :)

RevDisk · « **Reply #9 on:** January 25, 2011, 11:41:53 PM »

Quote from: tyme on January 25, 2011, 07:35:57 PM

Virtualized I/O for a database server (unless it's not virtualized, for instance you pass through an entire pci disk controller to the guest) is not going to win speed awards. The prospect of running a somewhat busy database on a VM frightens me, and is the main reason TFL isn't already running on EC2 or Linode.

Have you disabled transactions (or fsync-after-every-transaction) during restore (I don't know anything about SQL Server specifically)? If not, forced disk sync after every commit if it's committing after every row is going to be mind-bogglingly slow on a virtual disk. That could explain what you're seeing.

@Revdisk: Postgresql. end of story. :)

Oddly enough, I'm using Linode and Amazon's cloud service. Very happy with both, actually. I'm mainly using their S3, but slowly expanding to their other services as well. I highly recommend s3sync. EC2, I'm pondering learning so that I can have supercomputing capacity on demand. My back of the envelop calcs guessimate that the available GPUs you can rent from EC2 is roughly equal to the NSA circa mid to late 90s. Nothing to sneeze at.

Yea yea, I'll dip my toes in postgresql. Out of curiosity, why do you like it over say, MariaDB?

tyme · « **Reply #10 on:** January 26, 2011, 08:05:35 AM »

Postgresql has some neat data types. There's nothing I want to do that MariaDB can do better, other than run webapps that are mysql-only (like *cough* vbulletin). I think the pgsql community is better, too: better developers, better admins, most of whom have seen some of the horrors of bad mysql setups and apps first-hand.

I like VMs. I like EC2. Maybe I'm being paranoid about cloud VM I/O performance. It was horrible years ago, but perhaps it's not so bad now. I like S3 too. Have you tried duplicity? It's not quite as easy to use as s3sync, but it uses gnupg for encryption and signing so you can guarantee that nobody on the S3 end could have read or tampered with your backups. The only reason I'd use something like s3sync would be to sync files to s3 for web serving over cloudfront.

RevDisk · « **Reply #11 on:** January 26, 2011, 02:12:14 PM »

Quote from: tyme on January 26, 2011, 08:05:35 AM

Postgresql has some neat data types. There's nothing I want to do that MariaDB can do better, other than run webapps that are mysql-only (like *cough* vbulletin). I think the pgsql community is better, too: better developers, better admins, most of whom have seen some of the horrors of bad mysql setups and apps first-hand.

I like VMs. I like EC2. Maybe I'm being paranoid about cloud VM I/O performance. It was horrible years ago, but perhaps it's not so bad now. I like S3 too. Have you tried duplicity? It's not quite as easy to use as s3sync, but it uses gnupg for encryption and signing so you can guarantee that nobody on the S3 end could have read or tampered with your backups. The only reason I'd use something like s3sync would be to sync files to s3 for web serving over cloudfront.

http://www.revdisk.org/projects/backups/

DAY=`date +%d-%b-%Y`
openssl aes-256-ecb -k CRYPTOPASSWORDHERE -in /root/serverbackup/upload/$DAY-backup.tar.gz -out /root/serverbackup/upload/$DAY.aes256
rm -rf /root/serverbackup/upload/$DAY-backup.tar.gz
ruby /root/s3sync/s3sync.rb -v -s --delete /root/serverbackup/upload/ revdisk:/backup
echo "Done."

I think AES-256 and SSL is a pretty good combo.

But now that I thought about it, I should >> the MD5, SHA-256 and whirlpool hashes to a log. And maybe shoot them into a SQL table and write a webapp to compare to the hashes on the S3 cloud. Hrm. Need to see the logical way of doing that.

Thoughts?

And yea, I'm leery of putting production DBs on any virtualized platforms. I'm told it's not a problem. Yea, ok. I'm still leaving >=1 AD server on iron, and my DBs on iron. Call me paranoid. Everyone does until the SHTF and we have a very nice clean recovery path. Then it's "See how easy that was?", while you reach for the ball peen hammer.

Pharmacology · « **Reply #12 on:** January 26, 2011, 02:34:54 PM »

All of this techno-talk makes all of the cogs in my medicine geared brain shoot sparks!

tyme · « **Reply #13 on:** January 27, 2011, 08:06:33 AM »

Quote from: RevDisk on January 26, 2011, 02:12:14 PM

I think AES-256 and SSL is a pretty good combo.

Anyone who breaks into the server can decrypt all backups, since the symmetric key passphrase is there in plaintext. That's not important if your backups only contain stuff that's available unencrypted on that computer, but some people might have other uses for duplicity, for instance if they're backing up stuff that's not kept on the server indefinitely, and in that case duplicity's built-in public key encryption support could be valuable.

Also, duplicity understands and manages sets of (full backup + differential backups). S3sync doesn't.

Any of that can be scripted using S3sync, but why bother when duplicity does it already?

Quote

But now that I thought about it, I should >> the MD5, SHA-256 and whirlpool hashes to a log. And maybe shoot them into a SQL table and write a webapp to compare to the hashes on the S3 cloud. Hrm. Need to see the logical way of doing that.

With a lot of large files that could get expensive: re-downloading them only to rehash them and validate against a database. Amazon's MD5 ETag header is useless, not only is MD5 broken cryptographically, but I bet Amazon calculates that MD5 once after upload and stores it in metadata, never to be touched again. If someone can hack the S3 backend and replace your file on disk, I suspect Amazon would still return the original MD5 as the ETag:.

I'll feel better when a SHA-3 candidate or two gets implemented in openssl and other command line hash utilities. For the super paranoid in me, I'd like to see a utility that implements all 14 of the the round two and later SHA-3 contest algorithms. I'm particularly distraught that they eliminated Blue Midnight Wish, because I thought it had the coolest name.

RevDisk · « **Reply #14 on:** January 27, 2011, 11:06:23 AM »

Quote from: tyme on January 27, 2011, 08:06:33 AM

Anyone who breaks into the server can decrypt all backups, since the symmetric key passphrase is there in plaintext. That's not important if your backups only contain stuff that's available unencrypted on that computer, but some people might have other uses for duplicity, for instance if they're backing up stuff that's not kept on the server indefinitely, and in that case duplicity's built-in public key encryption support could be valuable.

Also, duplicity understands and manages sets of (full backup + differential backups). S3sync doesn't.

Any of that can be scripted using S3sync, but why bother when duplicity does it already?

Because duplicity is Fedora/Debian/Ubuntu leaning and still in beta. For instance, ncftp and boto are not part of CentOS's app repos. Yea, I could add third party repos or just maintain them by hand. I try very hard to avoid that for ease of management. At the moment, I use exactly ONE app that isn't in CentOS app repos, and that's qmail, which is not subject to many security issues. All of the material being backed up is fairly constant (except the MySQL DBs) and obviously in the clear on the server. >90% of it is directly or indirectly web facing anyways. Only feature that'd be nice to have is differentials, but I do that by hand anyways. I do a light 60 meg upload every night (everything but the web files) and a full once a week, even that is probably overkill but a good practice. I clear out the old material off S3 by hand, which I'd prefer to do.

If duplicity was in the CentOS repos, I'd be using it in a split second. As it is as at the moment, I don't see any benefits that'd justify a complete tear down and restart. Might do so for entertainment purposes at some point, but overall? Just not seeing the benefits of PKI over AES-256/SSL. Also, theoretically, other folks might need to recover the backups in event of I get hit by a truck. Sharing the key and one command line is pretty straight forward. Setting them up with PKI is a bit more involved.

Quote from: tyme on January 27, 2011, 08:06:33 AM

With a lot of large files that could get expensive: re-downloading them only to rehash them and validate against a database. Amazon's MD5 ETag header is useless, not only is MD5 broken cryptographically, but I bet Amazon calculates that MD5 once after upload and stores it in metadata, never to be touched again. If someone can hack the S3 backend and replace your file on disk, I suspect Amazon would still return the original MD5 as the ETag:.

I'll feel better when a SHA-3 candidate or two gets implemented in openssl and other command line hash utilities. For the super paranoid in me, I'd like to see a utility that implements all 14 of the the round two and later SHA-3 contest algorithms. I'm particularly distraught that they eliminated Blue Midnight Wish, because I thought it had the coolest name.

Ayep. That's my concern. It's friggin trivial to add hash functionality. I'll try to hunt down the suggestion box at Amazon and slap it in. Who knows.

Honestly, md5sum and sha256sum are good enough for my purposes. md5sum is weakened, but quite good enough for 99.999999% of applications. Even moreso when you're pairing your hash functions. I like skein for SHA-3 for technical reasons, but yea, BMW has the best name. Rest of the crypto candidates have kinda poor names. Half points for "CRUNCH", tho. NIST has done a pretty good job as of late on crypto competitions.

Thought you might find this interesting: http://www.mat.dtu.dk/people/S.Thomsen/bmw/bmw-pseudo.pdf

Armed Polite Society

News:

Author Topic: SQL Server Restore Problem (Read 2998 times)

AZRedhawk44

SQL Server Restore Problem

AZRedhawk44

Re: SQL Server Restore Problem

RevDisk

Re: SQL Server Restore Problem

AZRedhawk44

Re: SQL Server Restore Problem

RevDisk

Re: SQL Server Restore Problem

AZRedhawk44

Re: SQL Server Restore Problem

RevDisk

Re: SQL Server Restore Problem

GigaBuist

Re: SQL Server Restore Problem

tyme

Re: SQL Server Restore Problem

RevDisk

Re: SQL Server Restore Problem

tyme

Re: SQL Server Restore Problem

RevDisk

Re: SQL Server Restore Problem

Pharmacology

Re: SQL Server Restore Problem

tyme

Re: SQL Server Restore Problem

RevDisk

Re: SQL Server Restore Problem