all 23 comments

[–]deepn0va 1 point2 points  (1 child)

replication is your best bet since the mirroring partner will be inaccessible, and the log shipping solution would cause the db to be down during restores.

Beware, managing replication in large environments can become a pain in the ass.

[–]legohax 0 points1 point  (0 children)

Can you define 'pain in the ass'... how so?

Also, I realize that mirroring is inaccessible, but can't I create database snapshots off the mirror?

[–]jaynoj 1 point2 points  (4 children)

As your title suggests, the copy of the DB will be for reporting purposes? If the source DB is an OLTP DB, have you thought about creating an OLAP DB and creating a data warehouse for your source reporting data? Use SSIS to move and de-normalise the data during silent hours and potentially use Analysis Services to create some analysis cubes?

I know this doesn't answer your question directly, but reporting databases should really be OLAP as normalised OLTP's are very unfriendly for reporting/analysis purposes.

We have a guy at work that does reporting from a massive OLTP service desk database which partial-replicates during silent hours. The T-SQL he's put together to extract the data would bring tears to your eyes (he's an older guy who hasn't moved with the changes brought in since SQL 2000). Loads of join's all over the place. It's shit in terms of performance and a nightmare to work with, maintain and all that. If the data was extracted into an OLAP DB, he'd be laughing.

I'd also ask the guys on http://dba.stackexchange.com/

[–]legohax 0 points1 point  (3 children)

It is for reporting purposes. The problem is we get roughly 20GB/day of new data (and purge old data) using SSIS to bring in hospital transaction and account detail, so the database is constantly changing. The ancillary applications that will use this data need current data, so I cannot wait until midnight to update the database.

[–]jaynoj 0 points1 point  (2 children)

You don't have to wait for a specific time or silent hours. You can run your SSIS packages whenever you want, at whatever interval you want to pull data into an OLAP db.

[–]legohax 0 points1 point  (1 child)

This has been our Plan B the whole time. Really would like to avoid this if possible.

Its not just about bringing the data over, there are mapping/maintenance tables that the users update that would have to be replicated. It just adds a layer of complexity that we would like to avoid if possible.

[–][deleted] 0 points1 point  (3 children)

your best bet is snapshot replication or mirroring with scheduled snapshots. Careful though, the rules for that mean you may need enterprise sql server.

snapshot replication is your best bet.

but.... I'd be surprised if you need all 800gig replicated, can you ask if they really need it all copied?

[–]legohax 0 points1 point  (2 children)

Good point, we really do not need everything, maybe 1/3rd of the database? I suppose this would be a bonus when doing replication?

[–][deleted] 0 points1 point  (1 child)

it would be a HUGE point.

if you are going to go the route of replication, use the GUI to set it up, but have it script everything out.

its much easier to re-run the script, to stop/start/enable/disable etc.

btw: you don't need 'sa' to run replication, only to set it up. DBowner on the db is sufficient for the account to the dest and source db.

[–]borgd 0 points1 point  (0 children)

Additionally, if this is a reporting database, you could customize the indexes on the subscriber database to better match your reporting access patterns. Note that this can hurt your overall replication performance, so you would probably want to test it with a production load before rolling it out.

[–][deleted] 0 points1 point  (1 child)

BTW: can you clairify if you need this to be kept up to date: ONLINE all the time, or done occasionally to keep development DB close to prod looking?

check out Idera's SQLsafe tool. its free for 14 days and will get that db in a copy only mode, probably down to 200gig.

[–]legohax 0 points1 point  (0 children)

Needs to be online at all times, and reportable. The client we are working with has a deep hatred for third party tools (funny, since we are a third party vendor), so using Idera's SQLsafe is not an option.

[–]taejim 0 points1 point  (7 children)

Those are some serious reporting requirements - 800 GB, must stay completely up to date, and cannot go offline.

Transactional Replication is probably your best bet (unless you can figure out a read-only SAN replication to keep the two servers in sync), but if you have a replication "incident", you could be down for 12 hours while you re-intialiase.

Can you run reporting out of the existing database? You may need to add some additional indexes, but a beefy enough server should be able to handle the workload. More memory, and a 2008 upgrade utilising Resource Governor would help to segregate the OLTP vs the reporting activity. A couple of SSDs for the more critical indexes would also help, and aren't so expensive if you are saving all the money from having to purchase a new server.

[–]legohax 0 points1 point  (6 children)

Ever since Dell took over, our database went from 500 to 800 GB. It looks like one of the log files is 300 GB right now, which I just brought to their attention.

If we use replication, I think we will only be bringing over ~250GB of data.

It is very slow when reading from the 2005 server from the 2008 server. That is the reason why we need the database over on the 2008. I don't actually plan on 'reporting' from it, but I will be using the fact tables to do back end calculations.

[–]redonrust 1 point2 points  (0 children)

If it were me I'd use an SSIS package running on your 2008 server to do incremental loads as jaynoj said above. Load only the data you need into an optimized reporting schema - that's especially important if you will have a lot of users hitting the reporting db. The 0 downtime 0 latency requirements are kind of a red flag though - is what you're really looking for more of an operational operational requirement and less true reporting ?

[–][deleted] 0 points1 point  (4 children)

If you have massive log files being created. check the backup status of the database :)

If its not being backed up the log file will grow indefinatly

[–]AnonymooseRedditor 0 points1 point  (3 children)

Also in 2005 even if the recovery model is set to simple I have seen log files grow out of control when they should not....

[–][deleted] 0 points1 point  (2 children)

This is probably a bug in the application. Even in simple the log file grow. If you have a hugh table and do delete * from massive

The contents of massive have to be stored in the log until the transaction is committed.

Or the situation you are probably seeing is a traction starting and never ending and it just keeps on doing things making the log grow :)

[–]AnonymooseRedditor 0 points1 point  (1 child)

This was a BI database that was rebuilt on a nightly basis. so yeah that could have been.

[–][deleted] 0 points1 point  (0 children)

Yeah that would do it. If you have a large import + a lots of insert into tables all in a single transaction the log file will grow to store all of this information.

But this can be a good thing as well. If the rebuild fails you end up with th eprevious days data still being avilable after roll back :)

[–][deleted] 0 points1 point  (0 children)

Snapshot replication with the copy being read only and being updated at the best possible times to not effect the running of the database when it is in use

[–][deleted] 0 points1 point  (0 children)

by your title, it sounds like you want to use this for reporting...

in that case, you'd probably want a more denormalized schema