I’ll start by saying that I swear I posted this at some point in the past but I don’t see it looking back at old posts. This is from Aug ‘19; though, I haven’t seen any release notes on corruption in Azure.
Let’s say you see a request to restore a database backup to an Azure Managed Instance. You do this task and then a few days later the team that asked for the restore says they are having problems connecting to the newly restored backup.
You open SSMS and the database doesn’t show up in Object Explorer. You find the database exists by querying sys.databases; but, why doesn’t it show up in Object Explorer? This is weird but no worries. You attempt to connect to the database with “USE” but get the following error:
Msg 40863, Level 14, State 2, Line 1
Connections to this database are no longer allowed.
This is crazy! What now? Open at ticket with MSFT? This seemed the only choice and what was the root cause? Apparently in Azure Managed Instances, Microsoft will check databases for corruption and will take the database offline if detected.
When in this special offline state there’s no way to access the database and Microsoft must be contacted. You can’t set the DB in recovery mode or change it to ONLINE. Microsoft does “contact” someone to notify that the database was taken offline due to corruption but if you work at a larger company this notification may never reach the right people.
Once the database is back online, it would seem you have a limited time to troubleshoot because the same corruption check will disable the database when it runs again.
To make things worse there are limited options to fix the database.
- You can run CHECKDB in a Managed Instance but they do not allow SINGLE_USER mode, which means you cannot use the REPAIR option with CHECKDB. Ideally you would want to fix the corruption you find but in this scenario CHECKDB would either take too long and the database would offline itself or it would fail due to lack of resources; so, I couldn’t even find out what the corruption was as suspect_pages had no rows.
- You can’t restore with REPLACE because this isn’t allowed.
- Drop existing database? Probably not. In this case there was a process stuck because the tail of the log backup was failing and the the database couldn’t be dropped. This had to be fixed in the background by Microsoft.
It’s 2020, is DBCC CHECKDB relevant? For migrations absolutely! Imagine going into prod for 3 days and then not being able to use those 3 days of data. Luckily this was a test database. Over the years I’ve seen many clients that say “We can’t do CHECKDB because it won’t complete or the database is too big”. Keep in mind you can run CHECK ALLOC, CHECKTABLE, & CHECKCATALOG individually.
What did we learn today?
- ALWAYS run CHECKDB as a premigration step. If you’re given a backup restore it and run CHECKDB.
- PaaS database solutions can be painful in some scenarios because you’ve given up control over some management tasks (easier isn’t always better).
- You can run CHECKDB in a Managed Instance but you can’t use the repair.
- CHECKDB can be too resource intensive so run CHECKALLOC, CHECKCATALOG, and CHECKTABLE independently on a daily basis.
- Microsoft runs corruption checks in the background with your backups (This doesn’t mean CHECKDB isn’t needed. You should still be doing your DBA 101 tasks.)
Side note: Availability Groups offer “Automatic Page Repair” which could be helpful if this had occurred in Azure or was in place where-ever this db backup came from.