SQL Server – Backup to Network Storage – Error – The system cannot find the file specified ( 18210 )

Background

Taking a backup to get our LogShipping coupling back operational, but the backup is failing.

The error message is posted below.

Error Message

 

Event ID Message
18210 BackupMedium::ReportIoError: write failure on backup device ‘\\networkstorage\SQLBackups\MSSQL\HRDBProd\Full\hrdb.bak’. Operating system error 2(The system cannot find the file specified.).

 

Possible Remedy

There are a couple of areas that we will review:

  1. Adjust Network Session Timeout setting
  2. Disable and Stop “SQL Server VSS Service
  3. If on MS Windows 2003, apply KB 843515

Settings

  1. Increase Network Session Timeout
    • Branch :- HKLM\SYSTEM\CurrentControlSet\Services\LanmanWorkstation\Parameters
    • Item :- SessTimeOut
    • Default :- 15
    • Range :- 10–65,535 ( seconds )
    • What to do: Increase to 360
      • This will change the timeout to 6 minutes

It does not appear that the System needs to be restarted for this change to be effectual.

SQL Server VSS Service

Disable SQL Server VSS Service

Services

Hotfix

 

Microsoft

MS Windows 2003

  1. KB 843515
    • Your network connection may be reset when you try to perform a file operation on a remote Windows Server 2003-based computer
      https://support.microsoft.com/en-us/kb/843515

      • This patch alleviates a problem where concurrent SMB Sessions criss cross each other

 

Restore

After making the changes described above our backups are now running successfully.

But, still not out of the woods.

As we started the restore (with stats=1) noticed that the beacon is a bit slow.

Let us see what the metrics looks like…

DB Restore Profiling

Here we query dynamic management views  (dmvs)  to see expected timeline.

Code

No code writing here, as SQLDenis already has code.

SELECT
    d.PERCENT_COMPLETE AS [%Complete],
    d.TOTAL_ELAPSED_TIME/60000 AS ElapsedTimeMin,
    d.ESTIMATED_COMPLETION_TIME/60000   AS TimeRemainingMin,
    d.TOTAL_ELAPSED_TIME*0.00000024 AS ElapsedTimeHours,
    d.ESTIMATED_COMPLETION_TIME*0.00000024  AS TimeRemainingHours,
    s.text AS Command
FROM    sys.dm_exec_requests d
CROSS APPLY sys.dm_exec_sql_text(d.sql_handle)as s
WHERE  d.COMMAND LIKE 'RESTORE DATABASE%'
ORDER   BY 2 desc, 3 DESC

 

Output

RequestsTime

 

Explanation

  1. We already used up 2 hours
  2. And, have 12 more hours to go

 

Review Metrics

Resource Monitor

If you are using MS Windows 2008 and higher, please, please use “Resource Monitor”.

We can quickly see our Disk IO Stats.

ResourceMonitor-Disk

Quick Explanation:

  1. In the example above, we see that the System process is giving us about 10MB/sec for reads.  The data is being read from our backup (.bak) file
  2. The data is being written to our data file at about 10 MB/sec to our datafile (.mdf)

 

Resource Monitor

Performance Monitor

PerformanceMonitor.201601050457PM

Metric:

  1. SQLServer:BackupDevice
    • Device Throughput Bytes/sec at about 11 MB/sec
  2. PhysicalDisk
    • %Disk Read Time at 1.711
    • %Disk Time at 101.563
    • %Disk Write Time at 99.853
    • Avg. Disk Write Queue Length at 4.998
  3. Network Interface
    • Bytes Received/sec of 700 KB/sec
    • Bytes Sent/sec  of 300 KB/sec
    • Bytes Total/sec of 1MB/sec

Explanation

  1. Our backup device throughput matches our Resource Monitor File I/O Stats
  2. Our Network Interface is very tiny
  3. The big numbers is with Disk Utilization
    • Disk Read percentile is only at 1.711
    • Disk Time is at 101.563%
    • And, Disk Write is at 99.853%
    • Confirms that our burden is Disk Writes

 

Restore Stats

RestoreDBStats

Explanation
  1. The complete restore took 17 hours
  2. Thankfully the restore engine gives us a breakdown of pages restored at the filegroup level

 

Fess up

The reason why I wanted to review the backup file stats and compare with network stats is that I stupidly used a UNC path to reference the backup file, while infact the backup file is stored locally on our Log Shipping secondary.

Thankfully, MS Windows is covering for us, and not treating the backup file as a Network resource.

 

Commentary

We see we have problems with Disk I/O throughput.

Is the problem Disk RAID Level?

 

References

Hotfix

  1. Board index ‹ Discontinued and Previous Versions ‹ SQL Backup Previous Versions < Write failure on backup device
    https://forums.red-gate.com/viewtopic.php?p=11826

 

Q/A

  1. Home » SQL Server 7,2000 » Backups » Problems Backing up SQL 2000 over the network…
    http://www.sqlservercentral.com/Forums/Topic532254-24-1.aspx
  2. SBS – 2008 – VSS Backup
    http://serverfault.com/questions/294086/sbs-2008-vss-backup

 

Support

  1. Error message when you run the “vssadmin list writers” command on a Windows Server 2003-based computer: “Error: 0x8000FFFF”
    https://support.microsoft.com/en-us/kb/940184


Microsoft – System Monitor

  1. SQL Server: Backup Device Object
    https://technet.microsoft.com/en-us/library/aa905131(v=sql.80).aspx

 

Blogs

  1. Performance Tuning Tips for SQL Server Backup and Restore
    http://www.sql-server-performance.com/2007/backup-restore-tuning/

 

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s