Friday, August 22, 2014

Windows 2012 Server Recovery

Synergy, Windows Update and Oops
I have been a quietly happy user of Synergy  a small software product that allows you to use a single keyboard and a single mouse over multiple computers.

At one time I used it between a Windows Server and OSX server computer.   These days it is between 2 Windows Computers

Within the last week I've applied various Microsoft Patches but it took me some days to realise that on my slave (and old) Virtualisation PC, something was not right

Synergy works from the Keyboard on the /master PC/ to this slave PC  but the top row of keys above the number  (e.g shift 1 is !) does not work.  Shift 1 gives 1

Too Much Haste
I tried to install the Synergy 1.5.0 release but reliably got this error

The system cannot open the device or file specified

First I tried looking at a detailed installation log using this debug

msiexec /i synergy-1.5.0-r2278-Windows-x64.msi /L*V c:\tmp\synergy.log

So in a bad move moment, I deleted the 1.4 Synergy installation and tried to install the 1.5.0 again

Disaster!  It would not install

SO NOW: I had a Synergy-less system and no means to install it.  Flup!

Go To Bed
As the night rumbled on we past midnight, then 01.00 and a variety of other desperate measures ensued.  Checking firewall, selectively un-installing fixes based on Synergy problem reports

The Windows system seems to become less stable with Explorer hanging, and C-A-D not working from a primary keyboard  (well Synergy is gone, so I mean the only keyboard!)

At 02.00 rather than battling on I made the sensible decision to sleep.

When tired and desperate, this is really the best time NOT to keep working on a frustrating problem!

The Big Recovery
Next morning I had a brainwave, since I am making regular backups, why not just restore the entire system, onto a new disk, the new disk would eliminate disk errors that I saw last night.

This is the target backup disk, yes it is there

And there is a nightly scheduled backup.   I was rather lucky in that my disastrous delete of Synergy was about midnight and the backup ran at 20.30. Whew!

Okay the Virtualisation / Development server is now at least 6 years old.  But still going!!

 Take out the existing OS disk  (right) and replace with another disk labelled OSX you will notice.  This was inside Agata's now defunct Mac Mini  before it failed.   

Find replacement disk and blank it  

DISKPART> sel disk 8
Disk 8 is now the selected disk.

DISKPART> list part

  Partition ###  Type              Size     Offset
  -------------  ----------------  -------  -------
  Partition 1    System             200 MB    20 KB
  Partition 2    Unknown            117 GB   200 MB
  Partition 3    Unknown            619 MB   117 GB
  Partition 4    Unknown            697 MB   118 GB

DISKPART> sel part 1

Partition 1 is now the selected partition.

DISKPART> del part 1 override
(done for all partitions)

New disk inserted and data disks removed leaving only blank OS disk and Backups disk inside

For this Gigabyte computer press F11 at BIOS for Boot menu and boot from inserted Windows Storage server DVD

(I was going to make a USB key from my .ISO library image but I actually found a server 2012 real DVD so I save myself work)

 From the booted DVD choose to repair the system and follow graphical screens to select a valid backup

 Select disk to restore onto

It takes about 10 minutes nightly to backup so this can't take long right?

After over 30 minutes I had lost patience again!  So I went for a little nap.

When I came back I had a Server logon screen and after logon the Synergy logo was back!  Fantastic.

Synergy is not yet fixed, but I can hobble along with the wonky keyboard until hopefully things improve!

Check Dodgy Disk

I thought that I would check the old SSD that was removed which seemed to have been hanging the system.  Was it the OS or the disk?

C:\Windows\system32>chkdsk /?
Checks a disk and displays a status report.

CHKDSK [volume[[path]filename]]] [/F] [/V] [/R] [/X] [/I] [/C] [/L[:size]] [/B] [/scan] [/spotfix]

  volume              Specifies the drive letter (followed by a colon),
                      mount point, or volume name.
  filename            FAT/FAT32 only: Specifies the files to check for
  /F                  Fixes errors on the disk.
  /V                  On FAT/FAT32: Displays the full path and name of every
                      file on the disk.
                      On NTFS: Displays cleanup messages if any.
  /R                  Locates bad sectors and recovers readable information
                      (implies /F, when /scan not specified).
  /L:size             NTFS only:  Changes the log file size to the specified
                      number of kilobytes.  If size is not specified, displays
                      current size.
  /X                  Forces the volume to dismount first if necessary.
                      All opened handles to the volume would then be invalid
                      (implies /F).
  /I                  NTFS only: Performs a less vigorous check of index
  /C                  NTFS only: Skips checking of cycles within the folder
  /B                  NTFS only: Re-evaluates bad clusters on the volume
                      (implies /R)
  /scan               NTFS only: Runs a online scan on the volume
  /forceofflinefix    NTFS only: (Must be used with "/scan")
                      Bypass all online repair; all defects found
                      are queued for offline repair (i.e. "chkdsk /spotfix").
  /perf               NTFS only: (Must be used with "/scan")
                      Uses more system resources to complete a scan as fast as
                      possible. This may have a negative performance impact on
                      other tasks running on the system.
  /spotfix            NTFS only: Runs spot fixing on the volume
  /sdcleanup          NTFS only: Garbage collect unneeded security descriptor
                      data (implies /F).
  /offlinescanandfix  Runs an offline scan and fix on the volume.

The /I or /C switch reduces the amount of time required to run Chkdsk by
skipping certain checks of the volume.

C:\Windows\system32>chkdsk k:
The type of the file system is NTFS.
Volume label is littlepaw.

WARNING!  F parameter not specified.
Running CHKDSK in read-only mode.

Stage 1: Examining basic file system structure ...
  409344 file records processed.
File verification completed.
  5012 large file records processed.
  0 bad file records processed.

Stage 2: Examining file name linkage ...
  493232 index entries processed.
Index verification completed.
  0 unindexed files scanned.
  0 unindexed files recovered.

Stage 3: Examining security descriptors ...
Security descriptor verification completed.
  41945 data files processed.
CHKDSK is verifying Usn Journal...
  40467064 USN bytes processed.
Usn Journal verification completed.

Windows has scanned the file system and found no problems.
No further action is required.

 121886719 KB total disk space.
  90076784 KB in 293275 files.
    156316 KB in 41946 indexes.
         0 KB in bad sectors.
    522383 KB in use by the system.
     65536 KB occupied by the log file.
  31131236 KB available on disk.

      4096 bytes in each allocation unit.
  30471679 total allocation units on disk.
   7782809 allocation units available on disk.

So /unfortunately/ the disk checked out clean.  I will use HD tune to check the disk further. But notice the /perf and /spotfix and other new chkdsk options.  Must investigate that.

What about Baremetal Linux Backups?
With my renewed interest in Linux  I am building up a forest of Linux systems I better get my Baremetal Linux recovery sorted out.  I have neglected this so far.   Blog Post solution to follow shortly.

Summary and Learning Points

  • I waited a week before applying the Windows Update, but clearly this was not enough
  • It took me a long time to find the error and when I did I over-reacted and deleted the Synergy program, which had I thought about it was unlikely to be the issue
  • Because I am taking a daily full OS backup, automatically scheduled by Windows I was easily able to recover my system
  • It is of course not enough to backup, occaisionally you should test the restore.  I've already tested a full OS restore on this system in the past so I was pretty confident, but to those readers: when did you last test a full OS restore (or perhaps a test restore of some backups you make to the cloud?)
  • Although I was impatient that the restore was taking over 30 minutes (since the backup takes only about 10 minutes) it did fully complete within an hour, which for about 60GB was not too bad I suppose.  More patience Marcus!