Sunday, April 03, 2016

A troubled return to 4TB Disks



Subtitle: The best laid plans ...

As reported last year I have transitioned parts of the home Computer Infrastructure upto 8TB Hard disks.  In fact it has been almost a year now on 8TB platters.

Why to 8TB?
Seagate Archive 8TB hard drives are the absolute most cost effective drive per GB at this time period.

If you check this pdf  you will see


A Sata III 6Gb/sec drive with 128 MB cache with a 150 MB/sec sustained data rate capability.


And why back again?
The 8TB drive has at least 3 mediocre numbers;

- 5900 RPM spin speed
- 8TB on 12 Disk Heads
- Load / Unload rated 300K

So 2, 4TB disks would be a lot faster. Recently I restarted work on alternate OS technologies and testing. Primarily using VMware and other virtualisation technologies.  Putting VM's on the 8TB disk, shared with other data is just too slow.

So it is hoped that a migration to faster 4TB disks and using Microsoft Storage Spaces will make everything right.


Which 4TB Disk again?

Of course I am using my old friend,  HGST, Enterprise 7K4000.


- Enterprise Disk
- 5 year Warranty
- 7200 RPM Spin Speed
- 170 MB/sec sustained thruput
- Load/ Unload rated 600K

Full specs here


Burnin and Errors


What a disaster....

I've previously never, ever had a problem with HGST Enterprise 4TB hard disks.  And since I have personally used over 20 of them  (yes, even I find this number surprising), that is not a bad record.  And for the record what measures did I take before putting this drive into service?

- Installed to the Development systemto check basic function
- Burnin, i.e. random general use on Dev System
- I ran a full
 HD Tune Pro
Error Scan. This took more than 10 hours! No errors.
- After a further 100 hours spinning on time without error, paranoid Marcus installs to Production System

Then the periodic SMART test at about the 200 hour mark detected the above error. However:


HGST Drive Fitness Test  reported no error. What!


Living in Switzerland I feared return would be impossible so I contacted HGST to ask them if I can return the drive to them instead, given their own program DFT showed all ok.

I received a nice reply that the tools I had used were /more professional/ and that a direct warranty claim would be fine if a return to the shop failed.

I went back
 to digitec and they agreed without fuss to replace drive. My low expectations were thankfully surpassed.

Strangely the drives were available at Digitec supplier but my replacement disk took 14 days to arrive for exchange. Hmm.

Detanglement
In order to provide maximum speed I had previously configured my 4TB drives into Tiered Windows Storage Spaces.  I will write this up later, essentially you can create a single storage object, in my case a 480GB SSD and 4TB HDD combined entity.  Over time Windows will automatically migrate the most used blocks from files (i.e. sub file level) to the SSD providing for superior disk performance.

Downside:  It's pretty awkward to create and to replace the underlying HDD you need to remove the Storage Space which essentially means moving the data elsewhere first (Technically you can add another HDD into the pool, but I did not have a spare one. Plus now was not the time to try it for the first time with real data in case it went to sh** )




(From above you see the initial SSD I was using was my 1.5GB/second NVMe M.2 disk reviewed here )


Industry Comment

A good article to read on disk reliability:
https://www.backblaze.com/blog/hard-drive-reliability-q3-2015/


New Testing
Losing data is not a proposition I can tolerate.  As a result I've built a makeshift test rig here to really more rigorously test the new 4TB disk







 Short 60 second test

Long 10 hour test!  Passed.

As part of my revised commitment to Linux I am using the  GSmartControl  disk health evaluation tool.   It did indeed take 10 hours for the full test.


The Big Replacement Picture then

- Get replacement drive
- Exhaustively test drive making sure no errors
- Put into Prod System
- Create a Tiered storage space for max speed
- copy over 4TB code/data
- Keep both environments running for about 2 weeks
- When happy move to new 4TB disk (from 8TB source)
- Keep monitoring this disk and all disks as normal


Summary
What a Palava!  

The move back to 4TB HGST Enterprise drives then, and the almost immediate disk error was not welcome.

I did take paranoid precautions before using the now failed 4TB disk, but the second time around I think I have been even more rigorous.