One day, tinkering with my pfSense, I was quite annoyed about how long it takes to boot up my box to get it to the user interface. OK, I had an old 2.5 HDD 320GB from Hitachi, it was bulletproof TBH, but soo slow.
Not long time ago, I saw one of 2.5in, 500GB Seagate SSHD FireCuda hybrids – ST500LX025. Fast 8GB SSD inside of normal HDD, up to 140MB/s transfer, SATA III. One way of saying: NICE! Normally, I would not use full SSD in something like pfSense, due to wear and tear, but… an idea, that internal SSD is only utilised for the most often used files… it kind of appealed to me very quickly. Let restart system a few times and it should get updated onto internal SSD. Let’s see.
Installation was very straightforward as usual and in about 800 hours later, I had a peek at S.M.A.R.T. details and I was gutted. Not again. All that singing and dancing about “green drives”, “save the environment!” usually end up truly with saying: “Penny wise – dollar stupid!”. Why? Let me rumble a bit: This is about the third time I have to resort myself to digging around the Internet to find out how to disable something, I don’t really need it and it was quite difficult to find info about it. OK, I’ve saved few watts on electricity having all those fancy features on, but wasted a much more looking for the way to disable them. Let me show you SMART features after some 886 working hours.
SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 067 064 006 Pre-fail Always - 5411719 3 Spin_Up_Time 0x0003 100 100 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 084 084 020 Old_age Always - 17116 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 071 060 045 Pre-fail Always - 13991423 9 Power_On_Hours 0x0032 099 099 000 Old_age Always - 886 (200 24 0) 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 96 184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0 188 Command_Timeout 0x0032 100 099 000 Old_age Always - 3 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 053 049 040 Old_age Always - 47 (Min/Max 25/51) 191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 0 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 2 193 Load_Cycle_Count 0x0032 092 092 000 Old_age Always - 17171 194 Temperature_Celsius 0x0022 047 051 000 Old_age Always - 47 (0 22 0 0 0) 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 517 (152 84 0) 241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 233859569 242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 15692817 254 Free_Fall_Sensor 0x0032 100 100 000 Old_age Always - 0 SMART Error Log Version: 1 No Errors Logged
Don’t just don’t look at Raw_Read_Error_Rate & Seek_Error_Rate, it is the same rubbish I see since my first Seagate drives 7200.11 series. Somehow it always gives me some weird numbers not related to actual valid information, so with Seagate, I will just skip it. What you should be concerned, is Start_Stop_Count & Load_Cycle_Count, with BOTH having over 17000 counts!!! In 886 hours??!! Let us do some mathematics:
886h (working time) / 24h = 36.9 days 17116 / 36.9 days = 464 Start Stop cycles per day 17171 / 36.9 days = 465 Load Cycles per day
Those drives are calculated at max 600.000 times on those cycles, so:
600.000 / 465 = 1290 days 1290 / 365 days per year = 3.5 years
Basically… warranty last 5 years… but the drive with all those cycles may not.
Having already NAS4Free and problems with way too much head parking, the solution is exactly the same: TURNING THIS BLOODY THING OFF!
New hard drives have SMART options and ability to turn some features on and off, so quick command in pfSense’s Diagnostics/Command Prompt:
ataidle /dev/ada0 Model: ST500LX025-1U717D Serial: ******** Firmware Rev: SDM1 ATA revision: ATA-10 LBA 48: yes Geometry: 16383 cyls, 16 heads, 63 spt Capacity: 465GB SMART Supported: yes SMART Enabled: yes Write Cache Supported: yes Write Cache Enabled: yes APM Supported: yes APM Enabled: yes AAM Supported: no
What we need is APM Supported: yes and APM Enabled: yes. This is an indication that power management is available and is ON, so next thing is to turn this thing OFF by issuing the command:
ataidle -P 0 /dev/ada0 ataidle /dev/ada0 ... APM Supported: yes APM Enabled: no ...
Now… check the SMART features and you should see that those counts are not changed as often as it was before. It should also survive a reboot of the machine, at least it did on mine so far.
Three days later, I went back to SMART info and I saw that drive is parking head again, but also somehow I left pfSense’s System/Advanced/Miscellaneous/Hard Drive Standby @ Standby 36, which forced HDD back into APM mode. Leave this option with “ALWAYS ON”. Next thing is that it will turn off Advanced Power Management, but “old/normal power management” is still on and that will imply standby timers, where the device will go into normal standby mode as per ATA/SATA old standards. We can take care of those with this command:
camcontrol standby ada0 -t 3600
Forcing standby timers to 3600 seconds = 1 hour of inactivity.
So, that’s it. It is off and in machines like NAS4Free or pfSense should stay off as they do loads of small writes, where magnetic head stays busy for a small period time, waking up very often, racking up those cycles. The only problem is that not every HDD can be turned off this way, luckily this one can. The last thing is to just check if everything goes by the plan by issuing the command:
camcontrol identify ada0 pass0: <ST500LX025-1U717D SDM1> ACS-3 ATA SATA 3.x device pass0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes) protocol ATA/ATAPI-10 SATA 3.x device model ST500LX025-1U717D firmware revision SDM1 serial number ******** WWN ******** cylinders 16383 heads 16 sectors/track 63 sector size logical 512, physical 4096, offset 0 LBA supported 268435455 sectors LBA48 supported 976773168 sectors PIO supported PIO4 DMA supported WDMA2 UDMA6 media RPM 5400 Feature Support Enabled Value Vendor read ahead yes yes write cache yes yes flush cache yes yes overlap no Tagged Command Queuing (TCQ) no no Native Command Queuing (NCQ) yes 32 tags NCQ Queue Management no NCQ Streaming no Receive & Send FPDMA Queued no SMART yes yes microcode download yes yes security yes no power management yes yes advanced power management yes no 0/0x00 automatic acoustic management no no media status notification no no power-up in Standby yes no write-read-verify yes no 0/0x0 unload yes yes general purpose logging yes yes free-fall no no Data Set Management (DSM/TRIM) no Host Protected Area (HPA) yes no 976773168/976773168 HPA - Security no
Had a peek at SMART values and “Huston, we have NO problems” anymore. Both counts in question increased about +2 for the past 12h. I can live with that… 😉