Max safe temp for hard drives?
Moderators: NeilBlanchard, Ralf Hutter, sthayashi, Lawrence Lee
-
- Site Admin
- Posts: 12285
- Joined: Sun Aug 11, 2002 3:26 pm
- Location: Vancouver, BC, Canada
- Contact:
Max safe temp for hard drives?
This is the question that came up in Ralf Hutter's review of the Antec SLK3700BQE. His displeasure at seeing a max temp of 43C for a 2-platter Barracuda IV in the BQE prompted me to post a POINT * COUNTERPOINT addendum at the end of the review. (Please read that before posting further comments here.)
It also prompted me to review a lot of the documentation from HDD manufacturers about safe temperatures. It made me realize that they are pretty cagey on the topic in many ways.
For one, they most commonly talk about "operational temperature" not for the drives themselves (by which I mean the readout from the S.M.A.R.T internal temp diode) but for "ambient." The one exception I've found so far is Seagate:
Seagate specifies in their spec doc 100129212b.pdf for the Barracuda IV -- "Ambient temperature: 0° to 60°C (op.), –40° to 70°C (nonop.)"
And: "Ambient temperature is defined as the temperature of the environment immediately surrounding the drive. Actual drive case temperature should not exceed 69°C (156°F) within the operating ambient conditions."
These are exactly the same as specified for the 7200.7 drives.
The recommended position for the measuring the "Actual drive case temperature" is at the bottom center edge of the front edge (see p23 of the pdf -- p.31 as read by Acrobat)
The implication of all the above is that there is a ~10C difference between drive temp and ambient temp.
WD's thermal specs are harder to find, but this document on Thermal Monitoring for Advanced Data Protectionrefers to S.M.A.R.T. default warning temp as 60C and shutdown as 65C. This would suggest that WD's maximum ambient temp recommendation would be ~55C...
No temp refs found thus far at Maxtor, but operating ambient temps are 55C max.
So the questions are:
1) what do you think the max long term safe S.M.A.R.T drive temp should be (for most/any modern 7200 rpm drives)?
2) What experience / evidence do you have to support this?
3) can you point to any definitive info regarding max long term safe S.M.A.R.T drive temp?
Or should we simply say anything over 55C is unsafe and how much below that you want to go is a matter of personal comfort -- much like max CPU temp? (The argument is that except for burnouts caused by catastrophic failures like a HS fall off, there is little or no evidence of CPUs actually getting damaged by running them close -- say -10% -- to the max die temp for extended periods.) Or do we discriminate between the all-electronic CPU vs. the electro-mechanical hard drive?
It also prompted me to review a lot of the documentation from HDD manufacturers about safe temperatures. It made me realize that they are pretty cagey on the topic in many ways.
For one, they most commonly talk about "operational temperature" not for the drives themselves (by which I mean the readout from the S.M.A.R.T internal temp diode) but for "ambient." The one exception I've found so far is Seagate:
Seagate specifies in their spec doc 100129212b.pdf for the Barracuda IV -- "Ambient temperature: 0° to 60°C (op.), –40° to 70°C (nonop.)"
And: "Ambient temperature is defined as the temperature of the environment immediately surrounding the drive. Actual drive case temperature should not exceed 69°C (156°F) within the operating ambient conditions."
These are exactly the same as specified for the 7200.7 drives.
The recommended position for the measuring the "Actual drive case temperature" is at the bottom center edge of the front edge (see p23 of the pdf -- p.31 as read by Acrobat)
The implication of all the above is that there is a ~10C difference between drive temp and ambient temp.
WD's thermal specs are harder to find, but this document on Thermal Monitoring for Advanced Data Protectionrefers to S.M.A.R.T. default warning temp as 60C and shutdown as 65C. This would suggest that WD's maximum ambient temp recommendation would be ~55C...
No temp refs found thus far at Maxtor, but operating ambient temps are 55C max.
So the questions are:
1) what do you think the max long term safe S.M.A.R.T drive temp should be (for most/any modern 7200 rpm drives)?
2) What experience / evidence do you have to support this?
3) can you point to any definitive info regarding max long term safe S.M.A.R.T drive temp?
Or should we simply say anything over 55C is unsafe and how much below that you want to go is a matter of personal comfort -- much like max CPU temp? (The argument is that except for burnouts caused by catastrophic failures like a HS fall off, there is little or no evidence of CPUs actually getting damaged by running them close -- say -10% -- to the max die temp for extended periods.) Or do we discriminate between the all-electronic CPU vs. the electro-mechanical hard drive?
Last edited by MikeC on Mon May 31, 2004 9:01 am, edited 3 times in total.
Thanks for the info Mike:
To me it depends on your data and comfort level, for me if my Maxtor dies it gives the excuse and right to buy a cuda, but once a cuda is bought I will prolly mount it in the lower end of the case rather tah in the 5/12 drive bay. Plus I do not do any real critical stuff at home like many here do.
Some have suggested to keep you hard drive within 5 to 6c of your case temps, I think that can only be achieved on decoupled drives with a fan blowing on them.
To me it depends on your data and comfort level, for me if my Maxtor dies it gives the excuse and right to buy a cuda, but once a cuda is bought I will prolly mount it in the lower end of the case rather tah in the 5/12 drive bay. Plus I do not do any real critical stuff at home like many here do.
Some have suggested to keep you hard drive within 5 to 6c of your case temps, I think that can only be achieved on decoupled drives with a fan blowing on them.
NOTE: For all temps, I mean "idle" values. Like when one is surfing/playing MP3s/watching a video etc., not the highest temps when really pushing the drive to the limits (file copy/defrag). Reader can skip the small text if he/she sees fit.
Firstly, I'd like to point out that there is no simple high limit for drive temperature; as it depends on the ambient temp and the case temp. Say, with 30°C ambient You will be hitting over 40°C for the drive. You can't get the drive easily below 40°C even with active cooling.
The situation changes dramatically, if the ambient is only 20°C. If You are hitting over 40°C then, it's because You have the drive decoupled (or enclosed), or there is no airflow across the drive. Especially decoupling can really affect temps, as the heat has no way to conduct to the case.
Then again, can SMART sensors be trusted? As we all understand, motherboards have their hot and cold spots. Same applies to hard drives too. One of our national computer magazines (MikroBitti) used a thermographic camera to measure temps of various parts. [You can download few videos here.] Unfortunately there isn't any for a hard drive, but in the magazine they had a picture of Maxtor D740X (IIRC), and the chips were about 50°C, when the drive was out of the case. Placing the sensor near these chips will produce misleading temps (unless the sensor is "fixed").
And, as we all know, different motherboards give different temps for the same CPU. So, we can't do "accurate" direct comparisons between two brands, maybe not even between two different model series from the same manufacturer. I've seen Maxtor drives (new FireBall 3's, IIRC, which are 5400 rpm) that run near 50°C inside a case, and that drive replaced an IBM 75 GXP, which ran at 37°C in the same mounting. One would have to use external sensors, but the placement of these sensors gives again some pitfalls.
One thing to consider is the "10°C rule for electronics/mechanics": A temperature rise of 10°C will halve the expected lifetime. I think that CPUs are designed to work at high temperatures (like transistors in general), but for hard drives there is "optimum" operating temperature IMHO.
I've once experienced a "cold" hard drive: Once when we got back to work after a weekend, the A/C had acted out (outside temps either rose or dropped by 20°C) and dropped the temp in our office to 10°C. When I booted up the machine, the hard drive made very audible whining. DTemp showed only 14°C when I got to Windows. After the drive temp rose over 22-24°C, the noise dropped considerably. This Maxtor drive (D740X) used to idle at 46°C in 23°C ambient, BTW.
I'd like to use "rise above ambient" for calculating the max. desirable temp. I tend to agree with Ralf; going over 40°C upsets me, as I use a fan to cool the drives. Consider that the ambient temps here are near 20-22°C nearly all the time, so the delta T is roughly 18°C. During summer we had ambient temps near 30°C, so considering that the limit would elevate to close to 50°C, as lower values aren't obtainable without changing the cooling setup. In "normal" cases (=no extra fans or directed airflow), I'd set the limit to ambient + 23°C.
These are all assumptions based on the use of one drive. I've noticed that if one is running two or more drives, the topmost is always running hotter than the lower ones.
Hopefully this wasn't a boring read...
Cheers,
Jan
Firstly, I'd like to point out that there is no simple high limit for drive temperature; as it depends on the ambient temp and the case temp. Say, with 30°C ambient You will be hitting over 40°C for the drive. You can't get the drive easily below 40°C even with active cooling.
The situation changes dramatically, if the ambient is only 20°C. If You are hitting over 40°C then, it's because You have the drive decoupled (or enclosed), or there is no airflow across the drive. Especially decoupling can really affect temps, as the heat has no way to conduct to the case.
Then again, can SMART sensors be trusted? As we all understand, motherboards have their hot and cold spots. Same applies to hard drives too. One of our national computer magazines (MikroBitti) used a thermographic camera to measure temps of various parts. [You can download few videos here.] Unfortunately there isn't any for a hard drive, but in the magazine they had a picture of Maxtor D740X (IIRC), and the chips were about 50°C, when the drive was out of the case. Placing the sensor near these chips will produce misleading temps (unless the sensor is "fixed").
And, as we all know, different motherboards give different temps for the same CPU. So, we can't do "accurate" direct comparisons between two brands, maybe not even between two different model series from the same manufacturer. I've seen Maxtor drives (new FireBall 3's, IIRC, which are 5400 rpm) that run near 50°C inside a case, and that drive replaced an IBM 75 GXP, which ran at 37°C in the same mounting. One would have to use external sensors, but the placement of these sensors gives again some pitfalls.
One thing to consider is the "10°C rule for electronics/mechanics": A temperature rise of 10°C will halve the expected lifetime. I think that CPUs are designed to work at high temperatures (like transistors in general), but for hard drives there is "optimum" operating temperature IMHO.
I've once experienced a "cold" hard drive: Once when we got back to work after a weekend, the A/C had acted out (outside temps either rose or dropped by 20°C) and dropped the temp in our office to 10°C. When I booted up the machine, the hard drive made very audible whining. DTemp showed only 14°C when I got to Windows. After the drive temp rose over 22-24°C, the noise dropped considerably. This Maxtor drive (D740X) used to idle at 46°C in 23°C ambient, BTW.
I'd like to use "rise above ambient" for calculating the max. desirable temp. I tend to agree with Ralf; going over 40°C upsets me, as I use a fan to cool the drives. Consider that the ambient temps here are near 20-22°C nearly all the time, so the delta T is roughly 18°C. During summer we had ambient temps near 30°C, so considering that the limit would elevate to close to 50°C, as lower values aren't obtainable without changing the cooling setup. In "normal" cases (=no extra fans or directed airflow), I'd set the limit to ambient + 23°C.
These are all assumptions based on the use of one drive. I've noticed that if one is running two or more drives, the topmost is always running hotter than the lower ones.
Hopefully this wasn't a boring read...
Cheers,
Jan
-
- Site Admin
- Posts: 12285
- Joined: Sun Aug 11, 2002 3:26 pm
- Location: Vancouver, BC, Canada
- Contact:
Well there is if you consider the S.M.A.R.T. temp, which is off the internal temp diode, to be a reasonable representation of internal drive tmep. This would naturally include the effect of ambient temp.Jan Kivar wrote:...there is no simple high limit for drive temperature...
Regarding the "10°C rule for electronics/mechanics": A temperature rise of 10°C will halve the expected lifetime.
There is some question about where this originated. I recall reading somewhere about it being pulled out of the air by some smartass contractor for the US military who wanted to sell more cooling electronic gear?... Probably totally distorted.
Here's an insight from a more informed source at X-bit Labs:
Speaking seriously, we can’t help recalling Arrhenius’ rule from the U.S. Department of Defense Military Handbook 217 (this book used to be the court of first instance in all questions concerning electronics reliability). This rule suggests that for the temperature range from –20 to 140C every temperature drop by 10C doubles the life term of the equipment. Military Handbook 217 is no longer used nowadays and the rule shouldn’t be taken directly as is. For example, temperature may vary in different parts of a single PC case. Still, the book had its truth. High temperature of a chip doesn’t tell well on its life term.
By the way, we have mentioned temperature variations inside the PC case. This is more of a problem now than it used to be before. Traditionally, the central processor is the warmest spot, but lately the chipset, the graphics processor, and even the hard disk drive have become very warm, too. Together with a complex pattern of airflows inside, the whole picture is too complicated to fully comply with the “golden rule” about 10C.
-
- Posts: 580
- Joined: Sun Aug 11, 2002 3:26 pm
- Location: USA (Phoenix, AZ)
I am very interested in this subject.
Being a slave to temperature readings is a serious barrier in acheiving a quiet PC.
I have a Shuttle SS51G XPC (small form factor) with a grommet-mounted 7200 RPM 80GB Maxtor 6Y080P0.
I use SpeedFan, and I have it set to speed up my 92mm blowhole fan when my Maxtor reports higher than 40C.
Right now, ambient is 28C (NOT from a good lab thermometer, just a digital clock) and the drive is 39C (idle).
But should I even bother spinning up the fan after 40C? Maybe 45C or 50C? It would be great to know the optimum internal temperature for reliability/life. Overcooling beyond that just means unnecessary noise.
Being a slave to temperature readings is a serious barrier in acheiving a quiet PC.
I have a Shuttle SS51G XPC (small form factor) with a grommet-mounted 7200 RPM 80GB Maxtor 6Y080P0.
I use SpeedFan, and I have it set to speed up my 92mm blowhole fan when my Maxtor reports higher than 40C.
Right now, ambient is 28C (NOT from a good lab thermometer, just a digital clock) and the drive is 39C (idle).
But should I even bother spinning up the fan after 40C? Maybe 45C or 50C? It would be great to know the optimum internal temperature for reliability/life. Overcooling beyond that just means unnecessary noise.
I was trying to make a point that You can't easily say what the temp will be if You use Drive A in Case B, with Ambient C, apart from the fact that it will be lower than 55°C in most cases.MikeC wrote:Well there is if you consider the S.M.A.R.T. temp, which is off the internal temp diode, to be a reasonable representation of internal drive tmep. This would naturally include the effect of ambient temp.Jan Kivar wrote:...there is no simple high limit for drive temperature...
I've understood that this rule can be applied especially to the power supplies. Running PSUs with slow fans (or, better yet, without a fan) can kill the PSU sooner.MikeC wrote:Regarding the "10°C rule for electronics/mechanics": A temperature rise of 10°C will halve the expected lifetime.
There is some question about where this originated. I recall reading somewhere about it being pulled out of the air by some smartass contractor for the US military who wanted to sell more cooling electronic gear?... Probably totally distorted.
Jan
-
- Posts: 580
- Joined: Sun Aug 11, 2002 3:26 pm
- Location: USA (Phoenix, AZ)
I've been contemplating a sff with the same mods, but just haven't had the need to do it yet as I recently bought an Nforce 2 mobo and some new ram before I got the sff modding bug. Though I think my next project will be to find an old Philco cathedral radio and put a small Mobo to see if I can make that my primary silent system.
-
- Site Admin
- Posts: 12285
- Joined: Sun Aug 11, 2002 3:26 pm
- Location: Vancouver, BC, Canada
- Contact:
1 - Oh, ok, you mean to predict temps? No I totally agree, you can't predict it but you don't need to, you can measure directly with thermal diodes in almost any modern drive and DTemp. I'd recommend anyone who has concern about data safety to have a drive with thermal diodes and DTemp or similar and to monitor temps at least from time to time.Jan Kivar wrote:1 - I was trying to make a point that You can't easily say what the temp will be if You use Drive A in Case B, with Ambient C, apart from the fact that it will be lower than 55°C in most cases.
2 - I've understood that this rule can be applied especially to the power supplies. Running PSUs with slow fans (or, better yet, without a fan) can kill the PSU sooner.
2 - I have no quibble with the basic notion that more heat shortens component life -- just the precise expression of +10C = 1/2 life. I would think this depends entirely on how close you are to overheating. Say with a drive rated for safe operation to 60C internal temp. If you run it at 40C instead of 30C, it will halve the lifespan? Somehow I doubt it. But if you run it at 55C, it is much more likely to halve the life compared to running it at 45C, I would think.
Yeah, You're right. I was trying to say this with the "optimum" temperature. Having too low temperature can hurt the hard drive also. IIRC some guy was using a watercooling setup which had water temp lower than the ambient (and the case was isolated etc.). He mentioned that running the drive only at ~20°C made the drive whine more. I have experienced similar effects, as I mentioned in my post.MikeC wrote:2 - I have no quibble with the basic notion that more heat shortens component life -- just the precise expression of +10C = 1/2 life. I would think this depends entirely on how close you are to overheating. Say with a drive rated for safe operation to 60C internal temp. If you run it at 40C instead of 30C, it will halve the lifespan? Somehow I doubt it. But if you run it at 55C, it is much more likely to halve the life compared to running it at 45C, I would think.
45°C is safe, if the HD sees no airflow. With airflow, the temperature will be lower. This was posted today, and clearly shows the importance of having some airflow across the drive also. It's just the level of quietness one wishes to achieve...
Cheers,
Jan
-
- *Lifetime Patron*
- Posts: 1465
- Joined: Sun Mar 09, 2003 12:27 pm
- Location: Reading.England.EU
I can't answer any of Mike's original 3 questions, but figure some thoughts I started in another thread are worth repeating.
1) Based on the relative importance we should place on the reliability of our hdd, it is worth making an effort to keep them running. Despite claims for relatively high operating temps (50/60C) and because of claims that cooler is more likely longer life, I subscribe to the 'over ambient' target. My main thought is this: with a typical hdd power consumption around 10W, I figure a well designed case/airflow/hdd location should easily be able to keep a hdd 10C over ambient, if not 5C over ambient. (OK if we enclose them for silence things change.)
2) Also because these things are mechanical, I figure the rate of change of temp is significant. Seagate 7200.7 says 20C/hour. If you do have a design that runs more than 20C above ambient, then there is every chance that on start (from cold) your hdd will heat faster than is good for it. (So maybe 20C over ambient is a sensible design max for hdd temp?)
1) Based on the relative importance we should place on the reliability of our hdd, it is worth making an effort to keep them running. Despite claims for relatively high operating temps (50/60C) and because of claims that cooler is more likely longer life, I subscribe to the 'over ambient' target. My main thought is this: with a typical hdd power consumption around 10W, I figure a well designed case/airflow/hdd location should easily be able to keep a hdd 10C over ambient, if not 5C over ambient. (OK if we enclose them for silence things change.)
2) Also because these things are mechanical, I figure the rate of change of temp is significant. Seagate 7200.7 says 20C/hour. If you do have a design that runs more than 20C above ambient, then there is every chance that on start (from cold) your hdd will heat faster than is good for it. (So maybe 20C over ambient is a sensible design max for hdd temp?)
-
- SPCR Reviewer
- Posts: 8636
- Joined: Sat Nov 23, 2002 6:33 am
- Location: Sunny SoCal
My thoughts exactly.dukla2000 wrote:I can't answer any of Mike's original 3 questions, but figure some thoughts I started in another thread are worth repeating.
1) Based on the relative importance we should place on the reliability of our hdd, it is worth making an effort to keep them running. Despite claims for relatively high operating temps (50/60C) and because of claims that cooler is more likely longer life, I subscribe to the 'over ambient' target. My main thought is this: with a typical hdd power consumption around 10W, I figure a well designed case/airflow/hdd location should easily be able to keep a hdd 10C over ambient, if not 5C over ambient. (OK if we enclose them for silence things change.)
2) Also because these things are mechanical, I figure the rate of change of temp is significant. Seagate 7200.7 says 20C/hour. If you do have a design that runs more than 20C above ambient, then there is every chance that on start (from cold) your hdd will heat faster than is good for it. (So maybe 20C over ambient is a sensible design max for hdd temp?)
Add this to the counterpoint ("What Price Data Safety") that I posted in my case review and you'll see my position on this issue. Which still stands. And since I figured I had nothing new to add to this topic I haven't posted here until now. So consider this post as an exclamation point to my original counterpoint reply.
-
- Site Admin
- Posts: 12285
- Joined: Sun Aug 11, 2002 3:26 pm
- Location: Vancouver, BC, Canada
- Contact:
I have no issue with the position that more heat is worse than less heat. (as long as we're not dipping down to too cold.) The only real question I am asking is what temps are unsafe and is there empirical data to support that?dukla2000 wrote:(So maybe 20C over ambient is a sensible design max for hdd temp?)
Both of dukla2000 's points are sound. The second combines with the max safe temp spec provided by drive makers to give a much more curtailed high temp, especially if you turn your PC on/off as opposed to running them 24/7. (Constant rotation must be more benign than off/on for most devices like fans and hard drives; the start/stop process invoves a huge number of mechanical stresses from the jerk start to overcome inertia to large temp changes and so on, much like for a car engine, which tends to get the greatest wear & tear from ignition.)
BTW, going back to Ralf's review, we find the stated ambient is 75F = 24C. The worst case temp was 43C or 19C over ambient, within the sensible design max suggested by dukla2000. Add the 5V front fan Ralf favors, and it drops to 37C or just 13C over ambient.
-
- *Lifetime Patron*
- Posts: 5316
- Joined: Sat Jan 18, 2003 2:19 pm
- Location: St Louis (county) Missouri USA
Me being the cheap fellow I am, I value both the data and the drive. The worst scare I ever got in this matter was after a long XP install with a low-powered computer, and a Maxtor drive. (at least 2.5 hrs)
I thought the drive had enough airflow, but it was a new untested setup. For whatever reason I opened the case immediately after the install, and I swear I burned my hand on the drive. No telling how hot it was.
Since that experience, when I install an OS, it's with the side of the case open, and a large house fan blowing in...heh.
I use 40c as a max temp point, for no particular reason except most of my setups with moderate airflow around the drives, stay under that temp.
I thought the drive had enough airflow, but it was a new untested setup. For whatever reason I opened the case immediately after the install, and I swear I burned my hand on the drive. No telling how hot it was.
Since that experience, when I install an OS, it's with the side of the case open, and a large house fan blowing in...heh.
I use 40c as a max temp point, for no particular reason except most of my setups with moderate airflow around the drives, stay under that temp.
-
- Posts: 580
- Joined: Sun Aug 11, 2002 3:26 pm
- Location: USA (Phoenix, AZ)
I don't think even the drive manufacturer's know.
At work we have loads of computers in vent-blocking locations, filled to the max with dust-balls, and hard drives operating at temperatures I wouldn't dare touch. Not bad for 10 years of 10 hour days, 5 days per week. Of course, today's drives are different with their 1 year warranty period...
At work we have loads of computers in vent-blocking locations, filled to the max with dust-balls, and hard drives operating at temperatures I wouldn't dare touch. Not bad for 10 years of 10 hour days, 5 days per week. Of course, today's drives are different with their 1 year warranty period...
-
- Posts: 255
- Joined: Thu Jun 05, 2003 9:45 am
- Location: CA
seagate gives the MTBF for 7200.7 as 600,000 power-on hours @ 25'C. that's over 68 years of operation. if we take the "+10'C == 1/2 life" rule seriously, then 7200.7 running @ 35'C will last for 34 years before a failure, 17 years @ 45'C, and over 8 years @ 55'C .Bluefront wrote:Me being the cheap fellow I am, I value both the data and the drive
i would like you all to think back to the systems you were using 8 years ago and drives you were using then. or even better, think back 17 years and recall what system and what sort of drive you were using back in 1986. my system in 1986 was a 80186 with a 20MB disk drive. in 1995 i was running 486/66 with a 1.2GB drive .
now think about how much data you had there. and think about how much data you have now. plot the curve in your mind. it's roughly a 60-100%/year slope, i think. the average drive capacity definitely grows @ 60%/year (i did some survey and data analysis in this area some 6 months ago and my numbers matched the projections from IBM and seagate) now, consider your 120GB 7200.7 drive that probably has 20-30GB of data on it. you will run out of capacity on that drive in about 4 years. at that point the average size of a disk drive in "moderate" price range (say $100-150) will be around 500GB, and you are probably going to be upgrading your system at that point, anyway.
so if you are pro-active and move your valuable data from where it is now to a new disk on a new system every 4 years or so, you should be keeping well away from sudden heat death of your disk drive. if you start noticing a lot of seek delays and recalibration grinding of a drive, and use that a sign that it's time to migrate the data, you can accomodate even cases of really badly made drives (i am thinking here of my 30GB IBM deskstar that starting making nasty noises and grind-seeking after less than 2 years of fairly cool operation).
so unless i am building a system that will be entombed in a wall and has to work unattended for the next 20 years, based on this discussion and my thinking on the matter, i expect that i will not be worrying about a disk drive running @ 55'C.
i am disconnecting the 40mm fan on my formerly and soon again fanless via box as soon as i get home.
perhaps we should ask....josephclemente wrote:I don't think even the drive manufacturer's know.
-
- Posts: 226
- Joined: Sat Sep 06, 2003 5:59 am
- Location: Finland
I found the following white paper on Maxtor's web site. It discusses choosing a hard drive for DVR/PVR systems and has a lot of stuff that's relevant for silent systems. An interesting read. The temperature graphs seem to pan out with what I'm observing with my two 7200 rpm DiamonMax PLus 9 SATA drives. I have a Sonata case, which has restricted air flow around the drive cages and rubber grommet mounting, and my SMART temps are typically hovering around 45 C.
My thoughts exactly, at least for a Barracuda drive.grandpa_boris wrote:... i expect that i will not be worrying about a disk drive running @ 55'C...
I have never cooled a hard drive and I have never experienced a "heat-related failure". Nobody else that I know has ever experienced a "heat-related" hard drive failure. Although we have had manufacturers aknowledge various mechanical failures on occasion, these were inherent (structural) issues and (according to the manufacturer) not related to heat. Even the dreaded DeathStar drives , which I have had plenty of, are now made safe with the recent downloadable IBM firmware fixes.
My own safety-temp values, where I start to get worried about a hard drive, is at 56C for a Barracuda drive and at 51C for all others. I've run various drives for years, in on/off fashion too, at temps just under these and never had a single issue.
If I ever found a drive hitting these temp-marks, this would indicate a bigger problem, and I would want to find a better case-cooling solution rather than putting a fan in front of the hard drive.
I do understand the feelings of those that worry about cooling their hard drives though, as we are all sensitive to heat when minimizing noise from our PC's. I just think that most people waaaaaaaaaay underestimate the 'workhorse' nature of our hard drives.
-
- *Lifetime Patron*
- Posts: 1465
- Joined: Sun Mar 09, 2003 12:27 pm
- Location: Reading.England.EU
BEWARE THE MAGIC MTBF ILLUSION!!!
al bundy (et al) - Sure in the past/historically/actual data very few (if anyone) have actual heat related hdd failures. But as per investment performance, past experience is no guarantee of the future.
R = exp(-43800/250000) = 0.839289
But bottom line, with a 7200.7 over (say) a 4 year life then the stats is actually saying there is an x% (say 92%? - I can't interpret what exp function is in that equation!) probability my drive will last that long.
By looking after the drive environment I am trying to increase the probability of no failure. Coming back to the 'bad' environments: again the stats is only saying the probability is lower you will survive, not necessarily zero. In no way am I finger pointing or asserting your drive WILL fail: it is just an inner smuggness that I believe my drive has a better chance of lasting 4 years than yours.
[edit] ps - managed to work the arithmetic: it is natural log (e) based. So for 600000 MTBF, 4 year life, probability is 94.3% of operation without failure. [/edit]
Not necessarily. My stats is virtually zero, but I remember somewhere a good post on what MTBF really means, and this Googled result is more or less what I remember. Now I can't work the arithmetic in the examplegrandpa_boris wrote:... that's over 68 years of operation. ...
R = exp(-43800/250000) = 0.839289
But bottom line, with a 7200.7 over (say) a 4 year life then the stats is actually saying there is an x% (say 92%? - I can't interpret what exp function is in that equation!) probability my drive will last that long.
By looking after the drive environment I am trying to increase the probability of no failure. Coming back to the 'bad' environments: again the stats is only saying the probability is lower you will survive, not necessarily zero. In no way am I finger pointing or asserting your drive WILL fail: it is just an inner smuggness that I believe my drive has a better chance of lasting 4 years than yours.
[edit] ps - managed to work the arithmetic: it is natural log (e) based. So for 600000 MTBF, 4 year life, probability is 94.3% of operation without failure. [/edit]
Last edited by dukla2000 on Thu Oct 23, 2003 2:52 am, edited 1 time in total.
-
- Patron of SPCR
- Posts: 700
- Joined: Thu Mar 13, 2003 2:38 pm
- Location: California, US
- Contact:
I think you may be misinterpreting the MTBF rating. StorageReview has a good article on the topic. Hard drives don't really last half a century.grandpa_boris wrote:seagate gives the MTBF for 7200.7 as 600,000 power-on hours @ 25'C. that's over 68 years of operation. if we take the "+10'C == 1/2 life" rule seriously, then 7200.7 running @ 35'C will last for 34 years before a failure, 17 years @ 45'C, and over 8 years @ 55'C
-
- Posts: 255
- Joined: Thu Jun 05, 2003 9:45 am
- Location: CA
SometimesWarrior wrote:I think you may be misinterpreting the MTBF rating.
deliberately so . my point is that the disk will be practically useless and subject to replacement with a cheaper, better drive long before it reaches the end of its useful life. so shaving that life span down by running it near the operating limits may not be such a great threat.
-
- Posts: 255
- Joined: Thu Jun 05, 2003 9:45 am
- Location: CA
disk temperature vs longevity
this is very embarassing. this message was supposed to be a personal note to MikeC, hence the questions about possibly hosting images, obviously incomplete information, and some specifics that i am now editing out. i should be more careful next time. but i decided to leave the message posted because it may be of general interest after all...
in a recent discussion, you said:
turns out they don't have much to tell and all that they did have was proprietary, internal and subject to the usual ugly NDAs.
i don't have any hard numbers or charts to pass on the forums here. they didn't have anything that was publicly available, but promised they'll look for public info they can pass on to me. if that happens and if you are interested in placing it on this site, i'll get it to you.
what it all comes down to is that if a disk has an error rate of X @ 25°C, it is derated to (.44 * X) @ 45°C and (.2 * X) @ 65°C. at the nominal temperature of 25°C, a typical disk's serivce life is 5 years. they use temperature-induced aging to stress test drives, but they don't publish the data at higher temperatures. the implication is that the service life is guaranteed within the operating range of the drive. the manual for 7200.7 states "Actual drive case temperature should not exceed
69°C (156°F) within the operating ambient conditions.". does that mean i can run my disk @ 68°C for 5 years? they didn't know.
if i get any solid info, it may be worth sharing it with the people here. i have no way of hosting images of charts or pdf copies of papers, if they get me any. would it be possible to have you host them on SPCR if they aren't too big and of sufficiently broad appeal to the people here?
in a recent discussion, you said:
actually, you can to some extent. at USENIX FAST 2003 conference there was a paper presented on calculating power consumption of a disk drive given a work load pattern. it isn't yet available to non-USENIX members and the math was sufficiently baroque that it's probably of little interest to anyone outside of the academia.MikeC wrote:1 - Oh, ok, you mean to predict temps? No I totally agree, you can't predict it but you don't need to, you can measure directly with thermal diodes in almost any modern drive and DTemp. I'd recommend anyone who has concern about data safety to have a drive with thermal diodes and DTemp or similar and to monitor temps at least from time to time.
i had an opportunity to ask my contacts within a disk manufacturer's research arm to see what they can find out about the relationship between disk temperatures and disk longevity and reliability, and take a couple of minutes in our recent meeting to give me a synopsys.2 - I have no quibble with the basic notion that more heat shortens component life -- just the precise expression of +10C = 1/2 life. I would think this depends entirely on how close you are to overheating. Say with a drive rated for safe operation to 60C internal temp. If you run it at 40C instead of 30C, it will halve the lifespan? Somehow I doubt it. But if you run it at 55C, it is much more likely to halve the life compared to running it at 45C, I would think.
turns out they don't have much to tell and all that they did have was proprietary, internal and subject to the usual ugly NDAs.
i don't have any hard numbers or charts to pass on the forums here. they didn't have anything that was publicly available, but promised they'll look for public info they can pass on to me. if that happens and if you are interested in placing it on this site, i'll get it to you.
what it all comes down to is that if a disk has an error rate of X @ 25°C, it is derated to (.44 * X) @ 45°C and (.2 * X) @ 65°C. at the nominal temperature of 25°C, a typical disk's serivce life is 5 years. they use temperature-induced aging to stress test drives, but they don't publish the data at higher temperatures. the implication is that the service life is guaranteed within the operating range of the drive. the manual for 7200.7 states "Actual drive case temperature should not exceed
69°C (156°F) within the operating ambient conditions.". does that mean i can run my disk @ 68°C for 5 years? they didn't know.
if i get any solid info, it may be worth sharing it with the people here. i have no way of hosting images of charts or pdf copies of papers, if they get me any. would it be possible to have you host them on SPCR if they aren't too big and of sufficiently broad appeal to the people here?
Last edited by grandpa_boris on Fri Oct 31, 2003 10:31 am, edited 1 time in total.
-
- *Lifetime Patron*
- Posts: 1465
- Joined: Sun Mar 09, 2003 12:27 pm
- Location: Reading.England.EU
Interesting stuff. In particular the numbers for the error rate decrease (well I guess the quoted number will decrease => an increase in the number of errors) as temp increases. But after that I feel it is all lies, damned lies and statistics
I can understand they are loath to publish raw data: based on the litigation nature of some societies it is simple to imagine the consequences. And it also sets them up for a willy-waving spec contest with other manufacturers.
But your notes did make me recheck the 7200.7 Sata specs (Publication number: 100270024, Rev. C) which has under Reliability (pg 21) "Mean time between failures (MTBF) 600,000 power-on hours (nominal power, 25°C ambient temperature)". (My emphasis) I am sure the error rate degradation numbers you quote are used to 'normalise' the temperature induced aging. So the following may be misuse of correct data for an incorrect purpose, but what the hell
600000 hours @ 25C becomes
264000 hours @ 45C and
120000 hours @ 65C
Now the probability of no failure during a 5 year life become
0.93 @ 25C
0.85 @ 45C and
0.69 @ 65C
And presumably the probability at 65C is starting to be 'significantly low' which is why they spec the environment max as 60C? Now if these stats are 'meaningful' then for every 100 systems delivered, a system builder can expect between 7 and 30 hdd failures in 5 years (depending on the ambient temps): any builders out there with any records of failures?
One pedantry: I suggest your "the implication is that the service life is guaranteed within the operating range of the drive" would be more accurately written as "the implication is that the service life has high statistical probability of no failure within the operating range of the drive". I am sure it is not their intention to guarantee anything!
But certainly anything you can get would interest me: not least even their own numbers or correction of any GIGO I may be propagating!
I can understand they are loath to publish raw data: based on the litigation nature of some societies it is simple to imagine the consequences. And it also sets them up for a willy-waving spec contest with other manufacturers.
But your notes did make me recheck the 7200.7 Sata specs (Publication number: 100270024, Rev. C) which has under Reliability (pg 21) "Mean time between failures (MTBF) 600,000 power-on hours (nominal power, 25°C ambient temperature)". (My emphasis) I am sure the error rate degradation numbers you quote are used to 'normalise' the temperature induced aging. So the following may be misuse of correct data for an incorrect purpose, but what the hell
600000 hours @ 25C becomes
264000 hours @ 45C and
120000 hours @ 65C
Now the probability of no failure during a 5 year life become
0.93 @ 25C
0.85 @ 45C and
0.69 @ 65C
And presumably the probability at 65C is starting to be 'significantly low' which is why they spec the environment max as 60C? Now if these stats are 'meaningful' then for every 100 systems delivered, a system builder can expect between 7 and 30 hdd failures in 5 years (depending on the ambient temps): any builders out there with any records of failures?
One pedantry: I suggest your "the implication is that the service life is guaranteed within the operating range of the drive" would be more accurately written as "the implication is that the service life has high statistical probability of no failure within the operating range of the drive". I am sure it is not their intention to guarantee anything!
But certainly anything you can get would interest me: not least even their own numbers or correction of any GIGO I may be propagating!
-
- Posts: 255
- Joined: Thu Jun 05, 2003 9:45 am
- Location: CA
for the consumer market this is most likely the case (although seagate seems to be willing to make some efforts to improve on that image in some markets). but for the "enterprise" market segment (i.e. major OEMs like Sun, EMC, etc.) they will support and fix drives within their service life.dukla2000 wrote:One pedantry: I suggest your "the implication is that the service life is guaranteed within the operating range of the drive" would be more accurately written as "the implication is that the service life has high statistical probability of no failure within the operating range of the drive". I am sure it is not their intention to guarantee anything!
I try not to be a slave to component temps, but a little OCD kicks in and I get obsessed with it. I think the 55 degree number rings with me and my WDs seems to be hanging around 43-45 full time. That said, I have been repairnig OEM PCs for a long time, and as manufacturers have sought to stuff more stuff into a smaller space I have encountered some really hot and poorly ventilated drives in systems from Dell, Compaq and IBM, and none of their failure rates have struck me as so high to worry about it. I have one client with over 300 GX 50s which have the drive mounted in the front, top half of small desktop cases with practically zero airflow, and a lot of heat fromthe case seems to flow right UP into these drives. If you shut one of these machines down and check temps they are always in the high 50s or worse, and so far after 2 years of 24/7/364 usage I have only replaced one hard drive for this customer. I won't do the math, but judging on what I have seen from Dell and IBM in the past, if temps that high were really that much of a life shortener, they would find a solution or change case design to reduce warranty replacement cost.
I have to amend that, I ran Dtemp for the first time since I rearranged my cabling and changed heatsinks, both my WD 40 Gig SEs are idling at 28 degrees. Gotta' like that, although I'm not sure how accurate those diodes are? Any opinion?
I have to amend that, I ran Dtemp for the first time since I rearranged my cabling and changed heatsinks, both my WD 40 Gig SEs are idling at 28 degrees. Gotta' like that, although I'm not sure how accurate those diodes are? Any opinion?
So after all this debate, it still is left undecided and it's left up to what you're comfortable with? But it seems that the general concensus is that 55C is a safe range correct?
I ask this because I had just installed a cdrw so I had to re-arrange my setup with 2 hard drives and another cdrom. So my cables are all jumbled now and my hdd's aren't mounted the way I want them.
I ask this because I had just installed a cdrw so I had to re-arrange my setup with 2 hard drives and another cdrom. So my cables are all jumbled now and my hdd's aren't mounted the way I want them.
I'm not sure on what these max safe temperatures are based. I assume this is for the cover gaskets and drive bearing oil?
In the worst case scenario, the actual data on a hard drive should be safe up to approximately 500F, even if the circuitry may not survive. You can probably set the drive in a wood stove for an hour, then send it to a drive recovery service and get all the data back.
Somewhere around 500F there is a problem where the magnetic domains in metal can relax, and cause the data to essentially fade away as the domains begin realigning on the recording surface to a pattern of lowest energy.
I seem to recall that all modern circuit boards are built using a process called wave soldering where the whole board with components is dipped into a huge pool of liquid solder at around 350 degrees F, so the circuitry itself can therefore survive a nonoperating temperature at least that high. Even those ribbon cables in a drive are often wave soldered, so they too can withstand such high heat.
When operating, the circuitry will obviously generate heat, but the heat output is fairly small and stable, since the drive only needs to maintain a constant RPM with fairly low-friction bearings. Therefore, the circutry itself could probably tolerate running at least at 300 degrees continuously without failure.
Probably the real temperature concern is for the following two items:
- foam/rubber gaskets/insulators/isolators
- spindle motor/bearing oil
The drive cover is sealed to the frame using a gasket, and in all likelihood, this is a cheap foam-rubber gasket. This probably cannot handle temperatures in excess of 250 degrees before it begins to melt and bubble.
In the worst case, the foam seals could liquify, flow into the drive, and get on the platters, gumming up the head/arm assembly. Or it could melt and form a gap between the cover plates, and allow dust to get inside that eventually crashes the drive heads.
Additionally, many drives have foam or a plastic insulating sheet under the circuit board to insulate it from the metal drive frame. It is certainly possible for this plastic to shrink or perhaps melt, and perhaps warp the circuit board until the board cracks or touches a wire to bare metal.
The other weak spot is likely the spindle lubricating/bearing oil. Get the drive sufficiently hot, and you can probably boil the oil right out of the drive's motor bearings, evaporating the oil until the bearings are dry and the thing will no longer spin.
Since this oil is already likely very thin and light, it probably does not take too much heat to get the internal bearing pressure high enough until it leaks out of the seals and escapes into the air.
Very old drives develop a problem known as sticktion, which is some sort of failure of the bearings. Either the oil has leaked out, or the high heat over the years has caused the oil to form a thick, sticky jelly that makes the spindle difficult to turn. Usually drives with sticktion can be manually spin up with inertial techniques, and once running will continue to run, but if stopped for a while will go right back to being stuck again.
-Scalar
In the worst case scenario, the actual data on a hard drive should be safe up to approximately 500F, even if the circuitry may not survive. You can probably set the drive in a wood stove for an hour, then send it to a drive recovery service and get all the data back.
Somewhere around 500F there is a problem where the magnetic domains in metal can relax, and cause the data to essentially fade away as the domains begin realigning on the recording surface to a pattern of lowest energy.
I seem to recall that all modern circuit boards are built using a process called wave soldering where the whole board with components is dipped into a huge pool of liquid solder at around 350 degrees F, so the circuitry itself can therefore survive a nonoperating temperature at least that high. Even those ribbon cables in a drive are often wave soldered, so they too can withstand such high heat.
When operating, the circuitry will obviously generate heat, but the heat output is fairly small and stable, since the drive only needs to maintain a constant RPM with fairly low-friction bearings. Therefore, the circutry itself could probably tolerate running at least at 300 degrees continuously without failure.
Probably the real temperature concern is for the following two items:
- foam/rubber gaskets/insulators/isolators
- spindle motor/bearing oil
The drive cover is sealed to the frame using a gasket, and in all likelihood, this is a cheap foam-rubber gasket. This probably cannot handle temperatures in excess of 250 degrees before it begins to melt and bubble.
In the worst case, the foam seals could liquify, flow into the drive, and get on the platters, gumming up the head/arm assembly. Or it could melt and form a gap between the cover plates, and allow dust to get inside that eventually crashes the drive heads.
Additionally, many drives have foam or a plastic insulating sheet under the circuit board to insulate it from the metal drive frame. It is certainly possible for this plastic to shrink or perhaps melt, and perhaps warp the circuit board until the board cracks or touches a wire to bare metal.
The other weak spot is likely the spindle lubricating/bearing oil. Get the drive sufficiently hot, and you can probably boil the oil right out of the drive's motor bearings, evaporating the oil until the bearings are dry and the thing will no longer spin.
Since this oil is already likely very thin and light, it probably does not take too much heat to get the internal bearing pressure high enough until it leaks out of the seals and escapes into the air.
Very old drives develop a problem known as sticktion, which is some sort of failure of the bearings. Either the oil has leaked out, or the high heat over the years has caused the oil to form a thick, sticky jelly that makes the spindle difficult to turn. Usually drives with sticktion can be manually spin up with inertial techniques, and once running will continue to run, but if stopped for a while will go right back to being stuck again.
-Scalar