I need some help diagnosing my system instability

Got a shopping cart of parts that you want opinions on? Get advice from members on your planned or existing system (or upgrade).

Moderators: NeilBlanchard, Ralf Hutter, sthayashi, Lawrence Lee

Post Reply
guises
Posts: 55
Joined: Thu Oct 26, 2006 11:48 am

I need some help diagnosing my system instability

Post by guises » Thu Apr 28, 2011 3:37 pm

Over the last couple of weeks my system has gotten increasingly unstable and I'm out of ideas on what the cause could be, so I'm asking here to try and find someone smarter than me.

I can reliably crash my system within a few minutes of starting Portal 2 (getting into the 3D portion of the game, I can tool around in the menu for quite a while) or in less than a minute with Tron: Evolution. A less taxing game, Majesty 2, crashes sporadically but not reliably. I can actually play that one as long as I save pretty often. That's in Windows XP, however the problem is is not limited to Windows - Linux (OpenSUSE) is also crashing, though I don't have any taxing 3D games that can make it happen on command.

Because this is happening in different operating systems with different drivers, etc., I think this is a hardware problem. At first I suspected heat, because of the way that it doesn't crash immediately, but I've tried the following:

- Downclocked my CPU to stock, GPU and memory were already at stock. No change.

- Pulled off the side panel and turned a box fan on the system, figuring that I'd be able to rule out any heat problems definitively that way. Temps did go down, but they weren't that high in the first place, and there's no change in stability.

- Lengthy (7+ hour) memtest with no errors.

- Prime 95, no errors.

- Swapped power supplies, no change.

So... All I can think is that some random thing is wrong with either the graphics card or the motherboard. The motherboard is a MSI 945GT Speedster-A4R and the card is an XFX Radeon 5770. Using integrated Realtek audio. System is watercooled with a Reserator 1 along with one 120mm Noctua fan at low speed... Can't think of anything else that'd be relevant. I'd appreciate any flashes of brilliance that any of you might come up with.

ces
Posts: 3395
Joined: Thu Feb 04, 2010 6:06 pm
Location: US

Re: I need some help diagnosing my system instability

Post by ces » Thu Apr 28, 2011 4:13 pm

All you can do is swap things in and out to isolate the problem... Just for the heck of it I would try a new install of the operating system on a fresh c drive.... so you at least have a clean base of a start as you test for other things. You can keep you existing boot drive as is.

Zolishoru
Posts: 85
Joined: Sun Feb 06, 2011 11:19 pm
Location: North of the 49th parallel

Re: I need some help diagnosing my system instability

Post by Zolishoru » Thu Apr 28, 2011 10:28 pm

I've had similar problems with my ex-video card(Radeon HD 5750), caused by an improperly seated aftermarket cooler(by me :oops: );
Try to run Furmark, if the system fail, time to check the video card.

andymcca
Posts: 404
Joined: Mon Jan 11, 2010 8:19 am
Location: Boston, MA, USA

Re: I need some help diagnosing my system instability

Post by andymcca » Fri Apr 29, 2011 3:54 am

I would lean VERY heavily towards a video card problem. ces is right, switching things out will narrow it down. Since Prime95/memtest worked, you can assume your CPU & RAM are okay. Memory errors would probably be a bit more random, anyway.

Could also be PSU: you are definitely drawing more power when launching a game. Edit: oops, missed that bit of OP

Could also be the motherboard: If you have replaced everything else, then this could be the issue. But don't leap to this, as it is a pain to replace, especially if you didn't have to!

Edit: and by replace, I mean swap out with old parts. Do you have an old video card around?

Edit2: PS have you checked the fan on the video card? Is it working? Is it clogged up with dust? I had a laptop fail one time because of a solid wall of dust and hair (gross) across the fan exhaust.

Edit3: Sorry for the many edits :). You could also just try re-seating the video card. Could be an interface problem.

fumino
Posts: 298
Joined: Thu Aug 05, 2010 4:38 pm
Location: ontario

Re: I need some help diagnosing my system instability

Post by fumino » Fri Apr 29, 2011 4:55 am

hwmonitor.
furmark.
cinebench.
intel burn test.

get these, and lets figure this out.

hwmonitor to monitor temps, which is key.
furmark to see if its video card specific.
pay attention to how cinebench crashes(if it does). different crashes mean different things. also, make sure to try the normal tests, and the opengl one.
intelburntest,on maximum. nothing will produce more heat, use more cache, and generally tax your cpu and ram this fast and this hard. within 5 passes of IBT you should see failure. thats a lot quicker than running memtest and prime95 for 24hrs+ each.


given the age of your system, it could be capacitor wear on your motherboard. give it a quick look to see if any of the caps are bulging or ruptured.

guises
Posts: 55
Joined: Thu Oct 26, 2006 11:48 am

Re: I need some help diagnosing my system instability

Post by guises » Fri Apr 29, 2011 3:50 pm

Well, all right. Results:

linpack - no errors
cinebench - no errors (doesn't seem to push the GPU that hard, never got above 55 degrees C)

Furmark did the trick, crashing quickly at 71 and 75 degrees when I ran it at 1920x1080 (my monitor's resolution). Did that twice. However, it's a little more interesting than that - at 1280x720 (windowed) it started developing artifacts at roughly the three minute mark (83 degrees) and didn't crash until about four and a half minutes in, same temperature. That's a relatively long time. At the lowest resolution, 640x360, it wouldn't crash at all, even with maximum anti aliasing and "Xtreme burn-in" enabled. Did that twice for about twenty minutes each time before I closed it. At the lowest resolution it hovered around 76 degrees.

I'm not sure that temperature is a great indicator of what's going on - the highest temperature that I witnessed still isn't really all that high for a GPU and it was stable with no artifacts at the lowest resolution despite having a higher temperature than when it crashed. So... That's just core temperature, what about the VRMs, you ask? Well, okay, I don't know. This is a watercooled system and there's a waterblock on the GPU, but the VRMs have no cooling on them whatsoever. Before you say "Aha!" let me point out that the VRMs have no cooling by default, the cooler that came with the card only touches the core. Now it's a down blowing cooler, so that would likely do something, but I had a big honking box fan pushing a lot of air into the system for all of these tests so unless the VRMs have been somehow permanently damaged from longterm overheating (is that possible?) then that wouldn't explain it.

Anyway, I think that's a pretty strong indication that the problem is with the graphics card. I can say with some confidence that the power supply isn't the issue - I tried to different power supplies with exactly the same results. Also, physically, I don't see any obvious burn-outs or blown capacitors anywhere.

Thanks for everyone's input. I suppose I'll need to get a new video card, though I'm a little reluctant given that this one seems to work just fine. Only not for very long...

Post Reply