New hardware - System halts/freeze at random - Wall of text!

Got a shopping cart of parts that you want opinions on? Get advice from members on your planned or existing system (or upgrade).

Moderators: NeilBlanchard, Ralf Hutter, sthayashi, Lawrence Lee

Post Reply
Nevyn2
Posts: 43
Joined: Thu Jun 19, 2003 11:40 am
Location: Swe

New hardware - System halts/freeze at random - Wall of text!

Post by Nevyn2 » Fri Sep 04, 2009 3:31 am

Hi all!

## General description ##

I just bought some new hardware and it is giving me a problem that i don't know what to do with. After the assembly and installation of a OS the computer hangs/freezez from time to time. I haven't found a single action that is causing this it is totally random, i could be surfing the web, playing WoW etc.

When it freezez it stops responding in any way, i can't even turn on Caps and i have to restart it with the reset/power button. There is no articfacts on the screen or no BSOD and the audio totally dissappears (no stuttering left or anything). The freezez/hangs always occurs inside windows, as of yet it has never done this during the POST or boot of the OS

## Hardware and OS ##

Case: Antec SLK3700AMB (With two 120mm fans, 2 x Noctua NF-P12 directly plugged in to the mobo)
Mobo: Gigabyte GA-MA790XT-UD4P (Updated to the latest BIOS F5)
RAM: Corsair XMS DDR3 4GB (9-9-9-24) (TW3X4G1333C9A) 2x2GB
Graphics: XFX ATI Radeon HD4770
Processor: AMD Phenomen II X3 720BE (Stock HS and fan with AS5 applied) (Both CPU and chipset runs around 40-45 degrees C)
HDD: Samsung spinPoint 500GB SATA(HD501LJ) No RAID, just one disk.
PSU: Corsair CMPSU-520HX 520W
DVD: Optiarc DVD RW AD-7243S SATA

Nothing is or has been overclocked, i have not made any changes in the BIOS, most of the settings are on automatic.

Windows Vista Ultimate 64-bit SP1
Windows XP Pro 64-bit SP2
Windows XP Pro 32-bit SP3

## Detailed description of problem ##

I started of by installing Windows Vista Ultimate 64-bit SP1 and the installation was a success, no problems there. I then proceed with the first logon and 2-3 minutes after that the computer rebooted itself. On its way up after the POST it couldn't find the bootloader for Vista. I then did a number of re-installs with the same result. Finally there was a reboot when the system found the bootloader after the reboot. After this the system stopped rebooting by itself and allowed me to stay in the system.

The problem now is that the system starts to freezez/hangs at any given moment. This usually happends after a few minutes so there was basically no time to install drivers and so forth. I managed to check the event viewer but couldn't find any clues to these system halts.

I then tried installing Windows Vista Ultimate 64-bit SP1 on another HDD (Samsung SpinPoint 80GB SATA SP0812C) and i had some luck and the system didn't reboot the first time but the system halt is still present.

I then tried installing Windows XP Pro 32-bit SP3 and later the 64-bit version on the smaller HDD (Samsung SpinPoint 80GB SATA SP0812C) and here there were no reboots after the first logon at all. The system halts is still present in both these OSs but not nearly as often as in Windows Vista. I am now running Windows XP Pro 64-bit SP2 with all the drivers for the hardware installed and the system runs fine when it is actually running but the system halts occurs around 1-2 times a day now.

## What has been tried ##

- Ran MEMTEST-86 for one pass, took about one hour with no errors
- Ran HUTIL to check the Samsung spinPoint 500GB SATA disk, no errors
- Updated the BIOS to the latest version F5.
- Tried running the system with both Optimized defaults and fail-safe defaults.
- Ran all the test in 3Dmark06 to see if that could trigger any problems, went ok.
- Tried a different HDD, as i wrote earlier.
- Stressed the CPU with StressCPU from UBCD without any problem, ran it for about 30 min.
- I have tried the following different setups with the memory:
- Bank 1 and 2 - Dual channel mode with Unganged mode
- Bank 1 and 2 - Dual channel mode with Ganged mode
- Bank 1 and 3 - Single channel mode with Ungande mode
- Bank 1 and 3 - Single channel mode with Ganged mode (Currently under testing!)

Hope i got all the info in there and hopefully someone here can help me with ideas how to further test and nail down the hardware that is casuing this becasue i'm running out of ideas and have limited accesss to hardware to replace for testing. Also since i don't know what is casuing this i don't know which manufacturer to turn to with the problem for help. I don't know why but to me it looks like this is a hardware issue.

Almost forgot to mention that i did get some error in the event viewer that looked like this:

Windows Vista:
Source: WMIxWDM
EventID 106

"Event filter with query "SELECT * FROM __InstanceModificationEvent WITHIN 60 WHERE TargetInstance ISA "Win32_Processor" AND
TargetInstance.LoadPercentage > 99" could not be reactivated in namespace "//./root/CIMV2" because of error 0x80041003. Events cannot be delivered through this filter until the problem is corrected."

Windows XP:
Source: WMIxWDM
Event ID: 106

Description:
Machine Check Event reported is a corrected error.

0000: 00000001 00000001 dde4c3f4 01ca2bb5
0010: 00000000 00000000 00000004 00000000
0020: 001d018b 94324c50 2b6eab00 00000001
0030: 00000000 00000000 00000000 00000000
0040: 00000000 00000000 00000000 00000000
0050: 00000000 00000000 00000000 00000000
0060: 00000000 00000000 00000000 00000000

This dissappeared when i changed connector on the motherboard for the front fan. I noticed that front fan was running 3-400 rpm slower than the rear fan so i changed to a different connector and then the WMIxWDM logs dissappeard under Windows XP and the fan runs at the same speed as the rear. Under Vista i don't know if they dissapeared since i hadn't
done this change yet.

Thanks in advance

protellect
Posts: 312
Joined: Tue Jul 24, 2007 3:57 pm
Location: Minnesota

Post by protellect » Fri Sep 04, 2009 10:27 am

Sounds like you have gone down most of the general troubleshooting.

I'll throw you a couple more ideas.

1. I'd try an ubuntu live CD, see if that crashes.
2. I'd feel the heatsinks on the motherboard.
Just because the CPU is within specs doesn't mean the northbridge isn't roasting.
3. Swap video cards.
4. Try a power supply tester or a different power supply.
5. Swap motherboards or CPU if possible.

kittle
*Lifetime Patron*
Posts: 336
Joined: Thu Nov 09, 2006 4:44 pm
Location: San Jose, CA

Post by kittle » Fri Sep 04, 2009 4:25 pm

I suspect the PSU / wall socket or your motherboard.

Do you have it plugged into a surge supressor?

Try another PSU.
Try your same PSU in another PC.

next time it crashes, go back into the bios and look at the temps. what are they?

K.Murx
Posts: 177
Joined: Tue Mar 17, 2009 10:26 am
Location: Germany

Post by K.Murx » Fri Sep 04, 2009 7:55 pm

I am using almost exactly the same configuration (only difference are the drives) without problems under Win7 RC. So it should not be a general problem with your hardware configuration. The Northbridge is fine with similar CPU temps.

My initial suspect would be a faulty MB. However before sending that in for replacement, I would also try a Linux Live CD (or USB stick), to rule out any software issues.

Kate
Posts: 194
Joined: Tue Dec 23, 2008 11:15 pm
Location: *-Home-*

Post by Kate » Fri Sep 04, 2009 8:20 pm

hmmm

I had a similar problem with my computer when i first built it...

I had to remove ALL memories, install vista with only 2gb (using only ONE slot, not in dual channel), then update to SP1, and re-install all memories...

Maybe you could try...

Nevyn2
Posts: 43
Joined: Thu Jun 19, 2003 11:40 am
Location: Swe

Post by Nevyn2 » Sun Sep 06, 2009 6:18 am

Hi all and thanks for youre replies, have been away for the weekend but now i'm back for some more trial and error.

This can now be added to the Tried so far list.

- Bank 1 and 3 - Single channel mode with Ganged mode
- Tested the RAM sticks on by one in singel channel mode and in both ganged/unganged mode
- Replaced the SATA cables to the HDD and DVD and tested them on the internal chipset controller and Gigabytes own SATA controller
- Disconnected the DVD
- Tried a PS2 keyboard
- Replaced the signal cable between the screen and graphicscard.
- Tried a different wall jacket, although in the same room and none of the jackets is grounded but since my old system was plugged into the same wall jacket with this very PSU i don't think that is the issue.
- The temps are around 40-45 for the system after a freeze when i reboot and check in the BIOS. I also felt the different HS and they do not feel all that hot.

I have also seen a few new Event 106 warnings in the eventlog again, as i wrote about in my first post. They were gone for a day or two but has now come back. I have also recevied 2-3 BSOD right after i typed in the password for Windows and the system starts to load my profile etc. Here is the log from event viewer:

Event ID: 1003
Source: System error
Error signature

BCCode : 1000007e BCP1 : FFFFFFFFC0000005 BCP2 : FFFFFADF827E4256
BCP3 : FFFFFADF90C61B00 BCP4 : FFFFFADF90C61510 OSVer : 5_2_3790
SP : 2_0 Product : 256_1

Maybe someone here are able to decipher this.

Thanks in advance

xan_user
*Lifetime Patron*
Posts: 2269
Joined: Sun May 21, 2006 9:09 am
Location: Northern California.

Post by xan_user » Sun Sep 06, 2009 6:29 am

a single 1 hour memtest86+ would not satisfy me.

lm
Friend of SPCR
Posts: 1251
Joined: Wed Dec 17, 2003 6:14 am
Location: Finland

Post by lm » Sun Sep 06, 2009 9:31 am

Run memtest86+ overnight, ie. more than 8 hours, preferably over 24 hours.

Do the same with prime95 torture test running on each cpu core at the same time, again at least overnight, preferably 24h.

If you can't do the latter because windows freezes, try safe mode. Do those freezes happen in safe mode too?

Does linux freeze too?

Strip the system to bare minimum components for testing, remove all unnecessary components.

Nevyn2
Posts: 43
Joined: Thu Jun 19, 2003 11:40 am
Location: Swe

Post by Nevyn2 » Tue Sep 08, 2009 12:26 pm

Now i have been running Memtest86+ for 25h without any errors and i disabled the Cool n' quiet function to see of that was the troublemaker, but i still get the random halts.

I ran Ubuntu from a CD for 3-4 hours wihtout any problems, will have to try that for a longer period. Tomorrow i will run the Prime95 to see how that goes.

I also installed SpeedFan so i can log the temps in a better way, and the hottest temp i did see was 46-47 degrees C when i was gaming.

K.Murx
Posts: 177
Joined: Tue Mar 17, 2009 10:26 am
Location: Germany

Post by K.Murx » Tue Sep 08, 2009 2:45 pm

This knowledge base article may help:
http://support.microsoft.com/kb/889249
(although it basically says "It's the CPU's fault. Go tell AMD")

DanceMan
Posts: 287
Joined: Sun Aug 11, 2002 3:26 pm
Location: Burnaby, BC, Canada

Post by DanceMan » Tue Sep 08, 2009 9:17 pm

When a computer freezes or locks up, I suspect memory. Had it happen on an old IBM PII laptop and traced it to a sodimm that was heat sensitive. It happened to someone on another forum a couple of days ago (AMD X3 and OCZ) and I advised him to try boosting the memory voltage and/or loosen the timings. A slight boost in voltage solved his problem.

You've already done all the right things. Try the ram voltage, and if that does nothing, adjust the timings looser.

Nevyn2
Posts: 43
Joined: Thu Jun 19, 2003 11:40 am
Location: Swe

Post by Nevyn2 » Thu Sep 10, 2009 1:35 pm

I went to asktheramguy, Corsair tech support and asked them to check my settings for the RAM which were fine. He also recommended to up the volt for the RAM so i tried the suggested number but the random halts are still present.

Btw i have gotten help decipher the minidumps from the BSOD and those are most likely linked the the onboard audio. I have now disabled the onboard audio and the BSOD are gone.

But as i wrote the random halts are still there.... :x

DanceMan
Posts: 287
Joined: Sun Aug 11, 2002 3:26 pm
Location: Burnaby, BC, Canada

Post by DanceMan » Thu Sep 10, 2009 2:05 pm

Why not try loosening ram timings? What have you got to lose but a little time? Your default is 9-9-9-24? Try 10-10-10-28. If there's no change it's easy to put back and you'll have eliminated another factor.

I've been mildly overclocking on an Intel G965 chipset and despite running the ram (Corsair) slower than rated, about 756 on DDR2-800, I had to change the timings from the default 4-4-4-12 to 5-5-5-15 to make it work. For the other numbers I just went higher, 34 to 40, 4-2 to 5-3. I left the second group of timings alone.

Nevyn2
Posts: 43
Joined: Thu Jun 19, 2003 11:40 am
Location: Swe

Post by Nevyn2 » Thu Sep 10, 2009 10:06 pm

May I was a bit unclear I sure will test to loosen the timings of my RAM to what you suggested! It just that i haven't had the time to do it yet since this testing takes a lot of time.

Sometimes my system can run clean for 4-8 hours before a system halt and when it happends i'll have to reset the changes made and then start testing the new ones.

I really appreciate you taking the time to read my post and making suggestions. I have a upcoming week off work and then there will be a lot of time for testing =)

Again thanks for your input.

ascl
Posts: 279
Joined: Tue May 05, 2009 1:15 am
Location: Sydney, Australia

Post by ascl » Fri Sep 11, 2009 1:41 am

Given you have run memtest for over 24 hours without issues, it seems that RAM is not the problem. Its somewhat suspicious that ubuntu ran without issues, but then, you didn't run it very long.


You could try something like this:
http://www.stresslinux.org/

Its basically what it sounds like, a small linux image you can boot from a CD or USB key and run applications to stress the system a bit. Try running one of the CPU stress tests over night. You can also monitor the heat with tools in the image.

If all of these things pass, and the system stays up... then it suggests that something is dodgy in your windows install (driver conflict or something).

K.Murx
Posts: 177
Joined: Tue Mar 17, 2009 10:26 am
Location: Germany

Post by K.Murx » Fri Sep 11, 2009 6:59 am

Did you read that KB article I linked to? Because that basically says the CPU is faulty.

I'd try underclocking and/or overvolting the CPU and see what happens.

ascl
Posts: 279
Joined: Tue May 05, 2009 1:15 am
Location: Sydney, Australia

Post by ascl » Fri Sep 11, 2009 8:49 am

APPLIES TO
Microsoft Windows Server 2003, Datacenter Edition (32-bit x86)
Microsoft Windows Server 2003, Enterprise Edition (32-bit x86)
Microsoft Windows Server 2003, Standard Edition (32-bit x86)
I am not sure that KB article is relevant here.

K.Murx
Posts: 177
Joined: Tue Mar 17, 2009 10:26 am
Location: Germany

Post by K.Murx » Fri Sep 11, 2009 10:43 am

ascl wrote:
APPLIES TO
Microsoft Windows Server 2003, Datacenter Edition (32-bit x86)
Microsoft Windows Server 2003, Enterprise Edition (32-bit x86)
Microsoft Windows Server 2003, Standard Edition (32-bit x86)
I am not sure that KB article is relevant here.
XP and Server 2003 are quite similar. The problem description matches. There are no other descriptions of Event ID 106.
Therefore I am pretty sure it is relevant her, - and otherwise I would not have posted it.

Nevyn2
Posts: 43
Joined: Thu Jun 19, 2003 11:40 am
Location: Swe

Post by Nevyn2 » Fri Sep 11, 2009 1:23 pm

Hi again!!

Now i have been running Prime95, well trying to run would be more accurate =)

I ran the torture test with the deffault settings (do i need to change anything maybe?) and heres the result

1 run - System halt in test 3
2 run - System halt in test 1
3 run - System halt in test 4
4 run - System halt in test 7-8 (Two cores working with test 8 and 3'rd core on test 7)
5 run - System halt in test 7

I had SpeedFan running to monitor the temps and they were in the range 40-55 degrees C, i also felt the different HS and they were not hot or anything. I also checked BIOS after every reset and it reported temps between 40-45 for the chipset and CPU. So i guess the readings are fairly accurate.

One thing i noticed was that -12V line was fluctuating a bit, lowest was -2,90 and the highest was 3,31. I also noticed right before the test when the system was idle it was around -4,13. Is that a problem or is it normal? The other V values were stable during the test (Well some fluctuation, around +-0.02V). This values are taken from SpeedFan. Here is a sample during test 4:

Vcore1: 1.31V
Vcore2: 1.60V
+3,3V: 3,34V
+5V: 4,92V
+12V: 12,16V

-12V: -3,06 (The most common value)
-5V: 4,05V
+5V: 3,63V
Vbat: 3,25V

I made some screenshots but it dosen't look like you can post them here =(. At least it look like i found a way to trigger these system halts.

Is this indicating that it is the CPU that is the malfunctioning part or could it still be some other part? I mean even though this app primarily stresses the CPU i guess it still uses the RAM and the memory subsystem of the mobo right? I guess it could be the PSU having trouble delivering power?

Or can i assume that it is the CPU that is the bad guy?? Also i recevied a lot of those Event ID : 106 during these test.

FYI i reoladed the Optimized defaults in BIOS prior these test.

K.Murx I read those articles about Event ID: 106 and with these test results it's sure looks like the CPU is the bad guy. Thanks!

Now i gotta get some sleep before work! Will continue this tomorrow...

Thanks

kittle
*Lifetime Patron*
Posts: 336
Joined: Thu Nov 09, 2006 4:44 pm
Location: San Jose, CA

Post by kittle » Fri Sep 11, 2009 3:18 pm

ok hopefully what you wrote is a typo...

but if I understand right, the 12v line on your psu is only providing 3v (THREE volts) ... as opposed to 13v?
If so, then it sounds your PSU has some major problems. Which explains a lot of the resets you are seeing.

But if you meant that the 12v line is 3v under (or its running at 9v) - then you also have too much voltage drop -- which is still a problem, just not as bad.

Either way - I would try a new PSU at minimum. I would also look at a PSU tester to see if its just your MB sensor or if the PSU is really that far out of whack.

I think the allowable voltages are something like 11.5v to 13.5v or somewhere in there.

DanceMan
Posts: 287
Joined: Sun Aug 11, 2002 3:26 pm
Location: Burnaby, BC, Canada

Post by DanceMan » Fri Sep 11, 2009 4:45 pm

Did you notice the Minus? -12V is a signalling voltage. It does not power the cpu or any other major components.

Nevyn2
Posts: 43
Joined: Thu Jun 19, 2003 11:40 am
Location: Swe

Post by Nevyn2 » Sun Sep 20, 2009 9:29 am

Hi! I'm back again after a been away for a while.

I have now lowered the speed of my memory to 800Mhz. I turned the memory clock down to x4.00 and the system is much more stable, i did not alter any of the timings.

I ran Prime95 (with the same settings as before) and ran error free for 4 hours before i stopped it. Before i lowered the memory clock i wasn't able to get past 3-5 min in Prime95.

Now i have an occasional system halt and system is really stable.

What does this suggest? Is there something wrong with the memory?

DanceMan
Posts: 287
Joined: Sun Aug 11, 2002 3:26 pm
Location: Burnaby, BC, Canada

Post by DanceMan » Tue Sep 22, 2009 11:14 am

Nevyn2 wrote:What does this suggest? Is there something wrong with the memory?
I don't have familiarity with recent AMD overclocking, but in the Intel world the memory speed or memory divider is usually turned back if overclocking because increasing the fsb for overclocking will in turn increase the memory speed. You turn it down initially so that when increased it will end up near its rated speed.

In my second post above, despite underclocking my ram, I had to loosen the memory timings to make my overclock work, and I suspect it's the chipset, the less than Stellar G965, and not the ram. I also think that when you're using 2x2G rather than 2x1G you're going to need a little more voltage or looser timings or both. Run Memtest at your current settings for a couple of hours. If you get no errors, the ram is fine.

I know you're not overclocking, but even at stock settings using 2x2G instead of 2x1G can demand some small changes.

Nevyn2
Posts: 43
Joined: Thu Jun 19, 2003 11:40 am
Location: Swe

Post by Nevyn2 » Wed Sep 23, 2009 3:23 pm

I just bought one 2GB stick of RAM (KVR1333D3N9/2G) to test with, i tried with the same settings, 1333Mhz 9-9-9-24 with the same result as the Corsair memory =(

I have now tried a different memory, HDD, disconnect the DVD and now i really don't know what to do.

I tried the 10-10-10-25 + upping the V by 0.2.

Any ideas?

Post Reply