Help! Unreliable Booting. <SOLVED>

Got a shopping cart of parts that you want opinions on? Get advice from members on your planned or existing system (or upgrade).

Moderators: NeilBlanchard, Ralf Hutter, sthayashi, Lawrence Lee

Post Reply
tango charlie
Posts: 35
Joined: Sun Sep 18, 2005 2:39 pm

Help! Unreliable Booting. <SOLVED>

Post by tango charlie » Fri Apr 14, 2006 11:15 am

Hello all, this isn't a silencing issue, but this forum has been so informative during my pc-building experience that I thought I'd give it a try. Also, I've posted the same topic in the Anandtech forums - I think that's bad etiquette, so I apologize.

So, I'll try to summarize this as quickly as possible - Basically, I cannot rely on my machine to get running solidly in Windows for the first few boots after several hours powered-off. After rebooting a couple times, it seems to be stable - I have yet to see any crashes while running. Admittedly, the most I've pushed it is running COD2 for about twenty minutes, but I've been able to install most of my software and leave it on for some hours. It's worked through memtest overnight without issue.

SPECS:
- AMD X2 4200+
- ASUS A8N-SLI Premium
- OCZ EL Platinum PC3200 2X1GB DDR400 CL2-3-2-5
- EVGA E-GEFORCE 7800GT CO 470MHZ PCI-E 256MB
- Linksys WMP54G Wireless PCI Adpter
- NEC ND-3550A DVD+RW
- Panasonic 1.44MB 3.5IN Floppy
- Seagate 250GB SATA HDD
- Antec P150 Case
- Seasonic S-12 430W PSU

HISTORY:
Originally this system was assembled with a Maxtor DiamondMax 10 HDD and powered by the Antec P150's PSU. The system would crash all the time. I took it into NCIX, from where I'd bought all the parts, and they went about troubleshooting. Initially they replaced the HDD with the Seagate drive, which helped, but it wasn't until the PSU was replaced that the system would even load Windows with some sort of regularity. This whole process took about a month and a half of back and forth (which is why I opted to buy a new PSU rather than wait to diddle around with Antec's RMA process).

So now I have it back, though cold boots are still unpredictable. Thinking it's a RAM issue, I've been trying different timings with the help of the folks at the OCZ Tech Support Forums. Here's the full thread.
However, I'm still having the problems.

Chipset drivers have been updated to 6.65, and BIOS has been updated to 1009, both the latest non-beta updates on the ASUS site.

So, I am posting here in hopes that someone might have had a similar experience or might be able to offer some insight. I would really like the confidence that my system will boot when I need it to, and the peace of mind that my data isn't being messed up.

Any help would be greatly appreciated!

Cheers!
Tony
Last edited by tango charlie on Wed Apr 19, 2006 8:15 am, edited 1 time in total.

NeilBlanchard
Moderator
Posts: 7681
Joined: Mon Dec 09, 2002 7:11 pm
Location: Maynard, MA, Eaarth
Contact:

Post by NeilBlanchard » Fri Apr 14, 2006 6:32 pm

Hello,

I'd make yourself a copy of the Ultimate Boot CD and run Prime95. If it can run for a few hours w/o any problems, then it's a Windows issue. If it has problems, then it's a hardware issue -- most likely a RAM issue.

tango charlie
Posts: 35
Joined: Sun Sep 18, 2005 2:39 pm

Post by tango charlie » Sat Apr 15, 2006 10:15 am

I'll give that a shot. N00b question, though: Prime 95 is the same as "Mersenne Prime Test", correct?

Also, the machine doesn't seem to have trouble running stably (admittedly under everyday tasks) after the third boot - I just wish it would work the first time. Though I suppose stress-testing it would take any uncertainty away from the warmed-up stability.

tango charlie
Posts: 35
Joined: Sun Sep 18, 2005 2:39 pm

Post by tango charlie » Sat Apr 15, 2006 2:04 pm

Update: I've run Mersenne Prime from 12:30 to 3:00, and all tests passed successfully. This was off a warm boot.

Also, thank you for the link to that UBCD. The thing's fantastic!

Edit: I'll try it on a cold boot and see what happens.

calpchen
Posts: 2
Joined: Sat Apr 15, 2006 7:41 pm

Post by calpchen » Sat Apr 15, 2006 8:14 pm

I once had booting problems with my MSI motherboard where the first boot would lock up in the "Windows booting" screen for about a minute or two before continuing. Subsequent "warm" boots would be fine. It was finally traced to the USB card reader plugged into the motherboard and the "USB KB support" setting in BIOS being on.

The temporary work-around was to disable the "USB KB support" setting in BIOS, but with the latest BIOS version, that problem has been fixed altogether.

Trunks
Posts: 219
Joined: Sat Mar 04, 2006 1:58 am
Location: Baton Rouge, LA

Post by Trunks » Sat Apr 15, 2006 10:36 pm

I agree. Look in to the BIOS. I would turn off every thing like the nic, the ports, look closely at the APCI, wake upevents. rom etc. settings. This would be a good step in trouble shooting.

merlyn
*Lifetime Patron*
Posts: 149
Joined: Mon Jul 12, 2004 2:14 pm
Location: Jersey, UK

Post by merlyn » Sat Apr 15, 2006 11:59 pm

Ya this sounds like an MB level problem. had the same issue with a p4 based board a few weeks ago. really difficult to troubleshoot cold boot problems. swap the board out if you can. at least give it a thourough physical inspection, look for bad caps etc. if i recall correctly the early neohe's had timing problems with some asus boards, maybe it's related?

tango charlie
Posts: 35
Joined: Sun Sep 18, 2005 2:39 pm

Post by tango charlie » Sun Apr 16, 2006 9:59 am

Alright, in the BIOS i notice that the board has a "quickboot" feature, which "skips some tests". I turned it off, and now the machine runs the memory test that we all know and love. It takes a while to go through the 2GB, but for the first boot, I get a "Memory Test Failed" message! The second boot I get into windows but get a virus scanner message, and the third boot and everything's fine.

I'm going to try different RAM stick/slot combinations. Unfortunately, I don't have any other compatible RAM, but I'll see what I can discover with my two sticks. Current thinking is that contacts (somewhere in the system) are expanding with the increase in heat, leading to the stability. Dunno to what extent this would affect memory contacts, but it's the first place I can try. At least it seems I can rule out the SATA connection.

Thank you for all the helpful suggestions! The NeoHe I had seemed to be one of those problematic PSUs. Since I've swapped PSUs, the severe problems have gone away, but I wonder to what extent the issue is the MOBO.
Also, being a n00b I'm not sure what a lot of the acronyms in the BIOS stand for (like APCI), and the manual is absolutely no help. Can't we employ some better doc writers? Grr.

tango charlie
Posts: 35
Joined: Sun Sep 18, 2005 2:39 pm

Post by tango charlie » Sun Apr 16, 2006 10:04 am

Just stuck my head in with a flashlight... I'm not sure exactly what I'm looking for as far as "bad caps" go, but everything looks clean and properly connected.

fastturtle
Posts: 198
Joined: Thu May 19, 2005 12:48 pm
Location: Shi-Khan: Vulcan or MosEisley Tattonnie

Post by fastturtle » Sun Apr 16, 2006 3:55 pm

First off, reduce to a single stick of memory to reduce that issue.

Second, go into the bios and look for a FailSafe setting. If you have it, that's going to disable almost everything and select safest settings to get windows installed/booted

Next if windows boots sucessfully, start re-installing memory and reboot after each stick. Eventually either Windows will puke or things will go fine. NOTE Check the memroy settings in failsafe as some boards tend to be too agressive/performance oriented to start

Fourth: start re-enabling the stuff in the bios you need, one thing at a time until either Windows Pukes or things go fine

The only other item I can even think of is the PSU and Video card choice. Do you have the auxilary power connection hooked to the video card and are you absolutely sure you've got enough juice for that CPU/Video combo?

nici
Posts: 3011
Joined: Thu Dec 16, 2004 8:49 am
Location: Suomi Finland Perkele

Post by nici » Sun Apr 16, 2006 4:03 pm

tango charlie wrote:Just stuck my head in with a flashlight... I'm not sure exactly what I'm looking for as far as "bad caps" go, but everything looks clean and properly connected.
Well when looking for bad caps you would be looking at any sort of leak around the capacitors or bulging generally. :)

tango charlie
Posts: 35
Joined: Sun Sep 18, 2005 2:39 pm

Post by tango charlie » Mon Apr 17, 2006 8:55 am

fastturtle: I'll try that. Last night, with just one stick in slot A1, the system did two cold boots without any apparent problems. I will try some more combinations tonight.
I'm not absolutely sure I have enough power, though considering everything I heard regarding hype around wattage ratings, I thought 430 would be enough. What are your thoughts?

nici: heh heh, thank you! Fortunately, I didn't see anything like that.

gud4u
Posts: 30
Joined: Sat Feb 18, 2006 6:57 pm

Post by gud4u » Tue Apr 18, 2006 5:46 am

It looks like the BIOS memory test has diagnosed a problem with memory at current timings and/or voltage on at least one memory stick.

That can be tackled by patient memory testing with Memtest86, one stick at a time, with adjustments to memory timnings and/or memory voltage. That means that reliable operation with both sticks will require the same memory settings required by the poorest-performing stick.

Cold-boots draw more power from your PS than re-boots, so it is possible you have some PS problems as well. That's difficult to diagnose, except by substitution.

Prime95 will help diagnose stable Vcore setting, not much else.

Hope this helps!

jaganath
Posts: 5085
Joined: Tue Sep 20, 2005 6:55 am
Location: UK

Post by jaganath » Tue Apr 18, 2006 5:53 am

Cold-boots draw more power from your PS than re-boots
Why is that? Is there still some energy stored in the PSU in warm boots?

Le_Gritche
Posts: 140
Joined: Wed Jan 18, 2006 4:57 am
Location: France, Lyon

Post by Le_Gritche » Tue Apr 18, 2006 6:05 am

jaganath wrote:Why is that? Is there still some energy stored in the PSU in warm boots?
Capacitors probably store energy for some time, and besides the HDD and fans are still spinning, whereas on a cold boot they start from still.

Reading the start of the thread, I was going to suggest insufficient load on the 12V rail was the root of the problem or on the contrary an overwhelmed PSU, but now it looks like it's a RAM problem.
I looked at your thread at OCZ, they say your problem is caused by a weird mobo, I would say it's probably a weird RAM problem.

Let us and them know what are the results of your tests with 1 and 2 RAM stick, changing memory slot, and even disabling dual channel.
Maybe you will be able to pinpoint the cullprit.

tango charlie
Posts: 35
Joined: Sun Sep 18, 2005 2:39 pm

Post by tango charlie » Tue Apr 18, 2006 7:51 am

It is definitely a weird RAM problem. I finally got around to swapping the RAM. Initially I had one stick in slot A1 and the other in B1.

Just Stick 1 in Slot A1:
- passed the memory test, a bit slow to load, but booted without issue.
- 2nd time, booted without issue again.

Just Stick 2 in Slot A1:
- bios rom checksum error
- (warm, this time) windows started without issue.
- (again, but cold) nothing. No screen, no keyboard lights, no mouse lights. Just some whirring fans.
- bios rom checksum error

Back to just Stick 1 in Slot A1:
- memory tested fine, windows booted fine.

It seems safe to say one of those RAM sticks has issues.
...That means that reliable operation with both sticks will require the same memory settings required by the poorest-performing stick...
Is it standard for one stick in a set to perform better or worse than the other? That seems like a pretty poor deal, if you ask me.

EDIT: It seems unusual that a RAM issue would cause a BIOS ROM checksum error or that whole nothing-when-it-boots issue. I'm no expert, but I'm very suspicious about that. Perhaps someone with more experience can confirm or deny my suspicions.

NeilBlanchard
Moderator
Posts: 7681
Joined: Mon Dec 09, 2002 7:11 pm
Location: Maynard, MA, Eaarth
Contact:

Post by NeilBlanchard » Tue Apr 18, 2006 8:22 am

Hi Charlie,

Occam's Razor says you got one bad stick, and one good stick o' RAM! :o

tango charlie
Posts: 35
Joined: Sun Sep 18, 2005 2:39 pm

Post by tango charlie » Wed Apr 19, 2006 8:15 am

Ha ha, I gave my old sticks back to NCIX for a new pair, and so far (knock on wood) the system's performing perfectly!
Sincere thank yous to everyone who offered advice and support!

Now all I need to do is quiet the damn thing. Heh heh.

Post Reply