Help me build a Folding Farm

A forum just for SPCR's folding team... by request.

Moderators: NeilBlanchard, Ralf Hutter, sthayashi, Lawrence Lee

haysdb
Patron of SPCR
Posts: 2425
Joined: Fri Aug 29, 2003 11:09 pm
Location: Earth

Post by haysdb » Fri Dec 19, 2003 4:30 am

CharlieChan wrote:Doing a whois domain name search on www.davidhays.com reveals that a real www.davidhays.com exist at 216.21.229.209. You should rename your network to something else may be www.davidhays.biz.
Remember, I told you I was stubborn. [Imagine me jmping up and down, shaking my fist] It's my LOCAL domain name, and I don't give a damn if some other David Hays in Timbuktu has it or not, so long as I never need to visit the "real" www.davidhays.com. And what if I do change it to .biz and then someone grabs THAT name? I really don't want to register my own domain name. [Sits down]

Actually, I probably SHOULD register my own domain name, so then I could host "stuff" off my server, and have a cool email address like [email protected]. I just haven't ever come up with a good reason why I would need to do that.

The real point though, the point I am being stubborn over, is that this domain name is LOCAL, and should be resolved by my dns, and never passed upstream, so I should be able to make it whatever I want. I have that working now, so exactly WHY mr. pcnut is not happy is more a mystery than ever. I'll bet you 10 dollars/pounds/euros that it works after the reboot. There's just no reason left for it not to work. :!:

David

haysdb
Patron of SPCR
Posts: 2425
Joined: Fri Aug 29, 2003 11:09 pm
Location: Earth

Post by haysdb » Fri Dec 19, 2003 4:41 am

Despite the fact that "most people" can just fire up dnsmasq and not have to do ANY configuration (according to the author), here is my dnsmasq configuration file:

[Edit: Removed some lines that either didn't work or were found to be unnecessary.]

Code: Select all

# /etc/dnsmasq.conf
# dnsmasq configuration file
# This file is read by dnsmasq when it starts up


# To configure dnsmasq to act as cache for the host on which it is running,
# put "nameserver 127.0.0.1" in /etc/resolv.conf to force local processes
# to send queries to dnsmasq. Then put the real address(es) in another file
# and run dnsmasq with the --resolve-file option.

resolv-file=/etc/resolv.dnsmasq


# Enable the "Use DHCP assigned addresses" feature of dnsmasq by telling
# it where to find the DHCP leases file. This allows names assigned by (or
# registered with) the DHCP server to be resolved by dnsmasq. This allows
# hosts to be referred to by name without having to assign them a fixed IP
# address.

dhcp-lease=/var/lib/dhcp/dhcpd.leases


# Never pass this domain name to the upstream name server since this name
# is purely local to the local area network. In fact, this domain name 
# is registered to someone else, so it resolves to a real Internet address
# if passed upstream.

server=/davidhays.com/


# Some of the DNS 'root servers', such as Verisign, when they receive
# a request for a domain name which does not exist, will return the IP
# address of a 'domain names for sale' web site. The next line will trap
# these "bogus" IP addresses and instead return "host not found".
# Multiple domain names can be listed.

bogus-nxdomain=64.94.110.11
David
Last edited by haysdb on Sat Dec 20, 2003 8:34 am, edited 4 times in total.

haysdb
Patron of SPCR
Posts: 2425
Joined: Fri Aug 29, 2003 11:09 pm
Location: Earth

Post by haysdb » Fri Dec 19, 2003 4:54 am

I'd have lost that bet.

I remember reading something somewhere about the windows TCP/IP stack getting corrupted, whatever that means. If that doesn't pan out, I'm doing a clean install of Windows.

David

CharlieChan
Patron of SPCR
Posts: 198
Joined: Sun Jul 13, 2003 2:57 am
Location: East Anglia, UK

Post by CharlieChan » Fri Dec 19, 2003 4:56 am

David,

Change the /etc/resolv.conf on the farm server to,

search davidhays.com
nameserver 192.168.1.100
nameserver 192.168.1.1

Remember to restart the dnsmasq server.

Another interest point, my setup does not have /etc/resolv.dnsmasq, /etc/dnsmasq.conf. I am running dnsmasq 1-11.1 which was compiled from the .src.rpm file. I just check the startup script on the dnsmasq and it uses the /etc/resolv.conf file. You mention earlier 2 of your PC's on the network were able to access the internet, is that still true?

Charlie.
Last edited by CharlieChan on Fri Dec 19, 2003 6:17 am, edited 1 time in total.

CharlieChan
Patron of SPCR
Posts: 198
Joined: Sun Jul 13, 2003 2:57 am
Location: East Anglia, UK

Post by CharlieChan » Fri Dec 19, 2003 6:08 am

haysdb wrote: search ks.ok.cox.net*
Were did this come from?

Charlie.

haysdb
Patron of SPCR
Posts: 2425
Joined: Fri Aug 29, 2003 11:09 pm
Location: Earth

Post by haysdb » Fri Dec 19, 2003 6:30 am

CharlieChan wrote:Change the /etc/resolv.conf on the farm server to,

search davidhays.com
nameserver 192.168.0.100
nameserver 192.168.0.1

Remember to restart the dnsmasq server.
I assume you mean /etc/resolv.dsnmasq? Dnsmasq uses a copy of the original /etc/resolv.conf which I named resolv.dnsmasq. With dnsmasq using it's own copy, the original resolv.conf was changed to 'nameserver 127.0.0.1' so that local applications could use the local name server.

I tried your suggested change to resolv.dsnmasq, but within seconds of filing that change (no restart necessary, it watches that file like a proverbial hawk), this message showed up in the dnsmasq log:

Code: Select all

dnsmasq: ignoring nameserver 192.168.1.100 - local interface
OK, that makes sense - it'd be a recursive call.

Since the server itself, as well as at least one of the Windows XP machines, are fully functional, I don't think it's a resolv issue. In fact, I don't think it's a dnsmasq issue, or a DHCP issue, at least not on the server.

I followed the instructions in a Microsoft Knowledgebase article to repair a broken TCP/IP (netsh int ip reset <logfile>), but it does not appear to have worked. I haven't rebooted it since doing that though, so I guess I should before giving up.

There is one other thing that is bugging me, but I can't put a finger on it. The D-Link server uses the MAC address from the machine that isn't working to get an IP address via DHCP from my ISP. :? :!: When I change that MAC address to something else, it doesn't seem to want to work. If I release and try to renew the IP address, it times out. My ISP wants THAT MAC address, or so it seems. So I wonder if it's not related to that somehow, but I don't know why. Maybe because I have tried everything else and that's the only thing left short of a clean install?

David

haysdb
Patron of SPCR
Posts: 2425
Joined: Fri Aug 29, 2003 11:09 pm
Location: Earth

Post by haysdb » Fri Dec 19, 2003 6:43 am

CharlieChan wrote:
haysdb wrote:search ks.ok.cox.net*
Were did this come from?
Straight out of the original resolv.conf. I didn't put it there, so it must be something Linux got from my router, which got it from my ISP.
Last edited by haysdb on Fri Dec 19, 2003 7:48 am, edited 1 time in total.

haysdb
Patron of SPCR
Posts: 2425
Joined: Fri Aug 29, 2003 11:09 pm
Location: Earth

Post by haysdb » Fri Dec 19, 2003 6:53 am

I keep seeing this block of lines in the dnsmasq log, repeating over and over.

Code: Select all

dnsmasq: query 10.1.168.192.in-addr.arpa from 127.0.0.1
dnsmasq: DHCP 192.168.1.10 is pcnut.davidhays.com
dnsmasq: query pcnut.davidhays.com from 127.0.0.1
dnsmasq: DHCP pcnut.davidhays.com is 192.168.1.10
Why is the server (127.0.0.1) sending that query? And why does it keep sending it?

David

CharlieChan
Patron of SPCR
Posts: 198
Joined: Sun Jul 13, 2003 2:57 am
Location: East Anglia, UK

Post by CharlieChan » Fri Dec 19, 2003 6:57 am

haysdb wrote: Since the server itself, as well as at least one of the Windows XP machines, are fully functional, I don't think it's a resolv issue. In fact, I don't think it's a dnsmasq issue, or a DHCP issue, at least not on the server.
Was the IP for the XP machine assign using MAC or dynamic IP?
There is one other thing that is bugging me, but I can't put a finger on it. The D-Link server uses the MAC address from the machine that isn't working to get an IP address via DHCP from my ISP. :? :!: When I change that MAC address to something else, it doesn't seem to want to work. If I release and try to renew the IP address, it times out. My ISP wants THAT MAC address, or so it seems. So I wonder if it's not related to that somehow, but I don't know why. Maybe because I have tried everything else and that's the only thing left short of a clean install?
You could try changing the network card on the PC that is not working with the PC that is working. That way you can verify wether it is the PC or the network card.

Charlie.

CharlieChan
Patron of SPCR
Posts: 198
Joined: Sun Jul 13, 2003 2:57 am
Location: East Anglia, UK

Post by CharlieChan » Fri Dec 19, 2003 7:02 am

haysdb wrote:
CharlieChan wrote:
haysdb wrote: search ks.ok.cox.net*
Were did this come from?
Straight out of the original resolv.conf. I didn't put it there, so it must be something Linux got from my router, which got it from my ISP.
Try removing it with a #. The search usually means 'look in this domain first', replace it with search davidhays.com.

haysdb
Patron of SPCR
Posts: 2425
Joined: Fri Aug 29, 2003 11:09 pm
Location: Earth

Post by haysdb » Fri Dec 19, 2003 7:43 am

CharlieChan wrote:
haysdb wrote:
CharlieChan wrote:Were did this come from?
Straight out of the original resolv.conf. I didn't put it there, so it must be something Linux got from my router, which got it from my ISP.
Try removing it with a #. The search usually means 'look in this domain first', replace it with search davidhays.com.
OK, here is the new file:

Code: Select all

search davidhays.com
nameserver 192.168.1.1
It hasn't changed anything though. in-addr.arpa is still chasing his tail:

Code: Select all

dnsmasq: query 10.1.168.192.in-addr.arpa from 127.0.0.1
dnsmasq: DHCP 192.168.1.10 is pcnut.davidhays.com
dnsmasq: query pcnut.davidhays.com from 127.0.0.1
dnsmasq: DHCP pcnut.davidhays.com is 192.168.1.10
Repeated over and over.

I have determined that in-addr.arpa has something to do with "Reverse DNS lookup" or "inverse resolution", but that doesn't help me much. What service is that query coming from? That I don't know yet.

David

haysdb
Patron of SPCR
Posts: 2425
Joined: Fri Aug 29, 2003 11:09 pm
Location: Earth

Post by haysdb » Fri Dec 19, 2003 7:47 am

CharlieChan wrote:Was the IP for the XP machine assign using MAC or dynamic IP?
The two XP machines are configured identically as far as I can tell, both set on full auto.
You could try changing the network card on the PC that is not working with the PC that is working. That way you can verify wether it is the PC or the network card.
I have a couple of spare NICs laying around, so I could do that without much trouble.

David

Mutt_n_head
Posts: 20
Joined: Thu Nov 13, 2003 2:40 pm

Post by Mutt_n_head » Fri Dec 19, 2003 10:55 am

Charlie, have you actually done this yourself or are you essentially quoting the web pages that Turmelle and others have posted? Because it looks like hays and you are going around in circles and getting nowhere fast.

Reinstall windows... pull the NIC? It should not be this complicated. I've participated in the process of making diskless clients and it wasn't even CLOSE to this hard.

Just hate seeing hays running around in circles and not really making much progress. Eventually I'd think he'll just give up and that would be bad.

I wonder if anyone else has any knowledge of this. Like TRC, Lockheed or someone else.
[/quote]

CharlieChan
Patron of SPCR
Posts: 198
Joined: Sun Jul 13, 2003 2:57 am
Location: East Anglia, UK

Post by CharlieChan » Fri Dec 19, 2003 1:47 pm

Mutt_n_head wrote:Charlie, have you actually done this yourself or are you essentially quoting the web pages that Turmelle and others have posted? Because it looks like hays and you are going around in circles and getting nowhere fast.
Currently I have a diskless folding farm in the garage. The server is a P3 1G with the folding client running as a daemon (finstall). There is a duron 1.3G, dual athlon MP 2200+, and a dual P3 866Mhz running as clients. These computers run on subnet 192.168.2. and the configuration files I posted so far are files from the farm server. The farm server is connected to my main network via a wireless link using a usb prism 2 wireless adapter. This lot will generate about 2000PPW and consume about 500W accounting to the wattage meter I have connected to it. :D

Charlie.

haysdb
Patron of SPCR
Posts: 2425
Joined: Fri Aug 29, 2003 11:09 pm
Location: Earth

Post by haysdb » Fri Dec 19, 2003 4:58 pm

[Warning: I'm going to go off on a bit of a rant here. Read it or skip down to where I calm down and get back to the topic at hand - my trying to get Internet access restored to my desktop Windows XP machine so that it can resume folding.]

Charlie, I admire your restraint.

Mutt_n_head, if you have anything constructive to offer I'm all ears.

I'm pretty certain that Charlie didn't just fall off any turnip truck. I believe the advice I have recieved has been sound. Sure, I occasionally wonder where Charlie is coming from, but nearly always I discover later that he was just a step ahead of me. Without his help (and others), I would be in serious trouble.

I'd like to think I'm not a complete idiot either. I'm new to Linux but have two years of UNIX (True/64) plus a bazillion years of VAX/VMS, albiet as a programmer and not a "systems" guy, so configuring DHCP and DNS servers is new to me, and I don't have a strong networking background so I don't immediately understand a lot of the concepts. But I have "been around" computers (selling them, assembling them, programming them, using them) since the Apple II, I know how to use 'man' and 'info' and www.google.com, and while I'm asking questions, I am looking for the answers in another window.

I'm NOT saying there isn't a simple solution to this problem, since it's entirely possible it's something really obvious I've simply overlooked. I will suggest that if you have only "seen it done" or "participated in" setting one up, that you should walk a mile in my shoes., i.e. do one yourself, from scratch, then come tell us how easy it was.

[Beginning to mellow a bit here, having blown off a bit o' steam...]

It's fairly safe to say I won't give up before I get "the farm" up and running, not in frustration anyway. The specific issue I am having with my desktop machine is a different story. That machine has been two days now without an Internet connection, and I have tried everything I know how to try. If it was working OK except for that, I'd keep banging away at it because I know it can be fixed. In seven years and quite a few PC's, I have never needed to resort to a "clean install" of Windows to fix a problem. However, in this case, this PC has been 'sick' for quite a long time. It takes forever to boot. It crashes occasionally "for no reason". It has had a bazillion applications installed and uninstalled and just plain deleted with a crowbar. My son runs Kazaa, and downloads all kinds of stuff. Whenever I run Ad-aware, it deletes at least 50 things. If/when I do a clean install, it will be for more reasons than to just fix just one problem.

But this one problem so totally fascinates me (!) that I'd love to find a solution to it, even if I just turn around and do a clean install anyway.

[OK, I'm 90% down off my rant and heading back toward the off-topic topic Charlie and I were working on before I got off onto my rant...]

I don't know that the issue dnsmasq is having with the inverse name resolution is the cause of the problem with pcnut. I'm inclined to say no because even if I open the TCP/IP properties and enter a fixed IP address (so that DHCP isn't used), and point it to the IP address of the D-Link router (bypassing the Linux DNS), it behaves no differently.

I can see, by watching the dnsmasq log, that a domain name like www.google.com is received from pcnut and resolved to an IP address, and (presumably!) passed back to pcnut, but it's like pcnut never gets the message. It behaves as if its name server is down.

I can transfer files to and from pcnut, browse it from other machines on the LAN, ping it, so I'm comfortable that the network card is working just fine, but if I try to ping www.google.com from pcnut, it times out.

The following three machines (plus another) are connected to a D-Link 4-port broadband router:
  • pcnut, XP Home, Internet access - No
  • psc, XP Pro, Internet access - Yes
  • fahserv, RedHat 9 Linux, Internet access - Yes
pcnut and psc, other than one being 'Home' and one 'Pro' are otherwise connected and configured the same. Both are connected to the network the same, are running the same browser, have the same TCP/IP configurations, and have similar etc/host files. Both are treated the same in the /etc/dhcpd.conf file. In fact, neither is mentioned by name anywhere on the Linux server. Both get their configurations the same way from DCHP, unless I've overlooked something on the server, which is entirely possible.

But the name server is definitely having some sort if issue with pcnut that it's NOT having with psc, so something is different, I just haven't been able to figure out what.

[ON TOPIC]

The Artic Cooling heatsinks arrived today from SVC, so I have everything I need now for one diskless client. The power supplies arrived yesterday. By next week sometime I should have all the parts for 3 "blades".

David

[WHEW!]


[Edited to remove some overly harsh comments :oops:]
Last edited by haysdb on Fri Dec 19, 2003 7:37 pm, edited 3 times in total.

CharlieChan
Patron of SPCR
Posts: 198
Joined: Sun Jul 13, 2003 2:57 am
Location: East Anglia, UK

Post by CharlieChan » Fri Dec 19, 2003 5:32 pm

David,

If you really want to know what pcnut is doing you could install ethereal and analysis the packets on your network. Used apt-get to install ethereal-gnome which has a nice user interface. I once used this to find what was keeping a dial-on-demand connection on all the time - turn out to be the ntpd daemon. Another method is to install XP on pcnut using another HD, if the new installation is OK then you know it is the stuff you have on the old XP thats causing the problem.

Charlie.

haysdb
Patron of SPCR
Posts: 2425
Joined: Fri Aug 29, 2003 11:09 pm
Location: Earth

Post by haysdb » Fri Dec 19, 2003 7:21 pm

CharlieChan wrote:If you really want to know what pcnut is doing you could install ethereal and analysis the packets on your network. Used apt-get to install ethereal-gnome which has a nice user interface. I once used this to find what was keeping a dial-on-demand connection on all the time - turn out to be the ntpd daemon.
Yes, I am at the point where I need and want to know more than what dnsmasq is telling me. "Analyzing packets" sounds really nerdy, but it's time for me to face up to it - I AM a nerd. "Hey Charlie, you wanna come over Saturday night and we'll analyze some Ethernet packets?"
CharlieChan wrote:Another method is to install XP on pcnut using another HD, if the new installation is OK then you know it is the stuff you have on the old XP thats causing the problem.
I hadn't thought of using a different hard drive. I had thought of the more complicated solution of installing it on another partition on my existing drive and creating a "dual boot". I own a license for DiskMagic. I'm not anxious to use that WD drive again, that I just replaced with a mucho more quieto Samsung Spinpoint.

David

haysdb
Patron of SPCR
Posts: 2425
Joined: Fri Aug 29, 2003 11:09 pm
Location: Earth

Post by haysdb » Fri Dec 19, 2003 10:46 pm

While my Internet connection has been down, I haven't bothered to fire up Outlook Express, but I did tonight on the thought that "maybe it's an Internet Explorer problem". Well, OE "works", but ever so sloooowly. How slow? According to Task Manager, PEAK throughput is .05% of 100Mbps, and an average around .02%, or a whopping 20 Kbps. That's with a little b, as in bits, not Bytes. It took several minutes just to find out I could expand my penis by 3". :lol:

That makes it a different problem. What the heck is slowing this machine to a crawl?

Well, that's it for me. I'm doing a clean install so I can get back to my farm.

David
Last edited by haysdb on Sat Dec 20, 2003 8:43 am, edited 1 time in total.

haysdb
Patron of SPCR
Posts: 2425
Joined: Fri Aug 29, 2003 11:09 pm
Location: Earth

Post by haysdb » Sat Dec 20, 2003 7:45 am

CharlieChan wrote:my setup does not have /etc/resolv.dnsmasq, /etc/dnsmasq.conf.
dnsmasq.conf is optional. It does not exist by default. Options can be passed on the command line, so the same thing could be accomplished by customizing the dnsmasq startup script. I created the file so that I could better document my configuration, and so that I didn't need to touch the startup script, a consideration for future upgrades.

resolv.dnsmasq is optional. The primary reason for the "shell game' with resolv.conf is so that applications run on the server itself, such as a browser, are able to use the name service provided by dnsmasq. If the server is a dedicated server, this extra file is not required.

There is one rather significant but overlooked detail about how dnsmasq uses the resolv.conf file, which explains why changing the 'search' line from ok.ks.cox.net to davidhays.com didn't seem to DO anything.
'info dnsmasq' wrote:-r, --resolv-file=<file>
Read the IP addresses of the upstream nameservers from <file>, instead of /etc/resolv.conf.
The only lines relevant to dnsmasq are nameserver ones.
CharlieChan wrote: Change the /etc/resolv.conf on the farm server to,

search davidhays.com
nameserver 192.168.1.100
nameserver 192.168.1.1

Remember to restart the dnsmasq server.
As noted above, per the dnsmaq documentation, the search line is ignored by dnsmasq.
It rejects the second line since it points back to itself. It doesn't "hurt" to have either line, but they are ignored by dnsmasq.
My /etc/resolv.dnsmasq (which is a private resolv file used only by dnsmasq) now contains just the one line:

Code: Select all

nameserver 192.168.1.1
which is the IP address of my broadband router. The router is not actually a name server, but it knows the addresses of my ISP's name servers (which it gets from DHCP), so it forwards the DNS request.

And finally, restarting dnsmasq is not necessary. It "polls" the resolv file for changes, and picks them up almost immediately.

In other words, this aspect of dnsmasq at least, is much simpler than I was trying to make it. :oops:

David

haysdb
Patron of SPCR
Posts: 2425
Joined: Fri Aug 29, 2003 11:09 pm
Location: Earth

Post by haysdb » Sat Dec 20, 2003 9:02 am

My desktop machine that I have been fighting with the last few days was much sicker than I realized. Last night I let Windows try to "repair" my installation. Big Mistake. Now it won't even BOOT. It get's to the "updating windows" screen, tells me it will be finished in 39 minutes, then some minutes later reboots. This morning I tried booting Partition Magic from CD-ROM so I could install windows in a new partition and still get to my files, and it freezes up. I tried to run a disk checker from floppy and it froze up. This was after reverting to very conservative sittings in the BIOS, for clock speed and memory timing. At this point I don't know what's broken, but something is majorly broken.

I've just run memory tests for 2 hours with no errors, so that doesn't appear to be the problem. I'm running a "fitness test" on the primary hard drive now. So maybe Partition Magic and the first disk test I tried were just foobar.

OK, to the fun stuff. I'm going to see if I can get a "blade" working.

David

CharlieChan
Patron of SPCR
Posts: 198
Joined: Sun Jul 13, 2003 2:57 am
Location: East Anglia, UK

Post by CharlieChan » Sat Dec 20, 2003 9:04 am

haysdb wrote: In other words, this aspect of dnsmasq at least, is much simpler than I was trying to make it. :oops:
Phew, I was getting worry dnsmasq was more complicated than BIND :lol: .

Before you boot one of your blade, you may consider putting a copy of redhat on it to test all the services are working on the farm server. A temporary used for that WD you retired :wink: .

Charlie.

haysdb
Patron of SPCR
Posts: 2425
Joined: Fri Aug 29, 2003 11:09 pm
Location: Earth

Post by haysdb » Sat Dec 20, 2003 1:55 pm

The first blade is booting over the net.

It's dying trying to mount a root file system, but that's not in the least bit surprising since I haven't configured that part yet. Since I didn't expect to get this far this fast, I hadn't read up on that part yet. :D

David

CharlieChan
Patron of SPCR
Posts: 198
Joined: Sun Jul 13, 2003 2:57 am
Location: East Anglia, UK

Post by CharlieChan » Sat Dec 20, 2003 2:14 pm

:D

Did you have to use the long string to boot?

Charlie.

haysdb
Patron of SPCR
Posts: 2425
Joined: Fri Aug 29, 2003 11:09 pm
Location: Earth

Post by haysdb » Sat Dec 20, 2003 3:40 pm

CharlieChan wrote:Did you have to use the long string to boot?
Nope. Don't need option-28 or option-29 either. Here's the PXE boot section from dhcpd.conf

Code: Select all

subnet 192.168.1.0 netmask 255.255.255.0 {
   range 192.168.1.101 192.168.1.151;
 
   group  {
 
      filename  "/lts/pxelinux.0";
 
      host fah01 { hardware ethernet 00:E0:4C:B2:F7:6B; fixed-address 192.168.1.101; }
     #host fah02 { hardware ethernet xx:xx:xx:xx:xx:xx; fixed-address 192.168.1.102; }
   }
}
This will evolve I'm sure, but this is enough to boot.

Do you know offhand Charlie, whether the information that is displayed on the screen of the network boot clients during boot is logged, or CAN be logged, on the server? A lot of stuff scrolls by before I can read it.

Any suggestions on how to make simple power switches?

David

CharlieChan
Patron of SPCR
Posts: 198
Joined: Sun Jul 13, 2003 2:57 am
Location: East Anglia, UK

Post by CharlieChan » Sat Dec 20, 2003 3:49 pm

haysdb wrote: Do you know offhand Charlie, whether the information that is displayed on the screen of the network boot clients during boot is logged, or CAN be logged, on the server? A lot of stuff scrolls by before I can read it.
You need to put this line inside group.

Code: Select all

      option log-servers   192.168.1.100;
I seem to remember you need to alter some file on the server but can't remember which file.

Charlie.

haysdb
Patron of SPCR
Posts: 2425
Joined: Fri Aug 29, 2003 11:09 pm
Location: Earth

Post by haysdb » Sun Dec 21, 2003 1:03 am

What user ids do the diskless clients run under? I am getting Permission denied when the /home/fah/wsxxx is mounted as /fah.

Along the same lines, how do I "log in" to the blades to poke around and see how they are doing?

The error I am getting is

Code: Select all

mount 192.168.1.100:/home/fah/ws001 failed, reason given by server: Permission denied
mount: nfsmount failed, Bad file descriptor
NFS: mount program didn't pass remote address!
mount: Mounting 192.168.1.100:/home/fah/ws001 on /fah failed: Invalid argument
The mount command from /opt/ltsp/i386./etc/rc.local is

Code: Select all

mount -t nfs -o nolock ${NFS_SERVER}:/home/fah/${HOSTNAME} /fah
I was really hoping to get this up and running tonight, but I'm falling asleep, so I guess I will tackle it fresh tomorrow.

David

NeilBlanchard
Moderator
Posts: 7681
Joined: Mon Dec 09, 2002 7:11 pm
Location: Maynard, MA, Eaarth
Contact:

An interesting article on how Linux boots...

Post by NeilBlanchard » Sun Dec 21, 2003 3:34 am

Hello:

Over at Ars Technica, there is an interesting article on how a Linux computer boots -- and the BIOS part applies to any x86 computer. Good stuff:

http://www.arstechnica.com/etc/linux/index.html

The next article will deal specifically with booting from thje network... 8)

CharlieChan
Patron of SPCR
Posts: 198
Joined: Sun Jul 13, 2003 2:57 am
Location: East Anglia, UK

Post by CharlieChan » Sun Dec 21, 2003 4:52 am

haysdb wrote:What user ids do the diskless clients run under?
root.
I am getting Permission denied when the /home/fah/wsxxx is mounted as /fah.
What is in your /etc/exports, /etc/hosts, /etc/hosts.allow, /etc/hosts.deny ?
Along the same lines, how do I "log in" to the blades to poke around and see how they are doing?
You cannot not remotely login to the clients. The clients do not need logging in, they default to a root console once they finish booting.
The error I am getting is

Code: Select all

mount 192.168.1.100:/home/fah/ws001 failed, reason given by server: Permission denied
mount: nfsmount failed, Bad file descriptor
NFS: mount program didn't pass remote address!
mount: Mounting 192.168.1.100:/home/fah/ws001 on /fah failed: Invalid argument
The mount command from /opt/ltsp/i386./etc/rc.local is

Code: Select all

mount -t nfs -o nolock ${NFS_SERVER}:/home/fah/${HOSTNAME} /fah
You have name your clients fahxxx and it is trying to mount wsxxx directories. Generally ws001 mounts directory ws001 on the server so I do not know what changes you have made.

Thinking..... yes, you are trying to mount

Code: Select all

mount -t nfs -o nolock ${NFS_SERVER}:/home/fah/${HOSTNAME} /fah
which is,

192.168.1.100:/home/fah/fah001 /fah

the directory /home/fah/fah001 does not exist on the server :shock:

But that doesn't seem right as you log indicates it is mounting /home/fah/ws001. If you use the export file I posted then the directories could be mounted by any clients on 192.168.1.100/255.255.255.0 ie any machines on the network. I really need to see the configuration files as it could be a number of reasons especially since you have alter some of the default configurations.

Charlie.

haysdb
Patron of SPCR
Posts: 2425
Joined: Fri Aug 29, 2003 11:09 pm
Location: Earth

Post by haysdb » Sun Dec 21, 2003 1:59 pm

It's hard to remember where all the configuration files and scripts are, so I've come up with a way to just open ALL of them. I'm not sure how "elegant" this is or isn't, but it works. 8)

Code: Select all

# Folding@Home configuration files and scripts
 
# /home/fah/config
 
gedit \
   /etc/hosts \
   /etc/exports \
   /etc/crontab \
   /etc/dhcpd.conf \
   /etc/resolv.conf \
   /etc/resolv.dnsmasq \
   /etc/dnsmasq.conf \
   /etc/rc.d/init.d/folding \
   /opt/ltsp/i386/etc/lts.conf \
   /opt/ltsp/i386/etc/rc.local \
   /opt/ltsp/i386/etc/rc.d/startfah \
   /var/lib/dhcp/dhcpd.leases \
   /home/fah/setprot \
   &
 
exit
Am I missing any?

David

haysdb
Patron of SPCR
Posts: 2425
Joined: Fri Aug 29, 2003 11:09 pm
Location: Earth

Post by haysdb » Sun Dec 21, 2003 2:19 pm

CharlieChan wrote:I really need to see the configuration files as it could be a number of reasons especially since you have alter some of the default configurations.
I was naming the hosts fah<nn> but finally admitted that ws<nnn> would be "easier" in the long run since that's what all the original scripts and configuration files assume. I will look around and see if maybe I missed changing this back somewhere.

David

Post Reply