I was having some strange issues whereby I would often get prompted for a password when SSHing from one CDF machine to another even though I had passwordless auth set up. (I was doing this all from home via VPN.) This was causing all my MPI jobs to fail.
As it turns out, this may be a Kerberos issue. Here is the detailed explanation sent to me by the (super nice!) sysadmin. It may help in case other people experience a similar problem:
First off: you shouldn’t need to set up your own keypair just to
authenticate internally amongst Teaching Labs systems. We
have already set them to trust one another (providing the host
keys match); ssh as the same user amongst any two general-
access hosts won’t demand a password. (There are some
exceptions involving staff-only and special-purpose server
systems, but none of those should affect you.)
There’s something else that could cause trouble, and is a recent
change (that we ought to document better). Did you initially
log in by sitting down in front of a workstation and typing your
password, or did you ssh into some system from outside
(from a CSLab host, perhaps) using a keypair so that you
didn’t type a password?
All lab workstations now use Kerberos authentication to mount
file systems containing users’ home directories. Kerberos
authentication requires a ticket-granting ticket (TGT) that can be
generated only via a process that requires you to type your
password. Whenever you log in (whether at a workstation
console or via ssh) in a way that asks for your password, the
TGT is generated automatically. If you already have a TGT and
ssh from one Teaching Labs system to another, ssh passes
along the TGT. (To be technically correct it uses the TGT from
the first host to generate one for the second host, but the effect
is the same.)
But if you ssh to a Teaching Labs system from outside in a way
that doesn’t involve typing a password, you get no TGT. You
won’t notice this on central systems like wolf (what
teach.cs.toronto.edu points to), because they don’t use Kerberos
mounts. But if you then ssh from wolf to a lab system, you will
have only `other’ access to your home directory (or to any other
files), because you have no TGT and therefore Kerberos cannot
grant you access.
You can tell this has happened if the command
No ticket fileinstead of listing a handful of credentials.
You can make a TGT manually by running
kinit, which will
prompt for your password, check it with Kerberos, and if all
is well make a TGT. That will be a TGT only for the host where
you run kinit, but ssh will propagate it if you connect anywhere
else (within the Teaching Labs).
To confuse things a little further: if your TGT is made because
you typed a password during login, it is destroyed automatically
when you log out. Likewise for an automatically-created TGT
made by an ssh connection. But if you create a TGT by hand
using kinit, it sticks around for 14 days. So once you run kinit
on (say) wolf, you don’t have to do it again for two weeks,
which can be confusing when you log in on the 15th day and
suddenly what worked yesterday no longer does.
Bottom line: if a student logs in sitting in front of a workstation,
everything should just work. If you or a student logs in from
outside the Teaching Labs with ssh, it is best to
- Connect to teach.cs.toronto.edu, not directly to a workstation.
(From CSLab it is possible to call directly to a workstation; from
anywhere else, only to teach.cs.toronto.edu.)
- If the login didn’t ask for your password, run kinit (on wolf)
before you start using MPI.
I hope this isn’t so long as to be confusing. Please feel free to
teach.cs System Administrator