This is a collection of notes on various aspects of debugging FreeS/WAN setup and connections. Other sources of information are:
For how to report problems, see the file doc/prob.report.
Error messages generated by KLIPS during the boot sequence are accessible with the dmesg command.
Pluto logs to:
Check both places to get full information. If you find nothing, check your syslogd.conf(5) to see where your system is putting things.
Other man pages are on
this list and in
From a message posted to the mailing list Jan 14 2000 by Pluto developer Hugh Redelmeier:
Until ipsec auto and whack/pluto get fixed: When puzzled by Pluto behaviour, always look in /var/log/secure -- that's the unadulterated story. To get the whole whack output (almost a subset of the story from Pluto), give auto the --verbose flag on each invocation. Eg: ipsec auto --verbose --up sadaisy Bonus hint: problems snowball. So look for the first problem first, it is likely to be the cause of later problems. And a final hint: If one side keeps retrying to no avail, it may be because the other is unhappy about something and won't reply. Go look at the other side to figure out what it doesn't like.Various error messages from Pluto are discussed in the FAQ and the ipsec_pluto(8) man page.
Here are the Pluto developer's suggestions for doing this:
Can you get a core dump and use gdb to find out what Pluto was doing when it died? To get a core dump, you will have to set dumpdir to point to a suitable directory (see ipsec.conf(5)). To get gdb to tell you interesting stuff: $ script $ cd dump-directory-you-chose $ gdb /usr/local/lib/ipsec/pluto core (gdb) where (gdb) quit $ exit The resulting output will have been captured by the script command in a file called "typescript". Send it to the list. Do not delete the core file. I may need to ask you to print out some more relevant stuff.Note that the dumpdir parameter takes effect only when the IPsec subsystem is restarted -- reboot or ipsec setup restart .
From a mail message from our KLIPS developer:
Here is a catalogue of the types of errors that can occur for which statistics are kept when transmitting and receiving packets via klips. I notice that they are not necessarily logged in the right counter. . . . Sources of ifconfig statistics for ipsec devices rx-errors: - packet handed to ipsec_rcv that is not an ipsec packet. - ipsec packet with payload length not modulo 4. - ipsec packet with bad authenticator length. - incoming packet with no SA. - replayed packet. - incoming authentication failed. - got esp packet with length not modulo 8. tx_dropped: - cannot process ip_options. - packet ttl expired. - packet with no eroute. - eroute with no SA. - cannot allocate sk_buff. - cannot allocate kernel memory. - sk_buff internal error. The standard counters are: struct enet_statistics { int rx_packets; /* total packets received */ int tx_packets; /* total packets transmitted */ int rx_errors; /* bad packets received */ int tx_errors; /* packet transmit problems */ int rx_dropped; /* no space in linux buffers */ int tx_dropped; /* no space available in linux */ int multicast; /* multicast packets received */ int collisions; /* detailed rx_errors: */ int rx_length_errors; int rx_over_errors; /* receiver ring buff overflow */ int rx_crc_errors; /* recved pkt with crc error */ int rx_frame_errors; /* recv'd frame alignment error */ int rx_fifo_errors; /* recv'r fifo overrun */ int rx_missed_errors; /* receiver missed packet */ /* detailed tx_errors */ int tx_aborted_errors; int tx_carrier_errors; int tx_fifo_errors; int tx_heartbeat_errors; int tx_window_errors; }; of which I think only the first 6 are useful.
Sometimes you need to test the tunnel between two security gateways. This can be done by having a machine behind one gateway ping a machine behind the other gateway, but this is not always convenient or even possible.
Simply pinging one gateway from the other is not useful. Such a ping does not normally go through the tunnel. The tunnel handles trafiic between the two protected subnets, not between the gateways . Depending on the routing in place, a ping might
Neither event tells you anything about the tunnel. You can explicitly create an eroute to force such packets through the tunnel, or you can create additional tunnels as described in our configuration document, but those may be an unnecessary complications in your situation.
The trick is to explicitly use an IP address for the subnet-side interface of one gateway machine, either as the target of a ping or as the origin of a traceroute. Since that interface is on the protected subnet, the resulting packets do go via the tunnel.
From the mailing list:
>; > ;I have two gateways, SG1 and SG2, with I/Fs i and e (for internal and >; > ;external), and two hosts, H1 and H2 set up as: >; > ; >; > ; H1-----(i)SG1(e)===========(e)SG2(i)------H2 >; > ; >; > ;And I want to test a tunnel set up between the H1 subnet and the H2 >; > ;subnet, but the H2 host may not exist yet, or may not be responding. >; > ; >; > ;If I ping SG2i from H1, all traffic in both directions is encrypted, >; > ;testing the tunnel. ..... >; > ;If I understand correctly, this could be accomplished by the 'ping -I' >; > ;feature of which you spoke earlier or 'traceroute -i'? >; >; Indeed, >; traceroute -i eth0 -f 20 otherSG >; appears to give me a solution using only N machines, the SGs themselves. >; This is very nice. Note that in this example, eth0 is the *private* (i) >; interface. If you try it with the (e) interface or the ipsec0 interface, >; you won't get the desired result. If you leave off the -f 20, the trace >; will hang in some totally bizarre way.
Some older Linux distributions did not support ping -I, according to mailing list comments. More recent comments indicate that this does now work. For example, you can do:
ping -I 192.168.10.250 192.168.0.11to test between the interfaces on the two protected subnets.
Your mail has inspired me to write a little trouble shooting guide to supplement and connect the existing docs on the subject. Here's v. 1. Comments are welcome. Steps in Troubleshooting Linux FreeS/WAN: - ----------------------------------------- Finding the Error - ----------------- First, try to find verbose text that describes how things are going wrong or creating unexpected results. Here's how: While the dialog from ipsec auto --up myconn (or whatever) will tell you where the process fails, it is often not very specific. And for errors that have to do with the use of a conn, you may not even have this. More information can be gleaned from the log files, usually /var/log/messages or /var/log/secure. On some systems, the logfiles are differently named. To find your error messages, check where your /etc/syslog.conf or equivalent is directing authpriv. The amount of your error's description in your logs depends on your debug settings, klipsdebug= and plutodebug=, in ipsec.conf. See man ipsec.conf for details. Note that usually, either 'none' or 'all' will be what you want; you don't need to worry about the nuances of the debug options. If you're having an negotiation problem (as you are, above) plutodebug is most relevant. If you have a connection established but the packets aren't doing what you think they should, play with klipsdebug. See also /doc/ipsec.html#parts for the division of duties within Linux FreeS/WAN. After raising your debug levels, restart Linux FreeS/WAN to ensure that the conf file is re-read, then re-create the error to generate verbose logs. Proceed to the failure point in the logs and find the handful of lines which succinctly describe how things are going wrong or contrary to your expectation. Interpreting the Error - ---------------------- To interpret this text, use the following resources: * the FAQ, doc/faq.html. Since the FAQ is constantly being updated, the snapshot may have a new entry relevant to your problem. For example, the faq in today's snapshot, addresses several more questions than the version on the site. * doc/config.html. Instructions for some configurations you can make with Linux FreeS/WAN. See especially doc/config.html#multitunnel, which is useful in a large proportion of the questions we see on the list. * doc/trouble.html. Debugging instructions and notes. Note that most people now test automatic keying only if that's what they're using in the field, and only revert to manual testing to test unexpected behaviour that seems to be occurring at a very basic level. * the list archives. There are three: sandelman nexial, as listed at mail.html, and the archive for the filtered list at exim.org: http://www.exim.org/pipermail/linux-ipsec/ (also listed in the upcoming docs). Each of them works differently, so it's worth checking each. Take a snippet of the text of your error which doesn't include anything site specific, ex. "No connection is known for", and search on this. It's likely you'll find the same answer to someone else's question this way, and it's faster than asking real-time humans ;-) * Sometimes a quick peek into the code where the error is being generated can be helpful. The pluto code is pretty well documented with comments and meaningful variable names. Asking for Help - --------------- A combination of the freeswan.org pages mentioned above and an archive search will address nearly every problem. But for those times when you've found something unusual, or your forehead is sore from banging it on your monitor, there's always the mailing list ;-) When writing the list, remember that more is more -- While sometimes an initial query with a quick description of your intent and error will twig someone's memory of a similar problem, it's often necessary to send a second mail with a complete problem report. See doc/prob.report for details. Lastly, as a kindness to other list members, you might post a link to a website where you've published your barf file rather than the entire file, if that option's available to you. Happy trouble shooting, Claudia