当前位置:Linux教程 - 网络过滤 - 网络过滤 - IptablesTutorial1.1.16

网络过滤 - IptablesTutorial1.1.16

IptablesTutorial1.1.16
2004-04-23 15:18 pm
来自:Linux文档
现载:Www.8s8s.coM
地址:无名

Copyright 2001-2002 by Oskar Andreasson

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1; with the Invariant Sections being "Introduction" and all sub-sections, with the Front-Cover Texts being "Original Author: Oskar Andreasson", and with no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License".

All scripts in this tutorial are covered by the GNU General Public License. The scripts are free source; you can redistribute them and/or modify them under the terms of the GNU General Public License as published by the Free Software Foundation, version 2 of the License.

These scripts are distributed in the hope that they will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License within this tutorial, under the section entitled "GNU General Public License"; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA

Dedications

First of all I would like to dedicate this document to my wonderful girlfriend Ninel. She has supported me more than I ever can support her to any degree. I wish I could make you just as happy as you make me.

Second of all, I would like to dedicate this work to all of the incredibly hard working linux developers and maintainers. It is people like those who make this wonderful operating system possible.

Table of Contents
About the author
How to read
Conventions used in this document
1. Introduction

1.1. Why this document was written
1.2. How it was written
1.3. Terms used in this document
2. Preparations

2.1. Where to get iptables
2.2. Kernel setup
2.3. User-land setup

2.3.1. Compiling the user-land applications
2.3.2. Installation on Red Hat 7.1
3. Traversing of tables and chains

3.1. General
3.2. mangle table
3.3. nat table
3.4. Filter table
4. The state machine

4.1. Introduction
4.2. The conntrack entries
4.3. User-land states
4.4. TCP connections
4.5. UDP connections
4.6. ICMP connections
4.7. Default connections
4.8. Complex protocols and connection tracking
5. Saving and restoring large rule-sets

5.1. Speed considerations
5.2. Drawbacks with restore
5.3. iptables-save
5.4. iptables-restore
6. How a rule is built

6.1. Basics
6.2. Tables
6.3. Commands
6.4. Matches

6.4.1. Generic matches
6.4.2. Implicit matches
6.4.3. Explicit matches
6.4.4. Unclean match
6.5. Targets/Jumps

6.5.1. ACCEPT target
6.5.2. DNAT target
6.5.3. DROP target
6.5.4. LOG target
6.5.5. MARK target
6.5.6. MASQUERADE target
6.5.7. MIRROR target
6.5.8. QUEUE target
6.5.9. REDIRECT target
6.5.10. REJECT target
6.5.11. RETURN target
6.5.12. SNAT target
6.5.13. TOS target
6.5.14. TTL target
6.5.15. ULOG target
7. rc.firewall file

7.1. example rc.firewall
7.2. explanation of rc.firewall

7.2.1. Configuration options
7.2.2. Initial loading of extra modules
7.2.3. proc set up
7.2.4. Displacement of rules to different chains
7.2.5. Setting up default policies
7.2.6. Setting up user specified chains in the filter table
7.2.7. INPUT chain
7.2.8. FORWARD chain
7.2.9. OUTPUT chain
7.2.10. PREROUTING chain of the nat table
7.2.11. Starting SNAT and the POSTROUTING chain
8. Example scripts

8.1. rc.firewall.txt script structure

8.1.1. The structure
8.2. rc.firewall.txt
8.3. rc.DMZ.firewall.txt
8.4. rc.DHCP.firewall.txt
8.5. rc.UTIN.firewall.txt
8.6. rc.test-iptables.txt
8.7. rc.flush-iptables.txt
A. Detailed explanations of special commands

A.1. Listing your active rule-set
A.2. Updating and flushing your tables
B. Common problems and questions

B.1. Problems loading modules
B.2. State NEW packets but no SYN bit set
B.3. Internet Service Providers who use assigned IP addresses
B.4. Letting DHCP requests through iptables
B.5. mIRC DCC problems
C. ICMP types
D. Other resources and links
E. Acknowledgments
F. History
G. GNU Free Documentation License

0. PREAMBLE
1. APPLICABILITY AND DEFINITIONS
2. VERBATIM COPYING
3. COPYING IN QUANTITY
4. MODIFICATIONS
5. COMBINING DOCUMENTS
6. COLLECTIONS OF DOCUMENTS
7. AGGREGATION WITH INDEPENDENT WORKS
8. TRANSLATION
9. TERMINATION
10. FUTURE REVISIONS OF THIS LICENSE
How to use this License for your documents
H. GNU General Public License

0. Preamble
1. TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
2. How to Apply These Terms to Your New Programs
I. Example scripts code-base

I.1. Example rc.firewall script
I.2. Example rc.DMZ.firewall script
I.3. Example rc.UTIN.firewall script
I.4. Example rc.DHCP.firewall script
I.5. Example rc.flush-iptables script
I.6. Example rc.test-iptables script

List of Tables
3-1. Forwarded packets
3-2. Destination local host (our own machine)
3-3. Source local host (our own machine)
4-1. User-land states
4-2. Internal states
6-1. Tables
6-2. Commands
6-3. Options
6-4. Generic matches
6-5. TCP matches
6-6. UDP matches
6-7. ICMP matches
6-8. Limit match options
6-9. MAC match options
6-10. Mark match options
6-11. Multiport match options
6-12. Owner match options
6-13. State matches
6-14. TOS matches
6-15. TTL matches
6-16. DNAT target
6-17. LOG target options
6-18. MARK target options
6-19. MASQUERADE target
6-20. REDIRECT target
6-21. REJECT target
6-22. SNAT target
6-23. TOS target
6-24. TTL target
6-25. ULOG target
C-1. ICMP types
About the author

I am someone with too many old computers on his hands. I have my own LAN and want all my machines to be connected to the Internet, whilst at the same time making my LAN fairly secure. The new iptables is a good upgrade from the old ipchains in this regard. With ipchains, you could make a fairly secure network by dropping all incoming packages not destined for given ports. However, things like passive FTP or outgoing DCC in IRC would cause problems. They assign ports on the server, tell the client about it, and then let the client connect. There were some toothing problems in the iptables code that I ran into in the beginning, and in some respects I found the code not quite ready for release in full production. Today, I'd recommend everyone who uses ipchains or even older ipfwadm etc .,to upgrade - unless they are happy with what their current code is capable of and if it does what they need it to.
How to read

This document was written purely so people can start to grasp the wonderful world of iptables. It was never meant to contain information on specific security bugs in iptables or Netfilter. If you find peculiar bugs or behaviors in iptables or any of the subcomponents, you should contact the Netfilter mailing lists and tell them about the problem and they can tell you if this is a real bug or if it has already been fixed. There are very rarely actual security related bugs found in iptables or Netfilter, however, one or two do slip by once in a while. These are properly shown on the front page of the Netfilter main page, and that is where you should go to get information on such topics.

The above also implies that the rule-sets available with this tutorial are not written to deal with actual bugs inside Netfilter. The main goal of them is to simply show how to set up rules in a nice simple fashion that deals with all problems we may run into. For example, this tutorial will not cover how we would close down the HTTP port for the simple reason that Apache happens to be vulnerable in version 1.2.12 (This is covered really, though not for that reason).

This document was simply written to give everyone a good and simple primer at how to get started with iptables, but at the same time it was created to be as complete as possible. It does not contain any targets or matches that are in patch-o-matic for the simple reason that it would require too much effort to keep such a list updated. If you need information about the patch-o-matic updates, you should read the info that comes with it in patch-o-matic as well as the other documentations available on the Netfilter main page.
Conventions used in this document

The following conventions are used in this document when it comes to commands, files and other specific information.

*

Code excerpts and command-outputs are printed like this, with all output in fixed width font and user-written commands in bold typeface:

[blueflux@work1 neigh]$ ls
default eth0 lo
[blueflux@work1 neigh]$


*

All commands and program names in the tutorial are shown in bold typeface.
*

All system items such as hardware, and also kernel internals or abstract system items such as the loopback interface are all shown in an italic typeface.
*

computer-output is formatted in this way in the text.
*

filenames and paths in the file-system are shown like /usr/local/bin/iptables.

Chapter 1. Introduction
1.1. Why this document was written

Well, I found a big empty space in the HOWTO's out there lacking in information about the iptables and Netfilter functions in the new linux 2.4.x kernels. Among other things, I'm going to try to answer questions that some might have about the new possibilities like state matching. Most of this will be illustrated with an example rc.firewall.txt file that you can use in your /etc/rc.d/ scripts. Yes, this file was originally based upon the masquerading HOWTO for those of you who recognize it.

Also, there's a small script that I wrote just in case you screw up as much as I did during the configuration available as rc.flush-iptables.txt.
1.2. How it was written

I've consulted Marc Boucher and others from the core Netfilter team. Many heartfelt thanks to them for their work and for their help on this tutorial, that I wrote and maintain for boingworld.com. This document will guide you through the setup process step by step and hopefully help you to understand some more about the iptables package. I will base most of the stuff here on the example rc.firewall file, since I find that example a good way to learn how to use iptables. I have decided to just follow the basic chains and from there go down into each and one of the chains traversed in each due order. That way the tutorial is a little bit harder to follow, though this way is more logical. Whenever you find something that's hard to understand, just come back to this tutorial.
1.3. Terms used in this document

This document contains a few terms that may need more detailed explanations before you read them. This section will try to cover the most obvious ones and how I have chosen to use them within this document.

Stream - This term refers to a connection that sends and receives packets that are related to eachother in some fashion. Basically, I have used this term for any kind of connection that sends 2 packets or more in both directions. In TCP this may mean a connection that sends a SYN and then replies with an SYN/ACK, but it may also mean a connection that sends a SYN and then replies with an ICMP Host unreachable. In other words, I use this term very loosely.

State - This term refers to which state the packet is in, either according to RFC 793 - Transmission Control Protocol or according to userside states used in Netfilter/iptables. Note that the used states internally, and externally, do not fully follow the RFC 793 specification. The main reason is that Netfilter has to guess itself to the proper states.

User space - With this term I mean everything and anything that takes place outside the kernel. For example, invoking iptables -h takes place outside the kernel, while iptables -A FORWARD -p tcp -j ACCEPT takes place (partially) within the kernel, since a new rule is added to the ruleset.

Kernel space - This is more or less the opposite of User space. This implies the actions that take place within the kernel, and not outside of the kernel.

Userland - See User space.
Chapter 2. Preparations

This chapter is aimed at getting you started and to help you understand the role Netfilter and iptables play in linux today. This chapter should hopefully get you set up and finished to go with your experimentation, and installation of your firewall. Given time and perseverance, you'll then get it to perform exactly as you want it to.
2.1. Where to get iptables

The iptables user-space package can be downloaded from the Netfilter homepage. The iptables package also makes use of kernel space facilities which can be configured into the kernel during make configure. The necessary steps will be discussed a bit further down in this document.
2.2. Kernel setup

To run the pure basics of iptables you need to configure the following options into the kernel while doing make config or one of its related commands:

CONFIG_PACKET - This option allows applications and utilities that needs to work directly to various network devices. Examples of such utilities are tcpdump or snort.

CONFIG_NETFILTER - This option is required if you're going to use your computer as a firewall or gateway to the Internet. In other words, this is most definitely required if for anything in this tutorial to work at all. I assume you will want this, since you are reading this.

And of course you need to add the proper drivers for your interfaces to work properly, i.e. Ethernet adapter, PPP and SLIP interfaces. The above will only add some of the pure basics in iptables. You won't be able to do anything productive to be honest, it just adds the framework to the kernel. If you want to use the more advanced options in Iptables, you need to set up the proper configuration options in your kernel. Here we will show you the options available in a basic 2.4.9 kernel and a brief explanation :

CONFIG_IP_NF_CONNTRACK - This module is needed to make connection tracking. Connection tracking is used by, among other things, NAT and Masquerading. If you need to firewall machines on a LAN you most definitely should mark this option. For example, this module is required by the rc.firewall.txt script to work.

CONFIG_IP_NF_FTP - This module is required if you want to do connection tracking on FTP connections. Since FTP connections are quite hard to do connection tracking on in normal cases, conntrack needs a so called helper, this option compiles the helper. If you do not add this module you won't be able to FTP through a firewall or gateway properly.

CONFIG_IP_NF_IPTABLES - This option is required if you want do any kind of filtering, masquerading or NAT. It adds the whole iptables identification framework to the kernel. Without this you won't be able to do anything at all with iptables.

CONFIG_IP_NF_MATCH_LIMIT - This module isn't exactly required but it's used in the example rc.firewall.txt. This option provides the LIMIT match, that adds the possibility to control how many packets per minute that are to be matched, governed by an appropriate rule. For example, -m limit --limit 3/minute would match a maximum of 3 packets per minute. This module can also be used to avoid certain Denial of Service attacks.

CONFIG_IP_NF_MATCH_MAC - This allows us to match packets based on MAC addresses. Every Ethernet adapter has its own MAC address. We could for instance block packets based on what MAC address is used and block a certain computer pretty well since the MAC address very seldom change. We don't use this option in the rc.firewall.txt example or anywhere else.

CONFIG_IP_NF_MATCH_MARK - This allows us to use a MARK match. For example, if we use the target MARK we could mark a packet and then depending on if this packet is marked further on in the table, we can match based on this mark. This option is the actual match MARK, and further down we will describe the actual target MARK.

CONFIG_IP_NF_MATCH_MULTIPORT - This module allows us to match packets with a whole range of destination ports or source ports. Normally this wouldn't be possible, but with this match it is.

CONFIG_IP_NF_MATCH_TOS - With this match we can match packets based on their TOS field. TOS stands for Type Of Service. TOS can also be set by certain rules in the mangle table and via the ip/tc commands.

CONFIG_IP_NF_MATCH_TCPMSS - This option adds the possibility for us to match TCP packets based on their MSS field.

CONFIG_IP_NF_MATCH_STATE - This is one of the biggest news in comparison to ipchains. With this module we can do state-full matching on packets. For example, if we have already seen traffic in two directions in a TCP connection, this packet will be counted as ESTABLISHED. This module is used extensively in the rc.firewall.txt example.

CONFIG_IP_NF_MATCH_UNCLEAN - This module will add the possibility for us to match IP, TCP, UDP and ICMP packets that don't conform to type or are invalid. We could for example drop these packets, but we never know if they are legitimate or not. Note that this match is still experimental and might not work perfectly in all cases.

CONFIG_IP_NF_MATCH_OWNER - This option will add the possibility for us to do matching based on the owner of a socket. For example, we can allow only the user root to have Internet access. This module was originally just written as an example on what could be done with the new iptables. Note that this match is still experimental and might not work for everyone.

CONFIG_IP_NF_FILTER - This module will add the basic filter table which will enable you to do IP filtering at all. In the filter table you'll find the INPUT, FORWARD and OUTPUT chains. This module is required if you plan to do any kind of filtering on packets that you receive and send.

CONFIG_IP_NF_TARGET_REJECT - This target allows us to specify that an ICMP error message should be sent in reply to incoming packets, instead of plainly dropping them dead to the floor. Keep in mind that TCP connections, as opposed to ICMP and UDP, are always reset or refused with a TCP RST packet.

CONFIG_IP_NF_TARGET_MIRROR - This allows packets to be bounced back to the sender of the packet. For example, if we set up a MIRROR target on destination port HTTP on our INPUT chain and someone tries to access this port, we would bounce his packets back to him and finally he would probably see his own homepage.

CONFIG_IP_NF_NAT - This module allows network address translation, or NAT, in its different forms. This option gives us access to the nat table in iptables. This option is required if we want to do port forwarding, masquerading, etc. Note that this option is not required for firewalling and masquerading of a LAN, but you should have it present unless you are able to provide unique IP addresses for all hosts. Hence, this option is required for the example rc.firewall.txt script to work properly, and most definitely on your network if you do not have the ability to add unique IP addresses as specified above.

CONFIG_IP_NF_TARGET_MASQUERADE - This module adds the MASQUERADE target. For instance if we don't know what IP we have to the Internet this would be the preferred way of getting the IP instead of using DNAT or SNAT. In other words, if we use DHCP, PPP, SLIP or some other connection that assigns us an IP, we need to use this target instead of SNAT. Masquerading gives a slightly higher load on the computer than NAT, but will work without us knowing the IP address in advance.

CONFIG_IP_NF_TARGET_REDIRECT - This target is useful together with application proxies, for example. Instead of letting a packet pass right through, we remap them to go to our local box instead. In other words, we have the possibility to make a transparent proxy this way.

CONFIG_IP_NF_TARGET_LOG - This adds the LOG target and its functionality to iptables. We can use this module to log certain packets to syslogd and hence see what is happening to the packet. This is invaluable for security audits, forensics or debugging a script you are writing.

CONFIG_IP_NF_TARGET_TCPMSS - This option can be used to counter Internet Service Providers and servers who block ICMP Fragmentation Needed packets. This can result in web-pages not getting through, small mails getting through while larger mails don't, ssh works but scp dies after handshake, etc. We can then use the TCPMSS target to overcome this by clamping our MSS (Maximum Segment Size) to the PMTU (Path Maximum Transmit Unit). This way, we'll be able to handle what the authors of Netfilter them selves call "criminally brain-dead ISPs or servers" in the kernel configuration help.

CONFIG_IP_NF_COMPAT_IPCHAINS - Adds a compatibility mode with the obsolescent ipchains. Do not look to this as any real long term solution for solving migration from linux 2.2 kernels to 2.4 kernels, since it may well be gone with kernel 2.6.

CONFIG_IP_NF_COMPAT_IPFWADM - Compatibility mode with obsolescent ipfwadm. Definitely don't look to this as a real long term solution.

As you can see, there is a heap of options. I have briefly explained here what kind of extra behaviors you can expect from each module. These are only the options available in a vanilla linux 2.4.9 kernel. If you would like to take a look at more options, I suggest you look at the patch-o-matic functions in Netfilter user-land which will add heaps of other options in the kernel. POM fixes are additions that are supposed to be added in the kernel in the future but has not quite reached the kernel yet. These functions should be added in the future, but has not quite made it in yet. This may be for various reasons - such as the patch not being stable yet, to Linus Torvalds being unable to keep up, or not wanting to let the patch in to the mainstream kernel yet since it is still experimental.

You will need the following options compiled into your kernel, or as modules, for the rc.firewall.txt script to work. If you need help with the options that the other scripts need, look at the example firewall scripts section.

*

CONFIG_PACKET
*

CONFIG_NETFILTER
*

CONFIG_CONNTRACK
*

CONFIG_IP_NF_FTP
*

CONFIG_IP_NF_IRC
*

CONFIG_IP_NF_IPTABLES
*

CONFIG_IP_NF_FILTER
*

CONFIG_IP_NF_NAT
*

CONFIG_IP_NF_MATCH_STATE
*

CONFIG_IP_NF_TARGET_LOG
*

CONFIG_IP_NF_MATCH_LIMIT
*

CONFIG_IP_NF_TARGET_MASQUERADE

At the very least the above will be required for the rc.firewall.txt script. In the other example scripts I will explain what requirements they have in their respective sections. For now, let's try to stay focused on the main script which you should be studying now.
2.3. User-land setup

First of all, let's look at how we compile the iptables package. It's important to realize that for most part configuration and compilation of iptables goes hand in hand with the kernel configuration and compilation. Certain distributions comes with the iptables package preinstalled, one of these are Red Hat. However, in Red Hat it is disabled per default. We will check closer on how to enable it and take a look at other distributions further on in this chapter.
2.3.1. Compiling the user-land applications

First of all unpack the iptables package. Here, we have used the iptables 1.2.6a package and a vanilla 2.4 kernel. Unpack as usual, using bzip2 -cd iptables-1.2.6a.tar.bz2 | tar -xvf - (this can also be accomplished with the tar -xjvf iptables-1.2.6a.tar.bz2, which should do pretty much the same as the first command. However, this may not work with older versions of tar). The package should now be unpacked properly into a directory named iptables-1.2.6a. For more information read the iptables-1.2.6a/INSTALL file which contains pretty good information on compiling and getting the program to run.

After this, there you have the option of configuring and installing extra modules and options etcetera for the kernel.The step described here will only check and install standard patches that are pending for inclusion to the kernel, there are some even more experimental patches further along, which may only be available when you carry out other steps.

Note

Some of these patches are highly experimental and may not be such a good idea to install them. However, there are heaps of extremely interesting matches and targets in this installation step so don't be afraid of at least looking at them.

To carry out this step we do something like this from the root of the iptables package:

make pending-patches KERNEL_DIR=/usr/src/linux/

The variable KERNEL_DIR should point to the actual place that your kernel source is located at. Normally this should be /usr/src/linux/ but this may vary, and most probably you will know yourself where the kernel source is available.

Note

This only asks about certain patches that are just about to enter the kernel anyway. There might be more patches and additions that the developers of Netfilter are about to add to the kernel, but is a bit further away from actually getting there. One way to install these are by doing the following:

make most-of-pom KERNEL_DIR=/usr/src/linux/

The above command would ask about installing parts of what in Netfilter world is called patch-o-matic, but still skip the most extreme patches that might cause havoc in your kernel. Note that we say ask, because that's what these commands actually do. They ask you before anything is changed in the kernel source. To be able to install all of the patch-o-matic stuff you will need to run the following command:

make patch-o-matic KERNEL_DIR=/usr/src/linux/

Don't forget to read the help for each patch thoroughly before doing anything. Some patches will destroy other patches while others may destroy your kernel if used together with some patches from patch-o-matic etc.

Note

You may totally ignore the above steps if you don't want to patch your kernel, it is in other words not necessary to do the above. However, there are some really interesting things in the patch-o-matic that you may want to look at so there's nothing bad in just running the commands and see what they contain.

After this you are finished doing the patch-o-matic parts of installation, you may now compile a new kernel making use of the new patches that you have added to the source. Don't forget to configure the kernel again since the new patches probably are not added to the configured options. You may wait with the kernel compilation until after the compilation of the user-land program iptables if you feel like it, though.

Continue by compiling the iptables user-land application. To compile iptables you issue a simple command that looks like this:

make KERNEL_DIR=/usr/src/linux/

The user-land application should now compile properly. If not, you are on your own, or you could subscribe to the Netfilter mailing list, where you have the chance of asking for help you with your problems. There are a few things that might go wrong with the installation of iptables, so don't panic if it won't work. Try to think logically about it and find out what's wrong, or get someone to help you.

If everything has worked smoothly, you're ready to install the binaries by now. To do this, you would issue the following command to install them:

make install KERNEL_DIR=/usr/src/linux/

Hopefully everything should work in the program now. To use any of the changes in the iptables user-land applications you should now recompile and reinstall your kernel and modules, if you hadn't done so before. For more information about installing the user-land applications from source, check the INSTALL file in the source which contains excellent information on the subject of installation.
2.3.2. Installation on Red Hat 7.1

Red Hat 7.1 comes preinstalled with a 2.4.x kernel that has Netfilter and iptables compiled in. It also contains all the basic user-land programs and configuration files that is needed to run it. However, the Red Hat people have disabled the whole thing by using the backward compatible ipchains module. Annoying to say the least, and a lot of people keep asking different mailing lists why iptables don't work. So, let's take a brief look at how to turn the ipchains module off and how to install iptables instead.

Note

The default Red Hat 7.1 installation today comes with an hopelessly old version of the user-space applications, so you might want to compile a new version of the applications as well as install a new and custom compiled kernel before fully exploiting iptables.

First of all you will need to turn off the ipchains modules so it won't start in the future. To do this, you will need to change some filenames in the /etc/rc.d/ directory-structure. The following command should do it:

chkconfig --level 0123456 ipchains off

By doing this we move all the soft links that points to the /etc/rc.d/init.d/ipchains script to K92ipchains. The first letter which per default would be S, tells the initscripts to start the script. By changing this to K we tell it to Kill the service instead, or to not run it if it was not previously started. Now the service won't be started in the future.

However, to stop the service from actually running right now we need to run another command. This is the service command which can be used to work on currently running services. We would then issue the following command to stop the ipchains service:

service ipchains stop

Finally, to start the iptables service. First of all, we need to know which run-levels we want it to run in. Normally this would be in run-level 2, 3 and 5. These run-levels are used for the following things:

*

2. Multiuser without NFS or the same as 3 if there is no networking.
*

3. Full multiuser mode, i.e. the normal run-level to run in.
*

5. X11. This is used if you automatically boot into Xwindows.

To make iptables run in these run-levels we would do the following commands:

chkconfig --level 235 iptables on

The above commands would in other words make the iptables service run in run-level 2, 3 and 5. If you'd like the iptables service to run in some other run-level you would have to issue the same command in those. However, none of the other run-levels should be used, so you should not really need to activate it for those run-levels. Level 1 is for single user mode, i.e, when you need to fix a screwed up box. Level 4 should be unused, and level 6 is for shutting the computer down.

To activate the iptables service, we just run the following command:

service iptables start

There are no rules in the iptables script. To add rules to an Red Hat 7.1 box, there is two common ways. Firstly, you could edit the /etc/rc.d/init.d/iptables script. This would have the undesired effect of deleting all the rules if you updated the iptables package by RPM. The other way would be to load the rule-set and then save it with the iptables-save command and then have it loaded automatically by the rc.d scripts.

First we will describe the how to set up iptables by cutting and pasting to the iptables init.d script. To add rules that are to be run when the computer starts the service, you add them under the start) section, or in the start() function. Note, if you add the rules under the start) section don't forget to stop the start() function in the start) section from running. Also, don't forget to edit a the stop) section either which tells the script what to do when the computer is going down for example, or when we are entering a run-level that doesn't require iptables. Also, don't forget to check out the restart section and condrestart. Note that all this work will probably be trashed if you have, for example, Red Hat Network automatically update your packages. It may also be trashed by updating from the iptables RPM package.

The second way of doing the set up would require the following: First of all, make and write a rule-set in a file, or directly with iptables, that will meet your requirements, and don't forget to experiment a bit. When you find a set up that works without problems, or as you can see without bugs, use the iptables-save command. You could either use it normally, i.e. iptables-save > /etc/sysconfig/iptables, which would save the rule-set to the file /etc/sysconfig/iptables. This file is automatically used by the iptables rc.d script to restore the rule-set in the future. The other way is to save the script by doing service iptables save, which would save the script automatically to /etc/sysconfig/iptables. The next time you reboot the computer, the iptables rc.d script will use the command iptables-restore to restore the rule-set from the save-file /etc/sysconfig/iptables. Do not intermix these two methods, since they may heavily damage each other and render your firewall configuration useless.

When all of these steps are finished, you can deinstall the currently installed ipchains and iptables packages. This because we don't want the system to mix up the new iptables user-land application with the old preinstalled iptables applications. This step is only necessary if you are going to install iptables from the source package. It's not unusual that the new and the old package to get mixed up, since the rpm based installation installs the package in non-standard places and won't get overwritten by the installation for the new iptables package. To carry out the deinstallation, do as follows:

rpm -e iptables

And why keep ipchains lying around if you won't be using it any more? Removing it is done the same way as with the old iptables binaries, etc:

rpm -e ipchains

After all this has been completed, you will have finished with the update of the iptables package from source, having followed the source installation instructions. None of the old binaries, libraries or include files etc should be lying around any more.
Chapter 3. Traversing of tables and chains

In this chapter we'll discuss how packets traverse the different chains, and in which order. We will also discuss the order in which the tables are traversed. We'll see how valuable this is later on, when we write our own specific rules. We will also look at the points which certain other components, that also are kernel dependent, enter into the picture. Which is to say the different routing decisions and so on. This is especially necessary if we want to write iptables rules that could change routing patterns/rules for packets; i.e. why and how the packets get routed, good examples of this is DNAT and SNAT. Not to be forgotten are, of course, the TOS bits.
3.1. General

When a packet first enters the firewall, it hits the hardware and then gets passed on to the proper device driver in the kernel. Then the packet starts to go through a series of steps in the kernel, before it is either sent to the correct application (locally), or forwarded to another host - or whatever happens to it. In this example, we're assuming that the packet is destined for another host on another network. The packet goes through the different steps in the following fashion:

Table 3-1. Forwarded packets
Step Table Chain Comment
1 On the wire (i.e., Internet)
2 Comes in on the interface (i.e., eth0)
3 mangle PREROUTING This chain is normally used for mangling packets, i.e., changing TOS and so on.
4 nat PREROUTING This chain is used for Destination Network Address Translation mainly. Source Network Address Translation is done further on. Avoid filtering in this chain since it will be bypassed in certain cases.
5 Routing decision, i.e., is the packet destined for our local host or to be forwarded and where.
6 mangle FORWARD The packet is then sent on to the FORWARD chain of the mangle table. This can be used for very specific needs, where we want to mangle the packets after the initial routing decision, but before the last routing decision made just before the packet is sent out.
7 Filter FORWARD The packet gets routed onto the FORWARD chain. Only forwarded packets go through here, and here we do all the filtering. Note that all traffic that's forwarded goes through here (not only in one direction), so you need to think about it when writing your rule-set.
8 mangle POSTROUTING This chain is used for specific types of packet mangling that we wish to take place after all kinds of routing decisions has been done, but still on this machine.
9 nat POSTROUTING This chain should first and foremost be used for Source Network Address Translation. Avoid doing filtering here, since certain packets might pass this chain without ever hitting it. This is also where Masquerading is done.
10 Goes out on the outgoing interface (i.e., eth1).
11 Out on the wire again (i.e., LAN).

As you can see, there are quite a lot of steps to pass through. The packet can be stopped at any of the iptables chains, or anywhere else if it is malformed; however, we are mainly interested in the iptables aspect of this lot. Do note that there are no specific chains or tables for different interfaces or anything like that. FORWARD is always passed by all packets that are forwarded over this firewall/router. Do not use the INPUT chain to filter on in the previous scenario! INPUT is meant solely for packets to our local host that do not get routed to any other destination.

Now, let us have a look at a packet that is destined for our own local host. It would pass through the following steps before actually being delivered to our application that receives it:

Table 3-2. Destination local host (our own machine)
Step Table Chain Comment
1 On the wire (e.g., Internet)
2 Comes in on the interface (e.g., eth0)
3 mangle PREROUTING This chain is normally used for mangling packets, i.e., changing TOS and so on.
4 nat PREROUTING This chain is used for Destination Network Address Translation mainly. Avoid filtering in this chain since it will be bypassed in certain cases.
5 Routing decision, i.e., is the packet destined for our local host or to be forwarded and where.
6 mangle INPUT At this point, the mangle INPUT chain is hit. We use this chain to mangle packets, after they have been routed, but before they are actually sent to the process on the machine.
7 filter INPUT This is where we do filtering for all incoming traffic destined for our local host. Note that all incoming packets destined for this host pass through this chain, no matter what interface or in which direction they came from.
8 Local process/application (i.e., server/client program)

Note that this time the packet was passed through the INPUT chain instead of the FORWARD chain. Quite logical. Most probably the only thing that's really logical about the traversing of tables and chains in your eyes in the beginning, but if you continue to think about it, you'll find it will get clearer in time.

Finally we look at the outgoing packets from our own local host and what steps they go through.

Table 3-3. Source local host (our own machine)
Step Table Chain Comment
1 Local process/application (i.e., server/client program)
2 Routing decision. What source address to use, what outgoing interface to use, and other necessary information that needs to be gathered.
3 mangle OUTPUT This is where we mangle packets, it is suggested that you do not filter in this chain since it can have side effects.
4 nat OUTPUT This chain can be used to NAT outgoing packets from the firewall itself.
5 Filter OUTPUT This is where we filter packets going out from the local host.
6 mangle POSTROUTING The POSTROUTING chain in the mangle table is mainly used when we want to do mangling on packets before they leave our host, but after the actual routing decisions. This chain will be hit by both packets just traversing the firewall, as well as packets created by the firewall itself.
7 nat POSTROUTING This is where we do Source Network Address Translation as described earlier. It is suggested that you don't do filtering here since it can have side effects, and certain packets might slip through even though you set a default policy of DROP.
8 Goes out on some interface (e.g., eth0)
9 On the wire (e.g., Internet)

We have now seen how the different chains are traversed in three separate scenarios. If we were to figure out a good map of all this, it would look something like this:

To clarify this image, consider this. If we get a packet into the first routing decision that is not destined for the local machine itself, it will be routed through the FORWARD chain. If the packet is, on the other hand, destined for an IP address that the local machine is listening to, we would send the packet through the INPUT chain and to the local machine.

Also worth a note, is the fact that packets may be destined for the local machine, but the destination address may be changed within the PREROUTING chain by doing NAT. Since this takes place before the first routing decision, the packet will be looked upon after this change. Because of this, the routing may be changed before the routing decision is done. Do note, that all packets will be going through one or the other path in this image. If you DNAT a packet back to the same network that it came from, it will still travel through the rest of the chains until it is back out on the network.

Tip

If you feel that you want more information, you could use the rc.test-iptables.txt script. This test script should give you the necessary rules to test how the tables and chains are traversed.
3.2. mangle table

This table should as we've already noted mainly be used for mangling packets. In other words, you may freely use the mangle matches etc that could be used to change TOS (Type Of Service) fields and so on.

Caution

You are strongly advised not to use this table for any filtering; nor will any DNAT, SNAT or Masquerading work in this table.

Targets that are only valid in the mangle table:

*

TOS
*

TTL
*

MARK

The TOS target is used to set and/or change the Type of Service field in the packet. This could be used for setting up policies on the network regarding how a packet should be routed and so on. Note that this has not been perfected and is not really implemented on the Internet and most of the routers don't care about the value in this field, and sometimes, they act faulty on what they get. Don't set this in other words for packets going to the Internet unless you want to make routing decisions on it, with iproute2.

The TTL target is used to change the TTL (Time To Live) field of the packet. We could tell packets to only have a specific TTL and so on. One good reason for this could be that we don't want to give ourself away to nosy Internet Service Providers. Some Internet Service Providers do not like users running multiple computers on one single connection, and there are some Internet Service Providers known to look for a single host generating different TTL values, and take this as one of many signs of multiple computers connected to a single connection.

The MARK target is used to set special mark values to the packet. These marks could then be recognized by the iproute2 programs to do different routing on the packet depending on what mark they have, or if they don't have any. We could also do bandwidth limiting and Class Based Queuing based on these marks.
3.3. nat table

This table should only be used for NAT (Network Address Translation) on different packets. In other words, it should only be used to translate the packet's source field or destination field. Note that, as we have said before, only the first packet in a stream will hit this chain. After this, the rest of the packets will automatically have the same action taken on them as the first packet. The actual targets that do these kind of things are:

*

DNAT
*

SNAT
*

MASQUERADE

The DNAT (Destination Network Address Translation) target is mainly used in cases where you have a public IP and want to redirect accesses to the firewall to some other host (on a DMZ for example). In other words, we change the destination address of the packet and reroute it to the host.

SNAT (Source Network Address Translation) is mainly used for changing the source address of packets. For the most part you'll hide your local networks or DMZ, etc. A very good example would be that of a firewall of which we know outside IP address, but need to substitute our local network's IP numbers with that of our firewall. With this target the firewall will automatically SNAT and De-SNAT the packets, hence making it possible to make connections from the LAN to the Internet. If your network uses 192.168.0.0/netmask for example, the packets would never get back from the Internet, because IANA has regulated these networks (among others) as private and only for use in isolated LANs.

The MASQUERADE target is used in exactly the same way as SNAT, but the MASQUERADE target takes a little bit more overhead to compute. The reason for this, is that each time that the MASQUERADE target gets hit by a packet, it automatically checks for the IP address to use, instead of doing as the SNAT target does - just using the single configured IP address. The MASQUERADE target makes it possible to work properly with Dynamic DHCP IP addresses that your ISP might provide for your PPP, PPPoE or SLIP connections to the Internet.
3.4. Filter table

The filter table is, of course, mainly used for filtering packets. We can match packets and filter them in whatever way we want. There is nothing special to this chain or to packets that might slip through because they are malformed, etc. This is the place that we actually take action against packets and look at what they contain and DROP or /ACCEPT them, depending on their payload. Of course we may also do prior filtering; however, this particular table, is the place for which filtering was designed. Almost all targets are usable in this chain; however do keep in mind that the targets discussed previously too, however, this is the place that was designed for it. Almost all targets are usable in this chain, however, the targets discussed previously in this chapter can only be used in their respective tables. We will be more prolific about the filter table here; however you now know that this table is the right place to do your main filtering.
Chapter 4. The state machine

This chapter will deal with the state machine and explain it in detail. After reading trough it, you should have a complete understanding of how the State machine works. We will also go through a large set of examples on how states are dealt within the state machine itself. These should clarify everything in practice.
4.1. Introduction

The state machine is a special part within iptables that should really not be called the state machine at all, since it is really a connection tracking machine. However, most people recognize it under the first name. Throughout this chapter i will use this names more or less as if they where synonymous. This should not be overly confusing. Connection tracking is done to let the Netfilter framework know the state of a specific connection. Firewalls that implement this are generally called state-full firewalls. A state-full firewall is generally much more secure than non-state-full firewalls since it allows us to write much tighter rule-sets.

Within iptables, packets can be related to tracked connections in four different so called states. These are known as NEW, ESTABLISHED, RELATED and INVALID. We will discuss each of these in more depth later. With the --state match we can easily control who or what is allowed to initiate new sessions.

All of the connection tracking is done by special framework within the kernel called conntrack. conntrack may be loaded either as a module, or as an internal part of the kernel itself. Most of the time, we need and want more specific connection tracking than the default conntrack engine can maintain. Because of this, there are also more specific parts of conntrack that handles the TCP, UDP or ICMP protocols among others. These modules grabs specific, unique, information from the packets, so that they may keep track of each stream of data. The information that conntrack gathers is then used to tell conntrack in which state the stream is currently in. For example, UDP streams are, generally, uniquely identified by their destination IP address, source IP address, destination port and source port.

In previous kernels, we had the possibility to turn on and off defragmentation. However, since iptables and Netfilter were introduced and connection tracking in particular, this option was gotten rid of. The reason for this is that connection tracking can not work properly without defragmenting packets, and hence defragmenting has been incorporated into conntrack and is carried out automatically. It can not be turned off, except by turning off connection tracking. Defragmentation is always carried out if connection tracking is turned on.

All connection tracking is handled in the PREROUTING chain, except locally generated packets which are handled in the OUTPUT chain. What this means is that iptables will do all recalculation of states and so on within the PREROUTING chain. If we send the initial packet in a stream, the state gets set to NEW within the OUTPUT chain, and when we receive a return packet, the state gets changed in the PREROUTING chain to ESTABLISHED, and so on. If the first packet is not originated by ourself, the NEW state is set within the PREROUTING chain of course. So, all state changes and calculations are done within the PREROUTING and OUTPUT chains of the nat table.
4.2. The conntrack entries

Let's take a brief look at a conntrack entry and how to read them in /proc/net/ip_conntrack. This gives a list of all the current entries in your conntrack database. If you have the ip_conntrack module loaded, a cat of /proc/net/ip_conntrack might look like:

tcp 6 117 SYN_SENT src=192.168.1.6 dst=192.168.1.9 sport=32775
dport=22 [UNREPLIED] src=192.168.1.9 dst=192.168.1.6 sport=22
dport=32775 use=2


This example contains all the information that the conntrack module maintains to know which state a specific connection is in. First of all, we have a protocol, which in this case is tcp. Next, the same value in normal decimal coding. After this, we see how long this conntrack entry has to live. This value is set to 117 seconds right now and is decremented regularly until we see more traffic. This value is then reset to the default value for the specific state that it is in at that relevant point of time. Next comes the actual state that this entry is in at the present point of time. In the above mentioned case we are looking at a packet that is in the SYN_SENT state. The internal value of a connection is slightly different from the ones used externally with iptables. The value SYN_SENT tells us that we are looking at a connection that has only seen a TCP SYN packet in one direction. Next, we see the source IP address, destination IP address, source port and destination port. At this point we see a specific keyword that tells us that we have seen no return traffic for this connection. Lastly, we see what we expect of return packets. The information details the source IP address and destination IP address (which are both inverted, since the packet is to be directed back to us). The same thing goes for the source port and destination port of the connection. These are the values that should be of any interest to us.

The connection tracking entries may take on a series of different values, all specified in the conntrack headers available in linux/include/netfilter-ipv4/ip_conntrack*.h files. These values are dependent on which sub-protocol of IP we use. TCP, UDP or ICMP protocols take specific default values as specified in linux/include/netfilter-ipv4/ip_conntrack.h. We will look closer at this when we look at each of the protocols; however, we will not use them extensively through this chapter, since they are not used outside of the conntrack internals. Also, depending on how this state changes, the default value of the time until the connection is destroyed will also change.

Note

Recently there was a new patch made available in iptables patch-o-matic, called tcp-window-tracking. This patch adds, among other things, all of the above timeouts to special sysctl variables, which means that they can be changed on the fly, while the system is still running. Hence, this makes it unnecessary to recompile the kernel every time you want to change the timeouts.

These can be altered via using specific system calls available in the /proc/sys/net/ipv4/netfilter directory. You should in particular look at the /proc/sys/net/ipv4/netfilter/ip_ct_* variables.

When a connection has seen traffic in both directions, the conntrack entry will erase the [UNREPLIED] flag, and then reset it. The entry tells us that the connection has not seen any traffic in both directions, will be replaced by the [ASSURED] flag, to be found close to the end of the entry. The [ASSURED] flag tells us that this connection is assured and that it will not be erased if we reach the maximum possible tracked connections. Thus, connections marked as [ASSURED] will not be erased, contrary to the non assured connections (those not marked as [ASSURED]). How many connections that the connection tracking table can hold depends upon a variable that can be set through the ip-sysctl functions in recent kernels. The default value held by this entry varies heavily depending on how much memory you have. On 128 MB of RAM you will get 8192 possible entries, and at 256 MB of RAM, you will get 16376 entries. You can read and set your settings through the /proc/sys/net/ipv4/ip_conntrack_max setting.
4.3. User-land states

As you have seen, packets may take on several different states within the kernel itself, depending on what protocol we are talking about. However, outside the kernel, we only have the 4 states as described previously. These states can mainly be used in conjunction with the state match which will then be able to match packets based on their current connection tracking state. The valid states are NEW, ESTABLISHED, RELATED and INVALID states. The following table will briefly explain each possible state.

Table 4-1. User-land states
State Explanation
NEW The NEW state tells us that the packet is new in the connection. This means that the first packet that the conntrack module sees will be matched. For example, if we see a SYN packet and it is the first packet in a connection that we see, it will match. However, the packet may as well not be a SYN packet and still be considered NEW. This may lead to certain problems in some instances, but it may also be extremely helpful when we need to pick up lost connections from other firewalls, or when a connection has already timed out, but in reality is not closed.
ESTABLISHED The ESTABLISHED state has seen traffic in both directions and will then continuously match those packets. ESTABLISHED connections are fairly easy to understand. The only requirement to get into an ESTABLISHED state is that one host sends a packet, and that it later on gets a reply from the other host. The NEW state will upon receipt of the reply packet to or through the firewall change to the ESTABLISHED state. ICMP error messages and redirects etc can also be considered as ESTABLISHED, if we have generated a packet that in turn generated the ICMP message.
RELATED The RELATED state is one of the more tricky states. A connection is considered RELATED when it is related to another already ESTABLISHED connection. What this means, is that for a connection to be considered as RELATED, we must first have a connection that is considered ESTABLISHED. The ESTABLISHED connection will then spawn a connection outside of the main connection. The newly spawned connection will then be considered RELATED, if the conntrack module is able to understand that it is RELATED. Some good examples of connections that can be considered as RELATED are the FTP-data connections that are considered RELATED to the FTP control port, and the DCC connections issued through IRC. This could be used to allow ICMP replies, FTP transfers and DCC's to work properly through the firewall. Do note that most TCP protocols and some UDP protocols that rely on this mechanism are quite complex and send connection information within the payload of the TCP or UDP data segments, and hence require special helper modules to be correctly understood.
INVALID The INVALID state means that the packet can not be identified or that it does not have any state. This may be due to several reasons, such as the system running out of memory or ICMP error messages that do not respond to any known connections. Generally, it is a good idea to DROP everything in this state.

These states can be used together with the --state match to match packets based on their connection tracking state. This is what makes the state machine so incredibly strong and efficient for our firewall. Previously, we often had to open up all ports above 1024 to let all traffic back into our local networks again. With the state machine in place this is not necessary any longer, since we can now just open up the firewall for return traffic and not for all kinds of other traffic.
4.4. TCP connections

In this section and the upcoming ones, we will take a closer look at the states and how they are handled for each of the three basic protocols TCP, UDP and ICMP. Also, we will take a closer look at how connections are handled per default, if they can not be classified as either of these three protocols. We have chosen to start out with the TCP protocol since it is a state-full protocol in itself, and has a lot of interesting details with regard to the state machine in iptables.

A TCP connection is always initiated with the 3-way handshake, which establishes and negotiates the actual connection over which data will be sent. The whole session is begun with a SYN packet, then a SYN/ACK packet and finally an ACK packet to acknowledge the whole session establishment. At this point the connection is established and able to start sending data. The big problem is, how does connection tracking hook up into this? Quite simply really.

As far as the user is concerned, connection tracking works basically the same for all connection types. Have a look at the picture below to see exactly what state the stream enters during the different stages of the connection. As you can see, the connection tracking code does not really follow the flow of the TCP connection, from the users viewpoint. Once it has seen one packet(the SYN), it considers the connection as NEW. Once it sees the return packet(SYN/ACK), it considers the connection as ESTABLISHED. If you think about this a second, you will understand why. With this particular implementation, you can allow NEW and ESTABLISHED packets to leave your local network, only allow ESTABLISHED connections back, and that will work perfectly. Conversely, if the connection tracking machine were to consider the whole connection establishment as NEW, we would never really be able to stop outside connections to our local network, since we would have to allow NEW packets back in again. To make things more complicated, there is a number of other internal states that are used for TCP connections inside the kernel, but which are not available for us in User-land. Roughly, they follow the state standards specified within RFC 793 - Transmission Control Protocol at page 21-23. We will consider these in more detail further along in this section.

As you can see, it is really quite simple, seen from the user's point of view. However, looking at the whole construction from the kernel's point of view, it's a little more difficult. Let's look at an example. Consider exactly how the connection states change in the /proc/net/ip_conntrack table. The first state is reported upon receipt of the first SYN packet in a connection.

tcp 6 117 SYN_SENT src=192.168.1.5 dst=192.168.1.35 sport=1031
dport=23 [UNREPLIED] src=192.168.1.35 dst=192.168.1.5 sport=23
dport=1031 use=1


As you can see from the above entry, we have a precise state in which a SYN packet has been sent, (the SYN_SENT flag is set), and to which as yet no reply has been sent (witness the [UNREPLIED] flag). The next internal state will be reached when we see another packet in the other direction.

tcp 6 57 SYN_RECV src=192.168.1.5 dst=192.168.1.35 sport=1031
dport=23 src=192.168.1.35 dst=192.168.1.5 sport=23 dport=1031
use=1


Now we have received a corresponding SYN/ACK in return. As soon as this packet has been received, the state changes once again, this time to SYN_RECV. SYN_RECV tells us that the original SYN was delivered correctly and that the SYN/ACK return packet also got through the firewall properly. Moreover, this connection tracking entry has now seen traffic in both directions and is hence considered as having been replied to. This is not explicit, but rather assumed, as was the [UNREPLIED] flag above. The final step will be reached once we have seen the final ACK in the 3-way handshake.

tcp 6 431999 ESTABLISHED src=192.168.1.5 dst=192.168.1.35
sport=1031 dport=23 src=192.168.1.35 dst=192.168.1.5
sport=23 dport=1031 use=1


In the last example, we have gotten the final ACK in the 3-way handshake and the connection has entered the ESTABLISHED state, as far as the internal mechanisms of iptables are aware. After a few more packets, the connection will also become [ASSURED], as shown in the introduction section of this chapter.

When a TCP connection is closed down, it is done in the following way and takes the following states.

As you can see, the connection is never really closed until the last ACK is sent. Do note that this picture only describes how it is closed down under normal circumstances. A connection may also, for example, be closed by sending a RST(reset), if the connection were to be refused. In this case, the connection would be closed down after a predetermined time.

When the TCP connection has been closed down, the connection enters the TIME_WAIT state, which is per default set to 2 minutes. This is used so that all packets that have gotten out of order can still get through our rule-set, even after the connection has already closed. This is used as a kind of buffer time so that packets that have gotten stuck in one or another congested router can still get to the firewall, or to the other end of the connection.

If the connection is reset by a RST packet, the state is changed to CLOSE. This means that the connection per default have 10 seconds before the whole connection is definitely closed down. RST packets are not acknowledged in any sense, and will break the connection directly. There are also other states than the ones we have told you about so far. Here is the complete list of possible states that a TCP stream may take, and their timeout values.

Table 4-2. Internal states
State Timeout value
NONE 30 minutes
ESTABLISHED 5 days
SYN_SENT 2 minutes
SYN_RECV 60 seconds
FIN_WAIT 2 minutes
TIME_WAIT 2 minutes
CLOSE 10 seconds
CLOSE_WAIT 12 hours
LAST_ACK 30 seconds
LISTEN> 2 minutes

These values are most definitely not absolute. They may change with kernel revisions, and they may also be changed via the proc file-system in the /proc/sys/net/ipv4/netfilter/ip_ct_tcp_* variables. The default values should, however, be fairly well established in practice. These values are set in jiffies (or 1/100th parts of seconds), so 3000 means 30 seconds.

Note

Also note that the User-land side of the state machine does not look at TCP flags set in the TCP packets. This is generally bad, since you may want to allow packets in the NEW state to get through the firewall, but when you specify the NEW flag, you will in most cases mean SYN packets.

This is not what happens with the current state implementation; instead, even a packet with no bit set or an ACK flag, will count as NEW and if you match on NEW packets. This can be used for redundant firewalling and so on, but it is generally extremely bad on your home network, where you only have a single firewall. To get around this behavior, you could use the command explained in the State NEW packets but no SYN bit set section of the Common problems and questions appendix. Another way is to install the tcp-window-tracking extension from patch-o-matic, which will make the firewall able to track states depending on the TCP window settings.
4.5. UDP connections

UDP connections are in them selves not state-full connections, but rather stateless. There are several reasons why, mainly because they don't contain any connection establishment or connection closing; most of all they lack sequencing. Receiving two UDP datagrams in a specific order does not say anything about which order in which they where sent. It is, however, still possible to set states on the connections within the kernel. Let's have a look at how a connection can be tracked and how it might look in conntrack.

As you can see, the connection is brought up almost exactly in the same way as a TCP connection. That is, from the user-land point of view. Internally, conntrack information looks quite a bit different, but intrinsically the details are the same. First of all, let's have a look at the entry after the initial UDP packet has been sent.

udp 17 20 src=192.168.1.2 dst=192.168.1.5 sport=137 dport=1025
[UNREPLIED] src=192.168.1.5 dst=192.168.1.2 sport=1025
dport=137 use=1


As you can see from the first and second values, this is an UDP packet. The first is the protocol name, and the second is protocol number. This is just the same as for TCP connections. The third value marks how many seconds this state entry has to live. After this, we get the values of the packet that we have seen and the future expectations of packets over this connection reaching us from the initiating packet sender. These are the source, destination, source port and destination port. At this point, the [UNREPLIED] flag tells us that there's so far been no response to the packet. Finally, we get a brief list of the expectations for returning packets. Do note that the latter entries are in reverse order to the first values. The timeout at this point is set to 30 seconds, as per default.

udp 17 170 src=192.168.1.2 dst=192.168.1.5 sport=137
dport=1025 src=192.168.1.5 dst=192.168.1.2 sport=1025
dport=137 use=1


At this point the server has seen a reply to the first packet sent out and the connection is now considered as ESTABLISHED. This is not shown in the connection tracking, as you can see. The main difference is that the [UNREPLIED] flag has now gone. Moreover, the default timeout has changed to 180 seconds - but in this example that's by now been decremented to 170 seconds - in 10 seconds' time, it will be 160 seconds. There's one thing that's missing, though, and can change a bit, and that is the [ASSURED] flag described above. For the [ASSURED] flag to be set on a tracked connection, there must have been a small amount of traffic over that connection.

udp 17 175 src=192.168.1.5 dst=195.22.79.2 sport=1025
dport=53 src=195.22.79.2 dst=192.168.1.5 sport=53
dport=1025 [ASSURED] use=1


At this point, the connection has become assured. The connection looks exactly the same as the previous example, except for the [ASSURED] flag. If this connection is not used for 180 seconds, it times out. 180 Seconds is a comparatively low value, but should be sufficient for most use. This value is reset to its full value for each packet that matches the same entry and passes through the firewall, just the same as for all of the internal states.
4.6. ICMP connections

ICMP packets are far from a state-full stream, since they are only used for controlling and should never establish any connections. There are four ICMP types that will generate return packets however, and these have 2 different states. These ICMP messages can take the NEW and ESTABLISHED states. The ICMP types we are talking about are Echo request and reply, Timestamp request and reply, Information request and reply and finally Address mask request and reply. Out of these, the timestamp request and information request are obsolete and could most probably just be dropped. However, the Echo messages are used in several setups such as pinging hosts. Address mask requests are not used often, but could be useful at times and worth allowing. To get an idea of how this could look, have a look at the following image.

As you can see in the above picture, the host sends an echo request to the target, which is considered as NEW by the firewall. The target then responds with a echo reply which the firewall considers as state ESTABLISHED. When the first echo request has been seen, the following state entry goes into the ip_conntrack.

icmp 1 25 src=192.168.1.6 dst=192.168.1.10 type=8 code=0
id=33029 [UNREPLIED] src=192.168.1.10 dst=192.168.1.6
type=0 code=0 id=33029 use=1


This entry looks a little bit different from the standard states for TCP and UDP as you can see. The protocol is there, and the timeout, as well as source and destination addresses. The problem comes after that however. We now have 3 new fields called type, code and id. They are not special in any way, the type field contains the ICMP type and the code field contains the ICMP code. These are all available in ICMP types appendix. The final id field, contains the ICMP ID. Each ICMP packet gets an ID set to it when it is sent, and when the receiver gets the ICMP message, it sets the same ID within the new ICMP message so that the sender will recognize the reply and will be able to connect it with the correct ICMP request.

The next field, we once again recognize as the [UNREPLIED] flag, which we have seen before. Just as before, this flag tells us that we are currently looking at a connection tracking entry that has seen only traffic in one direction. Finally, we see the reply expectation for the reply ICMP packet, which is the inversion of the original source and destination IP addresses. As for the type and code, these are changed to the correct values for the return packet, so an echo request is changed to echo reply and so on. The ICMP ID is preserved from the request packet.

The reply packet is considered as being ESTABLISHED, as we have already explained. However, we can know for sure that after the ICMP reply, there will be absolutely no more legal traffic in the same connection. For this reason, the connection tracking entry is destroyed once the reply has traveled all the way through the Netfilter structure.

In each of the above cases, the request is considered as NEW, while the reply is considered as ESTABLISHED. Let's consider this more closely. When the firewall sees a request packet, it considers it as NEW. When the host sends a reply packet to the request it is considered ESTABLISHED.

Note

Note that this means that the reply packet must match the criterion given by the connection tracking entry to be considered as established, just as with all other traffic types.

ICMP requests has a default timeout of 30 seconds, which you can change in the /proc/sys/net/ipv4/netfilter/ip_ct_icmp_timeout entry. This should in general be a good timeout value, since it will be able to catch most packets in transit.

Another hugely important part of ICMP is the fact that it is used to tell the hosts what happened to specific UDP and TCP connections or connection attempts. For this simple reason, ICMP replies will very often be recognized as RELATED to original connections or connection attempts. A simple example would be the ICMP Host unreachable or ICMP Network unreachable. These should always be spawned back to our host if it attempts an unsuccessful connection to some other host, but the network or host in question could be down, and hence the last router trying to reach the site in question will reply with an ICMP message telling us about it. In this case, the ICMP reply is considered as a RELATED packet. The following picture should explain how it would look.

In the above example, we send out a SYN packet to a specific address. This is considered as a NEW connection by the firewall. However, the network the packet is trying to reach is unreachable, so a router returns a network unreachable ICMP error to us. The connection tracking code can recognize this packet as RELATED. thanks to the already added tracking entry, so the ICMP reply is correctly sent to the client which will then hopefully abort. Meanwhile, the firewall has destroyed the connection tracking entry since it knows this was an error message.

The same behavior as above is experienced with UDP connections if they run into any problem like the above. All ICMP messages sent in reply to UDP connections are considered as RELATED. Consider the following image.

This time an UDP packet is sent to the host. This UDP connection is considered as NEW. However, the network is administratively prohibited by some firewall or router on the way over. Hence, our firewall receives a ICMP Network Prohibited in return. The firewall knows that this ICMP error message is related to the already opened UDP connection and sends it as an RELATED packet to the client. At this point, the firewall destroys the connection tracking entry, and the client receives the ICMP message and should hopefully abort.
4.7. Default connections

In certain cases, the conntrack machine does not know how to handle a specific protocol. This happens if it does not know about that protocol in particular, or doesn't know how it works. In these cases, it goes back to a default behavior. The default behavior is used on, for example, NETBLT, MUX and EGP. This behavior looks pretty much the same as the UDP connection tracking. The first packet is considered NEW, and reply traffic and so forth is considered ESTABLISHED.

When the default behavior is used, all of these packets will attain the same default timeout value. This can be set via the /proc/sys/net/ipv4/netfilter/ip_ct_generic_timeout variable. The default value here is 600 seconds, or 10 minutes. Depending on what traffic you are trying to send over a link that uses the default connection tracking behavior, this might need changing. Especially if you are bouncing traffic through satellites and such, which can take a long time.
4.8. Complex protocols and connection tracking

Certain protocols are more complex than others. What this means when it comes to connection tracking, is that such protocols may be harder to track correctly. Good examples of these are the ICQ, IRC and FTP protocols. Each and every one of these protocols carries information within the actual data payload of the packets, and hence requires special connection tracking helpers to enable it to function correctly.

Let's take the FTP protocol as the first example. The FTP protocol first opens up a single connection that is called the FTP control session. When we issue commands through this session, other ports are opened to carry the rest of the data related to that specific command. These connections can be done in two ways, either actively or passively. When a connection is done actively, the FTP client sends the server a port and IP address to connect to. After this, the FTP client opens up the port and the server connects to that specified port from its own port 20 (known as FTP-Data) and sends the data over it.

The problem here is that the firewall will not know about these extra connections, since they where negotiated within the actual payload of the protocol data. Because of this, the firewall will be unable to know that it should let the server connect to the client over these specific ports.

The solution to this problem is to add a special helper to the connection tracking module which will scan through the data in the control connection for specific syntaxes and information. When it runs into the correct information, it will add that specific information as RELATED and the server will be able to track the connection, thanks to that RELATED entry. Consider the following picture to understand the states when the FTP server has made the connection back to the client.

Passive FTP works the opposite way. The FTP client tells the server that it wants some specific data, upon which the server replies with an IP address to connect to and at what port. The client will, upon receipt of this data, connect to that specific port, from its own port 20(the FTP-data port), and get the data in question. If you have an FTP server behind your firewall, you will in other words require this module in addition to your standard iptables modules to let clients on the Internet connect to the FTP server properly. The same goes if you are extremely restrictive to your users, and only want to let them reach HTTP and FTP servers on the Internet and block all other ports. Consider the following image and its bearing on Passive FTP.

Some conntrack helpers are already available within the kernel itself. More specifically, the FTP and IRC protocols have conntrack helpers as of writing this. If you can not find the conntrack helpers that you need within the kernel itself, you should have a look at the patch-o-matic tree within user-land iptables. The patch-o-matic tree may contain more conntrack helpers, such as for the ntalk or H.323 protocols. If they are not available in the patch-o-matic tree, you have a number of options. Either you can look at the CVS source of iptables, if it has recently gone into that tree, or you can contact the Netfilter-devel mailing list and ask if it is available. If it is not, and there are no plans for adding it, you are left to your own devices and would most probably want to read the Rusty Russell's Unreliable Netfilter Hacking HOW-TO which is linked from the Other resources and links appendix.

Conntrack helpers may either be statically compiled into the kernel, or as modules. If they are compiled as modules, you can load them with the following command

modprobe ip_conntrack_*


Do note that connection tracking has nothing to do with NAT, and hence you may require more modules if you are NAT'ing connections as well. For example, if you were to want to NAT and track FTP connections, you would need the NAT module as well. All NAT helpers starts with ip_nat_ and follow that naming convention; so for example the FTP NAT helper would be named ip_nat_ftp and the IRC module would be named ip_nat_irc. The conntrack helpers follow the same naming convention, and hence the IRC conntrack helper would be named ip_conntrack_irc, while the FTP conntrack helper would be named ip_conntrack_ftp.
Chapter 5. Saving and restoring large rule-sets

The iptables package comes with two more tools that are very useful, specially if you are dealing with larger rule-sets. These two tools are called iptables-save and iptables-restore and are used to save and restore rule-sets to a specific file-format that looks a quite a bit different from the standard shell code that you will see in the rest of this tutorial.
5.1. Speed considerations

One of the largest reasons for using the iptables-save and iptables-restore commands is that they will speed up the loading and saving of larger rule-sets considerably. The main problem with running a shell script that contains iptables rules is that each invocation of iptables within the script will first extract the whole rule-set from the Netfilter kernel space, and after this, it will insert or append rules, or do whatever change to the rule-set that is needed by this specific command. Finally, it will insert the new rule-set from its own memory into kernel space. Using a shell script, this is done for each and every rule that we want to insert, and for each time we do this, it takes more time to extract and insert the rule-set.

To solve this problem, there is the iptables-save and restore commands. The iptables-save command is used to save the rule-set into a specially formatted text-file, and the iptables-restore command is used to load this text-file into kernel again. The best parts of these commands is that they will load and save the rule-set in one single request. iptables-save will grab the whole rule-set from kernel and save it to a file in one single movement. iptables-restore will upload that specific rule-set to kernel in a single movement for each table. In other words, instead of dropping the rule-set out of kernel some 30.000 times, for really large rule-sets, and then upload it to kernel again that many times, we can now save the whole thing into a file in one movement and then upload the whole thing in as little as three movements depending on how many tables you use.

As you can understand, these tools are definitely something for you if you are working on a huge set of rules that needs to be inserted. However, they do have drawbacks that we will discuss more in the next section.
5.2. Drawbacks with restore

As you may have already wondered, can iptables-restore handle any kind of scripting? So far, no, it can not and it will most probably never be able to. This is the main flaw in using iptables-restore since you will not be able to do a huge set of things with these files. For example, what if you have a connection that has a dynamically assigned IP address and you want to grab this dynamic IP every-time the computer boots up and then use that value within your scripts? With iptables-restore, this is more or less impossible.

One possibility to get around this is to make a small script which grabs the values you would like to use in the script, then sed the iptables-restore file for specific keywords and replace them with the values collected via the small script. At this point, you could save it to a temporary file, and then use iptables-restore to load the new values. This causes a lot of problems however, and you will be unable to use iptables-save properly since it would probably erase your manually added keywords in the restore script. It is in other words a clumsy solution.

Another solution is to load the iptables-restore scripts first, and then load a specific shell script that inserts more dynamic rules in their proper places. Of course, as you can understand, this is just as clumsy as the first solution. iptables-restore is simply not very well suited for configurations where IP addresses are dynamically assigned to your firewall or where you want different behaviors depending on configuration options and so on.

Another drawback with iptables-restore and iptables-save is that it is not fully functional as of writing this. The problem is simply that not a lot of people use it as of today and hence there is not a lot of people finding bugs, and in turn some matches and targets will simply be inserted badly, which may lead to some strange behaviors that you did not expect. Even though these problems exist, I would highly recommend using these tools which should work extremely well for most rule-sets as long as they do not contain some of the new targets or matches that it does not know how to handle properly.
5.3. iptables-save

The iptables-save command is, as we have already explained, a tool to save the current rule-set into a file that iptables-restore can use. This command is quite simple really, and takes only two arguments. Take a look at the following example to understand the syntax of the command.

iptables-save [-c] [-t table]

The -c argument tells iptables-save to keep the values specified in the byte and packet counters. This could for example be useful if we would like to reboot our main firewall, but not loose byte and packet counters which we may use for statistical purposes. Issuing a iptables-save command with the -c argument would then make it possible for us to reboot but without breaking our statistical and accounting routines. The default value is, of course, to not keep the counters intact when issuing this command.

The -t argument tells the iptables-save command which tables to save. Without this argument the command will automatically save all tables available into the file. The following is an example on what output you can expect from the iptables-save command if you do not have any rule-set loaded.

# Generated by iptables-save v1.2.6a on Wed Apr 24 10:19:17 2002 *filter :INPUT ACCEPT [404:19766] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [530:43376] COMMIT # Completed on Wed Apr 24 10:19:17 2002 # Generated by iptables-save v1.2.6a on Wed Apr 24 10:19:17 2002 *mangle :PREROUTING ACCEPT [451:22060] :INPUT ACCEPT [451:22060] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [594:47151] :POSTROUTING ACCEPT [594:47151] COMMIT # Completed on Wed Apr 24 10:19:17 2002 # Generated by iptables-save v1.2.6a on Wed Apr 24 10:19:17 2002 *nat :PREROUTING ACCEPT [0:0] :POSTROUTING ACCEPT [3:450] :OUTPUT ACCEPT [3:450] COMMIT # Completed on Wed Apr 24 10:19:17 2002

This contains a few comments starting with a # sign. Each table is marked like *<table-name>, for example *mangle. Then within each table we have the chain specifications and rules. A chain specification looks like :<chain-name> <chain-policy> [<packet-counter>:<byte-counter>]. The chain-name may be for example PREROUTING, the policy is described previously and can for example be ACCEPT. Finally the packet-counter and byte-counters are the same counters as in the output from iptables -L -v. Finally, each table declaration ends in a COMMIT keyword. The COMMIT keyword tells us that at this point we should commit all rules currently in the pipeline to kernel.

The above example is pretty basic, and hence I believe it is nothing more than proper to show a brief example which contains a very small rule-set. If we would run iptables-save on this, it would look something like this in the output:

# Generated by iptables-save v1.2.6a on Wed Apr 24 10:19:55 2002 *filter :INPUT DROP [1:229] :FORWARD DROP [0:0] :OUTPUT DROP [0:0] -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A FORWARD -i eth0 -m state --state RELATED,ESTABLISHED -j ACCEPT -A FORWARD -i eth1 -m state --state NEW,RELATED,ESTABLISHED -j ACCEPT -A OUTPUT -m state --state NEW,RELATED,ESTABLISHED -j ACCEPT COMMIT # Completed on Wed Apr 24 1