Difference between revisions of "How to setup secure outbound web access"

From LinuxMCE
Jump to: navigation, search
m (New page: Category: Tutorials {| align="right" | __TOC__ |} =Background= This tutorial has come about as a result of a discussion on the [http://forum.linuxmce.org/index.php?topic=8692.0 for...)
 
m
Line 27: Line 27:
 
We will be installing a series of packages. The full details of configuring each will not be discussed here. For an in deapth discussion, please vivit the respective websites.
 
We will be installing a series of packages. The full details of configuring each will not be discussed here. For an in deapth discussion, please vivit the respective websites.
 
==Squid==
 
==Squid==
 +
Squid is a very powerful caching web proxy server. This means that it keeps a copy of all files it is asked to retrieve (unless the various http headers mean that the file cannot be cached). When a file is requested, Squid first checks to see if it has a cached copy. If so, that is returned, if not, it is fetched from the webserver. It is common for certain websites to be visited frequently and Squid will rapidly build up cached copies of things like logo graphics etc. which can (in some cases) drasticly reduce the requests to the internet with resulting speed advantages. If you also pay for data useage, Squid can save money as well! Full details can be found on the [http://www.squid-cache.org/ Squid website]. In our configuration, Squid will be the proxy that actually makes the requests to the internet and is, therefore, the "end" of the chain.
 
==HavP==
 
==HavP==
 
==ClamAV==
 
==ClamAV==

Revision as of 14:49, 8 September 2009

Background

This tutorial has come about as a result of a discussion on the forums. It details setting up a chain of security devices on the core which should help optimize web browsing from the internal network.

Please note, much of this is shamelessly plagarised from others. What I have done is to try to bring their work together and provide a comprehensive end-to-end solution. If you are one of those whose work has been used, please take it as a compliment and feel free to add appropriate credits at the end!

How Browsing Works

Without wishing to state the obvious, an basic understanding of web browsing helps us to understand the setup described here. Browsing is a fairly simple process. The client (known as a browser) sends a request using a protocol called HTTP to a server. By default, this request is sent to port 80. The browser responds with the requested file. This file may well contain HTML which the broser will display and will result in the browser making further requests for graphic files etc etc.

It is also possible to put a third entity in the middle of this chain. This entity is known as a proxy. In this case, the requests from the browser are sent to the proxy. The proxy sends the requests to the server, which responds to the proxy. Finally, the proxy responds to the browser. Proxys are used for many reasons, often security related. The system we will be setting up will consist of a chain of 3 proxies which will perform the following functions:

  • Caching. This allows the proxy to store a copy of the files requested. If a second request is recieved for the same file, it is already held locally and a second request does not need to be sent to the server. This reduces traffic on the external network and also improves performance overall. The Caching proxy we will be using is known as Squid.
  • Virus Scanning. As the file will be passing through the proxy, that proxy can examine its contents. In this case, viruses can be scanned for and blocked. The virus scanning proxy we will be using is called HavP and is used in conjunction with a regular scanner, in our case ClamAV.
  • Content Scanning. As well as being examined for viruses, the text of the HTML can be processed and scored. This allows inappropriate (for example sexual) cintent to be blocked. We will be setting up using one of the best known content scanners, Dan's Guardian.

Transparent Proxying

There was much discussion on the forum thread concerning this. It is an additional feature which is entirely optional. In order for a proxy to be used, the browser has to "know" to send its requests to the proxy rather than to the actual server. This is achieved by configuring the browser with the proxy's details. There are, however, one or two problems with this.

  1. Each browser must be individually configured. Not too much of a problem with a home network and not many browsers, but particularly with portable devices which may or may not be used elsewhere (for example at work) it can be inconvenient to have to keep turning the proxy on and off.
  2. It can be easy to bypass. Without additional firewall rulsed to prevent direct browsing of the internet, bypassing the proxy (and therefore gaining access to blocked content) is as simple as turning it off in the browser.

The solution to this is known as transparent proxying. This works by having the proxy running on the router (in our case the core) and configuring it to intercept all outbound web traffic (i.e. destined for port 80) and redirect it to the proxy. The process is transparent to the end browser / user, hence the name.

NOTE Transparent proxying should not be seen as an alternative to setting a proxy on static machines, only as an addition. There are still ways to circumvent the system (for example, some webservers don't operate on port 80) so transparent proxying should be seen as an additional layer of security, not an alternative!

The Software

We will be installing a series of packages. The full details of configuring each will not be discussed here. For an in deapth discussion, please vivit the respective websites.

Squid

Squid is a very powerful caching web proxy server. This means that it keeps a copy of all files it is asked to retrieve (unless the various http headers mean that the file cannot be cached). When a file is requested, Squid first checks to see if it has a cached copy. If so, that is returned, if not, it is fetched from the webserver. It is common for certain websites to be visited frequently and Squid will rapidly build up cached copies of things like logo graphics etc. which can (in some cases) drasticly reduce the requests to the internet with resulting speed advantages. If you also pay for data useage, Squid can save money as well! Full details can be found on the Squid website. In our configuration, Squid will be the proxy that actually makes the requests to the internet and is, therefore, the "end" of the chain.

HavP

ClamAV

Dan's Guardian

Shorewall

Installation

I am writing this as I perform the steps on my own core. So, while this line is here, it's a work in progress and as yet unfinished!!!!

Setting up Transparent Proxying

--Wierdbeard65 13:31, 8 September 2009 (CEST)