Squid Proxy (Secure, Paranoid and Non-caching)



Home


Squid is a caching proxy for the Web supporting HTTP, HTTPS and FTP. It can be used to protect internal lans from questionable servers and provide accounting of where clients go and what servers clients are allowed to go to.

Squid allows you to enforce policies with your users. If you have a policy stating no one can access CNN unless it is lunch time between 12noon and 2pm then you have that control. If you need to block MySpace or YouTube or if you only allow the latest version of Firefox outside your network, you have that ability. Squid also allows one to limit the headers a client can send and receive. If you want to block clients from logging into, but still allow them to look at, any external sites like Gmail then filtering the "authorization" header will do it.

If you are a parent and need to filter web access at home then Squid is the perfect tool. It can run on a separate machine inaccessible to children thus securing it from tampering. You can setup search parameters that stop pages from loading if certain words are found on the remote page. Pages can be blocked by URL or ip address and you can even setup times your children can access the web. Squid gives you the ability to enforce the rules you set down for your home network. As an added bonus Squid will keep logs of every URL, search query and server your network accesses for future review.

The best part is Squid is Open Source and completely free.





Introduction to the squid.conf

This squid proxy configuration is setup to be a non-caching secure proxy for HTTP and HTTPS only. This machine is accessing a low latency, high speed and un-metered Internet connection. Since our example network has unlimited bandwidth and it is fast, we are _not_ going to use caching. This config only allows access by the internal LAN (10.10.10/28), applies short timeouts for connections and enables the calomel.org "anti-ad server" modification. To protect our internal browsers squid will deny all headers except those specifically listed and obfuscate the Accept and User-Agent headers anonymizing our browsers.

Below you will find the link to the squid.conf example file and below that is the same squid.conf file in a text box. Both formats are available to make it easier for you to review the code. This squid.conf is a fully working config file with the exception of setting up a few variables for your environment.

You can download the Squid squid.conf here by doing a "save as" or just clicking on the link and choosing download. Before using the config file take a look it below or download it and look at the options. Calomel.org Squid squid.conf

#
### Calomel.org Squid  squid.conf
#
########### squid.conf ###########
#
## interface, port and proxy type
#http_port 10.10.10.1:8080 transparent
http_port 10.10.10.1:8080

## general options
cache_mgr not_to_be_disturbed
client_db on
collapsed_forwarding on
detect_broken_pconn on
dns_defnames on
dns_retransmit_interval 2 seconds
dns_timeout 5 minutes
forwarded_for off
half_closed_clients off
httpd_suppress_version_string on
ignore_unknown_nameservers on
pipeline_prefetch on
retry_on_error on
strip_query_terms off
uri_whitespace strip
visible_hostname localhost

## timeouts
forward_timeout 30 seconds
connect_timeout 30 seconds
read_timeout 30 seconds
request_timeout 30 seconds
persistent_request_timeout 1 minute
client_lifetime 20 hours

## host definitions
acl all src 0.0.0.0/0
acl localhost src 127.0.0.1/255.255.255.255
acl to_localhost dst 127.0.0.0/8

## proxy server client access
acl mynetworks src 127.0.0.0/8 10.10.10.0/28
http_access deny !mynetworks

## max connections per ip
acl maxuserconn src 127.0.0.0/8 10.0.10.0/28
acl limitusercon maxconn 500
http_access deny maxuserconn limitusercon

## disable caching
cache deny all
cache_dir null /tmp

## disable multicast icp
icp_port 0
icp_access deny all

## disable ident lookups
ident_lookup_access deny all

## no-trust for on-the-fly Content-Encoding
acl apache rep_header Server ^Apache
broken_vary_encoding allow apache

## logs
logformat combined [%tl] %>A %{Host}>h "%rm %ru HTTP/%rv" %Hs %<st "%{Referer}>h" "%{User-Agent}>h" %Ss:%Sh
access_log /var/log/squid/access.log combined
cache_store_log /var/log/squid/store.log
cache_log  /var/log/squid/cache.log
logfile_rotate 8

## support files
coredump_dir /tmp
pid_filename /var/log/squid/squid.pid

## ports allowed
acl Safe_ports port 80 443
http_access deny !Safe_ports

## ssl ports/method allowed
acl SSL_ports port 443
acl CONNECT method CONNECT
http_access deny CONNECT !SSL_ports

## protocols allowed
acl Safe_proto proto HTTP SSL
http_access deny !Safe_proto

## browsers allowed
# acl Safe_browser browser ^Mozilla/5\.0.*Firefox/2\.0\.0\.6
# http_access deny !Safe_Browser

## disable ads ( //squid_adservers.html )
# acl ads dstdom_regex "/etc/squid/ad_block.txt"
# http_access deny ads
# deny_info TCP_RESET ads

## Banned Sites
# acl Bad_Site dstdom_regex myspace.com youtube.com facebook.com 
# http_access deny Bad_Site

## redirector
# acl my_url dstdomain SITE_NAME.COM
# redirector_access allow my_url
# redirect_children 1
# redirect_rewrites_host_header off
# redirect_program /etc/squid/squid_redirector.pl

## methods allowed
acl Safe_method method CONNECT GET HEAD POST
http_access deny !Safe_method

## allow replies to client requests
http_reply_access allow all

## header re-write
# header_replace Accept */*
# header_replace Accept-Encoding gzip
# header_replace Accept-Language en
header_replace User-Agent OurBrowser/1.0 (Some Name)

## header list ( DENY all -> ALLOW listed )
header_access Accept allow all
header_access Accept-Encoding allow all
header_access Accept-Language allow all
header_access Authorization allow all
header_access Cache-Control allow all
header_access Content-Disposition allow all
header_access Content-Encoding allow all
header_access Content-Length allow all
header_access Content-Location allow all
header_access Content-Range allow all
header_access Content-Type allow all
header_access Cookie allow all
header_access Expires allow all
header_access Host allow all
header_access If-Modified-Since allow all
header_access Location allow all
header_access Range allow all
header_access Referer allow all
header_access Set-Cookie allow all
header_access WWW-Authenticate allow all
header_access All deny all

##########  END  ###########






Getting Started

The following instructions will allow you to get squid installed and working with the squid.conf config file listed above. Since entire books are written about squid we can not go to go into all of the definitions of all of the directives in the config file. Once you get squid working check at the bottom of the page for links to the squid directives definitions page.



Doing the install

Step 1: To get started you need to install Squid. Most operating systems have packages (rpm dev pkg) for Squid and you can also build it from source (squid-cache.org).

Step 2: Once squid is installed, download the squid.conf config file from above and place it in your squid config directory. This is usually found in /etc/squid/ on most OS distributions.

Step 3: Now, we need to edit the squid.conf file and make changes reflecting your environment.

"interface, port and proxy type" : We need to set the ip and port the squid daemon is going to listen on. In our example we listen on 10.10.10.1 port 8080 as that is the interface on the internal network on our client machines.

"Access Control List" : Next edit the area called "proxy server client access" and look for the directive "acl mynetworks src". This is the access control list (acl) of networks or individual ips that can access squid. You need to put in the network ips of your LAN. For example, most internal networks are setup with the ips 192.168.0.0 to 192.168.0.254. Then you would make sure the line read "acl mynetworks src 127.0.0.0/8 192.168.0/24".

"The logs" : The log files are going to be placed in /var/log/squid/ and we need to make that directory and make it owned by the squid user. Use "mkdir -p /var/log/squid/" and chown _squid:_squid /var/log/squid/" for OpenBSD Squid from packages.

Optional: Redirector : A redirector is a program squid will call to do a job. You can use a redirector for many purposes like blocking and redirecting URLs. In this example we are going to have squid pass URLs to the following Perl script to re-write the URL "SITE_NAME.COM" to "localhost:8080". If you run squid on the same machine as a webserver, then you may want to use this method.

The client browser will use the URL SITE_NAME.com and the requests will actually goto the webserver running on localhost port 8080. Notice that we have added an ACL to only have URLs with the destination domain of SITE_NAME.COM use the redirector to reduce congestion and keep squid fast. Squid will also not touch the "hosts" header. This means that clients will actually still see the URL name "SITE_NAME.COM" in the URL field even though they are getting the data from "localhost port 8080".

You are welcome to cut/paste the following Perl code. This script is called squid_redirector.pl and place in /etc/squid/ according to our example.

#!/usr/bin/perl -p
BEGIN { $|=1 }
s|http://SITE_NAME.COM|http://localhost:8080|;

The Headers : Finally, at the bottom of the config file are the headers squid will allow though to the Internet from the clients. At the end of the instructions you can find an explanation of each of the headers used and why you would want to use them.



Configure the clients

The last task is to tell machines on the inside of the LAN that squid is available at ready for use. In our example above our proxy server could be found at "10.10.10.1:8080" so we are going to enter this into our browser's proxy config page. On most browsers there is a menu for setting up access to a proxy server. The problem is that every browser is different and we can not cover all of the setup procedures of every browser. The easiest way to find instructions for your browser is to search on Google for the words "proxy server" and the name of your bowser. For example, if we were using Firefox we would search for the string "proxy server firefox".

search Calomel.org






What each "header" means

The following sections are definitions of the headers used in the example file and at the bottom of the page is are questions and answers.

Squid header definitions for squid.conf (calomel.org)

The Accept header is sent by the client to the server to explain what media types or page types the browser is willing to accept. This header is simply the browser "preferring" a set of media or text types in the specified format. The server can honor this request and send the data in the format listed or ignore it completely and send what ever the server has. For privacy concerns we can replace the true header of the client with "*/*" saying that we accept all data types. This option works with all clients.

Example:     header_replace Accept text/plain; q=0.5, text/html, text/x-dvi; q=0.8, text/x-c
Calomel.org: header_replace Accept */*

The Accept-Encoding header is sent by the client to the server to explain what compression encoding the client will accept. Compression will make the data being transfered smaller in size at the expense of CPU time on the server and the client due to compressing/uncompressing the data. The server can honor this request and send the data in the format listed or ignore it completely and send the data clear text. For privacy concerns we can replace the true header of the client with a request for "gzip" only. This option works with all clients.

Example:     header_replace Accept-Encoding compress, gzip
Calomel.org: header_replace Accept-Encoding gzip

The Accept-Language header is sent by the client to the server to explain what language we would like the page to be in. The server can honor this request and send the data in the format listed or ignore it completely and send what ever the server want to. For privacy concerns we can replace the true header of the client with our default language of "en". This option works with all clients.

Example:     Accept-Language da, en-gb;q=0.8, en;q=0.7
Calomel.org: Accept-Language en

The User-Agent header is sent by the client to the server to explain what browser name, browser version, build type, compiler version and other information about the client. For some sites (www.digg.com) this header must be sent in the proper format as seen in the calomel.org example, but not necessarily have valid or true information. For privacy concerns we can replace the true header of the client with what ever we want as long as it is at least in the form of the calomel.org example. If you want to randomize the User-Agent string then check on the Home for a Squid User-Agent randomizer script.

Example:     header_replace User-Agent Mozilla/5.0 (X11; U;) Gecko/20080221 Firefox/2.0.0.9
Calomel.org: header_replace User-Agent OurBrowser/1.0 (Some Name)

The Authorization header is sent by the client to the server with the user name and password for access. This header can also be used with the pop-up user name/password box that WWW-Authentication provides. This header is _NOT_ used for the user name and password of Java scripted sites like Netflix, digg, and financial institutions. This header _IS_ used to send credentials in the URL to the server. You will need the Authorization header if you have hosts connecting to sites with ddclient for dyndns updates or for a machine with MythTV so it can receive updates from Zap2It Labs for TV programming. If you see errors on the machine running squid in your logs from ddclient with "authorization failed (HTTP/1.0 401 Unauthorized" or "X-UpdateCode: A" this is authorization not allowed in squid.

Calomel.org:   header_access Authorization allow all



Want to BLOCK ad servers with squid? Make sure to check out the Squid Anti-Ad Server Guide. With a little time and understanding you could easily block 90% of that ads that show up in your browser. You can also setup a proxy auto configuration (PAC) file in the browser using our Proxy Auto Config for Firefox (PAC) "how to".



The Content-Disposition header is an extension to the MIME protocol instructing a MIME user agent on how it should display an download file. When the browser receives the header, it raises a "file download" dialog box with the file name specified by the server. One only needs this header if you use web pages that dynamically name the download file through a scripted process. For example, if the web page dynamically generates a list and specifies the filename as "calomel_file.txt", but you see the file being saved incorrectly as "file_script.pl" then blocking this header might be the problem.

Calomel.org:   header_access Content-Disposition allow all

The Content-Encoding header is sent by the server back to the client to explain what compression method or factor the server is sending the data in. Since we specified the header Accept-Encoding as "gzip" the server should be sending the client the same.

Calomel.org:   header_access Content-Encoding allow all

The Content-Length header is sent by the server back to the client to detail how much data the client should expect to receive. If the server says 1MB of data is being sent and only 0.9MB data arrived the client knows to wait longer or re-request the data.

Calomel.org:   header_access Content-Length allow all

The Content-Location header is sent by the server back to the client specifing the exact URI or relitive URL of the clients request. This header is also needed so that the server can notify the client the page they have ask for has changed or not. Content-Location works in conjunction with the If-Modified-Since GET request to return the page or return a 304, Not Modified message.

Calomel.org:   header_access Content-Location allow all

The Content-Range header field allows the remote server to tell a client how much data is left during a resumed download. Failure to allow this header on a resumed download will cause an error "206 Partial Content". The server has fulfilled a partial GET request for the client. The client request MUST include a Range header field indicating the desired range, and MAY have included an If-Range header field to make the request conditional.

Calomel.org:   header_access Content-Range allow all

The Content-Type header field indicates the media type of the entity-body sent to the recipient or, in the case of the HEAD method, the media type that would have been sent had the request been a GET. If the server is sending text/html page to the client then "text/html" will be sent through this header.

Calomel.org:   header_access Content-Type allow all

The Cookie header field allows the client to accept the cookie file from the server. This does _NOT_ allow the client to use the cookie, but only accept the cookie object. This header is used in conjunction with the header Set-Cookie to allow the client to accept the cookie file and to use it for the server site. A site that requires the headers Cookie and Set-Cookie for example is Netflix.com which will not even let you log in with out cookies enabled for the client. Other pages like digg.com and amazon.com will not recognize your client if you try to log into them with out this header.

Calomel.org:   header_access Cookie allow all

The Host header is sent from the client to the server specifying the host the client wants to connect to. Some sites use many virtual hosts on one server on a single ip address. If the client does not send the Host header the server does not know which virtual host the client wants to connect to. This is required for most sites.

Calomel.org:   header_access Host allow all



Want to randomize your User-Agent though Squid? Check out our Squid User-Agent Randomizer Script. Hide the true name or your internal browsers and send any string you want to remote web servers.



The If-Modified-Since header is allows the client to ask the server if the page being requested is newer than the last time our client downloaded it. If it is then our client will download it like a standard GET request. If the page is not new then the server will reply with a code "304 Not Modified" and our client will know the copy we have cached is the newest available. This saves the server and the client bandwidth and will make previously cached pages load significantly faster.

Calomel.org:   header_access If-Modified-Since allow all

The Location header is used to redirect the recipient to a location other than the Request-URI for completion of the request or identification of a new resource. This header is sometimes used in conjunction with the Authorization header. For example, the client may log into one server to be authorized and then is redirected with the Location header to another server to access the site or receive the data.

Calomel.org:   header_access Location allow all

The Range header specifies HTTP retrieval requests using conditional or unconditional GET methods and MAY request one or more sub-ranges of the entity, instead of the entire entity. For example a client may request the first 10KB of a 20MB file which includes descriptive information about a rpm package rather than download the entire file. If one is using the "yum" package maintainer and you see the error similar to "Header is not complete. Trying other mirror." then you need to add the Range header to squid.

Calomel.org:   header_access Range allow all

The Referer request-header field allows the client to specify, for the server's benefit, the address (URI) of the resource from which the Request-URI was obtained. The header allows a server to generate lists of back-links to resources for interest, logging, optimized caching, or security (like to stop image hijacking). It also allows obsolete or mistyped links to be traced for maintenance.

Calomel.org:   header_access Referer allow all

The Set-Cookie header works in conjunction with the header Cookie. This header allows the client to use the cookie file download from a site and allowed by the Cookie header.

Calomel.org:   header_access Set-Cookie allow all

The WWW-Authenticate header is the pop-up window or box the client sees to enter their user name and password into. This only allows the client to pop-up the box allowing the input of the credentials, it is nothing else. Once the user enters the user name/password in the box and hits "accept" the header Authenticate actually sends the user name/password to the server. You will need the WWW-Authenticate header if you have hosts connecting to sites like Zap2it or SchedulesDirect.org with MythTV so it can receive updates for TV programming.

Calomel.org:   header_access WWW-Authenticate allow all

The All header is a variable squid uses to define "any" http header. This rule is to deny all headers and is used in conjunction with the above rules. In essence, if the header is not defined above then this rule will block it. Think of this methodology as paranoid mode.

Calomel.org    header_access All deny all



Want more speed? Make sure to also check out the Network Speed and Performance Guide. With a little time and understanding you could easily double your firewall's throughput.



Questions?

Where can I find a list of all of the squid directives?

A full listing of the squid configuration directives can be found here on squid-cache.org.

Where can I find a list of all of the header field definitions?

The listing of header definitions are at W3.org in the rfc protocols section

How do I test if the headers a being changed by squid correctly?

You can test the browser headers at Xhaus's header test page.

How can I start and stop the squid daemon?

squid -k kill        -- To stop squid
squid -k reconfigure -- To re-read the squid.conf file without restarting squid

What is the best way to rotate the squid log files?

Make sure you have the squid.conf directive "logfile_rotate 8" set. The number "8" means we want to keep 8 copy's of the logs. Then setup at cron job like so to actually rotate the logs. Here we will keep 8 weekly log files and rotate them on Sunday at 12midnight.
#minute (0-59)
#|   hour (0-23)
#|   |    day of the month (1-31)
#|   |    |   month of the year (1-12 or Jan-Dec)
#|   |    |   |   day of the week (0-6 with 0=Sun or Sun-Sat)
#|   |    |   |   |   commands
#|   |    |   |   |   |
### rotate logs weekly (Sunday at 12midnight)
00   0    *   *   0   squid -k rotate

Can I setup environmental variables so programs will know squid is available?

Many command line programs will look at the proxy environmental variables and use the proxy defined. For example, in the bash shell (.bashrc or .profile) you can define the following to tell the client to use "squid.proxy.lan" port "8080" for the squid proxy.
 #### .bashrc or .profile
 export http_proxy="http://squid.proxy.lan:8080"
 export https_proxy="http://squid.proxy.lan:8080"
Other programs also have the option of using a configuration file. Wget's config file is /etc/wgetrc. This is an example of the proxy setting:
 #### /etc/wgetrc
 use_proxy = on
 http_proxy = http://squid.proxy.lan:8080/

How do I turn off Firefox's caching ability?

If you dislike Firefox's caching behavior and your nameservers cache just fine then disable all caching in the browser. Firefox's cache is just another level of expirations to go through. Here's the cross-platform method, if you should wish to do so:
In the Firefox configuration URL "about:config" add two new integer entries:
  network.dnsCacheExpiration -> 0
  network.dnsCacheEntries    -> 0




Questions, comments, or suggestions? Contact Calomel.org