Scheduled DB backup with automysqlbackup

If you are running your own MySQL server, setting up a good backup routine is an inescapable responsibility. If you’re not running your own server, check with your hosting provider to verify that they are indeed backing your data up. If your hosting provider does not and does not at least provide you with way to roll your own database backups, consider switching to someone that does.

A handy utility that we like use to backup databases is AutoMySQLBackup, a mature and very effective Sourceforge project which manages both the backups as well as their compression and organization. It will retain daily, weekly and monthly backups of some or all of your databases to your file system. Couple this with something to move the database backups offsite or an effective server imaging schedule, and you’ve got a pretty good mysql backup process.

Installation

To install the utility on Ubuntu, simply run the command-line sequence:

sudo apt-get install automysqlbackup

If you are running on an older version of Ubuntu or a different flavor of Linux, you can grab it from their download page at http://sourceforge.net/projects/automysqlbackup/files

The installation process will install a binary at /usr/sbin/automysqlbackup and a configuration file at /etc/default/automysqlbackup. Simply Issuing

automysqlbackup

at the command-line will attempt to run the backup process.

Configuration

By default AutoMySQLBackup will use root as the login name to get access to the databases for backup. In order for it to access the data correctly, you may need to add a line like this to the file:

PASSWORD=[your root password]

If you do put your password in the configuration file, you may want issue a

sudo chmod 600 /etc/default/automysqlbackup

to only allow access to this file by the root user on the server.

AutoMySQLBackup will backup all of your databases and their tables, will compress these backups with gzip (bzip2 is also an option), and then organize them into three folders stored in /var/lib/automysqlbackup/daily, weekly, monthly. You can change where these folders are created by editing the BACKUPDIR configuration variable. Each of these folders will contain a folder for each database on your system.

By default, the daily folder will contain all of the last seven days. The weekly folder will grow to contain the database as it was on Sunday each of the last fifty-two weeks. Similarly, monthly will contain the end of all each of the last twelve months.

If you only want to back up certain databases, you can specify them in the DBNAMES configuration variable. Conversely, if you want to backup everything except certain databases, you can use the DBEXCLUDE configuration variable to list what to exclude.

Scheduling

Once you have configured it to your liking, issue a

sudo crontab -e

and add a line like this to your crontab:

0 0 * * * /usr/sbin/automysqlbackup

This will schedule your database backups to occur every night at midnight. Now go get yourself the beer of victory.

Edge Side Includes with Varnish Cache

Overview

When caching content with Varnish, you can set different cache times for different types of content or URL patterns, which gives you some pretty good control over how quickly content is refreshed on your site (especially when combined with a good cache invalidation, or in Varnish nomenclature, purge, strategy).

With Varnish’s implementation of ESI, you can make this caching strategy even more granular by caching different pieces of the same page for differing amounts of time. ESI stands for Edge Side Include, and is a standardized way to include cached content within other cached content at the proxy level.

Just as server-side includes are injected into the page by the web server, ESI content is injected as it passes through the reverse proxy service – in this case, the Varnish cache. Other proxy services, such as Mongrel, Squid, and Akamai have also implemented versions of this standard.

An example of why you might want to implement this is if you had a page that was largely static, but needed some kind of dynamic feed embedded in it, or wanted to display some user-specific content that needed to output very up-to-date, custom information. To accomplish this, you can add an ESI tag to your output markup, instructing it to include the content of another page.

Basic Syntax

Here is what an Edge Side Include tag looks like:

<esi:include src="/foo.php"/>

In this example, we’re instructing Varnish to grab the content of foo.php and place it into the markup where this tag lives.

The strategy normally used when employing ESI tags is that the page that contains the tag is given a high TTL (time to live), meaning that Varnish will cache its content for a long time.  At the same time, the page that is being included (foo.php in this case) is given a short TTL.

The end result is a page that mostly springs from the Varnish cache, but a little piece of it has more frequently updated information.  This is faster, and therefore more scalable than generating the entire page from scratch, but still provides your users with current content.

A couple of words of warning from the Varnish Cache project on this:

  • Do not esi process all your web content, in particular not binary objects like images, since they could become garbled as random byte strings match the <esi:… syntax.
  • Remember the trailing slash in your ESI include tag – it must be a self-closing XML element.

Configuration

The last piece of the puzzle is to actually enable ESI include processing (or else you’ll just see the ESI tag in your output). To do this, you’ll need to add some code to the vcl_fetch section of your vcl file.

In Varnish 3, it should look something like this:

if (req.url == "/test.php") {
 set beresp.do_esi = true; /* process page for ESI /
 set beresp.ttl = 24 h; /* long TTL */
}elseif (req.url=="include.php"){
 set beresp.ttl = 1m; /* short ttl */
}

If you’re using Varnish 2 still, it’ll look like this:

if (req.url ~ "test.php") {
 esi; /* process page for ESI */
 set beresp.ttl = 24h; /* long TTL */
} elseif (req.url=="include.php"){
 set beresp.ttl = 1m; /* short TTL */
}

Don’t forget to restart Varnish after you’ve made your changes!

Finally, it’s important to note that you can nest esi tags – in the examples above, include.php could actually include other ESI-driven content. By default, the maximum depth you can nest ESIs inside each other is five, but you can change that if needs be by setting the parameter max_esi_includes.

Stopping email spoofing with SPF

SFP, or Sender Policy Framework is a standard that has been implemented in order to try to prevent or reduce email address spoofing.

Email Spoofing

At my workplace, we’ve managed email for our clients for years, initially self hosted, and later hosted at Rackspace’s excellent Email and Apps group.

In the early to mid 2000’s, we were suddenly faced with a rapid upswing in complaints of people receiving emails from themselves or other people at their company, when very clearly hadn’t actually done the sending. Usually, their first thought they had was that they had been compromised – a virus on their own system or their mail credentials stolen. What ended up being the case was that their email was being spoofed.

Spoofing email happens to be ridiculously easy to do – most all spam you receive is spoofed. If you run your own SMTP server, or have access to an open relay, a (usually misconfigured) mail server that lets third parties send email to other third parties, you can even try it yourself!

Here is an example command-line email session showing how easy it is to spoof an email :

telnet localhost 25
 Trying localhost...
 Connected to localhost. 
 Escape character is '^]'. 
 220 localhost ESMTP Postfix 
 helo localhost 
 250 mailer.pelanne.com 
 mail from: phil@pelanne.com 
 250 2.1.0 Ok 
 rcpt to: pelanne@gmail.com 
 250 2.1.5 Ok 
 data 
 354 End data with . 
 Yo yo... 
 . 
 250 2.0.0 Ok: queued as EA829A1BD0DB

…and just like that my email is out the door. Now, imagine that I change the from and to addresses to other people and all kinds of mischief can be afoot.

Enter Sender Policy Framework

To prevent this, the SPF standard was created. It unifies DNS and mail technology to try to provide some control over who can send email on your behalf.

On the DNS side of things, a TXT / SPF record is added to the zone file. The assumption here is that only the owner of the domain or their trusted technology partner will have access to be able to make such additions to their DNS records, so this is a good place to add instructions about what mail servers are authorized to send your emails out.

On the mail service side, the idea is that for every email that comes in, mail servers will check the from address to determine the domain, then do a DNS lookup on that domain, looking for that special DNS record. If it exists, the record will tell dictate what mail servers are allowed to be sending email for that domain, and if the email does indeed come from one of those servers, it will be allowed through. If it does not, it will be treated with suspicion. The amount of suspicion can also be suggested in the instructions stored in the DNS record.

SPF Specifics

We’ll now look at a quick example of what an SPF record looks like. Back in the day, BIND used TXT records to store information for SPF queries. Since BIND 9.4, a new record type called SPF is available. Currently, it’s identical to the TXT record except for the record type. The RFC recommendation is that both be included in a zone file. For brevity, we’ll simply look at the TXT version, but if you’re implementing in BIND, be sure to include both:

name ttl class TXT text name ttl class SPF text

The text part is what contains the actual information that the mail servers are interested in. Let’s look at couple of examples:

v=spf1 ip4:216.70.64.0/23 ~all

You will always see v=spf1 on an spf record. Currently there are no other versions available, but in the future there may be.

The ip4 portion says “email from my domain is okay if it is sent by a mail server in a particular range of ip addresses on the 216.70.64 subnet”.

The ~all portion says “for all email that fails this test, mark it as suspected spam”. The part about marking it as spam is all in that tilda. If that tilda were to be changed to a hyphen, it would be a “hard fail” instead of a “soft fail”, and would recommend that the server reject the email outright. It is usually safer to start with soft fail, and adjust upwards if needs be.

Alternately, you can include someone elses’s rules – let’s look at including Google’s

v=spf1 include:_spf.google.com ~all

This basically says “enforce the rules found in Google’s SPF record at _spf.google.com with a soft fail.”

If we look up Google’s spf record, we find that they are including additional two spf records:

include:_netblocks.google.com include:_netblocks6.google.com

Which translate to:

ip4:216.239.32.0/19 ip4:64.233.160.0/19 ip4:66.249.80.0/20 ip4:72.14.192.0/18 ip4:209.85.128.0/17 ip4:66.102.0.0/20 ip4:74.125.0.0/16 ip4:64.18.0.0/20 ip4:207.126.144.0/20 ip4:173.194.0.0/16
ip6:2607:f8b0:4000::/36 ip6:2a00:1450:4000::/36

…respectively.

Using includes is an excellent way of not having to keep track of ISP’s valid mail server IP addresses yourself and is highly recommended.

This touches on the two most popular ways of specifying SPF records, but if you want more details about the mechanisms for enforcing them, hop on over to the record syntax page.

Not all mail service providers check SPF records, but these days most do. If you do not have an SPF record, it won’t affect you negatively other than receiving a whole bunch more spam from yourself, but who wants that? Go create your SPF records now!

Prepping Varnish Cache

When we need to scale the amount of traffic a server can handle, we frequently turn to Varnish Cache for the job. It is a reverse proxy cache, meaning that it usually sits on the same server as your web service, between it and the outside world.

Varnish will take care of caching web content (markup, stylesheets, images – anything you see in the network inspector of your web browser), and then determines whether it needs to bother your web server about subsequent requests or just pull their contents from the cache. As such, a well-tuned Varnish configuration will reduce the amount of work that your web and database services need to do.

It’s stinkin’ fast as well – your configuration is compiled directly into c, and the cache can be configured to live in a reserved block of RAM.

Getting Varnish set up and running is a breeze, and its default settings are very good for most situations. You will, however, want to tune it to your site specifically before you go live. This entails things like making sure it won’t cache administration areas, teaching it what to do when your web server is slow or crashes, and making it aware of application plugins that can work with it, or external headers introduced by things like CDNs, and this does takes a little bit of doing.

There are a are a few useful configuration items that we end up adding to just about every Varnish installation which help us as we work at getting Varnish fully tuned.

Pass the visitor’s IP address through

Because Varnish answers on port 80 on the local machine and your web server answers on some other port, in a default setup, the webserver will see all requests as originating from 127.0.0.1. If you want to pass the IP address of the person talking to Varnish on to the webserver, you will need to send it as an extra header. The X-Forwarded-For header is often used for this purpose. As such, in the vcl_recv method of our Varnish configuration, we usually add:

remove req.http.X-Forwarded-For;
set req.http.X-Forwarded-For = client.ip;

This will pass the visitor’s IP address through to Apache/nginx, and you can now get at it with any scripts or .htaccess files you’ve set up. In Apache, you can also add this to the log files by adding:

LogFormat "%{X-Forwarded-For}i %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" varnishcombined

to your apache.conf file, and

CustomLog /var/log/httpd/example.com/access.log varnishcombined

to your individual site configuration. In doing this, your visitor’s IP address is now logged and greppable or available for fail2ban to watch.

Know whether Varnish is caching something or not

Another thing that we frequently need to do is know whether Varnish is doing it’s thing or not. From Varnish’s website is a handy configuration example – hit/miss headers. If something is served from Apache directly, Varnish will add a Miss header, and if it comes from Varnish’s cache, it will add a Hit header. To do this, simply paste the following to your .vcl file in the vcl_deliver section, and restart Varnish:

if (obj.hits > 0) {
    set resp.http.X-Cache = "HIT";
} else {
    set resp.http.X-Cache = "MISS";
}

You will now see these headers on each item served from your server in the network inspector panel of your web browser.

Allowing for cache purges

Finally, you may need to clear the cache for a particular URL on occasion, especially as you’re getting it set up. First, you’ll want to define who is actually allowed to clear the cache of a URL. You don’t want just anyone to be able to do this. Do this by creating an access control list called “purge”:

acl purge {
  "localhost";
  "127.0.0.1";
  #add other IP addresses as needed
}

Then add some code to the vcl_recv, vcl_hit, and vcl_miss blocks that react to a PURGE header:

sub vcl_recv {
 if (req.request == “PURGE”) {
 if (!client.ip ~ purge) {
 error 405 “Not allowed.”;
 }
 #Perform a cache-lookup, which will purge the actual cache
 return (lookup);
 }
}

sub vcl_hit {
 if (req.request == “PURGE”) {
 purge;
 error 200 “Purged.”;
 }
}

sub vcl_miss {
 if (req.request == “PURGE”) {
 purge;
 error 200 “Purged.”;
 }
}

Once these are in place, restart Varnish, and you should be able to perform queries at the command line like this:

curl -X PURGE http://example.com/foo/bar.html

Provided that you’re performing this request from one of the allowed IP addresses in your purge acl, you will clear Varnish’s cache of that item.

Watching files with tail and less

There are many instances in which you may want to watch a file for new writes. In my experience, I’m usually checking for particular activity in an Apache or nginx log file. Frequently the gut-check question of “is my service being hammered?” can be (at least initially) answered with a quick visual sweep of the frequency and types of writes to a log file.

At the command line, there are two common ways of doing this – one using tail, and one using less.

Tail

By default, tail will show you the last 10 lines of a file, and then exit.  Now, it may be that’s all you really need to check — perhaps all you’re looking to vary is how far back you’re looking in the file.  You can change the number of lines tail outputs by using the -n option.

For instance, to show just the last two lines of an nginx log file, use:

tail -n 2 /var/log/nginx/access.log

The real power of tail, however, is found when using it in follow mode. This is done by enabling the -f option:

tail -f /var/log/nginx/access.log

Running tail this way continuously outputs the contents of access.log to your terminal, updating the display as new lines get added to the file by the server process. You can escape this mode by entering ctrl-c, which will drop you back to the prompt.

If you pipe the output to other processes, interesting possibilites open up, for instance, filtering on a particular term with grep:

tail -f /var/log/nginx/access.log | grep beer.jpg

…will only output all instances of hits to beer.jpg that make it to the log file (note that if you’re using a CDN or setting expire headers, this may not actually be an accurate representation of hits). Alternatively,

tail -f /var/log/nginx/access.log | grep myscript.sh

…will pipe the output to your own script in which you’ve perhaps written some awk wizardry to slice and dice the lines in the output.

Less

If you would like to work within the file itself, performing searches or paging forward and backwards in the contents, you may want to use the less command:

less +F /var/log/nginx/access.log

The +F option will put you into follow mode in your less session. As with tail, new output will appear at the bottom of the file as it’s written to the file by whatever server process is appending to the file. You’ll know you’re in follow mode because you’ll see the message at the bottom of the file: “Waiting for data… (interrupt to abort)”

Hitting ctrl-c will interrupt this mode and allow you to jump out of follow and into the standard interactive less session, which lets you search (“/”) and jump forward (space bar) and backward (“b”).

To go back into follow mode, hit shift-F.

DNS Troubleshooting with nslookup

From time to time, when working with client’s website or email services, you will be faced with DNS issues, real or perceived.  Something doesn’t resolve, or is resolving incorrectly, and people need to know whether the issue is at the DNS level, or somewhere else.

Using the venerable nslookup command, there are two types queries you can perform that can help you quickly diagnose many of these types problems.  Both involve using nslookup in interactive mode.

To enter interactive mode, open up a terminal and enter nslookup, then hit enter.  You will be placed at the interactive nslookup prompt, which looks like a simple greater-than symbol, at which you can enter further commands in an interactive session.

Querytype

The first method is to request particular types of records via nslookup.  To do this simply enter: set querytype=record-type, in which you replace “record-type” with the type of record that you want to check for.

The ones I tend to have to look for most frequently are A, CNAME and MX records.  For these, you’d be entering set querytype=A, set querytype=MX or set querytype=CNAME, respectively.

You then follow this command with the record you’re specifically looking for.  By example, to get the list of MX records for domain.com you would enter:

nslookup
> set querytype=mx
> domain.com

The response would return all of the MX records for example.com.  Hey, you’re acting like a mail server!

Server

The second type query I find myself performing frequently is to specify what server to use when making the queries.  There are times when it’s important to know that you’re not getting cached data at a non-authoritative server.  In such cases, the solution is to query the primary or secondary authoritative servers directly.

To do this with nslookup, use the server command.  In this example, we will query the primary DNS server ns1.example.com:

nslookup
> server ns1.example.com
> domain.com

After you enter the server command, all subsequent queries will be made directly to the server you have specified.

Determining Reverse IP DNS Server Authority

If you’re running DNS service for a client, you may be occasionally be asked to add PTR records to a zone, to add or modify reverse DNS resolution.  Reverse resolution is simply the flipside of forward DNS resolution, or in other words, the mapping of ip addresses back to names.

By example, the reverse lookup of 4.2.2.2 is b.resolvers.Level3.net.

At the DNS server level, this is set up by creating a PTR record in the DNS server that is responsible for the reverse lookups:

For 4.2.2.2, in the zone 2.2.4.in-addr.arpa, a record would be created, which would look something like this in BIND:

2  IN  PTR  b.resolvers.Level3.net.

This would map the ip address 4.2.2.2 back to the name b.resolvers.level3.net.

A frequent misconception is that the same DNS servers that are responsible for the forward-resolving A and CNAME records are also responsible for the reverse-lookup PTR records.  This can be the case, but frequently is not.

Often, the reverse-lookup DNS servers are managed by the ISP, the people who manage the IP address itself.  However, they themselves can delegate authority for smaller blocks of IP addresses, so sometimes it WILL occur that the same server is handling both.

How can you determine what server is responsible for resolving PTR records?  Let’s look at the example of this website’s IP address – 50.57.54.144.

A fairly quick manual way is to perform some recursive nslookups (dig is another great tool for this).  Start by querying any ole root server.

nslookup
>  server a.root-servers.net
> 50.57.54.144
Authoritative answers can be found from:
in-addr.arpanameserver = d.in-addr-servers.arpa.
in-addr.arpanameserver = e.in-addr-servers.arpa.
in-addr.arpanameserver = f.in-addr-servers.arpa.
in-addr.arpanameserver = b.in-addr-servers.arpa.
in-addr.arpanameserver = a.in-addr-servers.arpa.
in-addr.arpanameserver = c.in-addr-servers.arpa.

This is telling you “please contact one of these other DNS servers.  They’ll either have the answer or be able to tell you who to contact next if they delegate further”.  So, choose one of the authoritative servers listed, and repeat:

>  server d.in-addr-servers.arpa.
> 50.57.54.144
Authoritative answers can be found from:
50.in-addr.arpanameserver = u.arin.net.
50.in-addr.arpanameserver = z.arin.net.
50.in-addr.arpanameserver = r.arin.net.
50.in-addr.arpanameserver = y.arin.net.
50.in-addr.arpanameserver = v.arin.net.
50.in-addr.arpanameserver = w.arin.net.
50.in-addr.arpanameserver = t.arin.net.
50.in-addr.arpanameserver = x.arin.net.

…and do it again:

> server u.arin.net
> 50.57.54.144
Authoritative answers can be found from:
57.50.in-addr.arpanameserver = NS2.RACKSPACE.COM.
57.50.in-addr.arpanameserver = NS.RACKSPACE.COM.

One more time!

> server NS.RACKSPACE.COM
> 50.57.54.144
144.54.57.50.in-addr.arpaname = 50-57-54-144.static.cloud-ips.com.

…and we’ve gotten to our authoritative server for this particular IP address.  If the reverse lookup name needed to be changed for this IP address, Rackspace would be the company to contact.

The question of who to contact for PTR record setups or changes seems to come up for me every couple of months, so this is something I fall back to fairly regularly.

Tools O’ The Trade

About once a year, I like to look back and consider what tools and utilities have impacted my development processes and online existence in a positive way.  This is the list for 2012…

Things: An oldie but a goodie (to-do manager).  I tend to vacillate between this and a todo.txt file synced between devices via Sugarsync.  Where Things excels is in its ability to cross-connect to other apps (like saving links to particular emails in Postbox, for instance), and the fact that it can schedule regular reminders for repeatable tasks.

Phing: This year in the development group, automation and consistency across developers have been the theme.  As such, we’ve been using meta-languages such as SASS, HAML, and Coffeescript, batching processes into git-hooks to make sure consistency checks are done, and running automated documentation via phpdocumentor and docco, and our lives are better for it.  Phing has helped bring these together under a nice PHP-based umbrella.  It is a PEAR project that lets you easily chain processes together, setting dependencies on earlier tasks for later ones.  For instance, want to make sure that your documentation is updated before commiting?  Or that all your images are optimized and your css/js minimized before pushing to staging?  Phing makes that a breeze.

Git: I really can’t tell you how many times git has saved my bacon over the last few years.  Or how many a fewer times I’ve run into merge conflicts compared to when we used SVN.  At work, we use it via Beanstalk.  We do love git.

Teleport: Most of the servers we deploy are using Ubuntu LTS editions.  Ensuring that they’re consistently deployed and configured was always a matter of hoping that the person setting them up (usually me) was doing it the same way each time.  This mostly worked, but human error being what it is, we sometimes ran into server-to-server inconsistencies.  We played a bit with Puppet and Chef, but the Ruby gem Teleport has given us the simplicity and ease of use we were looking for, especially considering our fairly moderate needs.

Itunes Match: With a music collection hovering at 160 gigs, storing music on my phone was looking to become a synchronization nightmare.  Luckily, for $25/yr, Apple is happy to store my music and stream it to my phones and other computers.  Songs in the itunes catalog are simply streamed at 256 bits (regardless of the quality of my version), which literally rocks.  Other music is uploaded to the cloud and streamed from there.  With the newest iOS, it doesn’t even save a copy of it on your phone (unless you want it to) – by default it simply streams, saving valuable storage space and rocking the tunes even more quickly.

Music for Programming: At least for me, this podcast has been great.  When I need to get in the zone, I require either silence, or something abstract enough that my brain will not grab onto musical structures or lyrics.  Music for Programming is an irregularly but fairly long-running podcast that fulfills these requirements well via hour to hour-and-a-half long installments.

ievms: It’s impossible to avoid. If you’re developing for the web, you’re going to run into having to hack/enhance perfectly good code into a state that will let it run consistently across various versions of Internet Explorer.  Luckily, Microsoft makes available some barebones images of their operating system with IE ready to go, for each of the last several IEs (6,7,8,9 thus far).  Using Sun’s excellent (and free) Virtualbox product, you can run these images, giving yourself the freedom to test easily and locally.  Unfortunately, the installation of these images is a bit of a tedious process.  The ievms project automates said installation, making it as easy as entering one line at the terminal.  Rock and roll.

Sublime Text: There was a time when the entire dev team at NewCity was using Textmate for all of their text editing needs.  Times change, but unfortunately Textmate didn’t.  Fortuitously, there was soon a new player on the scene – Sublime Text Editor.  It ran all of the Textmate bundles, featured an open and active plugin architecture, AND it was under regular and effective development. Given it’s intuitive, customizable interface and vibrant community, we haven’t looked back.

Thoughts on Cloudflare

We’ve been using Cloudflare, a combination Akamai-like CDN and security layer that you place in front of your websites.  It’s been about a month now, and here are a few thoughts:

  • Extremely low barrier to entry – fire up an account, change your DNS, and you’re off and running.  Even the free account offers a ton of options, so you can get a good sense of what it can do with no expense.
  • It’s done an amazing job saving on bandwidth – since we put it out in front, it has handled almost 90% of the bandwidth from its CDN, rather than our server having to send that out.  Memory-hungry Apache thanks you (and your battalion of nginx webservers), Cloudflare.
  • In doing so, it’s handled about 75% of the requests to the server.
  • Questionable behavior tends to be based in the US during the day and originates predominantly from China and Russia at night.  Makes sense, I guess.  US attacks are the majority.
  • As with Google’s mod_pagespeed, you’ll want to go easy on the optimizations and ramp up slowly, testing as you go, as they may break some of your custom client-side functionality or styling.  JS and HTML minification seemed to do the most damage for us.  The Javascript async optimization (aka, Rocket Loader) caused a couple problems as well.  Luckily you can just disable problematic optimizations.
  • The page-specific rules are nice for setting up aggressive caching for most of the site, and less aggressive caching for certain parts.  We also use it to enforce SSL in certain places (their full vs flexible SSL encryption is pretty cool!).

The only real challenge we’ve encountered is the loss of control that clients may face by having to change their DNS over to CF.  We have run into a couple situations in which the client wanted us to handle website-specific optimizations, but still needed to retain control of their DNS, which has precluded them from switching to Cloudflare.  It would be nice if there was a way to have all cloudflare accounts under one umbrella account, but still be able to give individual clients access to CF DNS management for their domain – that would go far in solving that.

Overall, it’s been a great product to use thus far, and in following their blog, the company is clearly run by smart, innovative folks who are working hard at pushing the boundaries of security and CDN services while making it simple stupid for the end-user.

 

Dynamic server protection with fail2ban

Fail2ban is a great framework written in Python that watches your server log files and is used for intrusion detection and general bad behavior prevention.

It is configured via text config files, which determine what log files you want it to watch, and let you define what constitutes a notable event in the log file.  These notable events are specified via regular expressions in a filter config file.  Fail2ban comes with a handy command-line utility, fail2ban-regex, which will let you test how fail2ban will use your regular expressions on test log snippets you provide.

Once you’ve defined the events you want fail2ban to look out for, you can then define what action it should take when enough of those events occur.  Potential actions to take can include running scripts, sending emails and dynamically modifying firewall rules.

One of the most common uses of fail2ban is to watch the /var/log/auth.log file that SSH writes to in cases of invalid login attempts.  If it detects enough failed attempts, fail2ban can ban the offending IP address for a specified time period by dynamically editing the local iptables rules.  Another good use case is to utilize it to ban IPs that generate failed HTTP authentication entries in the Apache error logs (by default in /var/log/apache2/error*.log).

Fail2ban can also be set to automatically unban blocked IP addresses after a certain amount of time.  This will help mitigate cases of false positives, and still provides some protection against automated attacks and scans, which tend to move on after being blocked.

To install fail2ban on an Ubuntu server, simply issue a:

sudo apt-get install fail2ban

Once it’s installed, you’ll find a number of pre-defined filters in /etc/fail2ban/filters.d.  You can determine which ones are active by editing /etc/fail2ban/jail.conf.  The jail.conf file is also where you define how many attempts constitute a ban, and for how long a ban is effective.

Fail2ban will maintain a log of its activities, including what IPs get banned and unbanned, at /var/log/fail2ban.log.

More information about this great utility can be found at its website: http://www.fail2ban.org/wiki/index.php/Main_Page