Skip to content

Recent Articles


Enhanced spamstats for Munin

I added some extra fields to the stock spamstats plugin for munin-node.  This version tracks legit (ham), spammy, spam, viruses, and DNSBL blocked mail.  The output looks like:

munin-node spamstats

The script is after the jump.

Read more »


ClamAV, American Express, and Heuristics.Phishing.Email.SpoofedDomain

ClamAV was doing it’s job scanning email via amavis-new.  It was catching all the nasties that folks tend to foist on their fellow net citizens.  Unfortunately, when your spam and virus filters are doing their job, they occasionally catch folks who aren’t malicious, but also aren’t using best practices.  This was the case with American Express.  Emails to my clients from were being classified as “Heuristics.Phishing.Email.SpoofedDomain”.  Searching around the net brought me to several sites where admins had ended up doing crazy things like disabling the heuristic scanning on email in ClamAV, or creating elaborate policy banks in amavis.  Well, I was having none of that.  I like the most correct, simplest solution.  Hopefully, this methodology will help others solve similar issues.

One post I read referenced this document that relates how to create whitelists for Clam’s phishing filters.  That’s a good start.  That same document mentions a utility script called “” that will help isolate why an email is getting picked up by a rule.  Unfortunately, my install didn’t have that script.  A little searching brought me to a copy on GitHub.   Running that led to a laundry list of python specific issues, mostly due to my environment.  But, using the script as a guide, I just did it manually. The following command gave me a goldmine of information.

clamscan -d /var/lib/clamav/ --debug --max-filesize=0 --max-scansize=0 /root/scratch/amex_mail.eml 2>&1 | less

A few items to note about the command:

  • The path after the -d is the location of my AV signatures.
  • The amex_mail.eml is the raw text of the email (headers and all) that I pulled out of our quarantine database.

In that giant slew of output from the clamscan debug, the important part was this:

LibClamAV debug:
LibClamAV debug: Phishing: looking up in whitelist:; host-only:1
LibClamAV debug: Looking up in regex_list:
LibClamAV debug: Lookup result: not in regex list
LibClamAV debug: Phishcheck: Phishing scan result: URLs are way too different
LibClamAV debug: found Possibly Unwanted: Heuristics.Phishing.Email.SpoofedDomain

The issue is that there are links that display one URL, but link to a different URL altogether. I added the first pair to a file called daily.wdb in the same directory as my other ClamAV signatures. (/var/lib/clamav/ in my case.)  With each pair that I added, I would re-run the debug command and discover a new pair.  I ended up with three pairs in there before the emails checked out clean.  Below is the contents of the daily.wdb file.

Here is a more advanced example of a daily.wdb file.

Once I restarted clamd, the AmEx emails started to pass as expected. Hope this helps someone.


AIX, Kerberos, LDAP, and Active Directory

I have an article published on IBM Develeoperworks detailing a working setup of Kerberos, LDAP, and AD authentication from AIX.


The Raspberry Pi Has Arrived

Due to the insanely late ship date of my Pi, I received the v2 model for being patient.  (The v2 has 512MB of onboard memory vice 256MB on the v1.)  In the first 30 minutes, I was able to load one SD card with RaspBMC (XBMC optimized for the Pi) and another with Raspbian (The project’s build of Debian Wheezy for the Pi).  I didn’t get too much time to poke around with it from there.

Here are some of the uses I have in mind:

  • Media center to stream content from the movie/TV collection on my NAS. (using RaspBMC)
  • Penetration testing platform using PwnPi, Raspberry Pwn, or rolling my own, starting from a base install of Raspbian.
  • A replacement for my always on test/dev server that also acts as the gateway for me to SSH into my home network.
  • A front end for my audio components using an IR sender and a web GUI accessible from my phone. (Those GPIO pin outs are too tempting.)

Odds are, the first unit will end up being a blend of many of those options.  I’m already queued up for 2 more when I make it down the list again.

Because I spend more time at the prompt than consuming media, I think I’ll start with Raspbian and build up the LAMP stack first.

More to come as time allows.  I should probably be studying for my CISSP now…


Filling PDF forms with PHP – Part 1

At work, we have oodles of databases containing untold treasures of information.  We also have boatloads of fillable PDF’s for just about every possible task. (It’s government work after all.)  As I walk around, I see these poor souls listlessly going back and forth between their database apps and their PDF forms, copying and pasting.  I finally decided that surely I could help the disenfranchised masses remove ten or twenty Ctl-C/Ctl-V operations out of every form they fill out.  Nobody wants to spend their day making “copypasta”.   So, I’m secretly (never tip your hand until successful) undertaking this as a side project. (Because apparently I have something against sleeping.)

So, my initial tests look promising, but I haven’t generated a lick of PHP yet, so don’t get excited.  Hopefully this will give you an idea as to what I’m doing.

Here’s the gist of it:

The Swiss Army Knife of PDF tools, pdftk, can generate an FDF file from a PDF form like so:

pdftk a_pdf_form.pdf generate_fdf

The generated FDF file had a few odd characters (like ^@ and þÿ) that I had to scrub out to make it useful.  That could be my environment, your mileage may vary. (CentOS 5.8, pdftk 1.44, BTW)  If you have the same, I used the following 2 commands in vim to scrub them out:

:%s/^@//g   #Note you get ^@ by typing Ctl-v then Ctl-@

Now I had a raw, blank FDF to work with, but all my fields were out of order and had names like “TextField[1]” or “CheckBox5[0]”, which was further made ugly by the fact that the names repeat for each row of fields nested inside the PDF’s table structure.  Icky.  My first thought was to enter values for every text field like “textField1″ to “textFieldn”, but when I laid the data back over the form, there was no semblance of order and it was going to be a nightmare.  I decided to go back and fill out the original PDF with descriptive names, regenerated my FDF, and cleaned it up like above.  Now I had an FDF where I could tell up from down, mostly.  The pertinent parts of the FDF look like this:

/V /1
/T (CheckBox5[0])
/V /
/T (CheckBox5[1])
/V (SSN4)
/T (TextField[1])

I still can’t discern the checkboxes, so some trial and error will be necessary there.  The basics of it are thus:  The value is in the “/V” line and the field type/designator is the “/T” line AFTER it. Text values go inside parentheses.  For checkboxes, a lone slash means unchecked.  The checked value will depend on your form.  The PDF spec calls for “(Yes)” for a checked box.  The form I was working with uses “/1″.  Trial and error.  Good luck on that.

If you change some values in the FDF and want to try generating a filled form, you use the following syntax:

pdftk original_pdf.pdf fill_form generated_and_modified_fdf.fdf output new_filled_pdf.pdf

If all goes well, you’ll have a neat new filled PDF.

That’s great, you say, but how does this work with PHP?  As I see it, you use the FDF file as a template by inserting placeholders in each of the values.  You parse and replace them, then merge with the original PDF to get your filled form. (using shell_exec() or similar)  I’ll work that part out and write Part II.

If you work it out first, or already have and really like me, post it in the comments.


Raspberry Pi

With the forthcoming shipment of my long-awaited Raspberry Pi, I am adding a new section under the projects category.

I wish I could have gotten my hands on more than one, because I have had way too much time to dream up uses for them.

More to come…

In the meantime, my buddy already received his and is way better about actually updating his blog at


Pandora from the Command Line

I like having Pandora going pretty much all the time, be it Bach when I’m coding, Techno for sysadmin tasks, or indulging my shameful pop music addiction.  I wanted a way to control Pandora without having to drop out of the shell.  I wanted it for my Mac, but lucked out and found one that works across all the platforms I use.  Pianobar is a command line Pandora client and it works in Mac and Linux. (And Windows too.)

I was having a little trouble building it in Snow Leopard using the instructions from here, when I discovered that it’s already available in MacPorts.  So I installed it with:

sudo port install pianobar

In Linux, you can find links to the repos for your distro of choice on the Pianobar website.

Next, I wanted it to login automatically and start playing when I launched it.  On Mac, you can create a config file at ~/.config/pianobar/config, with contents similar to the following:

autostart_station = 1234567890
password = s3cR3t_sQu1RR3L
user =

To get the station ID for the autostart_station parameter:

  1. Run pianobar
  2. Log in manually
  3. Launch your favorite station
  4. Hit i to see the station and song info.
  5. The station ID will be in parentheses after the station name.

After you’ve got your file saved, you should be able to launch pianobar and have it start playing auto-magically.

Now, my next step was to use at so I could start pianobar at a given time and use it as an alarm clock.

You need to enable atrun on your Mac to use at to schedule jobs. (It’s enabled by default on most Linux distros.)  You can schedule the launch like so:

at 5:30am tomorrow #hit enter
pianobar #hit enter
#hit Ctl+D

If you start pianobar with at, it’s not on an interactive shell so you have no way to interact with it, or so I thought.  You can create a fifo file to pass controls to the process:

mkfifo ~/.config/pianobar/ctl

Once you have that, you can control pianobar by echoing commands into the fifo:

#To pause:
echo p > ~/.config/pianobar/ctl
#To quit:
echo q > ~/.config/pianobar/ctl
#etc. etc.

Hopefully that’s food for though enough to get you started.  Enjoy.


PHP Choose Your Own Adventure

It all started with a whim.  “I wonder if anyone has written a choose your own adventure game in PHP?”  That landed me on Cal Henderson’s “choose” game.  That, in turn, led me to Club-Ubuntu’s fork.  I started playing around with the stock version and found myself making quite a few changes, so I decided to dig in a little deeper and make it an official fork.  I added support for Google Adsense and Analytics, ReCaptcha on the user forms, and an admin page to manage some of the new features as well as some of the copy throughout the site.

I pushed the initial release to github tonight.  You can find the code here.  There is also a working demo available.



Cleaning out the the wiki

I started going through my wiki and some old folders on my NAS this week.  I have several old projects that may be of use to someone.  Over the next few weeks, I’m going to try to post a whole slew of stuff.  I’m also working on a few new projects that are nearly ready to be put out into the wild.  So, in short, stay tuned.


Samba Network TIFF Printing

For over a year now, we have been using a samba shared network printer to generate TIFF files from electronic documents so that they can be imported directly in to iPerms.  The TIFF printer is simply a script that takes PostScript input from the client machine’s print driver and converts it to an iPerms compatible TIFF image.  This is primarily useful for an Army installation, but may be relevant if your site is using some other form of document archiving system that uses TIFF images.  I can say that ours has spit out over 30,000 pages.

How it works:

Users setup the printer on their system using a PS print driver.  (On Vista, I usually use the HP Color Laserjet 2500 PS driver)  When they print a document to the printer, it generate a PDF, converts the PDF to a TIFF file and removes the interim PDF.  The completed file is dropped into a share with 0600 file permissions.  I use 0600 because I set the share to only display readable files.  Thus, while everyone is printing to the same folder, they only see their files when they open the share.  Less chance of PII leakage.
Read more »