Simple Attachment Filtering

Archived Content

This article was written a long time ago and it is no longer being maintained. The contents may not be relevant and links may not work. Thank you for your understanding.

Introduction

Attachment Filtering Script
attfltr.zip / 11kB
The source code discussed in the article. The ZIP package also contains the required smtpreg.vbs event administration script.
Download

The recent attack of the W32.Sobig.F@mm virus confirmed again that virus creators do not have to use a professional infection mechanisms which exploit security vulnerabilities to spread their virus over the Internet.

It is often enough to send the virus in email with a subject like "Thank you!" and attach the virus with a name like thank_you.pif. The less experienced user will open the mail and the attached file—the virus itself.

Fortunately nowadays most corporate networks are already protected against virus attacks by anti-virus software. The anti-virus software updates its virus definitions day by day, so these networks are very well protected.

But what if you get a new virus which is not known yet by the anti-virus software? Ouch! Removing the virus will be a painful process! Would it not be better to have a first line of defense which removes the suspicious attachments from the incoming mails? I guess you do not want get .pif or .scr files in email at all.

Exchange does not support attachment filtering based on extensions, so you need a third-party software to do this (or you can write your own). I recommend the second option as it is really not that difficult.

In my previous article ( HOWTO: Remove Read Receipt Requests on the Server ) I introduced the SMTP OnArrival event sinks. These event sinks are scripts that are executed for every inbound mail and can manipulate these emails—exactly what we need.

The great plan

Writing this simple attachment filter script is not that big of a project which requires compelling 3D UML diagrams, developer team meetings and market research, so our plan will be simple. Our script will:

Sounds simple.

Ah, I almost forgot to mention that we will write the script code in JScript (ECMAScript, Microsoft's JavaScript variant). Sorry, no VBScript here, I do not write any code in VBScript unless it is really necessary. If you have ever worked with VBScript and another programming language, I guess I do not have to explain why.

Source code

Here comes our script source code.

<SCRIPT language="JScript">
  // declare "constants"
  var cdoRunNextSink = 0; 
  
  /* entry point called by CDO */
  function ISMTPOnArrival::OnArrival(Msg, Status) {
    // declare variables
    var strInfoText = "";
    var strInfoHTML = "";
    var bIsHTMLMail = false;
    var x = 1;
    var objAttachment;
    var strAttachmentFileName = "";    
    var strAttachmentFileExt = "";
    var st;
    var strHTMLBody = "";
    var strTemp = "";

We will need these variables later in the code, so we declare them in one place.

    // check attachments
    if(Msg.Attachments) {
      // is this a HTML mail?
      try {
        if(Msg.HTMLBodyPart) {
          bIsHTMLMail = true;
        }
      }
      catch(e) { }  // may raise an exception when IMessage.HTMLBodyPart not found

First we check whether there are attachments for this mail, then we detect the mail format (simple text or HTML).

The detection is important because HTML emails usually consist of two text parts: a simple text part for mail agents which does not support displaying HTML emails and another part for HTML-capable agents, which contains the HTML mail source code. HTML-capable agents usually show the HTML part of an HTML email only and as we need to append the attachment removal notice to the modified email, we need to modify the HTML part also when present.

      // check each attachment
      while(x <= Msg.Attachments.Count) {
        objAttachment = Msg.Attachments.Item(x);
        strAttachmentFileName = new String(objAttachment.FileName).toLowerCase();

We check every attachment file name in a loop. For easier handling, get the file name lowercase version to the strAttachmentFileName variable. I like short variable names.

        if(objAttachment) {
          if(
            (strAttachmentFileName == "your_document.pif") ||
            (strAttachmentFileName == "document_all.pif") ||
            (strAttachmentFileName == "thank_you.pif") ||
            (strAttachmentFileName == "your_details.pif") ||
            (strAttachmentFileName == "details.pif") ||
            (strAttachmentFileName == "document_9446.pif") ||
            (strAttachmentFileName == "application.pif") ||
            (strAttachmentFileName == "wicked_scr.scr") ||
            (strAttachmentFileName == "movie0045.pif")
          ) {

If the attachment object exists (okay, it will always exist, but who knows...), we examine the file name and if the attachment file name matches with any of the attachment names used by W32.Sobig.F@mm, we will do something terrible with the attachment...

            // oops, W32.Sobig.F@mm found!
            if(bIsHTMLMail) {
              // add HTML info
              strInfoHTML +=
                "<b>Attachment removed:</b> Possibly W32.Sobig.F@mm virus. " +
                "File name: <i>" + objAttachment.FileName + "</i>, " +
                "content type: <i>" + objAttachment.ContentMediaType +
                "</i><br>\r\n";
            }
            // add simple text info
            strInfoText +=
              "Attachment removed: Possibly W32.Sobig.F@mm virus. File name: " +
              objAttachment.FileName + ", content type: " + 
              objAttachment.ContentMediaType + "\r\n";
            
            // remove attachment
            Msg.Attachments.Delete(x);
          }

...cut it off from the message. Before we do that, we add a new line to the removal reason text. As every incoming email has at least a text part, we always generate the removal reason in simple text format and optionally we add an HTML-formatted text if the incoming message is in HTML format.

We have more to do, because our script not only removes W32.Sobig.F@mm attachments, but removes every .pif and .scr attachments, because they often carry viruses and actually we do not need to get emails with these file types.

          else {
            // W32.Sobig.F@mm was not identified, check for other .pif or
            // .scr attachment
            strAttachmentFileExt = 
              strAttachmentFileName.substr(strAttachmentFileName.length - 4, 4);
            if(
              (strAttachmentFileExt == ".pif") ||
              (strAttachmentFileExt == ".scr")
            ) {
              // suspicious attachment
              if(bIsHTMLMail) {
                // add HTML info
                strInfoHTML +=
                  "<b>Attachment removed:</b> Suspicious file extension (" +
                  strAttachmentFileExt + "). " +
                  "File name: <i>" + objAttachment.FileName + "</i>, " +
                  "content type: <i>" +
                  objAttachment.ContentMediaType + "</i><br>\r\n";
              }
              // add simple text info
              strInfoText +=
                "Attachment removed: Suspicious file extension (" +
                strAttachmentFileExt + "). " +
                "File name: " + objAttachment.FileName + ", content type: " +
                objAttachment.ContentMediaType + "\r\n";
              
              // remove attachment
              Msg.Attachments.Delete(x);              
            }

This is the same that we did with the Sobig attachments, but in this case we examined only the last four characters (the dot character and the extension, e.g. ".pif") of the attachment file name.

            else {
              // nothing special with this attachment, skip and go for next
              x++;
            }
          }
        }
        else {
          // no object, just skip
          x++;
        }
      }  

A number of nice brackets.

We increment the loop counter (the x variable) when we skip an attachment (because it passed our test), otherwise, when we delete an attachment, the counter is not incremented. Why? Because deleting an attachment modifies the next attachment's index.

Now we have to append the attachment removal information to the bottom of the mail, if we have removed anything. As the information text is blank if we did not do anything with the attachments, we use this variable to detect whether we have modified the email.

      // any attachments removed? add attachment removal info to the bottom
      // of the modified email
      if(strInfoText.length != 0) {
        // add filter message to the message text body
        Msg.TextBody +=
          "\r\n\r\n" + 
          "===================================================\r\n" + 
          "ATTACHMENT FILTER\r\n" +
          "===================================================\r\n" +
          strInfoText;

It is simple. We add two line feeds sequences (\r\n) and the removal information in simple text format to the end of the text email part.

Modifying the HTML part is a little bit more complex. We have to find the HTML document closing tag first, which is </html> and then insert the removal information immediately in front of the closing tag.

        // add filter message to the message HTML body (if exists)
        if(bIsHTMLMail) {
          // get the decoded HTML stream
          st = Msg.HTMLBodyPart.GetDecodedContentStream();
          // read the text and locate </html>
          strHTMLBody = st.ReadText();

As the HTML part is almost always encoded with quoted-printable or BASE64 encoding, we need to decode it first to be able to look for the </html> tag. This can be done by getting the decoded mail part stream into a text variable (called strHTMLBody here).

          x = strHTMLBody.toLowerCase().lastIndexOf("</html>");
          if(x >= 0) {
            // insert removal text to the stream
            st.Position = x;
            strTemp = st.ReadText();  // any text after the </html> tag
            st.WriteText(
              '<div style="font-family: Verdana, Geneva, Arial, Helvetica, ' +
              'sans-serif; font-size: 8pt; margin-top: 15px; ' + 
              'line-height: 160%;">\r\n' +
              '  <div style="background-color: #AA0000; font-weight: bold; ' +
              'color: #FFFFFF; padding: 4px; margin-bottom: 5px;">' +
              'ATTACHMENT FILTER</div>\r\n' +
              strInfoHTML + 
              '</div>\r\n' + 
              strTemp);
            // flush stream
            st.Flush();
          }         
        }

We set the position of the </html> tag to variable x, then we read the entire text from that position until the end of the HTML mail part to variable strTemp. We insert the removal information in front of the </html> tag, add the original text to the email and update the HTML mail part contents (Flush()).

        // save email changes
        Msg.Datasource.Save();

Save any changes we made to the email.

      }
    }
    
    // continue execution with the next sink
    Status = cdoRunNextSink;
  }
</SCRIPT>

Registering the event sink

We can register our event sink as it was described in my previous event sink article, please see HOWTO: Remove Read Receipt Requests on the Server.

Summary

Screenshot of a filtered email

On the right side you can see the script in action, I have attached 7 files to a test email, 2 "valid" attachments, 3 Sobig-like .pif and .scr files and 2 custom .pif/.scr attachments. The script removed the Sobig and the .pif/.scr attachments and left the two "valid" attachments intact.

This script is another good example of the cheap custom-tailored tools that you can create yourself. The OnArrival event sinks and Windows scripting are powerful tools that you must not miss in your everyday work.

I have found the attachment file names of the W32.Sobig.F@mm virus on the Symantec Security Response pages.

Update, September 2: This morning we have been bombed by W32.Sobig.F@mm. NAV anti-virus removed the virus from all emails, but we were still receiving 3 virus reports per minute, so I have created a modified version of the script which drops Sobig-like emails on the front-end mail gateway. Scripting is powerful! :)

Update, September 3. The Microsoft Knowledge Base article Q235309 (Outlook E-mail Attachment Security Update) gives some further tips for other attachment file name extensions that may be dangerous. These are: .ade, .adp, .bas, .bat, .chm, .cmd, .com, .cpl, .crt, .exe, .hlp, .hta, .inf, .ins, .isp, .js, .jse, .lnk, .mda, .mdb, .mde, .mdz, .msc, .msi, .msp, .mst, .pcd, .pif, .reg, .scr, .sct, .shs, .url, .vb, .vbe, .vbs, .wsc, .wsf and .wsh . Thanks for Nathan Silva for bringing this up!

hnp1 | hnp2