PolySwarmPolySwarmPolySwarmPolySwarm
Help

Building an EICAR-Detecting Microengine / Arbiter

Overview

The EICAR test file is used by the Antivirus industry to test engines' ability to detect malware. The file is not malicious, but is flagged as such by most vendors.

The EICAR test file is defined as a file that begins with the following string: X5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H* followed by a variable amount of whitespace.

Elsewhere, we discuss implementing an Ambassador that submits the EICAR test file.


Implement an EICAR Test File Scanner

Detecting EICAR is as simple as implementing a Scanner class that knows how to identify the EICAR test file.

When you created your participant, participant-template created a Scanner class in your participant's project_slug/package_slug/participant_name_slug.py file*. We're going to edit the Scanner class in this file. Scanner subclasses AbstractScanner, which is provided by polyswarm-client.

These `_slugvariables that define your directory structure are based on your responses to thecookiecutter` prompts.

You'll need to implement your Scanner's scan() method, instructing it to flag the EICAR test file as malicious.

There are, of course, many ways to identify files that match the EICAR criteria. The following are 2 examples for how you can write your scan() function to detect EICAR.

String Matching

This method does not detect EICAR test files with appended whitespace. Expanding this example to detect the full range of EICAR test files is left as an exercise to the user.

String matching is used by microengine/eicar.py:

import base64
from polyswarmclient.abstractscanner import AbstractScanner, ScanResult

EICAR = base64.b64decode(b'WDVPIVAlQEFQWzRcUFpYNTQoUF4pN0NDKTd9JEVJQ0FSLVNUQU5EQVJELUFOVElWSVJVUy1URVNULUZJTEUhJEgrSCo=')

class Scanner(AbstractScanner):

    async def scan(self, guid, artifact_type, content, metadata, chain):
        sysname, _, _, _, machine = os.uname()
        metadata = Verdict().\
          set_scanner(
            operating_system=sysname,
            architecture=machine)

        if content == EICAR:
            metadata.set_malware_family('Eicar Test File')
            return ScanResult(bit=True, verdict=True, metadata=metadata.json())

        metadata.set_malware_family('')
        return ScanResult(bit=True, verdict=False, metadata=metadata.json())

SHA-256 Matching

This method does not detect EICAR test files with appended whitespace. Expanding this example to detect the full range of EICAR test files is left as an exercise to the user.

This method compares the SHA-256 digest of the EICAR test file with a known-bad hash:

import base64

from hashlib import sha256
from polyswarmclient.abstractscanner import AbstractScanner, ScanResult

EICAR = base64.b64decode(b'WDVPIVAlQEFQWzRcUFpYNTQoUF4pN0NDKTd9JEVJQ0FSLVNUQU5EQVJELUFOVElWSVJVUy1URVNULUZJTEUhJEgrSCo=')
HASH = sha256(EICAR).hexdigest()

class Scanner(AbstractScanner):

    async def scan(self, guid, artifact_type, content, metadata, chain):
        sysname, _, _, _, machine = os.uname()
        metadata = Verdict().\
          set_scanner(
            operating_system=sysname,
            architecture=machine)

        testhash = sha256(content).hexdigest()
        if testhash == HASH:
            metadata.set_malware_family('Eicar Test File')
            return ScanResult(bit=True, verdict=True, metadata=metadata.json())

        metadata.set_malware_family('')
        return ScanResult(bit=True, verdict=False, metadata=metadata.json())

The ScanResult object's constructor that our scan method returns takes the following parameters representing our results:

  1. bit : a boolean representing a malicious or benign determination
  2. verdict: another boolean representing whether the engine wishes to assert on the artifact
  3. confidence: a float representing our confidence in our assertion, ranging from 0.0 to 1.0
  4. metadata: an object describing our scan results

Test Your Participant

Next Steps

Housing all of your participant's scan logic in the scan() function is unlikely to scale. Perhaps you have an existing scan engine or external library that houses the details of your scanning logic and you need to call out to that scan engine for results.

Next, we'll wrap the open source ClamAV antivirus engine into a Microengine / Arbiter.