PolySwarm

This page is available in English.

Está página está disponible en español.

このページは日本語でもご利用いただけます。

이 페이지는 한국어로만 표시됩니다.

Level 0: Scratch to EICAR

Microengine “Hello World”

Overview

The “Hello World” of developing an anti-malware solution is invariably detecting the EICAR test file.

This benign file is detected as “malicious” by all major anti-malware products - a safe way to test a positive result.

Our first Microengine will be no different: let’s detect EICAR!

(Optional) review the components of a Microengine →

Building Blocks

This guide will reference and build on:

  • engine-template: The name says it all - this is a convenient template with interactive prompts for creating new engines. We’ll use this in our tutorial.

  • polyswarm-client: The Swiss Army knife of exemplar PolySwarm participants (“clients”). polyswarm-client can function as a microengine (we’ll build on this functionality in this tutorial), an arbiter and an ambassador (we’ll use these to test what we built).

Customize engine-template

Warning: Windows-based engines are currently only supported as AMIs (AWS Machine Images).

The customization process for Window-based engines assumes you have an AWS account and its ID handy.

We'll be expanding deployment options in near future, including self-hosted options. Linux-based engines have no such stipulation.

We’re going to cut our Engine from engine-template. To do this, we’ll need cookiecutter:

pip install cookiecutter

With cookiecutter installed, jump-starting your engine from our template is as easy as:

cookiecutter https://github.com/polyswarm/engine-template

Prompts will appear, here’s how we’ll answer them:

  • engine_name: MyEicarEngine (the name of your engine)
  • engine_name_slug: (accept the default)
  • project_slug: (accept the default)
  • author_org: ACME (or the real name of your organization)
  • author_org_slug: (accept the default)
  • package_slug: (accept the default)
  • author_name: Wile E Coyote (or your real name)
  • author_email: (your email address)
  • platform: answer truthfully - will this Engine run on Linux or Windows?
  • has_backend: 1 for false (see explanation below)
  • aws_account_for_ami: (Windows only) your AWS account ID (for Linux engines, just accept the default)

One of the prompt items is has_backend, which can be thought of as "has a disjoint backend" and deserves additional explanation.

When wrapping your scan engine, inheritance of polyswarm-client classes and implementation of class functionality are referred to as "frontend" changes. If your scan engine "frontend" must reach out across a network or local socket to a separate process that does the real scanning work (the "backend"), then you have a disjoint "backend" and you should answer true to has_backend. If instead your scan engine can easily be encapsulated in a single Docker image (Linux) or AMI (Windows), then you should select false for has_backend.

Example of disjoint frontend / backend:

Example of only a frontend (has_backend is false):

You’re all set!

You should find a microengine-myeicarengine in your current working directory - this is what we’ll be editing to implement EICAR scan functionality.

If you want to use PyCharm as your IDE, now is when you can setup a new PyCharm project for this microengine development.

Implement an EICAR Scanner & Microengine

Detecting EICAR is as simple as:

  1. implementing a Scanner class that knows how to identify the EICAR test file
  2. implementing a Microengine class that uses this Scanner class

Let’s get started.

Open microengine-myeicarengine/src/(the org slug name)_myeicarengine/__init__.py.

If you used our cookiecutter engine-template from above, you will have some code in your __init__.py.

We will modify this file to implement both our Scanner and Microengine classes:

  • Scanner: our Scanner class. This class will implement our EICAR-detecting logic in its scan function.

  • Microengine: our Microengine class. This class will wrap the aforementioned Scanner to handle all the necessary tasks of being a Microengine that detects EICAR.

Write EICAR Detection Logic

The EICAR test file is defined as a file that contains only the following string: X5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*.

There are, of course, many ways to identify files that match this criteria. The scan function’s content parameter contains the entire content of the artifact in question - this is what you’re matching against.

The following are 2 examples for how you can write your scan() function to detect EICAR. Update the code in your __init__.py file with the changes from one of these examples.

The first way, is the simplest design and is used in eicar.py:

import base64
from polyswarmclient.abstractmicroengine import AbstractMicroengine
from polyswarmclient.abstractscanner import AbstractScanner, ScanResult

EICAR = base64.b64decode(b'WDVPIVAlQEFQWzRcUFpYNTQoUF4pN0NDKTd9JEVJQ0FSLVNUQU5EQVJELUFOVElWSVJVUy1URVNULUZJTEUhJEgrSCo=')

class Scanner(AbstractScanner):

    async def scan(self, guid, content, chain):
        if content == EICAR:
            return ScanResult(bit=True, verdict=True)

        return ScanResult(bit=True, verdict=False)


class Microengine(AbstractMicroengine):
    def __init__(self, client, testing=0, scanner=None, chains=None):
        scanner = Scanner()
        super().__init__(client, testing, scanner, chains)

Here’s another way, this time comparing the SHA-256 of the EICAR test file with a known-bad hash:

import base64

from hashlib import sha256
from polyswarmclient.abstractmicroengine import AbstractMicroengine
from polyswarmclient.abstractscanner import AbstractScanner, ScanResult

EICAR = base64.b64decode(b'WDVPIVAlQEFQWzRcUFpYNTQoUF4pN0NDKTd9JEVJQ0FSLVNUQU5EQVJELUFOVElWSVJVUy1URVNULUZJTEUhJEgrSCo=')
HASH = sha256(EICAR).hexdigest()

class Scanner(AbstractScanner):

    async def scan(self, guid, content, chain):
        testhash = sha256(content).hexdigest()
        if (testhash == HASH):
            return ScanResult(bit=True, verdict=True)

        return ScanResult(bit=True, verdict=False)


class Microengine(AbstractMicroengine):
    def __init__(self, client, testing=0, scanner=None, chains=None):
        scanner = Scanner()
        super().__init__(client, testing, scanner, chains)

Develop a Staking Strategy

At a minimum, Microengines are responsible for: (a) detecting malicious files, (b) rendering assertions with a NCT bid.

Bidding logic is implemented in the Microengine’s bid function.

By default, all assertions are placed with the minimum stake permitted by the community a Microengine is joined to.

Microengines can customize the default behavior by:

  1. Using the default logic, which weights your bid based on the confidence rating for each artifact between the min_bid and max_bid properties of the Microengine object
  2. Overriding the bid function to perform arbitrary logic

Check back soon for an exploration of various bidding strategies.

Finalizing & Testing Your Engine

cookiecutter customizes engine-template only so far - there are a handful of items you’ll need to fill out yourself. We’ve already covered the major items above, but you’ll want to do a quick search for CUSTOMIZE_HERE to ensure all customization have been made.

Once everything is in place, let’s test our engine:

Test Linux-based Engines →

Test Windows-based Engines →

Next Steps

Implementing scan logic directly in the Scanner class is difficult to manage and scale. Instead, you’ll likely want your Microengine class to call out to an external binary or service that holds the actual scan logic.

Next, we’ll wrap ClamAV into a Microengine →