ClamAV is an open source signature-based engine with a daemon that provides quick analysis of artifacts that it recognizes.
This tutorial will step you through building your second PolySwarm Microengine by means of incorporating ClamAV as an analysis backend.
The PolySwarm marketplace will be a source of previously unseen malware.
Relying on a strictly signature-based engine as your analysis backend, particularly one whose signatures everyone can access (e.g. ClamAV) is unlikely to yield unique insight into "swarmed" artifacts and therefore unlikely to outperform other engines.
This guide should not be taken as a recommendation for how to approach the marketplace but rather an example of how to incorporate an existing analysis backend into a Microengine skeleton.
This tutorial will walk the reader through building microengine/clamav.py; please refer to clamav.py for the completed work.
clamd Implementation and Integration
Start with a fresh engine-template, give it the engine-name of “MyClamAvEngine”.
You should find a microengine-myclamavengine in your current working directory - this is what we’ll be editing to implement ClamAV scan functionality.
Edit the __init__.py as we describe below:
We begin our ClamAV analysis backend by importing the clamd module and configuring some globals.
If clamd detects a piece of malware, it puts FOUND in result.
The ScanResult object’s constructor that our scan method returns takes the following parameters representing our results:
bit : a boolean representing a malicious or benign determination
verdict: another boolean representing whether the engine wishes to assert on the artifact
confidence: a float representing our confidence in our assertion, ranging from 0.0 to 1.0
metadata: (optional) string describing the artifact
We leave including ClamAV’s metadata as an exercise to the reader - or check clamav.py :)
The Microengine class is required, but we do not need to modify it, so it is not shown here.
Python 3's Asyncio - It is important that any external calls you make during a scan do not block the event loop.
We forked the clamd project to add support for python 3's asyncio.
Thus, for this example to run, you need install our python-clamd project to get the clamd package until our changes are merged upstream.
The command you need is: `pip install git+https://github.com/polyswarm/python-clamd.git@async#egg=clamd`.
Finalizing & Testing Your Engine
cookiecutter customizes engine-template only so far - there are a handful of items you’ll need to fill out yourself.
We’ve already covered the major items above, but you’ll want to do a quick search for CUSTOMIZE_HERE to ensure all customization have been made.
Once everything is in place, let’s test our engine: