Optimizing Your Engine's Accuracy and Reliability

Reliability

Operating an Engine on a public community comes with a lot of responsibility. Engines on public communities should be able to respond to the majority of bounties given to it. Bounties are sent only to Engines claiming to support the Bounty's artifact type and attributes. An Engine that cannot reliably respond will be put into the Failed state. Failed Engines do not receive bounties, and must go through the verification process again.

Downtime is often unavoidable, of course, and is acceptable for short infrequent periods. The problem we want to avoid is Engines with frequent outages.

Engines can avoid a lot of pitfalls that might cause them to be marked as failed erroniously. The biggest thing, is that Engines should respond to all bounties sent to them, even if the artifact cannot be scanned. We added unknown as a valid assertion response to account for artifacts that an Engine cannot process or when there is a failure condition during processing.

All Engines should use unknown in the event of recoverable errors, for any other situation where the artifact cannot be processed, such as running out of time, or an unsupported artifact. When Asserting unknown, set the bid value to 0.

Be aware, Engines that respond to a high percentage of bounties with an unknown assertion will be marked as Failed. It is an indication that the Engine is either not operating correctly or does not have the correct configuration settings in the PolySwarm UI.

Accuracy

PolySwarm wants to encourage and assist Engine owners to do accuracy monitoring and to operate a continuous improvement processes. To support those goals, PolySwarm provides an API for Engine owners to obtain the list of artifacts their engine processed. In the list of artifacts, for each artifact it includes the Engine's assertion, the ground truth result, and the sha256 that can be used to download the artifact.

Access to accuracy data about an Engine is only available to the Engine's owner.

Setup

We use PolySwarm CLI for these examples. but Engine owners can also use the PolySwarm API library, or any other tool that supports communications with PolySwarm's APIs. The functions in the PolySwarm API library to manage assertions and votes jobs are easy to use.

If you want to see json output from the CLI, you can use the --fmt=json or --fmt=pretty-json arguments.

To get started, you need 2 pieces of information:

API key - this must be an API key for the User Account or Team Account that owns the Engine
Engine ID - you can get this by going to the Engines page, clicking on the "My Engines" tab and selecting your Engine's name to view the details.

With that, you are ready to go. Let's first look at the commands available to us:

$ polyswarm engine
Usage: polyswarm engine [OPTIONS] COMMAND [ARGS]...

Options:
  -h, --help  Show this message and exit.

Commands:
  assertions  Interact with engine's assertions.
  votes       Interact with engine's votes.

Arbiter owners would use the votes commands and Microengine owners would use the assertions commands.

Let's work through the process assuming we are a Microengine. What does the assertions command allow us to do?

$ polyswarm engine assertions
Usage: polyswarm engine assertions [OPTIONS] COMMAND [ARGS]...

Options:
  -h, --help  Show this message and exit.

Commands:
  create  Create a new bundle with the consolidated assertions data.
  delete  Delete an assertions bundle.
  get     Get an assertions bundle.
  list    List all assertions bundles for the given engine.

Create Bundle

The first step is to create an "assertions bundle". You create a job to collect the data for your Engine between a given timestamp range. When starting the job, you define the ENGINE_ID, START_DATE, and END_DATE. Data having the start timestamp is included in the results, the end timestamp is not. If you omit the hh:mm:ss.usec, then it will use midnight on that date. The collected data will be stored in an "assertions bundle", a .csv file that you can download.

$ polyswarm engine assertions create <ENGINE_ID> 2021-07-01 2021-07-08
 ============================= Assertions Job =============================
 Assertions Job id: 16848288564354189
 Engine id: <ENGINE_ID>
 Created at: 2021-07-28 18:19:03.326115
 Start date: 2021-07-01 00:00:00
 End date: 2021-07-08 00:00:00

The job you create will be scheduled to run as soon as possible. Best practice is to check back periodically to see if it has completed. Checking once per minute is usually sufficient.

If you automate the creation of bundles, the shortest timeframe you should process is 1 hour, so creating one bundle per hour.

Get or List Bundles

You can use either the get or list operations to determine whether the bundle is complete and available for download. Using get will get the status of a specific bundle by Assertions Job Id, while list will show you all bundles that you've created. If the status details for an Assertions Job include a pre-signed download URL, then it is ready to be downloaded. If there is not a Download URL, then you should check again later.

$ polyswarm engine assertions get 16848288564354189
============================= Assertions Job =============================
Assertions Job id: 16848288564354189
Engine id: <ENGINE_ID>
Created at: 2021-07-28 18:19:03.326115
Start date: 2021-07-01 00:00:00
End date: 2021-07-08 00:00:00

In the first attempt, we can see that the job bundle was not ready for download, so wait a moment and try again...

$ polyswarm engine assertions get 16848288564354189
============================= Assertions Job =============================
Assertions Job id: 16848288564354189
Engine id: <ENGINE_ID>
Created at: 2021-07-28 18:19:03.326115
Start date: 2021-07-01 00:00:00
End date: 2021-07-08 00:00:00
Download: https://<REALLY LONG URL HERE>
True Positive: 7
True Negative: 0
False Positive: 0
False Negative: 0
Suspicious: 0
Unknown: 0
Total: 7

Here we go, this second attempt includes the download URL and summary stats, so you can download your results.

The pre-signed download URL is recreated each time you perform a GET or LIST operation.

The download URL is valid for 300 seconds.

Before we download the file, let's review the data displayed in the job status details.

True Positive: your result matched the ground truth result of "malicious"
True Negative: your result matched the ground truth result of "benign"
False Positive: your result was "malicious", but the ground truth result was "benign"
False Negative: your result was "benign", but the ground truth result was "malicious"
Suspicious: this is simply a count of the number of times your engine returned "suspicious" as a result. You can ignore this when doing FP/FN analysis.
Unknown: this is simply a count of the number of times your engine returned "unknown" as a result. You can ignore this when doing FP/FN analysis.

"Suspicious" and "Unknown" are only available for Engines operating in the new Webhook architecture.

Download Bundle

The next step is to download that bundle. You can use any browser or web request tool to download the bundle. The file you download will be named assertions.csv for Assertions jobs and votes.csv for Vote jobs.

The content of the .csv file will be similar to the following:

created,artifact_id,sha256,assertion_value,ground_truth_value
2021-07-01 16:07:16.129026,19131110281273461,275a021bbfb6489e54d471899f7db9d1663fc695ec2fe2a2c4538aabf651fd0f,malicious,malicious
2021-07-02 13:53:11.534493,51343509684868491,275a021bbfb6489e54d471899f7db9d1663fc695ec2fe2a2c4538aabf651fd0f,malicious,malicious
2021-07-02 14:13:32.866444,26799056656873371,275a021bbfb6489e54d471899f7db9d1663fc695ec2fe2a2c4538aabf651fd0f,malicious,malicious
2021-07-02 14:26:12.673761,4875488612233743,275a021bbfb6489e54d471899f7db9d1663fc695ec2fe2a2c4538aabf651fd0f,malicious,malicious
2021-07-02 15:14:48.598094,75781614615834141,275a021bbfb6489e54d471899f7db9d1663fc695ec2fe2a2c4538aabf651fd0f,malicious,malicious
2021-07-02 15:45:46.880895,6306841940656975,275a021bbfb6489e54d471899f7db9d1663fc695ec2fe2a2c4538aabf651fd0f,malicious,malicious
2021-07-06 16:05:38.321635,7987631822112379,275a021bbfb6489e54d471899f7db9d1663fc695ec2fe2a2c4538aabf651fd0f,malicious,malicious

The meaning of each value is as follows:

created: date assertion was made
artifact_id: unique reference to the artifact that was scanned
sha256: hash of the artifact that was scanned
assertion_value: Assertion returned by the engine for this artifact; one of malicious, benign, suspicious, or unknown
ground_thruth_value: Ground Truth determination; one of malicious, benign, suspicious, or unknown.

Analysis

With this data, you can do several things:

Compute accuracy rate of your Engine
Find all False Positive or False Negative assertions made by your Engine
Use the artifact_id or sha256 to download the artifact from PolySwarm.
Count the number of artifacts scanned by your Engine

Bundle Clean up

When you are done downloading and analyzing a bundle, you can delete the assertions job.

$ polyswarm --fmt=pretty-json engine assertions delete 16848288564354189
{
    "created": "2021-07-28T18:19:03.326115",
    "date_end": "2021-07-08T00:00:00",
    "date_start": "2021-07-01T00:00:00",
    "engine_id": "<ENGINE_ID>",
    "false_negative": 0,
    "false_positive": 0,
    "id": "16848288564354189",
    "storage_path": "https://<REALLY LONG URL>",
    "suspicious": 0,
    "total": 7,
    "true_negative": 0,
    "true_positive": 7,
    "unknown": 0
}

When deleting the job, it returns the same summary as doing a get, so you can still download the .csv file until the URL expires.

PolySwarm Account Usage Limits

Using this process, the Bundle/Job management operations do not count against your usage limits. You will only require available usage to perform downloads.

Summary

As you saw above, the data you need to perform analysis of your Engine's performance is easily available and in formats that are easy to parse/process.

Next Steps

With your accuracy and reliability improved, next you will want to improve your bidding strategy.