Optimizing Your Engine's Accuracy and Reliability
Reliability
Operating an Engine on a public community comes with a lot of responsibility.
Engines on public communities should be able to respond to the majority of bounties given to it.
Bounties are sent only to Engines claiming to support the Bounty's artifact type and attributes.
An Engine that cannot reliably respond will be put into the Failed
state.
Failed
Engines do not receive bounties, and must go through the verification process again.
Downtime is often unavoidable, of course, and is acceptable for short infrequent periods. The problem we want to avoid is Engines with frequent outages.
Engines can avoid a lot of pitfalls that might cause them to be marked as failed erroniously.
The biggest thing, is that Engines should respond to all bounties sent to them, even if the artifact cannot be scanned.
We added unknown
as a valid assertion response to account for artifacts that an Engine cannot process or when there is a failure condition during processing.
All Engines should use unknown
in the event of recoverable errors, for any other situation where the artifact cannot be processed, such as running out of time, or an unsupported artifact.
When Asserting unknown
, set the bid
value to 0
.
Be aware, Engines that respond to a high percentage of bounties with an
unknown
assertion will be marked asFailed
. It is an indication that the Engine is either not operating correctly or does not have the correct configuration settings in the PolySwarm UI.
Accuracy
PolySwarm wants to encourage and assist Engine owners to do accuracy monitoring and to operate a continuous improvement processes. To support those goals, PolySwarm provides an API for Engine owners to obtain the list of artifacts their engine processed. In the list of artifacts, for each artifact it includes the Engine's assertion, the ground truth result, and the sha256 that can be used to download the artifact.
Access to accuracy data about an Engine is only available to the Engine's owner.
Setup
We use PolySwarm CLI for these examples. but Engine owners can also use the PolySwarm API library, or any other tool that supports communications with PolySwarm's APIs. The functions in the PolySwarm API library to manage
assertions
andvotes
jobs are easy to use.If you want to see json output from the CLI, you can use the
--fmt=json
or--fmt=pretty-json
arguments.
To get started, you need 2 pieces of information:
- API key - this must be an API key for the User Account or Team Account that owns the Engine
- Engine ID - you can get this by going to the Engines page, clicking on the "My Engines" tab and selecting your Engine's name to view the details.
With that, you are ready to go. Let's first look at the commands available to us:
$ polyswarm engine
Usage: polyswarm engine [OPTIONS] COMMAND [ARGS]...
Options:
-h, --help Show this message and exit.
Commands:
assertions Interact with engine's assertions.
votes Interact with engine's votes.
Arbiter owners would use the votes
commands and Microengine owners would use the assertions
commands.
Let's work through the process assuming we are a Microengine.
What does the assertions
command allow us to do?
$ polyswarm engine assertions
Usage: polyswarm engine assertions [OPTIONS] COMMAND [ARGS]...
Options:
-h, --help Show this message and exit.
Commands:
create Create a new bundle with the consolidated assertions data.
delete Delete an assertions bundle.
get Get an assertions bundle.
list List all assertions bundles for the given engine.
Create Bundle
The first step is to create an "assertions bundle".
You create a job to collect the data for your Engine between a given timestamp range.
When starting the job, you define the ENGINE_ID
, START_DATE
, and END_DATE
.
Data having the start timestamp is included in the results, the end timestamp is not.
If you omit the hh:mm:ss.usec
, then it will use midnight on that date.
The collected data will be stored in an "assertions bundle", a .csv
file that you can download.
$ polyswarm engine assertions create <ENGINE_ID> 2021-07-01 2021-07-08
============================= Assertions Job =============================
Assertions Job id: 16848288564354189
Engine id: <ENGINE_ID>
Created at: 2021-07-28 18:19:03.326115
Start date: 2021-07-01 00:00:00
End date: 2021-07-08 00:00:00
The job you create will be scheduled to run as soon as possible. Best practice is to check back periodically to see if it has completed. Checking once per minute is usually sufficient.
If you automate the creation of bundles, the shortest timeframe you should process is 1 hour, so creating one bundle per hour.
Get or List Bundles
You can use either the get
or list
operations to determine whether the bundle is complete and available for download.
Using get
will get the status of a specific bundle by Assertions Job Id
, while list
will show you all bundles that you've created.
If the status details for an Assertions Job
include a pre-signed download URL, then it is ready to be downloaded.
If there is not a Download
URL, then you should check again later.
$ polyswarm engine assertions get 16848288564354189
============================= Assertions Job =============================
Assertions Job id: 16848288564354189
Engine id: <ENGINE_ID>
Created at: 2021-07-28 18:19:03.326115
Start date: 2021-07-01 00:00:00
End date: 2021-07-08 00:00:00
In the first attempt, we can see that the job bundle was not ready for download, so wait a moment and try again...
$ polyswarm engine assertions get 16848288564354189
============================= Assertions Job =============================
Assertions Job id: 16848288564354189
Engine id: <ENGINE_ID>
Created at: 2021-07-28 18:19:03.326115
Start date: 2021-07-01 00:00:00
End date: 2021-07-08 00:00:00
Download: https://<REALLY LONG URL HERE>
True Positive: 7
True Negative: 0
False Positive: 0
False Negative: 0
Suspicious: 0
Unknown: 0
Total: 7
Here we go, this second attempt includes the download URL and summary stats, so you can download your results.
The pre-signed download URL is recreated each time you perform a GET or LIST operation.
The download URL is valid for 300 seconds.
Before we download the file, let's review the data displayed in the job status details.
- True Positive: your result matched the ground truth result of "malicious"
- True Negative: your result matched the ground truth result of "benign"
- False Positive: your result was "malicious", but the ground truth result was "benign"
- False Negative: your result was "benign", but the ground truth result was "malicious"
- Suspicious: this is simply a count of the number of times your engine returned "suspicious" as a result. You can ignore this when doing FP/FN analysis.
- Unknown: this is simply a count of the number of times your engine returned "unknown" as a result. You can ignore this when doing FP/FN analysis.
"Suspicious" and "Unknown" are only available for Engines operating in the new Webhook architecture.
Download Bundle
The next step is to download that bundle.
You can use any browser or web request tool to download the bundle.
The file you download will be named assertions.csv
for Assertions jobs and votes.csv
for Vote jobs.
The content of the .csv
file will be similar to the following:
created,artifact_id,sha256,assertion_value,ground_truth_value
2021-07-01 16:07:16.129026,19131110281273461,275a021bbfb6489e54d471899f7db9d1663fc695ec2fe2a2c4538aabf651fd0f,malicious,malicious
2021-07-02 13:53:11.534493,51343509684868491,275a021bbfb6489e54d471899f7db9d1663fc695ec2fe2a2c4538aabf651fd0f,malicious,malicious
2021-07-02 14:13:32.866444,26799056656873371,275a021bbfb6489e54d471899f7db9d1663fc695ec2fe2a2c4538aabf651fd0f,malicious,malicious
2021-07-02 14:26:12.673761,4875488612233743,275a021bbfb6489e54d471899f7db9d1663fc695ec2fe2a2c4538aabf651fd0f,malicious,malicious
2021-07-02 15:14:48.598094,75781614615834141,275a021bbfb6489e54d471899f7db9d1663fc695ec2fe2a2c4538aabf651fd0f,malicious,malicious
2021-07-02 15:45:46.880895,6306841940656975,275a021bbfb6489e54d471899f7db9d1663fc695ec2fe2a2c4538aabf651fd0f,malicious,malicious
2021-07-06 16:05:38.321635,7987631822112379,275a021bbfb6489e54d471899f7db9d1663fc695ec2fe2a2c4538aabf651fd0f,malicious,malicious
The meaning of each value is as follows:
created
: date assertion was madeartifact_id
: unique reference to the artifact that was scannedsha256
: hash of the artifact that was scannedassertion_value
: Assertion returned by the engine for this artifact; one ofmalicious
,benign
,suspicious
, orunknown
ground_thruth_value
: Ground Truth determination; one ofmalicious
,benign
,suspicious
, orunknown
.
Analysis
With this data, you can do several things:
- Compute accuracy rate of your Engine
- Find all False Positive or False Negative assertions made by your Engine
- Use the
artifact_id
orsha256
to download the artifact from PolySwarm. - Count the number of artifacts scanned by your Engine
Bundle Clean up
When you are done downloading and analyzing a bundle, you can delete the assertions job.
$ polyswarm --fmt=pretty-json engine assertions delete 16848288564354189
{
"created": "2021-07-28T18:19:03.326115",
"date_end": "2021-07-08T00:00:00",
"date_start": "2021-07-01T00:00:00",
"engine_id": "<ENGINE_ID>",
"false_negative": 0,
"false_positive": 0,
"id": "16848288564354189",
"storage_path": "https://<REALLY LONG URL>",
"suspicious": 0,
"total": 7,
"true_negative": 0,
"true_positive": 7,
"unknown": 0
}
When deleting the job, it returns the same summary as doing a get
, so you can still download the .csv
file until the URL expires.
PolySwarm Account Usage Limits
Using this process, the Bundle/Job management operations do not count against your usage limits. You will only require available usage to perform downloads.
Summary
As you saw above, the data you need to perform analysis of your Engine's performance is easily available and in formats that are easy to parse/process.
Next Steps
With your accuracy and reliability improved, next you will want to improve your bidding strategy.