PolySwarmPolySwarmPolySwarmPolySwarm
Help

Ambassadors

目标受众

代表是 PolySwarm 市场的门户。 It is Ambassadors' responsibility to translate queries into marketplace actions and aggregate results on behalf of consumers of PolySwarm threat intelligence (e.g. enterprise customers). If you'd like to act as a conduit to the PolySwarm marketplace on your own behalf or on behalf third party consumers, continue reading.

When a consumer uses the PolySwarm web interface or polyswarm-api they will, by default, use an Ambassador hosted by Swarm Technologies, Inc.

Consumers may choose to speak to other Ambassadors using polyswarm-api's --api-uri argument or POLYSWARM_API_URI environment variable.


代表在市场中的作用

Ambassadors act as intermediaries between consumers of PolySwarm threat intelligence and the PolySwarm marketplace. Broadly, Ambassadors broker artifact uploads, hash searches, hunts and other features as they are developed, creating actionable events in the PolySwarm marketplace on behalf of consumers and then deliver results to consumers.

工件提交生命周期

The artifact submission lifecycle at a conceptual level:

  1. The consumer submits an artifact, and optionally, a PolySwarm community preference to the Ambassador*
  2. The Ambassador hosts the artifact in a manner that is accessible to microengines in the appropriate community/communities.
  3. The Ambassador creates a bounty for the artifact in each community, specifying: initial bounty amount, a signed transaction that escrows the initial bounty amount from the Ambassador's wallet to the community's BountyRegistry contract**, the bounty's assertion duration, the URI of the artifact, and, optionally, additional metadata (e.g. artifact filetype). 这是通过使用 polyswarmd API 与每个社区的 polyswarmd 实例对话来实现的。
  4. 每个社区中的引擎都必须在断言窗口关闭之前返回对工件的断言。
  5. Engines submit their assertions (and stake amounts) to the community's polyswarmd.
  6. Each community's polyswarmd instance delivers the aggregated results to the Ambassador.
  7. (Optional) the Ambassador distills the various assertions into digestible intelligence for the consumer.
  8. The Ambassador delivers finished intelligence to the consumer.

* 代表或消费者处理社区分配。

**It is the Ambassador's responsibility to ensure that they have adequate funds relayed to each community for the payment of network fees and to cover the initial bounty amount.

哈希搜索生命周期

Providing hash search capability to consumers requires:

  1. 已提交的工件和市场对这些工件的响应的存档。
  2. 处理哈希错误的策略。

The hash search lifecycle at a conceptual level:

  1. The Ambassador maintains an archive of all incoming queries and responses. 此存档包含描述查询的属性,例如每个提交的工件的哈希和文件类型,以及响应,例如不同时间的引擎到断言(及 NCT 投注额)映射。
  2. 消费者请求特定工件哈希的信息。
  3. The Ambassador consults their archive for information about the hash of the file.
  4. If the Ambassador has data about the hash, it returns that data.

The Ambassador may not have data about a hash because:

  1. 哈希与从未提交给 PolySwarm 市场的工件相对应。
  2. The hash corresponds to an artifact that was not submitted to the PolySwarm market via this Ambassador. Another Ambassador, e.g. Swarm Technologies' Ambassador, may have seen the artifact.
  3. The Ambassador previously handled the artifact but did not archive for various reasons: error, legal requirements, etc.

If your archive does not have the data needed to directly respond to your consumer's query, you may considering:

  1. 提出将工件重新提交给市场。
  2. Forwarding the consumer's request to Swarm Technologies' or a third party Ambassador that may have record of the results.

查杀生命周期

Providing hunt capabilities to consumers requires:

  1. 提交工件的存档。
  2. 一个可扩展且经济的基础架构,用于对照存档评估消费者上传的规则。

The hunt lifecycle at a conceptual level:

  1. The consumer submits hunt criteria (e.g. a YARA rule) to the Ambassador.
  2. The Ambassador invokes a search process for the criteria.
  3. The Ambassador returns results to the consumer.

As an Ambassador, it's largely up to you to define how (or whether) you'd like to offer hunting functionality to your consumers. Swarm Technologies' Ambassador will initially support YARA rule scanning. We encourage others to support the same.


Developing Your Ambassador

成功代表的标志

As an Ambassador, you'll be your consumer's interface to the PolySwarm marketplace. It's important that you strive to provide a service that is:

  • 易于使用
  • 可伸缩
  • 低延迟
  • 高吞吐率
  • 经济有效

Developing an Ambassador

Windows-based Ambassadors are not supported; we strongly recommend developing Ambassadors under Linux.

先决条件

Configure your Linux-based development environment.

polyswarm-client 上构建

polyswarm-client provides a convenient basis for Ambassador development by abstracting polyswarmd API complexities and providing ready-to-use Ambassador examples. By building on top of polyswarm-client, you won't have to worry about maintaining polyswarmd API compatibility, freeing time to focus on your Ambassador's differentiating features and developing your business logic. This tutorial will build on polyswarm-client and use the examples provided therein.

If you'd like to build an Ambassador from scratch, we encourage you to complete this tutorial first and then consult the polyswarmd API for a description of interfaces your Ambassador must support. It will be your responsibility to track polyswarmd releases and update your API interface as necessarily to ensure uninterrupted service to your customers.

The below sections break down the example Ambassadors available in polyswarm-client master as of commit 3c2e432289276f69be96db4e8eb587a997900af9.

All example Ambassadors assume a 1:1 relationship between Ambassadors and communities. As an Ambassador developer, you may want to interface with multiple communities. This and other real-world concerns are covered in a subsequent section.

示例:EICAR 提交代表

EICAR is a test file used by the Antivirus industry to test engines' ability to detect malware. The file is not malicious, but is flagged as such by many antivirus engines.

Here we'll discuss submitting EICAR as a PolySwarm Ambassador. Elsewhere, we discuss building a Microengine that detects EICAR.

polyswarm-client 带有一个 EICAR 提交代表 (eicar.py),该代表在 IPFS(一个公共的分布式文件共享网络)上保留工件。

eicar.py begins:

import base64
import logging
import random
import os

from concurrent.futures import CancelledError
from polyswarmclient.abstractAmbassador import AbstractAmbassador

logger = logging.getLogger(__name__)

EICAR = base64.b64decode(
    b'WDVPIVAlQEFQWzRcUFpYNTQoUF4pN0NDKTd9JEVJQ0FSLVNUQU5EQVJELUFOVElWSVJVUy1URVNULUZJTEUhJEgrSCo=')
NOT_EICAR = 'this is not malicious'
ARTIFACTS = [('eicar', EICAR), ('not_eicar', NOT_EICAR)]

经过一些导入和日志记录配置后,对 EICAR 字符串和一个明显不是 EICAR 的字符串进行硬编码。 这些字符串被放置在 ARARTIFS 数组中。

继续:

BOUNTY_TEST_DURATION_BLOCKS = int(os.getenv('BOUNTY_TEST_DURATION_BLOCKS', 5))

eicar.py 将默认断言持续时间窗口设置为 5 个区块。 墙上时间内的区块时间由社区所在者决定。 在 Swarm Technologies 托管的社区中,大约每秒向链中添加 1个区块,因此 5 区块块窗口大约为 5 秒。 此默认值可以用环境变量覆盖。

class Ambassador(AbstractAmbassador):
    """Ambassador which submits the EICAR test file"""

    def __init__(self, client, testing=0, chains=None, watchdog=0, submission_rate=30):
        """
        Initialize {{ cookiecutter.participant_name }}
        Args:
            client (`Client`): Client to use
            testing (int): How many test bounties to respond to
            chains (set[str]): Chain(s) to operate on
            watchdog: interval over which a watchdog thread should verify bounty placement on-chain (in number of blocks)
            submission_rate: if nonzero, produce a sleep in the main event loop to prevent the Ambassador from overloading `polyswarmd` during testing
        """
        init_logging([__name__], log_format='json')
        super().__init__(client, testing, chains, watchdog, submission_rate)

eicar.py's Ambassader is built on polyswarm-client's AbstractAmbassador. 除其他作用外,AbstractAmbassador 建立与托管的 polyswarmd 的连接,并通过名为 client 的实例变量管理此连接。

AbstractAmbassador declares a single method, generate_bounties, as abstract. AbstractAmbassador 的所有子类都必须定义此方法。

正如您可能会想象的那样,eicar.py 的此方法的实现非常简单:

    async def generate_bounties(self, chain):
        """Submit either the EICAR test string or a benign sample

        Args:
            chain (str): Chain sample is being requested from
        """
        amount = await self.client.bounties.parameters[chain].get('bounty_amount_minimum')

        while True:
            try:
                filename, content = random.choice(ARTIFACTS)

                logger.info('Submitting %s', filename)
                ipfs_uri = await self.client.post_artifacts([(filename, content)])
                if not ipfs_uri:
                    logger.error('Error uploading artifact to IPFS, continuing')
                    continue

                await self.push_bounty(amount, ipfs_uri, BOUNTY_TEST_DURATION_BLOCKS, chain)
            except CancelledError:
                logger.warning('Cancel requested')
                break
            except Exception:
                logger.exception('Exception in bounty generation task, continuing')
                continue

该方法执行以下操作(模数错误检查):

  1. polyswarmd 查询最低初始悬赏额
  2. 进入无限循环
  3. 随机选择 eicarnot_eicar 字符串作为工件
  4. 告诉 polyswarmd 在 IPFS 上托管工件
  5. 指示 polyswarmd 发布悬赏,指定初始悬赏额(允许的最小值)、工件的 URI、断言窗口持续时间和 chain*

*chain refers to which blockchain to post the bounty on: "homechain" or "sidechain". This argument should always be side; it will be removed in a future polyswarm-client release.

Notes:

  1. polyswarm-client-derived Ambassadors are multi-threaded by default handling events asynchronously. This infinite loop will be isolated to the thread responsible for posting bounties; the remainder of the Ambassador will function normally.
  2. 循环中没有显式睡眠。 This is intentional; the thread responsible for generate_bounties effectively sleeps while blocking on bounty submission each time it calls self.client.post_artifacts (blocking on IPFS host) and self.client.push_bounty (blocking on the announcement of the bounty in the marketplace by polyswarmd).

eicar.py is a trivial example that does not account for many real-world Ambassador operating concerns. Next, we'll expand on this example to an Ambassador that submits on-disk artifacts.

示例:“Filesystem”代表

polyswarm-client's filesystem.py Ambassador expands on the eicar.py Ambassador, submitting artifacts from a local filesystem.

它以类似的方式开始:

import logging
import random
import os

from concurrent.futures import CancelledError
from polyswarmclient.abstractAmbassador import AbstractAmbassador
from polyswarmclient.corpus import DownloadToFileSystemCorpus

logger = logging.getLogger(__name__)

ARTIFACT_DIRECTORY = os.getenv('ARTIFACT_DIRECTORY', 'docker/artifacts')
ARTIFACT_BLACKLIST = os.getenv('ARTIFACT_BLACKLIST', 'truth.db').split(',')
BOUNTY_TEST_DURATION_BLOCKS = int(os.getenv('BOUNTY_TEST_DURATION_BLOCKS', 5))

同样,将处理导入,对悬赏时间进行硬编码,并配置日志记录。 filesystem.py makes use of polyswarmclient.corpus, a helper class that will download, decrypt and extract an artifact collection. Swarm Technologies uses this class internally during continuous integration to ensure that legitimately malicious artifacts are detected as such by microengines.

继续:

class Ambassador(AbstractAmbassador):
    """Ambassador which submits artifacts from a directory"""

    def __init__(self, client, testing=0, chains=None, watchdog=0, submission_rate=30):
        """Initialize a filesystem Ambassador
        Args:
            client (`Client`): Client to use
            testing (int): How many test bounties to respond to
            chains (set[str]): Chain(s) to operate on
        """
        init_logging([__name__], log_format='json')
        super().__init__(client, testing, chains, watchdog, submission_rate)

filesystem.pyAbstractAmbassador 类再使用几个有助于测试的参数:

  • testing: when nonzero, this parameter specifies the maximum number of bounties the Ambassador will generate before exiting.
  • watchdog:块间隔。 Bounties placed by this Ambassador are checked against each new block to ensure that the bounty has been successfully placed on-chain.
  • submission_rate: if nonzero, this produces a sleep in the main event loop to prevent the Ambassador from overloading polyswarmd during testing.

继续:

        self.artifacts = []
        u = os.getenv("MALICIOUS_BOOTSTRAP_URL")
        if u:
            logger.info("Unpacking malware corpus at {0}".format(u))
            d = DownloadToFileSystemCorpus()
            d.download_and_unpack()
            bfl = d.get_benign_file_list()
            mfl = d.get_malicious_file_list()
            logger.info("Unpacking complete, {0} malicious and {1} benign files".format(len(mfl), len(bfl)))
            self.artifacts = bfl + mfl
        else:
            for root, dirs, files in os.walk(ARTIFACT_DIRECTORY):
                for f in files:
                    self.artifacts.append(os.path.join(root, f))

If the environment variable MALICIOUS_BOOTSTRAP_URL is set, the Ambassador downloads artifacts from a testing repository. If it's not set, ARTIFACT_DIRECTORY directory is walked relative to the Ambassador's current working directory. 收集文件,准备悬赏生成。

filesystem.py 将覆盖 generate_bounties 方法:

    async def generate_bounties(self, chain):
        """Submit bounty from the filesystem
        Args:
            chain (str): Chain sample is being requested from
        """
        amount = await self.client.bounties.parameters[chain].get('bounty_amount_minimum')

        while True:
            try:
                filename = random.choice(self.artifacts)

                logger.info('Submitting file %s', filename)
                ipfs_uri = await self.client.post_artifacts([(filename, None)])
                if not ipfs_uri:
                    logger.error('Error uploading artifact to IPFS, continuing')
                    continue

                await self.push_bounty(amount, ipfs_uri, BOUNTY_TEST_DURATION_BLOCKS, chain)
            except CancelledError:
                logger.warning('Cancel requested')
                break
            except Exception:
                logger.exception('Exception in bounty generation task, continuing')
                continue

This is identical to the logic contained withing eicar.py, refer to the previous section for a breakdown.

filesystem.py 通过从磁盘上且可以是远程的 URI 来构建工件,从而在 eicar.py 上构建。 In a real-world Ambassador, these artifacts would come from the consumer's submissions to the Ambassador.

participant-template 建立代表

The easiest way to get started is to build an Ambassador using participant-template. By using the template, your Ambassador will be based on polyswarm-client, allowing you to focus on business logic.

我们将从 engine-template 切断我们的微引擎。 为此,我们需要 cookiecutter

pip install cookiecutter

安装 cookiecutter,从我们的 participant-template 快速启动您的微引擎一样简便:

cookiecutter https://github.com/polyswarm/participant-template

And answering some prompts. Read about these prompts here.

Choose:

  • participant_type: Ambassador
  • platform: linux (Windows Ambassadors are not supported)
  • participant_name: helloworld
  • 接受剩余默认值

You'll be left with an Ambassador-helloworld directory. Change directory (cd) into Ambassador-helloworld:

$ cd Ambassador-helloworld

自定义您的代表

Here we'll implement a simple, minimum viable Ambassador, re-creating the EICAR Ambassador described above.

实施最低限度的可行代表就像实施您的 Ambassadorgenerate_bounties 方法一样简单。 This method is found in Ambassador_<participant_name_slug>/src/<author_org_slug>_<participant_name_slug>/__init__.py (Ambassador_helloworld/src/polyswarm_helloworld/__init__.py if you followed the cookiecutter prompts as described above). Production Ambassadors will, of course, need to do far more than this.

Open Ambassador_helloworld/src/polyswarm_helloworld/__init__.py.

Customize the file to include the EICAR and not-EICAR definitions we saw in the EICAR Ambassador:

...
logger = logging.getLogger(__name__)

EICAR = base64.b64decode(
    b'WDVPIVAlQEFQWzRcUFpYNTQoUF4pN0NDKTd9JEVJQ0FSLVNUQU5EQVJELUFOVElWSVJVUy1URVNULUZJTEUhJEgrSCo=')
NOT_EICAR = 'this is not malicious'
ARTIFACTS = [('eicar', EICAR), ('not_eicar', NOT_EICAR)]

BOUNTY_TEST_DURATION_BLOCKS = int(os.getenv('BOUNTY_TEST_DURATION_BLOCKS', 5))
...

然后,自定义 generate_bounty 方法,以提交 EICAR 和非 EICAR:

    async def generate_bounties(self, chain):
        """Submit either the EICAR test string or a benign sample

        Args:
            chain (str): Chain sample is being requested from
        """
        amount = await self.client.bounties.parameters[chain].get('bounty_amount_minimum')

        while True:
            try:
                filename, content = random.choice(ARTIFACTS)

                logger.info('Submitting %s', filename)
                ipfs_uri = await self.client.post_artifacts([(filename, content)])
                if not ipfs_uri:
                    logger.error('Error uploading artifact to IPFS, continuing')
                    continue

                await self.push_bounty(amount, ipfs_uri, BOUNTY_TEST_DURATION_BLOCKS, chain)
            except CancelledError:
                logger.warning('Cancel requested')
                break
            except Exception:
                logger.exception('Exception in bounty generation task, continuing')
                continue

Once these changes are made, you now have an EICAR Ambassador built on participant-template!

Next, we'll consider some real-world concerns that go beyond the current scope of this document and then test our EICAR-submitting Ambassador.


Production Ambassador Considerations

The eicar.py and filesystem.py Ambassadors are proof of concepts that do not address many requirements desirable of production Ambassadors, including, but not limited to:

  1. 面向消费者的 API*
  2. 与多个社区交谈的能力
  3. 跟踪消费者请求的手段,达到限制费率和记账等目的
  4. 可扩展的基础设施,根据需求进行调整,确保低延迟和高吞吐速率
  5. 过去的查询和结果的可扩展存档,哈希搜索、查杀和其他功能在此基础上创建

Ready to build your ambassador and serve as your clients' window into the PolySwarm marketplace?

I want to build an Ambassador →

Ambassadors are only supported under Linux.