Metadata Fields Reference
Searchable fields
This page is an analyst-curated walkthrough of the most useful searchable metadata fields, with current example values and the pivot queries an analyst typically runs once they have a value. It's not the full list.
For the complete, always-current list of every searchable field, run the mapping CLI command:
polyswarm search mappingor call the equivalent REST endpoint directly:
curl -H "Authorization: $POLYSWARM_API_KEY" \
https://api.polyswarm.network/v3/search/metadata/mappingsBoth return every field in the live metadata-* index, with description and
category metadata where available. This page is the curated narrative; that
endpoint is the canonical reference.
Example values shown below are real values pulled from production indices. URLs, IPs, and domains in sampled value lists are defanged for safe display.
polyunite
polyunite.malware_family
Malware family label derived by polyunite from per-engine verdicts. Field is keyword in the ES mapping, but the search endpoint's query parser lowercases terms before matching — so queries are effectively case-insensitive (lockbit, LOCKBIT, and LockBit all return the same results). polyunite normalizes well-known families to lowercase (lockbit, emotet, rhadamanthys), but any family it doesn't recognize is passed through with the AV vendor's casing — could be mixed-case (ShiFu, Trojan.DownLoader1, Wapomi). See the curated exemplars table below for current crime + offensive-tool names verified to be lowercase in the index.
- Type:
keyword - Normalizer: none (stored values are lowercase for known families; queries are case-insensitive at the search endpoint)
- Aggregatable: yes
- Value casing: polyunite normalizes well-known families to lowercase; vendor casing passes through for unknowns — queries are case-insensitive regardless
Example: polyunite.malware_family:lockbit
Example values (curated):
| Value | Notes |
|---|---|
lockbit |
ransomware |
blackcat |
ransomware (ALPHV) |
rhadamanthys |
infostealer |
lummastealer |
infostealer |
vidar |
infostealer |
stealc |
infostealer |
formbook |
infostealer |
agenttesla |
infostealer / RAT |
qakbot |
loader / banking trojan |
emotet |
loader |
njrat |
RAT |
asyncrat |
RAT |
remcos |
RAT |
sliver |
C2 framework (offensive tooling) |
cobaltstrikebeacon |
C2 framework (offensive tooling) |
These 15 families have been verified to be stored lowercase in metadata-* (April 2026). polyunite normalizes its known families to lowercase but does not normalize names it doesn't recognize — vendor casing passes through (e.g. ShiFu, Trojan.DownLoader1). Refresh quarterly.
Pivots:
All samples in this family
polyunite.malware_family:{family}Stored lowercase for known families. Queries are case-insensitive — lockbit, LOCKBIT, and LockBit all match.
Recent activity in this family
polyunite.malware_family:{family} AND artifact.created:[now-30d TO *]Family + high polyscore (high-confidence variants)
polyunite.malware_family:{family} AND scan.latest_scan.polyscore:[0.9 TO *]Family + sandbox C2 IPs — find live infrastructure
polyunite.malware_family:{family} AND _exists_:cape_sandbox_v2.extracted_c2_ipsscan
scan.mimetype.mime
MIME type of the artifact
Example: scan.mimetype.mime:"application/pdf"
Pivots:
Scope analysis to a file type before any other filter
scan.mimetype.mime:{value}Field is text and tokenized on '/'. For exact MIME-type matches, quote the value: scan.mimetype.mime:"application/pdf".
Type + recently-seen + malicious
scan.mimetype.mime:{value} AND scan.last_seen:[now-7d TO *] AND scan.detections.malicious:>1scan.latest_scan.polyscore
PolyScore from latest scan (0-1, higher = more malicious)
Example: scan.latest_scan.polyscore:[0.5 TO *]
Pivots:
High-confidence malicious only
scan.latest_scan.polyscore:[0.9 TO *]PolyScore disagrees with engines (high score, few engine detections)
scan.latest_scan.polyscore:[0.8 TO *] AND scan.latest_scan.detections.malicious:[0 TO 2]These are interesting research candidates — PolyScore caught something engines missed.
hash
hash.md5
A widely used hash function that produces a 128-bit value, typically represented as a 32-character hexadecimal number. Commonly used for file integrity checks, though considered less secure for cryptographic purposes due to known weaknesses.
Example: hash.md5:d41d8cd98f00b204e9800998ecf8427e
Pivots:
Look up everything by MD5
hash.md5:{value}MD5 collisions are computationally feasible; for high-confidence identity prefer sha256.
hash.sha1
A cryptographic hash function that generates a 160-bit value. Used in many security protocols, but largely replaced by more secure algorithms due to discovered weaknesses.
Example: hash.sha1:da39a3ee5e6b4b0d3255bfef95601890afd80709
Pivots:
Look up everything by SHA1
hash.sha1:{value}SHA1 collisions are demonstrated but uncommon in malware corpora; sha256 is the safer identifier.
hash.sha256
Part of the SHA-2 family, this hash function produces a 256-bit output. Commonly used for security and integrity checks; considered secure for most modern cryptographic applications.
Example: hash.sha256:9838e53777041620de659421f8b50e87815ff738fcf64478b83d104c2a958f1f
Pivots:
Look up everything by SHA256 (preferred identifier)
hash.sha256:{value}SHA256 is the canonical artifact identifier across PolySwarm. Use this when you have a choice.
Find dropper / dropped relationships
cape_sandbox_v2.dropped.sha256:{value} OR cape_sandbox_v2.dropped.extracted_files.sha256:{value}Find every sample that drops this exact payload at runtime, including sub-extracted layers.
hash.ssdeep
A fuzzy hash for similarity matching of files. Identifies files that are similar but not identical. Useful for finding malware variants and modified payloads.
Example: hash.ssdeep:*
Pivots:
Pull candidate ssdeeps for client-side fuzzy match
polyunite.malware_family:{family} AND _exists_:hash.ssdeepES does exact match only on hash.ssdeep. Substitute {family} with the family of your starting sample (or any other anchor — time window, mimetype, imphash). Include hash.ssdeep in the result fields, then use python ssdeep.compare() locally to cluster.
Recent ssdeeps in a mimetype bucket
scan.mimetype.mime:{mime} AND scan.last_seen:[now-7d TO *] AND _exists_:hash.ssdeepSame client-side flow as above, scoped by file type and recency.
hash.tlsh
Trend Micro Locality Sensitive Hash — a similarity hash for detecting near-duplicate files. More stable than ssdeep on larger files; useful for repacked-variant hunting.
Example: hash.tlsh:*
Pivots:
Pull candidate TLSH digests for client-side similarity
polyunite.malware_family:{family} AND _exists_:hash.tlshES does exact match only on hash.tlsh. Use this query to retrieve a candidate set with hash.tlsh included, then compute TLSH distance locally with python-tlsh (tlsh.diff(a, b) returns 0 for identical, <70 is typically near-duplicate, <100 is loosely related).
TLSH set scoped by recency / type
scan.mimetype.mime:{mime} AND scan.last_seen:[now-30d TO *] AND _exists_:hash.tlshWider net for variant hunting in a corpus slice; same client-side distance step.
Sandbox (CAPE)
cape_sandbox_v2.extracted_c2_ips
C2 / connection-target IPs that the CAPE sandbox extracted from the sample's runtime behavior. Typed ip, so CIDR queries work — quote the CIDR (e.g. "185.244.25.0/24"). This is the canonical sandbox C2 IP field — prefer it over network.hosts.ip (text) for analyst pivots.
Example: cape_sandbox_v2.extracted_c2_ips:5.196.74.210
Pivots:
Other samples that contacted this C2 IP
cape_sandbox_v2.extracted_c2_ips:{value}Same /24 — adjacent infra often shares operators
cape_sandbox_v2.extracted_c2_ips:"{value/24}"Substitute the /24 of {value}, e.g. "185.244.25.0/24". Quote the CIDR — unquoted slashes cause a parse error.
C2 IP + family — count families using this infra
cape_sandbox_v2.extracted_c2_ips:{value} AND _exists_:polyunite.malware_familyRun as a metadata aggregation on polyunite.malware_family to identify shared infrastructure across families.
cape_sandbox_v2.suricata_alerts.signature
Name of the signature that triggered the alert.
Example values (curated):
| Value | Notes |
|---|---|
ET MALWARE Common Stealer Behavior - Source IP Associated with Hosting Provider… |
infostealer behavior |
ET MALWARE Terse alphanumeric executable downloader high likelihood of being ho… |
downloader |
ET MALWARE Win32/Delf.TJJ CnC Domain in DNS Lookup (udo.jxwan.com) |
C2 / DNS lookup |
ET MALWARE Win32/RustMiner Suspicious HTTP Accept Header Observed |
cryptominer |
ET MALWARE Ransom.Win32.Birele.gsg Checkin |
ransomware checkin |
Real signature names from the Emerging Threats (ET) ruleset that fired on samples in metadata-* (sampled April 2026). Field is text-typed; tokenized words match. Substring queries with * wrap each side find variants of a signature.
Pivots:
Group corpus by this Suricata signature
cape_sandbox_v2.suricata_alerts.signature:*{value}*Use wildcards — single-token queries may return empty on this field. Wrap the keyword in `` on both sides, or use a distinctive fragment of the signature name.*
cape_sandbox_v2.dropped.sha256
SHA256 of dropped files
Example: cape_sandbox_v2.dropped.sha256:<sha256>
Pivots:
Find every sample that dropped this exact payload
cape_sandbox_v2.dropped.sha256:{value}Field is text but hex tokenizes as a single token; exact match works. Prefer this over dropped.md5 if available.
Pivot to the standalone artifact
artifact.sha256:{value}cape_sandbox_v2.dropped.md5
MD5 of a file written to disk by the sample during sandbox execution. Stored as text; aggregations don't work but exact-match queries do (hex strings tokenize as a single token). Pivot on this to find every sample that drops the same payload.
Example: cape_sandbox_v2.dropped.md5:<md5>
Pivots:
Find every sample that dropped this exact payload
cape_sandbox_v2.dropped.md5:{value}Strong signal for shared dropper / multi-stage families; the dropped MD5 is the second-stage payload. Field is text; aggregations don't work but exact match does.
Pivot to the standalone artifact (if PolySwarm has scanned the dropped file)
artifact.md5:{value}cape_sandbox_v2.dropped.extracted_files.sha256
SHA256 of a sub-payload that CAPE extracted from a dropped binary (unpacker output, embedded resources, etc.). One layer deeper than dropped.sha256 — useful when the dropper varies across samples but the unpacked payload is shared. Text-typed; exact match works.
Example: cape_sandbox_v2.dropped.extracted_files.sha256:<sha256>
Pivots:
Find samples sharing this sub-extracted payload
cape_sandbox_v2.dropped.extracted_files.sha256:{value}One layer deeper than dropped.sha256 — useful when the dropper itself varies but the unpacked / embedded payload is shared across the cluster.
Pivot to the standalone artifact
artifact.sha256:{value}Cross-layer: same payload extracted AND dropped directly
cape_sandbox_v2.dropped.extracted_files.sha256:{value} OR cape_sandbox_v2.dropped.sha256:{value} OR artifact.sha256:{value}Catches the payload regardless of which layer reported it.
cape_sandbox_v2.dropped.filepath
Path on the guest VM where the sample wrote the dropped file. Combined with malware family, often reveals install-path templates (e.g. always %APPDATA%\Microsoft\Windows\
Example: cape_sandbox_v2.dropped.filepath:*\\AppData\\Roaming\\*
Pivots:
Search by path fragment (substring match)
cape_sandbox_v2.dropped.filepath:*\\AppData\\Roaming\\*Most useful pattern — find samples writing anywhere under a known directory or matching a filename pattern. Wrap fragments in `` on both sides; escape backslashes for Windows paths.*
Family + install-path patterns
polyunite.malware_family:{family} AND _exists_:cape_sandbox_v2.dropped.filepathPull every install path a given family uses; common to find a small set of templates per family.
Exact path match
cape_sandbox_v2.dropped.filepath:"%APPDATA%\\Microsoft\\Windows\\update.exe"Quote when you've already identified a specific path you're hunting.
cape_sandbox_v2.dropped.guest_paths
Every observed file system location the dropped file appeared at during sandbox execution. Broader than dropped.filepath — useful when the sample copies itself to multiple locations. Text-typed.
Example: cape_sandbox_v2.dropped.guest_paths:*\\AppData\\*
Pivots:
Search by path fragment
cape_sandbox_v2.dropped.guest_paths:*\\AppData\\*guest_paths records every observed location the dropped file appeared at — broader than filepath. Substring search is the typical pattern.
cape_sandbox_v2.target.file.yara.name
Name of any YARA rule that matched the sample's executable image inside the CAPE sandbox. Surfaces behavior-based detections — sandbox-evasion checks, packer signatures, embedded shellcode/PE patterns, LNK execution chains — that fire even when AV verdicts miss the sample. Powerful for hunting evasion techniques across families: pivot from a single rule name to every sample exhibiting that technique, regardless of malware family.
- Type:
text - Normalizer: standard analyzer (lowercases tokens; underscores kept)
- Aggregatable: no
Example: cape_sandbox_v2.target.file.yara.name:"INDICATOR_SUSPICIOUS_EXE_SandboxHookingDLL"
Example values (curated):
| Value | Notes |
|---|---|
INDICATOR_SUSPICIOUS_EXE_SandboxHookingDLL |
sandbox evasion |
vmdetect |
VM / sandbox detection |
INDICATOR_EXE_Packed_ASPack |
packer (ASPack) |
AutoIT_Compiled |
wrapper (AutoIT) |
shellcode_get_eip |
shellcode pattern |
shellcode_patterns |
shellcode pattern |
shellcode_stack_strings |
shellcode pattern |
embedded_pe |
embedded payload |
embedded_win_api |
embedded API resolver |
EXE_in_LNK |
LNK execution chain |
Execution_in_LNK |
LNK execution chain |
Script_in_LNK |
LNK execution chain |
MSOffice_in_LNK |
LNK execution chain |
Archive_in_LNK |
LNK execution chain |
Verified to return current hits in metadata-* (April 2026). Common rule prefixes group related techniques: INDICATOR_SUSPICIOUS_* for suspicious-executable behaviors, shellcode_* for shellcode patterns, *_in_LNK for LNK-based execution chains. Prefix wildcards on the field name work — e.g. cape_sandbox_v2.target.file.yara.name:INDICATOR_SUSPICIOUS_*.
Pivots:
All samples that triggered this YARA rule
cape_sandbox_v2.target.file.yara.name:"{value}"Field is text; quote the rule name for exact match. Behavior-based — fires even when AV misses the sample.
Find every rule in a behavior class — wildcard prefix
cape_sandbox_v2.target.file.yara.name:INDICATOR_SUSPICIOUS_*Substitute the prefix you care about: `INDICATORSUSPICIOUS(suspicious-EXE behaviors),shellcode(shellcode patterns),in_LNK` (LNK execution chain). Useful for hunting a technique, not a single rule.*
YARA rule + family — what families use this technique
cape_sandbox_v2.target.file.yara.name:"{value}" AND _exists_:polyunite.malware_familyRun as a metadata aggregation on polyunite.malware_family to map a behavior to its top families — fast way to see who relies on a given evasion / packer / loader technique.
cape_sandbox_v2.ttp
MITRE ATT&CK technique IDs that the CAPE sandbox attributed to the sample's runtime behavior. Stored as an array — a single sample typically carries several T-codes (encryption + C2 + persistence + evasion). Lets you pivot from a behavior (e.g. T1486 ransomware encryption, T1497 sandbox evasion, T1055 process injection) to every sample exhibiting it, regardless of family or AV verdict. Subtechniques use dotted IDs (e.g. T1027.002 = software packing) — quote them.
- Type:
text - Normalizer: standard analyzer (lowercases tokens)
- Aggregatable: no
Example: cape_sandbox_v2.ttp:"T1486"
Example values (curated):
| Value | Notes |
|---|---|
T1071 |
Application Layer Protocol (C2) |
T1027 |
Obfuscated Files or Information |
T1027.002 |
Software Packing |
T1497 |
Virtualization / Sandbox Evasion |
T1003 |
OS Credential Dumping |
T1055 |
Process Injection |
T1057 |
Process Discovery |
T1082 |
System Information Discovery |
T1112 |
Modify Registry |
T1547.001 |
Persistence — Registry Run Keys |
T1486 |
Data Encrypted for Impact (ransomware) |
T1485 |
Data Destruction |
T1573 |
Encrypted Channel |
T1562 |
Impair Defenses |
T1564.001 |
Hide Artifacts — Hidden Files |
Verified live in metadata-* (April 2026). Subtechnique IDs (e.g. T1027.002) contain a dot — quote them so the query parser treats them as a single phrase. Refer to https://attack.mitre.org/ for current technique definitions.
Pivots:
All samples that exhibited this technique (CAPE)
cape_sandbox_v2.ttp:"{value}"Quote the T-code — subtechniques (e.g. T1027.002) contain a dot and need phrase-quoting.
Multi-technique intersection — narrow to a behavior chain
cape_sandbox_v2.ttp:"{value}" AND cape_sandbox_v2.ttp:"T1071"ttp is an array; AND across T-codes finds samples carrying both techniques. Common pairings: ransomware encryption + C2 (T1486 AND T1071), evasion + injection (T1497 AND T1055). Substitute the second T-code for the chain you care about.
Cross-sandbox corroboration — same technique seen by both
cape_sandbox_v2.ttp:"{value}" AND triage_sandbox_v0.ttp:"{value}"Both sandboxes flagging the same technique is a stronger signal than one alone — useful when triaging whether a behavior is real vs. sandbox-specific noise.
Sandbox (Triage)
triage_sandbox_v0.extracted.dropper.urls.url
URL used by the dropper.
Example: triage_sandbox_v0.extracted.dropper.urls.url:*evilpath/evilbin.exe*
Pivots:
Search by URL fragment (substring match)
triage_sandbox_v0.extracted.dropper.urls.url:*evilpath/evilbin.exe*The most common analyst pattern. A path or filename fragment finds every sample whose dropper fetched from any URL containing it — regardless of host rotation. Wrap the fragment in `` on both sides.*
Same host, any path — campaign sweep
triage_sandbox_v0.extracted.dropper.urls.url:*example.com*Substitute the host you saw. Returns every sample whose dropper hit that host on any path. Pair with a recent artifact.created window to keep wildcards cheap.
Exact URL match
triage_sandbox_v0.extracted.dropper.urls.url:"https://example.com/path/payload.exe"Quote the full URL. Use when you've already identified a specific staging URL and want only the samples that hit *exactly that one.*
Dropper URLs grouped by family
polyunite.malware_family:{family} AND _exists_:triage_sandbox_v0.extracted.dropper.urls.urlPull every staging URL a given family is using right now.
triage_sandbox_v0.analysis.family
Malware family Triage's sandbox attributed to the sample based on runtime behavior. Independent from polyunite.malware_family (which is derived from per-engine AV verdicts) — comparing the two surfaces both high-confidence corroboration (sandbox + AV agree) and research candidates (sandbox identifies a family AV missed). Stored as an array, lowercase. Field is text and the standard analyzer lowercases tokens, so queries are case-insensitive.
- Type:
text - Normalizer: standard analyzer (lowercases tokens)
- Aggregatable: no
- Value casing: stored lowercase; case-insensitive at query time
Example: triage_sandbox_v0.analysis.family:"cobaltstrike"
Example values (curated):
| Value | Notes |
|---|---|
cobaltstrike |
C2 framework (offensive tooling) |
metasploit |
C2 framework (offensive tooling) |
kawaiiunicorn |
ransomware |
blihanstealer |
infostealer |
cosmu |
infostealer / virus |
Verified live in metadata-* (April 2026). Triage stores values lowercase; case-insensitive at query time. Cross-reference with polyunite.malware_family — agreement is a strong-confidence label, disagreement is a research signal.
Pivots:
All samples Triage attributes to this family
triage_sandbox_v0.analysis.family:"{value}"Quote the family name. Field is text and the analyzer lowercases tokens, so case-insensitive at query time.
Sandbox + AV agree — high-confidence family label
triage_sandbox_v0.analysis.family:"{value}" AND polyunite.malware_family:{family}Both Triage's runtime attribution and AV-derived polyunite labeling agree — strongest family-attribution signal in the corpus. Substitute the matching polyunite family value for {family} (typically the same string).
Triage caught a family AV missed — research candidates
triage_sandbox_v0.analysis.family:"{value}" AND NOT _exists_:polyunite.malware_familyTriage's sandbox attributed a family but no AV engine produced a polyunite label. These are interesting research candidates — sandbox-only family identifications often surface novel or undertested variants.
triage_sandbox_v0.ttp
MITRE ATT&CK technique IDs that the Triage sandbox attributed to the sample's runtime behavior. Stored as an array. Independent from cape_sandbox_v2.ttp (the two sandboxes don't always agree) — having both gives an analyst a way to corroborate behavior or surface sandbox-specific blind spots. Triage's TTP coverage tends to skew toward persistence (T1547.*) and host-discovery (T1614.001, T1082) techniques.
- Type:
text - Normalizer: standard analyzer (lowercases tokens)
- Aggregatable: no
Example: triage_sandbox_v0.ttp:"T1547.001"
Example values (curated):
| Value | Notes |
|---|---|
T1614.001 |
System Location Discovery — System Language |
T1547.001 |
Persistence — Registry Run Keys |
T1547.004 |
Persistence — Winlogon Helper DLL |
T1547.014 |
Persistence — Active Setup |
T1112 |
Modify Registry |
T1082 |
System Information Discovery |
T1564.001 |
Hide Artifacts — Hidden Files |
Verified live in metadata-* (April 2026). Triage's TTP coverage is narrower than CAPE's and skews toward persistence (T1547.*) and host-discovery (T1614.001, T1082). Cross-reference with cape_sandbox_v2.ttp to corroborate or spot sandbox-specific blind spots.
Pivots:
All samples that exhibited this technique (Triage)
triage_sandbox_v0.ttp:"{value}"Quote the T-code — subtechniques (e.g. T1547.001) contain a dot and need phrase-quoting.
Persistence sweep — every Registry / Logon technique
triage_sandbox_v0.ttp:T1547.*Triage's TTP coverage skews toward persistence; the T1547. family covers Registry Run Keys, Winlogon Helper DLL, Active Setup, etc. Wildcard prefix on the T-code finds them all in one query.*
Technique + family — what families use this technique (Triage)
triage_sandbox_v0.ttp:"{value}" AND _exists_:polyunite.malware_familyRun as a metadata aggregation on polyunite.malware_family to map a technique to its top families.
Static Tools
pefile.imphash
Hash of the PE import table. Identical imphashes across samples are a strong signal of shared compiler, packer, or family — useful for clustering unpacked PEs.
Example: pefile.imphash:5d6cad172c5535e4b6b6bbd246571621
Pivots:
Same import-table hash → likely same compiler / packer
pefile.imphash:{value}imphash matches across samples are a stronger family signal than fuzzy hashes for unpacked PEs.
imphash + family — confirm the family is consistent across the cluster
pefile.imphash:{value} AND _exists_:polyunite.malware_familypefile.resources.md5
The MD5 hash of a resource, used for integrity verification.
Example: pefile.resources.md5:e44e3eb91dbf2fde6d40b95f9f2a5f92
Pivots:
Shared PE resource → shared codebase / dropper template
pefile.resources.md5:{value}Common in malware families that bundle a payload as a resource (e.g. RATs, droppers).
exiftool.mimetype
MIME type identified by ExifTool's parse of the file. Independent from scan.mimetype.mime (the scanner's bytes-based identification) — comparing the two surfaces samples where the container disagrees with the contents (a common file-masquerading signal). Field is text and tokenized on '/' — quote the value for exact-match queries.
- Type:
text - Normalizer: none (tokenized on '/')
- Aggregatable: no
Example: exiftool.mimetype:"application/pdf"
Pivots:
All samples with this MIME type (per ExifTool)
exiftool.mimetype:"{value}"Field is text and tokenized on '/' — quote the value for exact MIME-type matches.
ExifTool / scanner mimetype mismatch — file-masquerading hunt
exiftool.mimetype:"{value}" AND NOT scan.mimetype.mime:"{value}"Two independent mimetype views; mismatches surface samples where the container metadata disagrees with the bytes (e.g. a PDF wrapper hiding a non-PDF payload).
MIME type + family — what families ship as this filetype
exiftool.mimetype:"{value}" AND _exists_:polyunite.malware_familyRun as a metadata aggregation on polyunite.malware_family to see which families currently distribute this filetype.
Analyst Tags
tags
Analyst-applied labels on the artifact. Stored as an array, so a single sample can carry multiple labels (e.g. ["Ransomware", "Dropper", "Trojan", "Windows", "PE32"]). Three rough kinds of values: behavior class (Ransomware, Infostealer, RAT, Loader, Backdoor, Dropper, Trojan, Stealer, Downloader, Exploit), platform / format (Windows, Linux, MacOSX, PE32, ELF64), and structured key:value labels for sector / feed routing (e.g. sector:financial, feed:premium). Stored values are typically TitleCase but the field is text and the standard analyzer lowercases tokens at index time — tags:"ransomware" and tags:"Ransomware" both match. For the full live list of tags, run polyswarm tag list.
- Type:
text - Normalizer: standard analyzer (lowercases tokens)
- Aggregatable: no
- Value casing: stored TitleCase but case-insensitive at query time (analyzer lowercases)
Example: tags:"ransomware"
Example values (curated):
| Value | Notes |
|---|---|
Ransomware |
behavior class |
Infostealer |
behavior class |
RAT |
behavior class |
Dropper |
behavior class |
Loader |
behavior class |
Backdoor |
behavior class |
Trojan |
behavior class |
Exploit |
behavior class |
Downloader |
behavior class |
Stealer |
behavior class |
Windows |
platform |
Linux |
platform |
MacOSX |
platform |
PE32 |
file format |
ELF64 |
file format |
Verified to return current hits in metadata-* (April 2026). Stored TitleCase but case-insensitive at query time — both tags:"Ransomware" and tags:"ransomware" match. tags is an array, so a single sample commonly carries several labels (behavior + platform + format). Structured key:value labels (e.g. sector:financial, feed:premium) also exist for routing.
Pivots:
All samples carrying this tag
tags:"{value}"Field is text and the standard analyzer lowercases tokens at index time — both tags:"Ransomware" and tags:"ransomware" match. Quote multi-word values.
Multi-tag intersection — narrow to a behavior pair
tags:"{value}" AND tags:"backdoor"tags is an array, so AND across tags finds samples wearing both labels (e.g. tags:"loader" AND tags:"backdoor" for dual-purpose families). Substitute the second tag for the pair you care about.
Tag + family — what families wear this label
tags:"{value}" AND _exists_:polyunite.malware_familyRun as a metadata aggregation on polyunite.malware_family to see which families currently carry this tag — fast way to map a behavior class to its top families.