Alert search

Before the connector can produce any meaningful information to import in OpenCTI, it has to look up entity metadata in Wazuh’s OpenSearch database. A typically use case in an automated setup, is to look up every new File and Artifact hash IoCs imported into OpenCTI. Other common IoCs are domain names and IPv4/IPv6 addresses. The connector has support for looking up all of these, along with many other observables.

It is also possible to look up alerts using less common metadata, like filenames, directory paths and process command lines. This can be helpful in an investigation, if you want to see where files, whose contents varies, has spread across systems, or where a process may have been run. This essentially makes OpenCTI into a search interface for OpenSearch, allowing you to search for alerts without having to craft complicated DSL queries yourself. Be sure to read the usage documentation below before doing so, and beware that “simple” searches may result in a lot of results (sightings), which in turn may create incidents depending your configuration.

Configuration

Use CONNECTOR_SCOPE to select which entities to search for. Use the various settings in SearchConfig to determine how searches are performed.

Observables that have been created by the connector through enrichment are not looked up by default (determined by label_ignore_list and enrich_labels). In order to look up these entities, simply remove the WAZUH_IGNORE label and run the enrichment again.

Supported entities

The function documentation below describes how the various supported entities are looked up. Most observables, along with vulnerabilities and indicators are supported:

Artifact
Directory
Domain-Name
Email-Addr
Hostname
IPv4-Addr
IPv6-Addr
Mac-Addr
Network-Traffic
Process
StixFile
Url
User-Account
User-Agent
Windows-Registry-Key
Windows-Registry-Value-Type
Vulnerability
Indicator

Indicators are special: There is no direct support for searching indicators, but they can be useful to include in the connector scope in the following situation:

An observable is created in the OpenCTI platform. It is automatically enriched by the connector, but because the configuration requires observables to have indicators based on them (see require_indicator_for_incidents , require_indicator_detection , ignore_revoked_indicators and indicator_score_threshold), sightings/incidents are not created.
An indicator is later created with a relationship indicating that it is “based on” an observable. The enrichment of the indicator will in turn run on the observable that it is based on, just as if the observable was enriched directly. The search will now produce events, because of the new indicator relationship.

Even if the indicator and the relationship are created or imported into OpenCTI at the same time as the observable, this information is not available at the time the connector is told to enrich the observable automatically. The separate enrichment of the indicator is therefore necessary.

Automation can be used to automatically create observables from a STIX-only indicator without any based-on relationship. See automatically create obserables from indicators for details.

Note

The indicator pattern is not used at all.

Note

If the indicator is based on more than one observable, only the first relationship is taken into consideration.

class AlertSearcher.query_file(self, *, entity: dict, stix_entity: dict)

Search for File/Artifact hashes, filename/paths and/or size

If the file has a hash (SHA-256, MD5 or SHA-1), the hash will looked up in any field with a matching name (*sha256*).
If the file also has a name, and if filesearch_options contains SearchNameAndHash, the name is included in the search
If the file has no hash, a filename search is performed if filesearch_options contains SearchFilenameOnly
If the file does not have hashes, but has a filename and a size, and filesearch_options contains SearchSize, the search looks for the exact size in syscheck.{size_before,size_after} along with the filenames
If the file has additional names (x_opencti_additional_names) and filesearch_options contains SearchAdditionalFilenames, all filenames are included in the search

Filenames and paths

When searching for filenames, a number of settings dictate how to deal with paths. The filenames most likely do not contain a path, but if they do, the setting BasenameOnly removes this path before searching for the filename. Otherwise, the path, regardless of whether is is absolute, is included in the search.

If the file has a reference to a parent directory (parent_directory_ref), that directory’s path is included in the search if filesearch_options contains IncludeParentDirRef. If the filename already contains a path, it is removed and replaced with that of the parent directory.

If filesearch_options contains RequireAbsPath, the filename (including its parent directory’s path) must be absolute in order to run the search.

Matching

Regular expressions (Regexp) are used as long as filesearch_options contains AllowRegexp. This allows for flexible searching, like

Searching for filenames regardless of the path in alerts
Search for paths with any backslash escaping patterns (\, \\, \\\\ etc.). Wazuh’s syscheck, for instance, uses no extra ecaping, whereas sysmon and most other events uses double escaping.
Ignoring case (CaseInsensitive)

However, regular expressions may be expensive or even disabled in your OpenSearch instance, so when not using Regexp, Match is used instead. This requires an exact match of both the filename and the path.

TODO: Mention IncludeRegValues if not moved to Analyse

class AlertSearcher.query_addr(self, *, entity: dict)

Search for IPv4/IPv6 addresses

If lookup_agent_ip is true, Wazuh agents’ IP addresses will also be looked up. This is probably not useful.

If ignore_private_addrs is true, no search is performed if the IP address is private (IPv4, IPv6).

class AlertSearcher.query_mac(self, *, entity: dict)

Search for MAC addresses

If lookup_mac_variants is true, various MAC address formats will be looked up. Otherwise, only lower-case, colon-separated MAC addresses will be looked up.

class AlertSearcher.query_traffic(self, *, stix_entity: dict)

Search for network traffic SCOs

The following properties in Network-Traffic are considered:

src_ref (MAC/IPv4/IPv6 addresses only, not domain names)
src_port
dst_ref (MAC/IPv4/IPv6 addresses only, not domain names)
dst_port
protocol

Support for domain names in sources, as well as support for the other properties are not implemented, because no decoders seem to provide these kinds of fields.

If lookup_mac_variants is true, various MAC address formats will be looked up. Otherwise, only lower-case, colon-separated MAC addresses will be looked up.

Note that it is possible to add multiple addresses as sources/destinations in OpenCTI (despite the standard specifying that only one can be specified). However, only one is provided to the connector. The precedence is unknown.

class AlertSearcher.query_email(self, *, stix_entity: dict): Search for e-mail addresses

class AlertSearcher.query_domain(self, *, entity: dict)

Search for domain names and hostnames

If lookup_hostnames_in_cmd_line is enabled, command line alerts will also be searched.

class AlertSearcher.query_url(self, *, entity: dict)

Search for URLs

Some alerts, like logs from web servers, only contains the path from URLs (scheme, host etc. are not present). If lookup_url_without_host is enabled, these fields can still be matched. This is probably not useful for looking up IoCs unless you’re looking for a malicious requests.

If lookup_url_ignore_trailing_slash is enabled, trailing slashes in the observable and in alert fields will be ignored.

If none of these settings are enabled, more fields are possibly searched.

class AlertSearcher.query_directory(self, *, stix_entity: dict)

Search for Directory paths

Directory IoCs are most likely very uncommon, but extensive search support is still available. A number of options in dirsearch_options dictate how the search is performed:

MatchSubdirs will match parent directories in paths, like “/foo/bar” in “/foo/bar/baz”.
SearchFilenames will look for directories in filename fields as well. If disabled, fields that may contain either directories or absolute filename paths will still be searched.
CaseInsensitive ignores case when searching
RequireAbsPath requires the path in the observable to be absolute in order to perform a search
NormaliseBackslashes searches for several variations of backslash escaping if AllowRegexp is disabled. syscheck.path contains minimum exaping, whereas most other fields have twice the amount of backslashes. When regexp is enabled, the number of backslashes in the observable and fields are completely ignored.
IgnoreTrailingSlash will ignore trailing slashes in both the observable and fields

AllowRegexp must be enabled for most of the search flexibility to work, and most of the other options requires this option to be set. See DirSearchOption for details.

class AlertSearcher.query_reg_key(self, *, stix_entity: dict): Search for Windows registry keys

class AlertSearcher.query_reg_value(self, *, stix_entity: dict)

Search for Windows registry values

Wazuh’s FIM module only registers registry values’ hashes, not values. Also, it only supports REG_SZ, REG_EXPAND_SZ and REG_BINARY (i.e. not numeric values, like REG_DWORD).

This function will only search for registry values of type REG_{SZ, EXPAND_SZ, BINARY}, and it will only compare SHA-256 values (since that is what Wazuh’s FIM/syscheck module provides).

If the data type is REG_SZ/REG_EXPAND_SZ, a SHA-256 hash is taken from the value (data). If the data type is REG_BINARY, the contents is expected to be a hex string, of which a SHA-256 hash is computed.

class AlertSearcher.query_process(self, *, stix_entity: dict)

Search for process command lines

Command lines are hard to search, because arguments and options may be in any order. This function tries to match all of the words in the SCO’s command_line without being too inaccurate. Only the property command_line is supported.

Matching

First, the string is tokenised into a list of words separated by whitespace. Any sequence of words enclosed in non-escaped quotes are considered as a single token, .e.g.:

foo bar baz becomes [‘foo’, ‘bar’, ‘baz’]
foo ‘bar baz’ becomes [‘foo’, ‘bar baz’]

The first token is considered to be the command, and is treated differently: Any path is stripped (basename) in order to match commands both with and without a full path, and this token is searched for in command fields and not argument fields (where applicable).

The remaining tokens are considered arguments. First, any non-escaped quotes are removed from the beginning and end of argument. In case of command line alerts without individual argument fields, a search is performed for each individual argument in command line fields on a non-whitespace boundary, e.g.:

“C:foobar baz ‘qux quux’” will search for “baz” and match “ baz”, ” baz” and “ baz “, but not “bazaar”

For alerts with argument fields, like data.audit.execve, the argument are matched as they are.

Note

Regexp queries are needed to search for process command lines, which may be expensive, and may also be disabled in your OpenSearch installation if search.allow_expensive_queries is set to false. In order to disable regexp searches, disable process searching altogether by not specifying “Process” in the connector scope.

class AlertSearcher.query_vulnerability(self, *, stix_entity: dict)

Search for vulnerabilities

Results will typically contain an event when the vulnerability was first detected, then later when the vulnerability was “resolved” due to a package upgrade.

class AlertSearcher.query_account(self, *, stix_entity: dict): TODO: document

class AlertSearcher.query_user_agent(self, *, stix_entity: dict): Search for user agents strings