Query string syntax

The search engine is using the "Query string" DSL from Elasticsearch. As such, most of this information comes from this documention page

The query string “mini-language” is used by the q query string parameter in the search API.

The query string is parsed into a series of terms and operators.
A term can be a single word — quick or brown — or a phrase, surrounded by double quotes — "quick brown" — which searches for all the words in the phrase, in the same order.

Operators allow you to customize the search — the available options are explained below.

Field names

You can specify fields to search in the query syntax:

  • where the software.name field contains apache
    software.name:apache
  • where the software.name field contains apache or nginx
    software.name:(apache OR nginx)
  • where any of the fields software.name, software.version, software.modules[].name or software.modules[].version contains apache and php (note how we need to escape the * with a backslash):
    software.\*:(apache AND php)
  • where the net field contains the exact phrase "Microsoft Corporation"
    net:"Microsoft Corporation"
  • where the country field has any non-null value:
    _exists_:country
Available field names can be found on the API page
Current field aliases
  • net is an alias for network.organisation_name
  • country is an alias for geoip.country_name
  • asn is an alias for network.asn

Ranges

Ranges can be specified for date, numeric (IPs) or string fields.

Inclusive ranges are specified with square brackets [min TO max] and exclusive ranges with curly brackets {min TO max}.

  • All the ports between 3305 and 3308
    ports:[3305 TO 3308]
  • All the data before the 1rst of May 2020
    timestamp:{* TO 2020-05-01}
  • Ranges with one side unbounded can use the following syntax
    timestamp:>2020-05-01
    timestamp:>=2020-05-01
    timestamp:<2020-05-01
    timestamp:<=2020-05-01
  • To combine an upper and lower bound with the simplified syntax, you would need to join two clauses with an AND operator
    ip:(>=212.0.0.0 AND <213.0.0.0)
  • A special kind of range is the IP range, which can be specified as CIDR as long as it's enclosed between double quotes (")
    ip:"212.0.0.0/8"

Boolean operators

By default, all terms are optional, as long as one term matches.
A search for foo bar baz will find any document that contains one or more of foo or bar or baz. There are also boolean operators which can be used in the query string itself to provide more control.

The preferred operators are + (this term must be present) and - (this term must not be present).
All other terms are optional. For example, this query:

quick brown +fox -news

states that:

  • fox must be present
  • news must not be present
  • quick and brown are optional — their presence increases the relevance

The familiar boolean operators AND, OR and NOT (also written &&, || and !) are also supported but beware that they do not honor the usual precedence rules, so parentheses should be used whenever multiple operators are used together. For instance the previous query could be rewritten as:

((quick AND fox) OR (brown AND fox) OR fox) AND NOT news

This form now replicates the logic from the original query correctly, but the relevance scoring bears little resemblance to the original.

More valid search queries

  • country:germany
  • asn:12812
  • protocol:redis AND software.version:3.*
  • headers.Server:nginx AND port:80 will search for the word nginx in the Server header on all 80 ports, service scope only