IronBee Reference Manual

IronBee Reference Manual

Brian Rectanus

Ivan Ristic

William Metcalf


Table of Contents

Preface 1. Introduction
Configuration Data Fields
Addressing Fields Field Expansion Collection size
Rules
Metadata Phases
Events Request and Response Body Handling
Request body processing
2. Installation
Tested Operating Systems General Dependencies Resolving Dependencies
Installing LibHTP
Building, Testing and Installing IronBee
Initial Setup Build and Install IronBee Build C++ Wrapper and Utilities (Optional) Build and Run Unit Tests(Optional) Build Doxygen Documents(Optional) Build Docbook Manual(Optional)
3. Directives
AuditEngine AuditLogBaseDir AuditLogDirMode AuditLogFileMode AuditLogIndex AuditLogIndexFormat AuditLogParts AuditLogSubDirFormat DefaultBlockStatus GeoIPDatabaseFile Hostname Include InspectionEngine LoadModule Location Log LogHandler LogLevel ModuleBasePath PcreMatchLimit PcreMatchLimitRecursion RequestBodyBuffering RequestBodyBufferLimit RequestBodyBufferLimitAction ResponseBodyBuffering ResponseBodyBufferLimit ResponseBodyBufferLimitAction Rule RuleBasePath RuleDisable RuleEnable RuleEngineLogData RuleEngineLogLevel RuleExt RuleMarker SensorId Service Site SiteId StreamInspect
4. Data Fields
ARGS AUTH_PASSWORD AUTH_TYPE AUTH_USERNAME CAPTURE FIELD FIELD_NAME FIELD_NAME_FULL GEOIP HTP_REQUEST_FLAGS HTP_RESPONSE_FLAGS LAST_MATCHED REMOTE_ADDR REMOTE_PORT REQUEST_BODY REQUEST_BODY_PARAMS REQUEST_CONTENT_TYPE REQUEST_COOKIES REQUEST_FILENAME REQUEST_HEADERS REQUEST_HOST REQUEST_LINE REQUEST_METHOD REQUEST_PROTOCOL REQUEST_URI REQUEST_URI_FRAGMENT REQUEST_URI_HOST REQUEST_URI_PARAMS REQUEST_URI_PASSWORD REQUEST_URI_PATH REQUEST_URI_PORT REQUEST_URI_RAW REQUEST_URI_SCHEME REQUEST_URI_QUERY REQUEST_URI_USERNAME RESPONSE_BODY RESPONSE_CONTENT_TYPE RESPONSE_COOKIES RESPONSE_HEADERS RESPONSE_LINE RESPONSE_MESSAGE RESPONSE_PROTOCOL RESPONSE_STATUS SERVER_ADDR SERVER_PORT SITE_NAME TX UA
5. Operators
contains dfa eq ge gt ipmatch ipmatch6 le lt ne pm pmf rx
Using captured substrings to create variables
streq
6. Modifiers
allow logdata block capture chain confidence delRequestHeader delResponseHeader id msg phase rev setflag setRequestHeader setResponseHeader setvar severity status t tag
7. Transformation Functions
base64Decode compressWhitespace count htmlEntityDecode length lowercase removeWhitespace removeComments replaceComments trim trimLeft trimRight urlDecode min max normalizePath
8. Extending IronBee
Overview Execution Flow
Definitions Flow
Hooks Modules
Standalone Modules Provider Modules
Writing Modules in C
Anatomy of a C Module A Simple C Module Example
Writing Modules in Lua
A. Configuration Examples
Example IronBee Configuration Example Apache Configuration Example Trafficserver Configuration
B. Ideas For Future Improvements
Directive: AuditLogPart Directive: Include Directive: Hostname Directive: LoadModule Directive: Site Directive: DebugLogLevel Directive: SensorName Directive: RequestParamsExtra PersonalityAdd PersonalityAliasClear PersonalityAliasParam PersonalityClearAll PersonalityParam Variable: ARGS_URI Modules Variable length Special single-variable syntax Consolidation of @pm rules standalone sqlDecode
C. ModSecurity Migration Guide
Variables
Syntax Other changes
Rules Miscellaneous Features Not Supported (Yet)

Preface

...

Chapter 1. Introduction

...

Configuration

IronBee configuration consists of one or more configuration files. The following rules apply:

  • Escape sequences are as in JavaScript (section 7.8.4 in ECMA-262), except within PCRE regular expression patterns, where PCRE escaping is used

  • Lines that begin with # are comments

  • Lines are continued on the next line when \ is the last character on a line

Data Fields

IronBee supports two types of data fields:

  • Scalars, which can contain data of various types (strings or numbers)

  • Collections, which contain zero or more scalars

Fields that are created using external data (e.g., request parameters) can have any byte value used in their names, which are essentially binary strings. This is because we have to use whatever input data is provided to us.

Fields that are created by configuration must use names that begin with an underscore or a letter, and are followed by zero or more letter, digit, underscore, and dash characters. The only allowed letters are from the A-Z range (irrespective of the case, as these names are case-insensitive).

The names of built-in fields are written using all uppercase letters by convention.

Addressing Fields

How you're using fields depends on the context. The rules, for example, typically accept one or more target fields, which means that you can give a rule a scalar field (e.g., REQUEST_URI) or a collection field (e.g., ARGS). When you give a rule a collection, it will extract all scalar fields out of it.

You can also choose to be more specific, either because the logic of the rule requires it, or because you are using a field in a context that does not accept collections.

To select the second field in a collection:

ARGS[2]

To select the last field in a collection:

ARGS[-1]

A collection can contain multiple scalar fields with the same name. You can use the colon operator to filter out only the named fields:

ARGS:username

If you wish you can also select only a specific occurrence of the field. The following returns the first username parameter from request parameters:

ARGS:username[1]

Sometimes you will need to deal with parameters whose full names you don't know in advance. For such cases we support selection using regular expression patterns. The following selects all fields whose names begin with "user".

ARGS:/^user/

If, on the other hand, you know the field name but it contains unusual characters (e.g., a colon, or a bracket), you can quote the entire name using single quotes and then use most characters without fear of breaking anything:

ARGS:'name[]'

Alternatively, you can use the escape syntax to "deactivate" unusual characters:

ARGS:name\x5b\x5d

Note

As of version 0.4.0, the [n] syntax described above is not implemented.

Field Expansion

Sometimes there's a need to use a variable, or some value derived from a variable, in text strings. This can be achieved using field expansion. For example:

%{NAME}

If the expression resolves to only one variable, the entire %{NAME} expression will be replaced with the field value.

Caution

What if the field is not a scalar? Perhaps the value could be JSON or similar format?

Collection size

To find out how many variables there are in a collection:

&ARGS

Rules

Metadata

Rule metadata is specified using the following modifiers:

  • id - globally unique identifier, in the form vendorPrefix/vendorRuleId. Example: ib/10001. Vendor prefix is a word that can contain any ASCII letter or digit, as well as the colon character. The colon should be used by vendors to differentiate among rule sets. For example: ib:default/10001 and ib:extra/10001 represent two separate rules, each belonging to a different set.

  • rev - revision (or serial), which is used to differentiate between two versions of the same rule; it defaults to 1 if not specified.

  • msg - message that will be used when the rule triggers. Rules that generate events must define a message.

  • ref - contains URL that provides additional documentation for the rule; it is best practice to provide complete documentation for the rule on the publisher's web site; multiple references are allowed (NOTE: This is not yet implemented).

  • tag - assigns one or more tags to the rule; tags are used to classify rules and events (as events inherit all tags from the rule that generates them). Tag values must be constructed from ASCII letters, digits, and underscore characters. The forward slash character is used to establish a tag hierarchy.

  • phase - determines when the rule will run

  • severity - determines the seriousness of the finding (0-100)

  • confidence - determines the confidence the rule has in its logic (0-100)

Phases

Rule phase determines when a rule runs. IronBee understands the following phases:

REQUEST_HEADER

Invoked after the entire HTTP request headers has been read, but before reading the HTTP request body (if any). Most rules should not use this phase, opting for the REQUEST phase instead.

REQUEST

Invoked after receiving the entire HTTP request, which may involve request body and request trailers, but it will run even when neither is present.

RESPONSE_HEADER

Invoked after receiving the HTTP entire response header.

RESPONSE

Invoked after receiving the HTTP response body (if any) and response trailers (if any).

POSTPROCESS

Invoked after the entire transaction has been processed. This phase is for logging and tracking data between transactions, such as storing state. Actions cannot affect the transaction in this phase.

Rule ordering

The phase information, assigned to every rule, determines when a rule will run within transaction lifecycle. Within a phase, configuration determines how rules are ordered. When a rule is read from the configuration files, it is appended to the list of rules in the desired phase. At run-time, the engine will process the rules one by one, and all of them until it is interrupted.

There can be situations when the order of the rules may not correspond to the order in which they appear in the configuration files, and that's when the order is manipulated by the configuration.

Standalone rules

Standalone rules are those that do not depend on other rules and which are safe to execute at any point within the phase. To mark a rule as standalone, apply the standalone modifier to it:

Rule ARGS @rx TEST phase:REQUEST standalone

When you mark a rule as standalone, the rule engine is free to move it around to optimize the execution. (NOTE: this is not yet implemented.)

Events

During a transaction, one or more events may be generated. Each event has the following mandatory attributes:

  • Event number; the first event in a transaction is assigned number 1, the second number 2, and so on...

  • Rule identifier

  • Message

  • Phase information (generated)

  • Severity

  • Suppression status [was the event suppressed]

Optional attributes:

  • Tags

  • Reference URLs

  • Data source (generated when applicable); parameter name

Request and Response Body Handling

Request and response headers are generally limited in size and thus easy to handle. This is especially true in a proxy deployment, where buffering is possible. Proxies will typically cache request and response headers, making it easy to perform inspection and reliably block when necessary.

The situation is different with request and response bodies, which can be quite big. For example, request bodies may carry one or more files; response bodies too often deliver files, and some HTML responses can get quite big too. Even when sites do not normally have large request bodies, they are under the control of attackers, and they may intentionally submit large amounts of data in an effort to bypass inspection.

Let's look of what might be of interest here:

Inspection

Do we want to inspect a particular request or response body? Whereas it would be rare not to want inspect a request body, it's quite common with response bodies, because many carry static files and images. We can decide by looking at the Content-Type header.

Processing

After we decide to inspect a body, we need to determine how to process it, after which inspection can take place. It's only in the simplest case, when the body is treated as a continuous stream of bytes, is that no processing is needed. Content types such as application/x-www-form-urlencoded and multipart/form-data must be parsed before fine-grained analysis can be undertaken. In many cases we may need to process a body in more than one way to support all the desired approaches to analysis.

Buffering

Reliable blocking is possible only when all of the data is buffered: accumulate the entire request (or response) until the inspection is complete, and then you release it all once. Blocking without buffering can be effective, but such approach is susceptible to evasion in edge cases. The comfort of reliable blocking comes at a price. End user performance may degrade, because rather than receiving data as it becomes available, the proxy must wait to receive the last byte of the data to let it through. In some cases (e.g., WebSockets) there is an expectation that chunks of data travel across the wire without delay. And, of course, buffering increases memory consumption required for inspection.

Logging

Finally, we wish to be able to log entire transaction for post-processing or evidence. This is easy to do when all of data is buffered, but it should also be possible even when buffering is not enabled.

Request body processing

IronBee comes with built-in logic that controls the default handling of request body data. It will correctly handle application/x-www-form-urlencoded and multipart/form-data requests. Other formats will be added as needed.

Chapter 2. Installation

...

Tested Operating Systems

We have provided a table below of the operating systems that are officially supported by IronBee. Our definition of a tested operating system is one that we perform build, functionality, and regression testing on. This is not to say that IronBee will not work for you if your OS is not listed on this table, it probably will as long as you can meet the general dependencies outlined in the section "General Dependencies".

Table 2.1. Tested Operating Systems
Operating SystemVersion(s)Website
Red Hat Enterprise LinuxCurrent and previous version.http://www.redhat.com/rhel/
FedoraCurrent versionhttp://fedoraproject.org/
DebianCurrent stable versionhttp://www.debian.org/
Ubuntu-LTSCurrent and previous versionhttps://wiki.ubuntu.com/LTS/
Ubuntu(non-LTS release)Current versionhttp://www.ubuntu.com/
OS XLionhttp://www.apple.com/

General Dependencies

...

Table 2.2. Build Tool Dependencies
DependencyVersionDescriptionWebsite
C compilergcc 4.6+ or clang 3.0Currently gcc and clang have been tested.http://gcc.gnu.org/ or http://clang.llvm.org/
GNU Build Systemautoconf 2.9+Autotools(Automake, Autoconf, Libtool)http://www.gnu.org/software/hello/manual/autoconf/The-GNU-Build-System.html
pkg-configanyHelper tool used when compiling applications and libraries.http://pkg-config.freedesktop.org/wiki/
Table 2.3. Software Version Control
DependencyVersionDescriptionWebsite
GitlatestGit is needed to access the IronBee source repository.http://git-scm.com/
Table 2.4. Libraries for IronBee
DependencyVersionDescriptionWebsite
PCRE8.0+Regular Expression Library.http://www.pcre.org/
LibHTP0.3.0+Security-aware parser for the HTTP protocol.https://github.com/ironbee/libhtp
PThreadNAPOSIX threadsNA
ossp-uuid1.6.2+OSSP UUID library.http://www.ossp.org/pkg/lib/uuid/
Table 2.5. Libraries for IronBee C++ Wrapper and Utilities
DependencyVersionDescriptionWebsite
C++ Compilerg++ 4.6+ or clang++ 3.0Currently gcc and clang have been tested.http://gcc.gnu.org/ or http://clang.llvm.org/
Boost1.46+General purpose C++ library.http://www.boost.org/
Table 2.6. Libraries for IronBee C++ CLI (clipp)
DependencyVersionDescriptionWebsite
protobuf-cpp2.4.1+Generic serialization library.https://developers.google.com/protocol-buffers/
libpcap1.1.1+Packet capture library (optional).http://www.tcpdump.org/
libnidslatestTCP reassembly library (optional).http://libnids.sourceforge.net/
libnetlatestGeneric networking library (optional).http://libnet.sourceforge.net/
stringencoders3.10+String encoder library (optional).https://code.google.com/p/stringencoders/
Table 2.7. Server
DependencyVersionDescriptionWebsite
Apache Traffic Server3.1Apache foundation's Traffic Server.http://trafficserver.apache.org/

Resolving Dependencies

Installing LibHTP

In the future LibHTP will be integrated into the ironbee build process but until that time you will have to download, build and install LibHTP from it's public git repository.

# Clone the repository
git clone git://github.com/ironbee/libhtp.git
cd libhtp

# Generate the autotools utilities
./autogen.sh

# Configure the build for the current platform
./configure

# Build and Install
make
sudo make install

Building, Testing and Installing IronBee

...

Initial Setup

# Clone the repository
git clone git://github.com/ironbee/ironbee.git
cd ironbee

# Generate the autotools utilities
./autogen.sh

Build and Install IronBee

# Configure the build for the current platform
./configure 

# Build and Install
make
sudo make install

Build C++ Wrapper and Utilities (Optional)

...

# Configure the build for the current platform, enabling C++
./configure --enable-cpp

# Build and Install
make
sudo make install

Build and Run Unit Tests(Optional)

...

make check

Build Doxygen Documents(Optional)

...

make doxygen

Build Docbook Manual(Optional)

...

make manual

Chapter 3. Directives

...

AuditEngine

Description: Configures the audit log engine.

Syntax: AuditEngine On|Off|RelevantOnly

Default: RelevantOnly

Context: Any

Cardinality: 0..1

Module: core

Version: 0.3

Setting AuditEngine to RelevantOnly, the default, does not log any transactions in itself. Instead, further activity (e.g., a rule match) is required for a transaction to be recorded. Setting AuditEngine to On activates audit logging for all transactions, which may cause a large amount of data to be logged.

AuditLogBaseDir

Description: Configures the directory where individual audit log entries will be stored. This also serves as the base directory for AuditLogIndex if it uses a relative path.

Syntax: AuditLogBaseDir directory

Default: "/var/log/ironbee"

Context: Any

Cardinality: 0..1

Module: core

Version: 0.3

AuditLogDirMode

Description: Configures the directory mode that will be used for new directories created during audit logging.

Syntax: AuditLogDirMode octal-mode

Default: 0700

Context: Any

Cardinality: 0..1

Module: core

Version: 0.4

AuditLogFileMode

Description: Configures the file mode that will be used when creating individual audit log files.

Syntax: AuditLogFileMode octal-mode

Default: 0600

Context: Any

Cardinality: 0..1

Module: core

Version: Not Implemented Yet

AuditLogIndex

Description: Configures the location of the audit log index file.

Syntax: AuditLogIndex None|filename

Default: ironbee-index.log

Context: Any

Cardinality: 0..1

Module: core

Version: 0.4

Relative filenames are based off the AuditLogBaseDir directory and specifying None disables the index file entirely.

AuditLogIndexFormat

Description: Configures the format of the entries logged in the audit log index file.

Syntax: AuditLogIndexFormat format-string

Default: "%T %h %a %S %s %t %f"

Context: Any

Cardinality: 0..1

Module: core

Version: 0.4

  • %% The percent sign

  • %a Remote IP-address

  • %A Local IP-address

  • %h HTTP Hostname

  • %s Site ID

  • %S Sensor ID

  • %t Transaction ID

  • %T Transaction timestamp (YYYY-MM-DDTHH:MM:SS.ssss+/-ZZZZ)

  • %f Audit log filename (relative to AuditLogBaseDir)

AuditLogParts

Description: Configures which parts will be logged to the audit log.

Syntax: AuditLogPart [+|-]partType ...

Default: default

Context: Any

Cardinality: 0..n

Module: core

Version: 0.4

An audit log consist of many parts; AuditLogParts determines which parts are recorded by default. The parts are inherited into child contexts (Site, Location, etc). Specifying a part with +/- operator will add or remove the given part from the current set of parts. Specifying the first option without +/- operators will cause all options to be overridden and the list of options will be the only options set. Here is what your configuration might look like:

AuditLogParts minimal +request -requestBody +response -responseBody

The above first resets the list of parts to minimal, adds all the request parts except the requestBody, then adds all the response parts except the responseBody.

Later, in a sub-context, you may wish to enable response body logging and thus can just specify this part with the + operator:

<Location /some/path>
    AuditLogParts +responseBody
</Location>

If you already had response body logging enabled, but didn't want it any more, you would write:

<Location /some/path>
    AuditLogParts -responseBody
</Location>

Audit Log Part Names:

  • header: Audit Log header (required)

  • events: List of events that triggered

  • requestMetadata: Information about the request

  • requestHeaders: Raw request headers

  • requestBody: Raw request body

  • requestTrailers: Raw request trailers

  • responseMetadata: Information about the response

  • responseHeaders: Raw response headers

  • responseBody: Raw response body

  • responseTrailers: Raw response trailers

  • debugFields: Currently not implemented

Audit Log Part Group Names:

These are just aliases for multiple parts.

  • none: Removes all parts

  • minimal: Minimal parts (currently header and events parts)

  • default: Default parts (currently minimal and request/response parts without bodies)

  • request: All request related parts

  • response: All response related parts

  • debug: All debug related parts

  • all: All parts

AuditLogSubDirFormat

Description: Configures the directory structure created under the AuditLogBaseDir directory. This is a strftime(3) format string allowing the directory structure to be created based on date/time.

Syntax: AuditLogSubDirFormat format-string

Default: 403

Context: Any

Cardinality: 0..1

Module: core

Version: 0.4

DefaultBlockStatus

Description: Configures the default HTTP status code used for blocking.

Syntax: DefaultBlockStatus http-status-code

Default: None (audit logs will be created in the base directory)

Context: Any

Cardinality: 0..1

Module: core

Version: 0.4

GeoIPDatabaseFile

Description: Configures the location of the geoip database file.

Syntax: GeoIPDatabaseFile filename

Default: /usr/share/geoip/GeoLiteCity.dat

Context: Any

Cardinality: 0..1

Module: geoip

Version: 0.4

Hostname

Description: Maps hostnames to a Site.

Syntax: Hostname hostname

Default: * (any)

Context: Site

Cardinality: 0..n

Module: core

Version: 0.4

The Hostname directive establishes a mapping between a Site and one or more URL spaces.

In the simplest case, a site will occupy a single hostname:

Hostname www.ironbee.com

More often than not, however, several names will be used:

Hostname www.ironbee.com
Hostname ironbee.com

Wildcards are permitted when there there are multiple names under a common domain. Only one wildcard character per hostname is allowed and it must currently be on the left-hand side:

Hostname ironbee.com
Hostname *.ironbee.com

Finally, to match any hostname on any IP address (which you will need to do in default sites), use a single asterisk, which is the default if no Hostname directive is specified for a site:

Hostname *

Include

Description: Includes external file into configuration.

Syntax: Include filename

Default: None

Context: Any

Cardinality: 0..n

Module: core

Version: Not Implemented Yet

Allows inclusion of another file into the current configuration file. The following line will include the contents of the file sites.conf into configuration:

Include conf/sites.conf

If you specify a relative path, the location of the current configuration file will be used to resolve it.

InspectionEngine

Description: Configures the inspection engine.

Syntax: InspectionEngine On|DetectionOnly|Off

Default: Off

Context: Any

Cardinality: 0..1

Module: core

Version: Not Implemented Yet

LoadModule

Description: Loads an external module into configuration.

Syntax: LoadModule module

Default: None

Context: Main

Cardinality: 0..n

Module: core

Version: 0.4

This directive will add an external module to the engine, potentially making new directives available to the configuration.

Location

Description: Creates a subcontext that can have a different configuration.

Syntax: <Location path>...</Location>

Default: None

Context: Site

Cardinality: 0..n

Module: core

Version: 0.4

A subcontext created by this directive initially has identical configuration to that of the site it belongs to. Further directives are required to introduce changes. Locations are evaluated in the order in which they appear in the configuration file. The first location that matches request path will be used. This means that you should put the most-specific location first, followed by the less specific ones.

Include rules.conf

<Site site1>
    Service *:80
    Service 10.0.1.2:443
    Hostname site1.example.com

    <Location /prefix/app1>
        RuleEnable all
    </Location>

    <Location /prefix>
        RuleEnable tag:GenericRules
    </Location>
</Site>

Log

Description: Configures the location of the log file.

Syntax: Log default|filename

Default: default

Context: Any

Cardinality: 0..1

Module: core

Version: 0.4

LogHandler

Description: Configures the log handler.

Syntax: LogHandler name

Default: None

Context: Any

Cardinality: 0..1

Module: core

Version: 0.3

Note

The log handler allows the log to be handled by another facility (currently the server). For Apache Traffic Server, this should be set to "ironbee-ts" and for Apache Web Server, this should be set to "mod_ironbee". Using the log handler overrides the Log directive.

LogLevel

Description: Configures the detail level of the entries recorded to the log.

Syntax: LogLevel level

Default: 4

Context: Any

Cardinality: 0..1

Module: core

Version: 0.4

The following log levels are supported:

  • emergency - system unusable

  • alert - crisis happened

  • critical - crisis coming

  • error - error occured

  • warning - error lieky to occur

  • notice - something unusual happened

  • info - informational messages

  • debug - debugging: transaction state changes

  • debug2 - debugging: log of activities carried out

  • debug3 - debugging: activities, with more detail

  • trace - debugging: developer log messages

ModuleBasePath

Description: Configures the base path where IronBee modules are loaded.

Syntax: ModuleBasePath path

Default: The lib directory under the IronBee install prefix.

Context: Main

Cardinality: 0..1

Module: core

Version: 0.4

PcreMatchLimit

Description: Configures the PCRE library match limit.

Syntax: PcreMatchLimit limit

Default: 5000

Context: Main

Cardinality: 0..1

Module: pcre

Version: 0.4

From the pcreapi manual: The match_limit field provides a means of preventing PCRE from using up a vast amount of resources when running patterns that are not going to match, but which have a very large number of possibilities in their search trees. The classic example is a pattern that uses nested unlimited repeats.

PcreMatchLimitRecursion

Description: Configures the PCRE library match limit recursion.

Syntax: PcreMatchLimitRecursion limit

Default: 5000

Context: Main

Cardinality: 0..1

Module: pcre

Version: 0.4

From the pcreapi manual: The match_limit_recursion field is similar to match_limit, but instead of limiting the total number of times that match() is called, it limits the depth of recursion. The recursion depth is a smaller number than the total number of calls, because not all calls to match() are recursive. This limit is of use only if it is set smaller than match_limit.

RequestBodyBuffering

Description: Enable/disable request body buffering.

Syntax: RequestBodyBuffering On|Off

Default: Off

Context: Any

Cardinality: 0..1

Module: core

Version: Not implemented yet

RequestBodyBufferLimit

Description: Configures the size of the request body buffer.

Syntax: RequestBodyBufferLimit byte_limit

Default: None

Context: Any

Cardinality: 0..1

Module: core

Version: Not implemented yet

RequestBodyBufferLimitAction

Description: Configures what happens when the buffer is smaller than the request body.

Syntax: RequestBodyBufferLimitAction Reject|RollOver

Default: None

Context: Any

Cardinality: 0..1

Module: core

Version: Not implemented yet

When Reject is configured, the transaction with a body larger than the buffer will be blocked. With RollOver selected, the buffer will be used to keep as much data as possible, but any overflowing data will be allowed to the backend. Request headers will be sent before the first overflow batch. In detection-only mode, Reject is converted to RollOver.

ResponseBodyBuffering

Description: Enable/disable response body buffering.

Syntax: ResponseBodyBuffering On|Off

Default: Off

Context: Any

Cardinality: 0..1

Module: core

Version: Not implemented yet

ResponseBodyBufferLimit

Description: Configures the size of the response body buffer.

Syntax: ResponseBodyBufferLimit byte_limit

Default: None

Context: Any

Cardinality: 0..1

Module: core

Version: Not implemented yet

ResponseBodyBufferLimitAction

Description: Configures what happens when the buffer is smaller than the response body.

Syntax: ResponseBodyBufferLimitAction Reject|RollOver

Default: None

Context: Any

Cardinality: 0..1

Module: core

Version: Not implemented yet

When Reject is configured, the transaction with a body larger than the buffer will be blocked. With RollOver selected, the buffer will be used to keep as much data as possible, but any overflowing data will be allowed to the backend. Response headers will be sent before the first overflow batch. In detection-only mode, Reject is converted to RollOver.

Rule

Description: Loads a rule and, in most contexts, enable the rule for execution in that context.

Syntax: Rule input [input2 ... inputN] @operator op_param [modifiers]

Default: None

Context: Any

Cardinality: 0..n

Module: rules

Version: 0.4

Note

Loading a rule will, in most contexts, also enable the rule to be executed in that context. However, the main configuration context is special. Loading a rule in the main configuration context will NOT enable the rule, but just load it into memory so that it can be shared by other contexts. You must explicitly use RuleEnable to enable the rule.

RuleBasePath

Description: Configures the base path where external IronBee rules are loaded.

Syntax: RuleBasePath path

Default: The lib directory under the IronBee install prefix.

Context: Main

Cardinality: 0..1

Module: core

Version: 0.4

RuleDisable

Description: Disables a rule from executing in the current configuration context.

Syntax: RuleDisable ["all" | "id:"id | "tag":name] ...

Default: None

Context: Any

Cardinality: 0..n

Module: rules

Version: 0.4

Rules can be disabled by id or tag. Any number of id or tag modifiers can be specified per directive. All disables are processed after enables. See the RuleEnable directive for an example.

RuleEnable

Description: Enables a rule for execution in the current configuration context.

Syntax: RuleEnable ["all" | "id:"id | "tag":name] ...

Default: None

Context: Any

Cardinality: 0..n

Module: rules

Version: 0.4

Rules can be disabled by id or tag. Any number of id or tag modifiers can be specified per directive. All enables are processed before disables. For example:

Include "rules/big_ruleset.conf"

<Site foo>
    Hostname foo.example.com
    RuleEnable id:1234
    RuleEnable id:3456 tag:SQLi
    RuleDisable id:5678 tag:experimental tag:heavyweight
</Site>

RuleEngineLogData

Description: Configures the data logged by the rule engine.

Syntax: RuleEngineLogData [+|-]option ...

Default: ruleExec

Context: Any

Cardinality: 0..n

Module: core

Version: 0.6

The following data type options are supported:

  • tx - Log the transaction:

    TX_START clientip:port site-hostname
    ...
    TX_END
  • requestLine - Log the HTTP request line:

    REQ_LINE method uri version-if-given 
  • requestHeader - Log the HTTP request header:

    REQ_HEADER name: value
  • requestBody - Log the HTTP request body, possibly in multiple chunks:

    REQ_BODY size data
  • responseLine - Log the HTTP response line:

    RES_LINE version status message 
  • responseHeader - Log the HTTP response header:

    RES_HEADER name: value
  • responseBody - Log the HTTP response body, possibly in multiple chunks:

    RES_BODY size data
  • phase - Log the phase about to execute:

    PHASE name
  • rule - Log the rule executing:

    RULE_START rule-type
    ...
    RULE_END
  • target - Log the target being inspected:

    TARGET full-target-name {NOT_FOUND|field-type field-name field-value}
  • transformation - Log the transformation being executed:

    TFN tfn-name(param) {ERROR error}
  • operator - Log the operator being executed:

    OP op-name(param) TRUE|FALSE {ERROR error}
  • action - Log the action being executed:

    ACTION action-name(param) {ERROR error}
  • event - Log the event being logged:

    EVENT rule-id type action [confidence/severity] [csv-tags] msg
  • audit - Log the audit log filename being written:

    AUDIT audit-log-filename

The following alias options are supported:

  • request - Alias for: requestLine, requestHeader, requestBody

  • response - Alias for: responseLine, responseHeader, responseBody

  • ruleExec - Alias for: phase, rule, target, transformation, operator, action, actionableRulesOnly

  • default - Alias for: ruleExec

  • all - Alias for all data options

The following filter options are supported:

  • actionableRulesOnly - Filter option indicating that only rules that were actionable (actions executed) are logged - any rule specific logging are delayed/suppressed until at least one action is executed.

RuleEngineLogLevel

Description: Configures the logging level which the rule engine will write logs.

Syntax: RuleEngineLogLevel level

Default: info

Context: Any

Cardinality: 0..1

Module: core

Version: 0.6

RuleExt

Description: Creates a rule implemented externally, either by loading the rule directly from a file, or referencing a rule that was previously declared by a module.

Syntax: RuleExt ruleLocation actions

Default: None

Context: Site, Location

Cardinality: 0..n

Module: rules

Version: 0.4

To load a Lua rule:

RuleExt lua:/path/to/rule.lua phase:REQUEST

RuleMarker

Description: Creates a rule marker (plcaeholder) which will not be executed, but instead should be overridden. The idea is that rule sets can include placeholders for optional custom rules which can be overridden, but maintain execution order.

Syntax: RuleMarker id:id phase:phase

Default: None

Context: Any

Cardinality: 0..n

Module: rules

Version: 0.5

To mark and later replace a rule:

Rule ARGS @rx foo id:1 rev:1 phase:REQUEST
RuleMarker id:2 phase:REQUEST
Rule #MY_VALUE @gt 0 id:3 rev:1 phase:REQUEST setRequestHeader:X-Foo:%{MY_VALUE}

<Site test>
    Hostname *

    Rule &ARGS @gt 5 id:2 phase:REQUEST setvar:MY_VALUE=5
    RuleEnable all
</Site>

In the above example, rule id:2 in the main context would be replaced by the rule id:2 in the site context, then the rules would execute id:1, id:2 and id:3. If Rule id:2 was not replaced in the site context, then rules would execute id:1 then id:3 as id:2 is only a marker (placeholder).

SensorId

Description: Unique sensor identifier.

Syntax: SensorId sensor_id

Default: None

Context: Main

Cardinality: 0..1

Module: core

Version: 0.4

TODO: Can we make this directive so that, if not defined, we attempt to detect server hostname and use that as ID?

Service

Description: Maps IP address and port combinations to a Site.

Syntax: Hostname ip:port

Default: *:* (any ip:any port)

Context: Site

Cardinality: 0..n

Module: core

Version: 0.6

The Service directive establishes a mapping between a Site and one or more services (IP Address/Port combinations). A wildcard (*) may be used to mean "any".

A site may run on only a specific port:

Service *:80

More often than not, however, several ports will be used and a port may have a specfic IP address to bind to if the site is SSL/TLS enabled:

Service *:80
Service 10.0.1.2:443

The default, if no Service directive is specified for a site, is to match any ip and port, which can also be explictly defined as:

Service *:*

Site

Description: A site is one of the main concepts in the configuration in IronBee. The idea is to have an element to correspond to real-life web sites. With most web sites there is an one-to-one mapping to domain names, but our mapping mechanism is quite flexible: you can have one site per domain name, many domain names for a single site, or even have one domain name shared among several sites.

Syntax: <Site site_name>...</Site>

Default: None

Context: Main

Cardinality: 0..n

Module: core

Version: 0.1

At the highest level, a configuration will contain one or more sites. For example:

<Site site1>
    Service *:80
    Hostname site1.example.com
    Hostname site1-alternate.example.com
</Site>

<Site site2>
    Service *:80
    Service 10.0.1.2:443
    Hostname site2.example.com
</Site>

<Site default>
    Service *:*
    Hostname *
</Site>

Before it can process a transaction, IronBee will examine the current configuration looking for a site to assign the transaction. Sites are processed in the configured order where the first matching site is chosen. A default site can be specified as the last site using wildcards when all previous sites fail to match. The Site directive only establishes configuration boundaries and assigns a unique handle to each site; the Service and Hostname directives are responsible for the mapping.

Note

Every configuration should have a default site. IronBee will generate a run-time error if it is unable to find a site to assign to a transaction. TODO: Should we block if no site is chosen?

SiteId

Description: Unique site identifier.

Syntax: SiteId site_id

Default: None

Context: Site

Cardinality: 0..1

Module: core

Version: 0.4

TODO: Can we make this directive so that, if not defined, we attempt to detect site hostname and use that as ID?

StreamInspect

Description: Creates a streaming inspection rule, which inspects data as it becomes available, outside rule phases.

Syntax: StreamInspect INPUT @OPERATOR OP_PARAM [MODIFIERS]

Context: Site, Location

Cardinality: 0..n

Module: rules

Version: 0.4

Normally, rules run in one of the available phases, which happen at strategic points in transaction lifecycle. Phase rules are convenient to write, because all the relevant data is available for inspection. However, there are situations when it is not possible to have access to all of the data in a phase. This is the case, for example, when a request body is very large, or when buffering is not allowed.

Streaming rules are designed to operate in these circumstances. They are able to inspect data as it becomes available, be it a dozen of bytes, or a single byte.

The syntax of the Inspect directive is similar to that of Rule, but there are several restrictions:

  • Only one input can be used. This is because streaming rules attach to a single data source.

  • The phase modifier cannot be used, as streaming rules operate outside of phases.

  • Only REQUEST_BODY_STREAM and RESPONSE_BODY_STREAM can be used as inputs.

  • Only the pm, and dfa operators can be used.

  • Transformation functions are not yet supported.

Chapter 4. Data Fields

...

ARGS

Description: All request parameters combined and normalized.

Type: Collection

Scope: Transaction (REQUEST_HEADERS, REQUEST_BODY)

Module: core

Version: 0.2

Note

The ARGS collection is currently the same as specifying REQUEST_URL_PARAMS REQUEST_BODY_PARAMS, but this will change in later releases to include normalization based on parser personalities. If you do not want normalization, then use REQUEST_URL_PARAMS REQUEST_BODY_PARAMS.

AUTH_PASSWORD

Description: Basic authentication password.

Type: String

Scope: Transaction

Module: core

Version: Not implemented yet

AUTH_TYPE

Description: Indicator of the authentication method used.

Type: Collection

Scope: Transaction

Module: core

Version: Not implemented yet

This field contains the first token extracted from the Authorization request header. Typical values are: Basic, Digest, and NTLM.

AUTH_USERNAME

Description: Basic or Digest authentication username.

Type: String

Scope: Transaction

Module: core

Version: Not implemented yet

CAPTURE

Description: Transaction collection.

Type: Collection

Scope: Transaction

Module: core

Version: 0.4

This collection contains information for the transaction. Currently captured data from operators is stored here in keys "0"-"9".

Note

The captures were previously stored in the generic TX collection, but moved to its own collection in version 0.4.

FIELD

Description: An alias to the current field being inspected.

Type: Variable (same type as the aliased field)

Scope: Rule

Module: core

Version: 0.5

This field is useful only in field expansions within actions when you must have the original value of the field being inspected. For example:

# Log the field value with an event
Rule ARGS @contains attack_string id:123 phase:REQUEST logdata:%{FIELD} event

# Create a collection matching a pattern for later use
Rule REQUEST_HEADERS @rx pattern1 id:124 phase:REQUEST_HEADER setvar:NEW_COL:%{FIELD_NAME}=%{FIELD}
Rule ARGS @rx pattern2 id:125 phase:REQUEST setvar:NEW_COL:%{FIELD_NAME}=%{FIELD}
...
# Then perform further matches on the new collection in another phase, which
# is not possible via chaining.
Rule NEW_COL @rx some_other_patt id:126 phase:REQUEST "msg:Some msg" event block

FIELD_NAME

Description: An alias to the current field name being inspected, not including the collection name if it is a sub-field in a collection.

Type: Variable (same type as the aliased field)

Scope: Rule

Module: core

Version: 0.5

This field is useful only in field expansions within actions when you must have the name of the field being inspected. The collection name is not prepended, so if ARGS:foo is being inspected, the value will be foo, not ARGS:foo. If you want the full name with the collection prepended, then use FIELD_NAME_FULL.

FIELD_NAME_FULL

Description: An alias to the current field name being inspected, including the collection name if it is a sub-field in a collection.

Type: Variable (same type as the aliased field)

Scope: Rule

Module: core

Version: 0.5

This field is useful only in field expansions within actions when you must have the full name of the field being inspected. See FIELD_NAME .

GEOIP

Description: If the geoip module is loaded, then a lookup will be performed on the remote (client) address and the results placed in this collection.

Type: Collection

Scope: Transaction

Module: geoip

Version: 0.3

Note

The address used during lookup is the same as that stored in the REMOTE_ADDR field, which may be modified from the actual connection (TCP) level address by the user_agent module.

Sub-Fields (not all are available prior to GeoIP v1.4.6):

  • latitude: Numeric latitude rounded to nearest integral value (no floats yet).

  • longitude: Numeric longitude rounded to nearest integral value (no floats yet).

  • area_code: Numeric area code (US only).

  • charset: Numeric character set code.

  • country_code: Two character country code.

  • country_code3: Three character country code.

  • country_name: String country name.

  • region: String region name.

  • city: String city name.

  • postal_code: String postal code.

  • continent_code: String continent code.

  • accuracy_radius: Numeric accuracy radius (v1.4.6+).

  • metro_code: Numeric metro code (v1.4.6+).

  • country_conf: String country confidence (v1.4.6+).

  • region_conf: String region confidence (v1.4.6+).

  • city_conf: String city confidence (v1.4.6+).

  • postal_conf: String postal code confidence (v1.4.6+).

HTP_REQUEST_FLAGS

Description: Collection of LibHTP request parsing flags.

Type: Collection

Scope: Transaction

Module: htp

Version: 0.3

The LibHTP parser will set various flags while parsing. This is a collection of those flags for request parsing. The following flags may be set:

  • AMBIGUOUS_HOST: The host was specified in both the URI and in the Host header, but they do not match.

  • FIELD_INVALID: An invalid field was sent.

  • FIELD_LONG: A field length was longer than allowed.

  • FIELD_UNPARSEABLE: An unparseable field was given.

  • HOST_MISSING: The host was missing from a request in which it is normally sent.

  • INVALID_CHUNKING: Invalid chunking was used.

  • INVALID_FOLDING: Invalid header folding was used.

  • MULTI_PACKET_HEAD: The header was sent in more than one packet (buffer).

  • PATH_ENCODED_NUL: A NUL (zero) byte was sent, encoded, in the path.

  • PATH_ENCODED_SEPARATOR: An encoded path separator was sent in the path.

  • PATH_FULLWIDTH_EVASION: A full width character was used in the path.

  • PATH_INVALID_ENCODING: Invalid encoding was used in the path.

  • PATH_OVERLONG_U: An overlong unicode encoding was used in the path.

  • PATH_UTF8_INVALID: An invalid UTF-8 encoding was used in the path.

  • PATH_UTF8_OVERLONG: An overlong UTF-8 encoding was used in the path.

  • PATH_UTF8_VALID: A UTF-8 character was used in the path.

  • REQUEST_SMUGGLING: A HTTP smuggling attack was detected.

HTP_RESPONSE_FLAGS

Description: Collection of LibHTP response parsing flags.

Type: Collection

Scope: Transaction

Module: htp

Version: 0.3

The LibHTP parser will set various flags while parsing. This is a collection of those flags for response parsing. The following flags may be set:

  • FIELD_INVALID: An invalid field was sent.

  • FIELD_LONG: A field length was longer than allowed.

  • FIELD_UNPARSEABLE: An unparseable field was given.

  • INVALID_CHUNKING: Invalid chunking was used.

  • INVALID_FOLDING: Invalid header folding was used.

  • MULTI_PACKET_HEAD: The header was sent in more than one packet (buffer).

  • STATUS_LINE_INVALID: An invalid HTTP status code was sent.

LAST_MATCHED

Description: An alias to the last matching field.

Type: Variable (same type as the aliased field)

Scope: Rule

Module: core

Version: Not implemented yet

This field is useful in a rule that is inspecting several fields, and you need to which of the fields matched. For example:

# Further inspect a matching field
Rule ARGS @contains attack_string chain
    Rule LAST_MATCHED "@rx some.*complex.*patt.*that.*may.*be.*expensive" block

# Log the value of the field that matched
Rule ARGS @rx pattern \
    "msg:'Test matched',logdata:%{LAST_MATCHED}"

REMOTE_ADDR

Description: Remote (client) IP address, extracted from the TCP connection. Can be in IPv4 or IPv6 format.

Type: String

Scope: Connection

Module: core

Version: 0.2

Note

If the user_agent module is also loaded, then the client address will be corrected using any available proxy headers (currently X-Forwarded-For).

REMOTE_PORT

Description: Remote (client) port, extracted from the TCP connection.

Type: Numeric

Scope: Connection

Module: core

Version: 0.2

REQUEST_BODY

Description: Request body data.

Type: Byte string

Scope: Transaction

Module: core

Version: Not implemented yet

Note

This field may not be supported by all operators (noteably @rx) as it may not be a contiguous buffer, but rather spread across multiple buffers. In most cases, you will want to use the StreamInspect directive with the @pm, @pmf or @dfa operator.

REQUEST_BODY_PARAMS

Description: Request parameters transported in request body.

Type: String

Scope: Transaction

Module: core

Version: 0.4

REQUEST_CONTENT_TYPE

Description: Contains the normalized request content type.

Type: String

Scope: Transaction (REQUEST_HEADERS)

Module: core

Version: Not implemented yet

Request content type is constructed from the request Content-Type header. The value is first converted to contain only the content type (and exclude any character encoding information), then converted to lowercase.

REQUEST_COOKIES

Description: Collection of request cookies (name/value pairs).

Type: Collection

Scope: Transaction (REQUEST_HEADERS)

Module: core

Version: 0.2

REQUEST_FILENAME

Description: Request filename, extracted from request URI and normalized according to the current personality.

Type: String

Scope: Transaction

Module: core

Version: Not implemented yet

Normalization algorithm, with all "features" enabled, is as follows:

  1. Decode URL-encoded characters (both %HH and %uHHHH formats), convert to lowercase, compress separators, convert backslashes, and terminate NUL.

  2. Convert UTF-8 to single-byte stream using best-fit mapping

  3. Perform RFC 3986 normalization

REQUEST_HEADERS

Description: Collection of request headers (name/value pairs).

Type: Collection

Scope: Transaction (REQUEST_HEADERS)

Module: core

Version: 0.2

REQUEST_HOST

Description: Request hostname information, extracted from the request and normalized.

Type: String

Scope: Transaction (REQUEST_HEADERS)

Module: core

Version: 0.2

The following rules apply:

  1. Use the hostname information if provided on the request line

  2. Alternatively, look up the HTTP Host request header

  3. If the hostname information is provided in both locations, the information in the HTTP Host request header is ignored

Normalization [TODO What RFC should we refer to?]:

  1. Lowercase

  2. Remove trailing dot [TODO What dot?]

  3. [TODO Remove port?]

REQUEST_LINE

Description: Full, raw, request line.

Type: String

Scope: Transaction

Module: core

Version: 0.3

Example:

GET /path/to/page?a=5&q=This+is+a+test. HTTP/1.1

REQUEST_METHOD

Description: Request method.

Type: String

Scope: Transaction

Module: core

Version: 0.3

This field contains the HTTP method used for the request.

Example: GET

REQUEST_PROTOCOL

Description: Request protocol name and version.

Type: String

Scope: Transaction

Module: core

Version: o.3

This field contains the HTTP protocol name and version, as specified on the request line. Transactions that do not specify the protocol (e.g., HTTP prior to 1.0) will have an empty string value.

REQUEST_URI

Description: Request URI, extracted from request and normalized according to the current personality (see REQUEST_FILENAME for more details).

Type: String

Scope: Transaction

Module: core

Version: 0.2

Default normalization:

  1. RFC normalization

  2. Convert to lowercase

  3. Reduce consecutive forward slashes to a single character

All normalization options:

  • RFC normalization

  • Convert to lowercase

  • Convert \ characters to /

  • Reduce consecutive forward slashes to a single character

REQUEST_URI_FRAGMENT

Description: Parsed fragment portion of the URI within the request line.

Type: String

Scope: Transaction

Module: core

Version: 0.3

REQUEST_URI_HOST

Description: Parsed host portion of the URI within the request line.

Type: String

Scope: Transaction

Module: core

Version: 0.3

This is the hostname specified in the URI. Note that this may be different from the normalized host, which is in REQUEST_HOST.

REQUEST_URI_PARAMS

Description: Request parameters transported in query string.

Type: Collection

Scope: Transaction (REQUEST_HEADERS)

Module: core

Version: 0.2

REQUEST_URI_PASSWORD

Description: Parsed password portion of the URI within the request line.

Type: String

Scope: Transaction

Module: core

Version: 0.3

REQUEST_URI_PATH

Description: Parsed path portion of the URI within the request line.

Type: String

Scope: Transaction

Module: core

Version: 0.3

REQUEST_URI_PORT

Description: Parsed port portion of the URI within the request line.

Type: String

Scope: Transaction

Module: core

Version: 0.3

REQUEST_URI_RAW

Description: Raw, unnormalized, request URI.

Type: String

Scope: Transaction

Module: core

Version: 0.2

REQUEST_URI_SCHEME

Description: Parsed scheme portion of the URI within the request line.

Type: String

Scope: Transaction

Module: core

Version: 0.3

REQUEST_URI_QUERY

Description: Parsed query portion of the URI within the request line.

Type: String

Scope: Transaction

Module: core

Version: 0.3

REQUEST_URI_USERNAME

Description: Parsed username portion of the URI within the request line.

Type: String

Scope: Transaction

Module: core

Version: 0.3

RESPONSE_BODY

Description: Response body data up to any imposed limits.

Type: Scalar

Scope: Transaction

Module: core

Version: 0.3

Note

This field may not be supported by all operators (noteably @rx) as it may not be a contiguous buffer, but rather spread across multiple buffers. In most cases, you will want to use the StreamInspect directive with the @pm, @pmf or @dfa operator.

RESPONSE_CONTENT_TYPE

Description: Contains the normalized response content type.

Type: Scalar

Scope: Transaction (RESPONSE_HEADERS)

Module: core

Version: Not implemented yet

Response content type is constructed from the response Content-Type header. The value is first converted to keep only the content type part (and exclude character encoding information, if any), then converted to lowercase.

RESPONSE_COOKIES

Description: Collection of response cookies (name/value pairs).

Type: Collection

Scope: Transaction

Module: core

Version: Not implemented yet

RESPONSE_HEADERS

Description: Collection of response headers (name/value pairs).

Type: Collection

Scope: Transaction

Module: core

Version: 0.2

RESPONSE_LINE

Description: Full response line.

Type: String

Scope: Transaction

Module: core

Version: 0.3

Transactions that do not specify a response line (e.g., HTTP prior to 1.0) will have an empty string value.

Example:

HTTP/1.1 200 OK

RESPONSE_MESSAGE

Description: Response status message.

Type: String

Scope: Transaction

Module: core

Version: 0.3

This field contains the status message (text following the status code), as specified on the response line. Transactions that do not specify a response line (e.g., HTTP prior to 1.0) will have an empty string value.

RESPONSE_PROTOCOL

Description: Response protocol name and version.

Type: String

Scope: Transaction

Module: core

Version: 0.3

This field contains the protocol name and version, as specified on the response line. Transactions that do not specify a response line (e.g., HTTP prior to 1.0) will have an empty string value.

RESPONSE_STATUS

Description: Response status code.

Type: String

Scope: Transaction

Module: core

Version: 0.3

This field contains the status code, as specified on the response line. Transactions that do not specify a response line (e.g., HTTP prior to 1.0) will have an empty string value.

SERVER_ADDR

Description: Server IP address, extracted from the TCP connection. Can be in IPv4 or IPv6 format.

Type: String

Scope: Connection

Module: core

Version: 0.2

SERVER_PORT

Description: Server port, extracted from the TCP connection.

Type: Numeric

Scope: Connection

Module: core

Version: 0.2

SITE_NAME

Description: The name of the site to which the current transaction was assigned.

Type: String

Scope: Transaction

Module: core

Version: Not implemented yet

TX

Description: Transaction collection.

Type: Collection

Scope: Transaction

Module: core

Version: 0.3

This collection contains arbitrary information for the transaction. It is a generic place for rules to store transaction data in which other rules can monitor.

UA

Description: User agent information extracted if the user_agent module is loaded.

Type: Collection

Scope: Transaction

Module: user_agent

Version: 0.3

Note

While the User-Agent HTTP request header may be used in generating these fields, the term "user agent" here refers to the client as a whole.

Sub-Fields:

  • agent: String name of the user agent.

  • product: String product deduced from the user agent data.

  • os: String operating system deduced from user agent data.

  • extra: Any extra string available after parsing the User-Agent HTTP request header.

  • category: String category deduced from user agent data.

Chapter 5. Operators

...

contains

Description: Returns true if the target contains the given sub-string.

Types: String

Module: core

Version: 0.3

dfa

Description: Deterministic finite atomation matching algorithm (PCRE's alternative matching algorithm).

Types: String

Module: pcre

Version: 0.4

The dfa operator implements the alternative matching algorithm in the PCRE regular expressions library. The parameter of the operator is a regular expression pattern that is passed to the PCRE library without modification. This alternative matching algorithm uses a similar syntax to PCRE regular expressions, except that backtracking is not available. The primary use of dfa is to allow a subset of regular expression matching in a streaming manner (see StreamInspect). TODO: Describe limits on regex syntax.

eq

Description: Returns true if the target is numerically equal to the given value.

Types: Numeric

Module: core

Version: 0.3

ge

Description: Returns true if the target is numerically greater than or equal to the given value.

Types: Numeric

Module: core

Version: 0.3

gt

Description: Returns true if the target is numerically greater than the given value.

Types: Numeric

Module: core

Version: 0.3

ipmatch

Description: Returns true if a target IPv4 address matches any given whitespace separated address in CIDR format.

Types: String

Module: core

Version: 0.3

ipmatch6

Description: Returns true if a target IPv6 address matches any given whitespace separated address in CIDR format.

Types: String

Module: core

Version: 0.3

le

Description: Returns true if the target is numerically less than or equal to the given value.

Types: Numeric

Module: core

Version: 0.3

lt

Description: Returns true if the target is numerically less than the given value.

Types: Numeric

Module: core

Version: 0.3

ne

Description: Returns true if the target is not numerically equal to the given value.

Types: Numeric

Module: core

Version: 0.3

pm

Description: Parallel matching using the Aho-Corasick algorithm.

Types: String

Module: ac

Version: 0.2

Implements a set-based (or parallel) matching function using the Aho-Corasick algorithm. The parameter of the operator contains one or more matching patterns, separated with whitespace. Set-based matching is capable of matching many patterns at the same time, making it efficient for cases when the number of patterns is very large (in hundreds and thousands).

Rule REQUEST_HEADERS:User-Agent @pm "one two three"

If the capture modifier is specified on a @pm rule, the TX.0 variable will contain the matched data fragment. Do note that, because the pm operator can easily match many times per rule, the TX.0 value is valid only when used in the same rule. In the following rules, TX.0 will contain the data fragment of the last @pm match.

pmf

Description: Parallel matching with patterns from file.

Types: String

Module: ac

Version: 0.2

Same as pm, but instead of accepting parameters directly, it loads them from the file whose filename was supplied. The file is expected to contain one pattern per line. To convert a line into a pattern, whitespace from the beginning and the end is removed. Empty lines are ignored, as are comments, which are lines that begin with #. Relative filenames are resolved from same directory as the configuration file.

Rule REQUEST_HEADERS:User-Agent @pmf bad_user_agents.dat

rx

Description: Regular expression (perl compatible regular expression) matching.

Types: String

Module: pcre

Version: 0.2

The rx operator implements PCRE regular expressions. The parameter of the operator is a regular expression pattern that is passed to the PCRE library without modification.

Rule ARGS:userId !@rx ^[0-9]+$

Patterns are compiled with the following settings:

  • Entire input is treated as a single buffer against which matching is done.

  • Patterns are case-sensitive by default.

  • Patterns are compiled with PCRE_DOTALL and PCRE_DOLLAR_ENDONLY set.

Using captured substrings to create variables

Regular expressions can be used to capture substrings. In IronBee, the captured substrings can be used to create new variables in the TX collection. To use this feature, specify the capture modifier in the rule.

Rule ARGS @rx (\d{13,16}) capture

When capture is enabled, IronBee will always create a variable TX.0, which will contain the entire matching area of the pattern. Anonymous capture groups will create up to 9 variables, from TX.1 to TX.9. These special TX variables will remain available until the next capture rule is run, when they will all be deleted.

streq

Description: Returns true if target exactly matches the given string.

Types: String

Module: core

Version: 0.3

Chapter 6. Modifiers

...

allow

Description: Mark a transaction as allowed to proceed to a given inspection point.

Type: Action

Syntax: allow[:phase | :request]

Cardinality: 0..1

Module: core

Version: 0.4

By default this allows the transaction to proceed without inspection until the postprocessing phase. This can be changed depending on the modifier used:

  • phase - Proceed to the end of the current phase without further rule execution.

  • request - Proceed to the end of the request processing phases without further rule execution.

logdata

Description: Add data to be logged with the event.

Type: Metadata

Syntax: logdata:data

Cardinality: 0..1

Module: core

Version: 0.2

Log a data fragment as part of the error message.

Rule ARGS @rx pattern \
    "msg:Test matched" logdata:%{MATCHED_VAR}

Note: Up to 128 bytes of data will be recorded.

block

Description: Mark a transaction to be blocked.

Type: Action

Syntax: block[:advisory | :phase | :immediate]

Cardinality: 0..1

Module: core

Version: 0.4

By default this marks the transaction with an advisory blocking flag. This can be changed depending on the modifier used:

  • advisory - Mark the transaction with an advisory blocking flag which further rules may take into account.

  • phase - Block the transaction at the end of the current phase.

  • immediate - Block the transaction immediatly after rule execution.

capture

Description: Enable capturing the matching data.

Type: Modifier

Syntax: capture

Cardinality: 0..1

Module: core

Version: 0.4

Enabling capturing will poopulate the CAPTURE collection with data from the most recent matching operator. For most operators the CAPTURE:0 field will be set to the last matching value. Operators that support capturing multiple values may set other items in the CAPTURE collection. For example, the rx operator supports setting CAPTURE:1 - CAPTURE:9 via capturing parens in the regular expression.

chain

Description: Chains the next rule, so that the next rule will execute only if the current operator evaluates true.

Type: Modifier

Syntax: chain

Cardinality: 0..1

Module: core

Version: 0.4

Rule chains are essentially rules that are bound together by a logical AND with short circuiting. In a rule chain, each rule in the chain is executed in turn as long as the operators are evaluating true. If an operator evaluates to false, then no further rules in the chain will execute. This allows a rule to execute multiple operators.

All rules in the chain will still execute their actions before the next rule in the chain executes. If you want a rule that only executes an action if all operators evaluate true, then the action should be given on the final rule in the chain.

Requirements for chained rules:

  • Only the first rule in the chain may have an id or phase, which will be used for all rule chains.

  • A numeric chain ID will be assigned and appended to the rule ID, prefixed with a dash, to uniquely identify the rule.

  • Different metadata attributes (except id/phase) may be given for each chain, but the first rule's metasta will be the default.

  • Specifying one or more tag modifiers is allowed in any chain, but the tags will be bound to the entire rule chain so that RuleEnable and similar will act on the entire rule chain, not just an individual rule in the chain.

Example:

# Start a rule chain, which matches only POST requests. The implicit ID here
# will be set to "id:1-1".
Rule REQUEST_METHOD "@rx ^(?i:post)$" id:1 phase:REQUEST chain
# Only if the above rule's operator evaluates true, will the next rule in the
# chain execute. This rule checks to see if there are any URI based parameters
# which typically should not be there for POST requests. If the operator evaluates
# true, then the setvar action will execute, marking the transaction and an
# event will be generated with the given msg text. This rule will have the
# implicit ID set to "id:1-2".
Rule &REQUEST_URI_PARAMS @gt 0 "msg:POST with URI parameters." setvar:TX:uri_params_in_post=1 event chain
# Only if the above two rules' operators return true will the next rule in the
# chain execute.  This rule checks that certain parameters are not used in
# on the URI and if so, generates an event and blocks the transaction with the
# default status code at the end of the phase. This rule will have the implicit
# ID set to "id:1-3".
Rule &REQUEST_URI_PARAMS:/^(id|sess)$/ @gt 0 "msg:Sensitive parameters in URI." event block:phase

confidence

Description: Numeric value indicating the confidence of the rule.

Type: Metadata

Syntax: confidence:integer (0-100)

Cardinality: 0..1

Module: core

Version: 0.4

Higher confidence rules should have a lower False Positive rate.

delRequestHeader

Description: Delete an HTTP header from the request.

Type: Action

Syntax: delRequestHeader:header-name

Cardinality: 0..n

Module: core

Version: 0.4

delResponseHeader

Description: Delete an HTTP header from the response.

Type: Action

Syntax: delResponseHeader:header-name

Cardinality: 0..n

Module: core

Version: 0.4

id

Description: Unique identifier for a rule.

Type: Metadata

Syntax: id:name

Cardinality: 1

Module: core

Version: 0.4

Specifies a unique identifier for a rule. If a later rule re-uses the same identifier, then it will overwrite the previous rule.

TODO: Explain what the full unique id is (taking context and chains into account)

msg

Description: Message associated with the rule.

Type: Metadata

Syntax: msg:text

Cardinality: 0..1

Module: core

Version: 0.4

This message is used by the event action when logging the event.

phase

Description: The runtime phase at which the rule should execute.

Type: Metadata

Syntax: phase:REQUEST_HEADER|REQUEST|RESPONSE_HEADER|RESPONSE|POSTPROCESS

Cardinality: 1

Module: core

Version: 0.4

rev

Description: An integer rule revision.

Type: Metadata

Syntax: rev:integer (1-n)

Cardinality: 0..1

Module: core

Version: 0.4

TODO: Explain how this is used in RuleEnable and when overriding Rules in sub contexts.

setflag

Description: Set binary attributes (flags) in the transaction.

Type: Action

Syntax: setflag:[!]flag-name

Cardinality: 0..n

Module: core

Version: 0.4

TODO: Document specific flags which may have meaning to the inspection engine.

setRequestHeader

Description: Set the value of a HTTP request header.

Type: Action

Syntax: setRequestHeader:header-name=header-value

Cardinality: 0..n

Module: core

Version: 0.4

setResponseHeader

Description: Set the value of an HTTP response header.

Type: Action

Syntax: setResponseHeader:header-name=header-value

Cardinality: 0..n

Module: core

Version: 0.4

setvar

Description: Set a variable data field.

Type: Action

Syntax: setvar:[!]variable-field-name=[+|-]value

Cardinality: 0..n

Module: core

Version: 0.2

The setvar modifier is used for data field manipulation. To create a variable data field or change its value:

setvar:tx:score=1

To remove all instances of a named variable data field:

setvar:!tx:score

To increment or decrement a variable data field value:

setvar:tx:score=+5
setvar:tx:score=-5

An attempt to modify a value of a non-numerical variable will assume the old value was zero (NOTE: Probably should just fail, logging an attempt was made to modify a non-numerical value).

severity

Description: Numeric value indicating the severity of the issue this rule is trying to protect against.

Type: Metadata

Syntax: severity:integer (0-100)

Cardinality: 0..1

Module: core

Version: 0.4

The severity indicates how much impact a successful attack may be, but does not indicate the quality of protection this rule may provide. The severity is meant to be used as part of a "threat level" indicator. The "threat level" is essentially severity x confidence, which balances how severe the threat may be with how well this rule might be protecting against it.

status

Description: The HTTP status code to use for a blocking action.

Type: Modifier

Syntax: status:http-status-code

Cardinality: 0..1

Module: core

Version: 0.4

t

Description: Apply one or more named transformations to each of the targerts in a rule.

Type: Modifier

Syntax: t:transformation-name

Cardinality: 0..n

Module: core

Version: 0.4

tag

Description: Apply an arbitrary tag name to a rule.

Type: Metadata

Syntax: tag:name

Cardinality: 0..n

Module: core

Version: 0.4

TODO: Describe where this is used, noteably RuleEnable/RuleDisable and logged with events.

Chapter 7. Transformation Functions

...

base64Decode

Description:

Module: core

Version: Not implemented yet.

compressWhitespace

Description: Replaces one or more consecutive whitespace characters with a single space.

Module: core

Version: 0.3

Replaces various whitespace characters with spaces. In addition, consecutive whitespace characters will be reduced down to a single space. Whitespace characters are: 0x20, \f, \t, \n, \r, \v, 0xa0 (non-breaking white space).

count

Description: Given a collection, it returns the number if items in the collection. Given a scalar, returns 1.

Module: core

Version: 0.4

htmlEntityDecode

Description: Decodes HTML entities in the data.

Module: core

Version: Not implemented yet.

The following forms are supported:

  • &#DDDD; - Numeric code point, where DDDD represents a decimal number with any number of digits.

  • &#xHHHH; - Numeric code point, where HHHH represents a hexidecimal number with any number of digits.

  • &name; - Predefined XML named entities (currently: quot, amp, apos, lt, gt).

See https://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references.

length

Description: Returns the byte length of the value.

Module: core

Version: 0.4

lowercase

Description: Returns the input as all lower case characters.

Module: core

Version: 0.2

removeWhitespace

Description: Removes one or more consecutive whitespace characters.

Module: core

Version: 0.3

Similar to compressWhitespace, except removes the characters instead of replacing them with a single space.

removeComments

Description: Remove various types of code comments.

Module: core

Version: Not implemented yet.

The following style comments are replaced:

  • /* ... */ - C style comments.

  • // ... - C++ style comments.

  • # ... - Shell style comments.

  • -- ... - SQL style comments.

replaceComments

Description: Replace various types of code comments with a single space character.

Module: core

Version: Not implemented yet.

This is similar to removeComments, but instead of removing, replaces with a single space character.

trim

Description: Removes consecutive whitespace from the beginning and end of the input.

Module: core

Version: 0.2

trimLeft

Description: Removes consecutive whitespace from the beginning of the input.

Module: core

Version: 0.2

trimRight

Description: Removes consecutive whitespace from the end of the input.

Module: core

Version: 0.2

urlDecode

Description: Decodes URL encoded values in the input.

Module: core

Version: Not implemented yet.

Implements decoding the encoding used in application/x-www-form-urlencoded values (percent encoding with additions).

  • %HH; - Numeric code point, where HH represents a two digit hexidecimal number.

  • + - Represents an ASCII space character (equiv to %20).

Warning

Fields which are parsed from the URI and form parameters are already URL Decoded and you should not apply this transformation to these fields unless you are trying to inspect multiple levels of encoding.

min

Description: Given a collecion of numeric data, returns the minimum value.

Module: core

Version: 0.3

max

Description: Given a collection of numeric data, returns the maximum value.

Module: core

Version: 0.3

normalizePath

Description:

Module: core

Version: Not implemented yet.

Chapter 8. Extending IronBee

...

Overview

...

Warning

This documentation is currently out of date.

Execution Flow

Definitions

Engine

The framework that controls data flow, state and code execution.

Plugin

Server native code for embedding the engine into another software base (e.g. the Apache httpd server). The plugin is responsible for instantiating the engine, initiating the initial configuration process, feeding the engine with data and optionally implementing methods of blocking.

Hook

A hook is an execution point within the engine that allows external code to be registered and executed as if it were part of the engine. There are many builtin hooks in the IronBee engine and custom hooks can also be added. Hooks are typically leveraged by modules.

Module

Engine code that is not essential to the core engine, but rather extends what the engine can accomplish by hooking into it. Modules in IronBee are dynamically loadable files which can extend and alter how the engine executes. There are a number of different types of modules which will be explained in detail. Some examples of modules are HTTP parsers, matching algorithms, logging methods, rule languages/executors and specialized detection techniques. All IronBee features are essentially modules, which allows nearly every aspect of the engine to be extended.

Provider Definition

An abstract interface, optionally defining an API, which can be used to extend functionality of the engine. There are a number of providers defined by the core module for parsing, logging, data manipulation, matching, etc. Each definition includes a unique type, which must be implemented by a provider. The provider definition essentially describes the methods that must be implemented to extend the engine with the given type of functionality.

Provider

Code (normally a module) implementing an abstract interface described by the provider definition. This code is then registered with the provider definition type and a unique key for later lookup. For example, the core module defines a provider of type "matcher" which must have a set of methods implemented. The "pcre" module implements these methods and registers a PCRE matcher, which can then be used by other modules to perform PCRE based matches.

Provider Instance

Some providers may require multiple instances to be instantiated with differing scope or configurations. A provider instance is such an instantiation, which is associated with both a provider and a configuration. As an example, the data provider must manage data in various scopes (e.g. connection data, session data and transaction data). The core module creates a provider instance handling this data for each of the different scopes. Another example is the logger provider which must allow the same file based logger (a provider) to be configured differently per configuration context (e.g. log to different log files).

Flow

There are four main stages of execution detailed below.

Startup Stage

During startup, the plugin is instantiated by whatever server has loaded it, for example when the Apache httpd server loads/configures the plugin. During this stage, the plugin instantiates the engine and initiates the configuration stage.

  1. Server starts and instantiates/starts the plugin.

  2. Plugin is configured with native plugin configuration, which includes the location of the engine configuration.

  3. Utility libraries are initialized.

  4. Engine is instantiated.

    1. An engine configuration context is created.

    2. Static core module is loaded which defines builtin provider APIs.

  5. Plugin registers a native logging provider.

  6. Engine configuration stage is initiated based on initial plugin configuration.

Configuration Stage

During configuration, the configuration files/scripts are read/executed, engine modules are loaded/initialized and contexts are created/configured in preparation for the runtime stage. The following is an outline of what will happen during this stage.

  1. Configuration is read/executed.

  2. The main configuration context is created.

  3. Modules are loaded.

    1. Module global configuration data is copied to the global context as a base configuration.

    2. Module "init" function is called just after it is loaded to initialize any global module configuration.

    3. Modules may hook into the engine globally by registering to be called when certain events occur.

    4. If successfully initialized, a module is registered with the engine.

  4. Configuration contexts are created and registered along with a function which will be executed to determine if the context will be chosen at runtime.

  5. Modules register themselves with a configuration context if they are to be used in that context.

    1. Module "context init" function is called to initialize any context configuration.

    2. Modules may hook into the engine for the given context by registering to be called when certain events occur.

  6. The runtime stage is initiated.

Runtime Stage

During runtime all of the configuration has been finalized and the engine will now handle data passed to it by the plugin. Data is handled by the state machine which essentially follows a four step process. First, a configuration context is chosen. Second, the request is handled. Third the response is handled. And finally any post processing is performed. Below is an outline of the flow.

  1. Raw connection HTTP data is received by the plugin and passed to the engine.

  2. [Need to add connection context here. Events could be: conn open, conn data (inbound/outbound), conn close. Configuration options include which protocol parser to use, default parser configuration, whether to decrypt SSL, private keys for decryption, etc.]

  3. If the connection is encrypted, SSL decryption takes place. This step is optional and will largely depend on how the plugin is designed. For example, the Apache plugin will always send decrypted data.

  4. The engine parses the data as a stream, buffering if configured to do so.

  5. The parser notifies the engine of various events (request headers available, request body, etc.)

  6. Any hooks associated with events are executed.

  7. Once enough data is available, the configuration context selection process is started configuration context function until one returns that it wants to be enabled.

    1. At this point all modules registered in the chosen context will have their "context activated" functions executed, allowing them to be prepared for executing in the context.

  8. Further events occur and associated hooks are executed, but now with the chosen configuration context instead of the global context.

Reconfiguration Stage

During a reconfiguration, the engine has been notified that configuration changes are available and should be reloaded. This is very similar to startup, except that instead of the plugin initiating the configuration stage, the engine is initiating the configuration stage. During this time and current data being processed will continue to use the old configuration.

TODO

Hooks

TODO: Add description of each hook

Modules

Modules make up the majority of executed code in IronBee. Most features are built using modules. There are three primary reasons for this. First, it makes the code more readable and each feature more self contained. Second, it allows only features in use to be loaded into the executable. And last, since modules are shared libraries, it makes for easier upgrades as the engine only needs to unload the old code and reload the new.

Modules can interact with the engine in quite a few different ways. However, there are two primary module types: provider and standalone. The simplest is the provider module, which implements one of the many predefined (or even custom) provider definitions. As an example, a provider module can be used to extend logging capabilities, add a different type of pattern matcher or provide a different way of storing/accessing data. The provider module typically will define a configuration and set of functions implementing a provider's well defined interface, then register these functions with the engine. A standalone module will typically be more involved. Standalone modules can hook into any part of the engine and typically used to implement a new detection technique or provide some sort of filtering. To do this, the standalone module defines a configuration and registers functions with various hooks in the engine. While this is similar to a provider module, there is no well defined set of functions to implement in a standalone module.

Standalone Modules

Modules have three essential duties. A module must export a known symbol so that it can be loaded. A set of configuration parameters may be set. And common module functions must be registered which will be called at various initialization and cleanup points. With Lua, however, this is much more simplified than in C.

Exporting a symbol is quite language specific and will not be discussed here.

Any number of configuration parameters are registered with the engine and their storage locations are then mapped by the engine both globally to the module as well as into each configuration context. As of this writing, there are two types of configuration parameters, numeric and string. Along with configuration parameter definitions can be defined default values.

The eventual goal of a module is to register functions to be called by the engine. Typically in a standalone module, this is done by registering functions to be called with hooks. Hooks allow executing at defined points in the connection/transaction lifecycle, which is documented with the state machine in the API documentation.

TODO: Need more on what a basic module will look like without going into language details.

Provider Modules

A provider module typically implements the core provider APIs -- allowing different methods of debug logging, matching, etc. -- but a provider module can also extend the core by adding its own custom interface. Providers are defined with a set of module configuration parameters, an API to allow calling the provider and an abstract interface which must be implemented by other modules. This abstract interface is what most provider modules will be implementing.

The provider API is the interface in which the consumer will call to use the service. This is essentially the public calling interface. Typically this is already defined by the core provider definitions and is only used to define custom providers. As an example, the core logger API defines two functions to perform logging. The first is a vprintf like function with a va_list argument and the second a printf like function with a variable argument list.

The provider interface is what is called by the API to do the real work. This is a private interface in which any module implementing this provider will need to implement. In the case of the logger provider, there is a single function named "logger" of type "qi_log_logger_fn_t". Both of the functions defined in the API will just call this function to do the actual logging. Note, as in this case, the public API does not have to match the private interface.

To implement a defined provider interface, a module needs to register any required interface functions with the engine with a unique key. At this point the provider's register function is called. This function is defined with the provider and is typically used to validate the registered interface, allowing version checking as well as any initial setup (the "version" field is defined by the default interface header).

Core Provider Interfaces

The engine core defines a number of standard provider interfaces which a module can implement. Each interface is implemented using the same concepts as above, however each has a different API.

Logging Interface

...

Event Interface

...

HTTP Parser Interface

...

Matcher Interface

...

Data Interface

...

Writing Modules in C

TODO: Some general description on why one would want to do this.

Anatomy of a C Module

A C module is built into a shared library. The shared library exposes a known structure (see Example 8.1) that IronBee uses to load the module.

Example 8.1. IronBee Module Structure

struct ib_module_t {
    /* Header */
    uint32_t                vernum;           /* Engine version number */
    uint32_t                abinum;           /* Engine ABI Number */
    const char             *version;          /* Engine version string */
    const char             *filename;         /* Module code filename */
    void                   *data;             /* Module data */
    ib_engine_t            *ib;               /* Engine */
    size_t                  idx;              /* Module index */

    /* Module Config */
    const char             *name;             /* Module name */
    const void             *gcdata;           /* Global config data */
    size_t                  gclen;            /* Global config data length */
    const ib_cfgmap_init_t *cm_init;          /* Module config mapping */
    const ib_dirmap_init_t *dm_init;          /* Module directive mapping */

    /* Functions */
    ib_module_fn_init_t     fn_init;          /* Module init */
    ib_module_fn_fini_t     fn_fini;          /* Module finish */
    ib_module_fn_ctx_init_t fn_ctx_init;      /* Module context init */
    ib_module_fn_ctx_fini_t fn_ctx_fini;      /* Module context fini */
};

A module must define and initialize this structure to be loadable in IronBee. This is done by defining a few functions and making a few macro calls. A minimal module example is given in Example 8.2.

Example 8.2. Minimal Module

#include <ironbee/engine.h>
#include <ironbee/debug.h>
#include <ironbee/module.h>

/* Declare the public module symbol. */
IB_MODULE_DECLARE();

/* Called when module is loaded. */
static ib_status_t exmin_init(ib_engine_t *ib, ib_module_t *m)
{
    IB_FTRACE_INIT();
    ib_log_debug(ib, 4, "Example minimal module loaded.");
    IB_FTRACE_RET_STATUS(IB_OK);
}

/* Called when module is unloaded. */
static ib_status_t exmin_fini(ib_engine_t *ib, ib_module_t *m)
{
    IB_FTRACE_INIT();
    ib_log_debug(ib, 4, "Example minimal module unloaded.");
    IB_FTRACE_RET_STATUS(IB_OK);
}

/* Initialize the module structure. */
IB_MODULE_INIT(
    IB_MODULE_HEADER_DEFAULTS,      /* Default metadata */
    "exmin",                        /* Module name */
    IB_MODULE_CONFIG_NULL,          /* Global config data */
    NULL,                           /* Configuration field map */
    NULL,                           /* Config directive map */
    exmin_init,                     /* Initialize function */
    exmin_fini,                     /* Finish function */
    NULL,                           /* Context init function */
    NULL                            /* Context fini function */
);

Example 8.2 shows a very minimalistic module that does nothing but log when the module loads and unloads. The module includes some standard IronBee headers, declares itself a module and defines two functions. The module structure is then initialized with these functions assigned to the fn_init and fn_fini fields. This results in the exmin_init and exmin_fini functions being called when the module is loaded and unloaded, respectfully. Of course much more can be done with a module.

TODO: Describe what other things a module can do.

A Simple C Module Example

To better illustrate writing a C module we need a simple task to accomplish. Here we will define a minimalistic signature language. To keep things simple, the module will stick to IronBee built-in features and ignore any performance concerns. The module will simply allow a user to add signature to IronBee. In this case a signature is defined as performing a PCRE based regular expression on a given data field and triggering an event if there is a match.

To accomplish this task, we need to write a module that does the following:

  • Allow writing a signature within the configuration file that allows specifying when it should execute, what field it should match against, a regular expression and an event message that should be triggered on match.

  • Parse the signature into its various components.

  • Compile the PCRE and store the signature for later execution.

  • At runtime, execute the signatures at the specified time.

  • If a signature matches, generate an event.

The module begins the same as in Example 8.2, but with some additional type definitions which we will use to store our signatures.

Example 8.3. Signature Module Setup

#include <strings.h>

#include <ironbee/engine.h>
#include <ironbee/debug.h>
#include <ironbee/mpool.h>
#include <ironbee/cfgmap.h>
#include <ironbee/module.h>
#include <ironbee/provider.h>

/* Define the module name as well as a string version of it. */
#define MODULE_NAME               pocsig
#define MODULE_NAME_STR           IB_XSTRINGIFY(MODULE_NAME)

/* Declare the public module symbol. */
IB_MODULE_DECLARE();

typedef struct pocsig_cfg_t pocsig_cfg_t;
typedef struct pocsig_sig_t pocsig_sig_t;

/* Signature Phases */
typedef enum {
    POCSIG_PRE,                   /* Pre transaction phase */
    POCSIG_REQHEAD,               /* Request headers phase */
    POCSIG_REQ,                   /* Request phase */
    POCSIG_RESHEAD,               /* Response headers phase */
    POCSIG_RES,                   /* Response phase */
    POCSIG_POST,                  /* Post transaction phase */

    /* Keep track of the number of defined phases. */
    POCSIG_PHASE_NUM
} pocsig_phase_t;

/* Signature Structure */
struct pocsig_sig_t {
    const char         *target;   /* Target name */
    const char         *patt;     /* Pattern to match in target */
    void               *cpatt;    /* Compiled PCRE regex */
    const char         *emsg;     /* Event message */
};

Configuration

Modules control their own configuration structure. Normally a module will use a simple C structure which it can reference directly. However, a module may also expose some or all of its configuration. Any exposed parameters can then be accessed by other modules and/or through the configuration language. In addition to exposing configuration parameters a module can register and expose new configuration directives for use in the configuration language.

In this example we will need to track multiple lists of signatures (one for each point of execution) and a handle to the PCRE pattern matcher. While these will not be exposed, we will expose a numeric parameter to toggle tracing signature execution. The configuration is defined and instantiated in a C structure shown in Example 8.4.

Example 8.4. Configuration Structure

/* Module Configuration Structure */
struct pocsig_cfg_t {
    /* Exposed as configuration parameters. */
    ib_num_t            trace;                   /* Log signature tracing */

    /* Private. */
    ib_list_t          *phase[POCSIG_PHASE_NUM]; /* Phase signature lists */
    ib_matcher_t       *pcre;                    /* PCRE matcher */
};

/* Instantiate a module global configuration. */
static pocsig_cfg_t pocsig_global_cfg;

We will then define a configuration directive to control tracing as well as signature directives for each phase of execution. Note that multiple signature directives are only used to simplify the example so that we do not have to write rule parsing code. The functions defined in Example 8.5 are used to handle the configuration directives, which we will define later on.

The pocsig_dir_trace function is a simple single parameter directive handler which parses the parameter for a "On" or "Off" value and sets a numeric parameter value in the configuration context. We will see how this parameter is exposed later on. The pocsig_dir_signature function is a directive handler that can handle an arbitrary number of parameters. Note that much of this function is described later on with pattern matchers.

Example 8.5. Configuration Directive Handlers

/* Handle PocSigTrace directive. */
static ib_status_t pocsig_dir_trace(ib_cfgparser_t *cp,
                                    const char *name,
                                    const char *p1,
                                    void *cbdata)
{
    IB_FTRACE_INIT();
    ib_engine_t *ib = cp->ib;
    ib_context_t *ctx = cp->cur_ctx ? cp->cur_ctx : ib_context_main(ib);
    ib_status_t rc;

    ib_log_debug(ib, 7, "%s: \"%s\" ctx=%p", name, p1, ctx);
    if (strcasecmp("On", p1) == 0) {
        rc = ib_context_set_num(ctx, MODULE_NAME_STR ".trace", 1);
        IB_FTRACE_RET_STATUS(rc);
    }
    else if (strcasecmp("Off", p1) == 0) {
        rc = ib_context_set_num(ctx, MODULE_NAME_STR ".trace", 0);
        IB_FTRACE_RET_STATUS(rc);
    }

    ib_log_error(ib, 1, "Failed to parse directive: %s \"%s\"", name, p1);
    IB_FTRACE_RET_STATUS(IB_EINVAL);
}

/* Handle all PocSig* signature directives. */
static ib_status_t pocsig_dir_signature(ib_cfgparser_t *cp,
                                        const char *name,
                                        ib_list_t *args,
                                        void *cbdata)
{
    IB_FTRACE_INIT();
    ib_engine_t *ib = cp->ib;
    ib_context_t *ctx = cp->cur_ctx ? cp->cur_ctx : ib_context_main(ib);
    const char *target;
    const char *op;
    const char *action;
    pocsig_cfg_t *cfg;
    pocsig_phase_t phase;
    pocsig_sig_t *sig;
    const char *errptr;
    int erroff;
    ib_status_t rc;

    /* Get the pocsig configuration for this context. */
    rc = ib_context_module_config(ctx, IB_MODULE_STRUCT_PTR, (void *)&cfg);
    if (rc != IB_OK) {
        ib_log_error(ib, 1, "Failed to fetch %s config: %d",
                     MODULE_NAME_STR, rc);
    }

    /* Setup the PCRE matcher. */
    if (cfg->pcre == NULL) {
        rc = ib_matcher_create(ib, ib_engine_pool_config_get(ib),
                               "pcre", &cfg->pcre);
        if (rc != IB_OK) {
            ib_log_error(ib, 2, "Could not create a PCRE matcher: %d", rc);
            IB_FTRACE_RET_STATUS(rc);
        }
    }

    /* Determine phase and initialize the phase list if required. */
    if (strcasecmp("PocSigPreTx", name) == 0) {
        phase = POCSIG_PRE;
        if (cfg->phase[phase] == NULL) {
            rc = ib_list_create(cfg->phase + POCSIG_PRE,
                                ib_engine_pool_config_get(ib));
            if (rc != IB_OK) {
                IB_FTRACE_RET_STATUS(rc);
            }
        }
    }
    /* ... PocSigReqHead removed for brevity ... */
    /* ... PocSigReq removed for brevity ... */
    /* ... PocSigResHead removed for brevity ... */
    /* ... PocSigRes removed for brevity ... */
    /* ... PocSigPostTx removed for brevity ... */
    else {
        ib_log_error(ib, 2, "Invalid signature: %s", name);
        IB_FTRACE_RET_STATUS(IB_EINVAL);
    }

    /* Target */
    rc = ib_list_shift(args, &target);
    if (rc != IB_OK) {
        ib_log_error(ib, 1, "No PocSig target");
        IB_FTRACE_RET_STATUS(IB_EINVAL);
    }

    /* Operator */
    rc = ib_list_shift(args, &op);
    if (rc != IB_OK) {
        ib_log_error(ib, 1, "No PocSig operator");
        IB_FTRACE_RET_STATUS(IB_EINVAL);
    }

    /* Action */
    rc = ib_list_shift(args, &action);
    if (rc != IB_OK) {
        ib_log_debug(ib, 4, "No PocSig action");
        action = "";
    }

    /* Signature */
    sig = (pocsig_sig_t *)ib_mpool_alloc(ib_engine_pool_config_get(ib),
                                         sizeof(*sig));
    if (sig == NULL) {
        IB_FTRACE_RET_STATUS(IB_EALLOC);
    }

    sig->target = ib_mpool_memdup(ib_engine_pool_config_get(ib),
                                  target, strlen(target));
    sig->patt = ib_mpool_memdup(ib_engine_pool_config_get(ib),
                                op, strlen(op));
    sig->emsg = ib_mpool_memdup(ib_engine_pool_config_get(ib),
                                action, strlen(action));

    /* Compile the PCRE patt. */
    if (cfg->pcre == NULL) {
        ib_log_error(ib, 2, "No PCRE matcher available (load the pcre module?)");
        IB_FTRACE_RET_STATUS(IB_EINVAL);
    }
    sig->cpatt = ib_matcher_compile(cfg->pcre, sig->patt, &errptr, &erroff);
    if (sig->cpatt == NULL) {
        ib_log_error(ib, 2, "Error at offset=%d of PCRE patt=\"%s\": %s",
                     erroff, sig->patt, errptr);
        IB_FTRACE_RET_STATUS(IB_EINVAL);
    }

    ib_log_debug(ib, 4, "POCSIG: \"%s\" \"%s\" \"%s\" phase=%d ctx=%p",
                 target, op, action, phase, ctx);

    /* Add the signature to the phase list. */
    rc = ib_list_push(cfg->phase[phase], sig);
    if (rc != IB_OK) {
        ib_log_error(ib, 1, "Failed to add signature");
        IB_FTRACE_RET_STATUS(rc);
    }

    IB_FTRACE_RET_STATUS(IB_OK);
}

Any configuration parameters and directives must be registered with the engine. This is accomplished through two mapping structures as shown in Example 8.6. The exposed configuration parameter is named, typically modulename.name, and the engine told it type, offset, length and default value. This is wrapped into a macro to make this much easier. The configuration directives are registered in a similar fashion and mapped to handler functions.

Example 8.6. Registering the Configuration

/* Configuration parameter initialization structure. */
static IB_CFGMAP_INIT_STRUCTURE(pocsig_config_map) = {
    /* trace */
    IB_CFGMAP_INIT_ENTRY(
        MODULE_NAME_STR ".trace",
        IB_FTYPE_NUM,
        pocsig_cfg_t,
        trace,
        0
    ),

    /* End */
    IB_CFGMAP_INIT_LAST
};

/* Directive initialization structure. */
static IB_DIRMAP_INIT_STRUCTURE(pocsig_directive_map) = {
    /* PocSigTrace - Enable/Disable tracing */
    IB_DIRMAP_INIT_PARAM1(
        "PocSigTrace",
        pocsig_dir_trace,
        NULL
    ),

    /* PocSig* - Define a signature in various phases */
    IB_DIRMAP_INIT_LIST(
        "PocSigPreTx",
        pocsig_dir_signature,
        NULL
    ),
    IB_DIRMAP_INIT_LIST(
        "PocSigReqHead",
        pocsig_dir_signature,
        NULL
    ),
    IB_DIRMAP_INIT_LIST(
        "PocSigReq",
        pocsig_dir_signature,
        NULL
    ),
    IB_DIRMAP_INIT_LIST(
        "PocSigResHead",
        pocsig_dir_signature,
        NULL
    ),
    IB_DIRMAP_INIT_LIST(
        "PocSigRes",
        pocsig_dir_signature,
        NULL
    ),
    IB_DIRMAP_INIT_LIST(
        "PocSigPostTx",
        pocsig_dir_signature,
        NULL
    ),

    /* End */
    IB_DIRMAP_INIT_LAST
};

Pattern Matchers

Pattern matchers are defined through the matcher provider interface. These matchers are typically loaded via modules. In case of the PCRE matcher, it is loaded through the pcre module, which must be loaded for our example module to work. A matcher provider exposes a common interface for calling any pattern matchers registered with the engine.

In Example 8.5 ib_matcher_create is used to fetch the PCRE pattern matcher. This matcher is used here to compile the patterns with ib_matcher_compile. The matcher is stored in the configuration context for later use in executing the signatures. The compiled pattern is stored in the signature structure which is added to a list for later execution.

Hooks

Up until now, we have been dealing with configuration time processing. In order to handle processing at runtime, we have to define a handler and register this handler to be executed at defined points. Since all signatures are executed in the same fashion, we can define a single handler and register it to be executed multiple times.

Example 8.7. Runtime Hook Handlers

static ib_status_t pocsig_handle_sigs(ib_engine_t *ib,
                                      ib_state_event_type_t event,
                                      ib_tx_t *tx,
                                      void *cbdata)
{
    IB_FTRACE_INIT();
    pocsig_cfg_t *cfg;
    pocsig_phase_t phase = (pocsig_phase_t)(uintptr_t)cbdata;
    ib_list_t *sigs;
    ib_list_node_t *node;
    int dbglvl;
    ib_status_t rc;

    /* Get the pocsig configuration for this context. */
    rc = ib_context_module_config(tx->ctx, IB_MODULE_STRUCT_PTR, (void *)&cfg);
    if (rc != IB_OK) {
        ib_log_error(ib, 1, "Failed to fetch %s config: %d",
                     MODULE_NAME_STR, rc);
    }

    /* If tracing is enabled, lower the log level. */
    dbglvl = cfg->trace ? 4 : 9;

    /* Get the list of sigs for this phase. */
    sigs = cfg->phase[phase];
    if (sigs == NULL) {
        ib_log_debug(ib, dbglvl, "No signatures for phase=%d ctx=%p",
                     phase, tx->ctx);
        IB_FTRACE_RET_STATUS(IB_OK);
    }

    ib_log_debug(ib, dbglvl, "Executing %d signatures for phase=%d ctx=%p",
                 ib_list_elements(sigs), phase, tx->ctx);

    /* Run all the sigs for this phase. */
    IB_LIST_LOOP(sigs, node) {
        pocsig_sig_t *s = (pocsig_sig_t *)ib_list_node_data(node);
        ib_field_t *f;

        /* Fetch the field. */
        rc = ib_data_get(tx->dpi, s->target, &f);
        if (rc != IB_OK) {
            ib_log_error(ib, 4, "PocSig: No field named \"%s\"", s->target);
            continue;
        }

        /* Perform the match. */
        ib_log_debug(ib, dbglvl, "PocSig: Matching \"%s\" against field \"%s\"",
                     s->patt, s->target);
        rc = ib_matcher_match_field(cfg->pcre, s->cpatt, 0, f, NULL);
        if (rc == IB_OK) {
            ib_logevent_t *e;

            ib_log_debug(ib, dbglvl, "PocSig MATCH: %s at %s", s->patt, s->target);

            /* Create the event. */
            rc = ib_logevent_create(
                &e,
                tx->mp,
                "-",
                IB_LEVENT_TYPE_ALERT,
                IB_LEVENT_ACT_UNKNOWN,
                IB_LEVENT_PCLASS_UNKNOWN,
                IB_LEVENT_SCLASS_UNKNOWN,
                IB_LEVENT_SYS_UNKNOWN,
                IB_LEVENT_ACTION_IGNORE,
                IB_LEVENT_ACTION_IGNORE,
                90,
                80,
                s->emsg
            );
            if (rc != IB_OK) {
                ib_log_error(ib, 3, "PocSig: Error generating event: %d", rc);
                continue;
            }

            /* Log the event. */
            ib_event_add(tx->epi, e);
        }
        else {
            ib_log_debug(ib, dbglvl, "PocSig NOMATCH");
        }
    }

    IB_FTRACE_RET_STATUS(IB_OK);
}

Example 8.7 defines a handler for executing our signatures at runtime. In order to use this handler with each phase, we will pass the phase number to the handler. Other than some casting trickery to pass the phase number, the function is fairly straight forward. It loops through a phase list, fetches the data field it will match against, matches the pre-compiled pattern against the field and then logs an event if there is a match.

All that is left in the module is to register the signature handler to be executed in the various phases. Example 8.8 shows the finial module functions and registration required for this. Normally configuration data is exposed publicly where it is given a default value. Since some of our configuration is not exposed, we need to initialize the data ourselves. This is done though the module initialization function, pocsig_init. The context initialization function, pocsig_context_init, is called for each configuration context that this module is configured. This is where we register our handler with the engine hooks and define the phase numbers that are passed to the handler. Finally, the module structure is initialized to point to the various configuration mapping structures and module initialization functions.

Example 8.8. Module Functions and Registration

static ib_status_t pocsig_init(ib_engine_t *ib,
                               ib_module_t *m)
{
    IB_FTRACE_INIT();

    /* Register hooks to handle the phases. */
    ib_hook_tx_register(ib, handle_context_tx_event,
                        pocsig_handle_sigs, (void *)POCSIG_PRE);
    ib_hook_tx_register(ib, handle_request_headers_event,
                        pocsig_handle_sigs, (void *)POCSIG_REQHEAD);
    ib_hook_tx_register(ib, handle_request_event,
                        pocsig_handle_sigs, (void *)POCSIG_REQ);
    ib_hook_tx_register(ib, handle_response_headers_event,
                        pocsig_handle_sigs, (void *)POCSIG_RESHEAD);
    ib_hook_tx_register(ib, handle_response_event,
                        pocsig_handle_sigs, (void *)POCSIG_RES);
    ib_hook_tx_register(ib, handle_postprocess_event,
                        pocsig_handle_sigs, (void *)POCSIG_POST);

    IB_FTRACE_RET_STATUS(IB_OK);
}

IB_MODULE_INIT(
    IB_MODULE_HEADER_DEFAULTS,           /* Default metadata */
    MODULE_NAME_STR,                     /* Module name */
    IB_MODULE_CONFIG(&pocsig_global_cfg),/* Global config data */
    pocsig_config_map,                   /* Configuration field map */
    pocsig_directive_map,                /* Config directive map */
    pocsig_init,                         /* Initialize function */
    NULL,                                /* Finish function */
    NULL,                                /* Context init function */
    NULL                                 /* Context fini function */
);

Events

TODO

Writing Modules in Lua

Lua modules are designed to be much easier to develop than a C equivalent. A Lua IronBee module is built like any other Lua module. Really all you need to do is to implement handlers which are executed when an event is triggered. These event handlers (prefixed with "onEvent") are automatically registered with the engine on load. Simply put the code you want executed in the appropriate handler and that is about it.

-- ===============================================
-- Define local aliases of any globals to be used.
-- ===============================================
local base = _G
local ironbee = require("ironbee-ffi")

-- ===============================================
-- Declare the rest of the file as a module and
-- register the module table with ironbee.
-- ===============================================
module(...)
_COPYRIGHT = "Copyright (C) 2010-2011 Qualys, Inc."
_DESCRIPTION = "IronBee example Lua module"
_VERSION = "0.1"

-- ===============================================
-- This is called to handle the
-- LuaExampleDirective directive.
--
-- ib: IronBee engine handle
-- cbdata: Callback data (from registration)
-- ...: Any arguments
-- ===============================================
function onDirectiveLuaExampleDirective(ib, cbdata, ...)
    ironbee.ib_log_debug(ib, 4, "%s.onDirectiveLuaExampleDirective ib=%p",
                       _NAME, ib.cvalue())
    return 0
end

-- ===============================================
-- This is called when the module loads
--
-- ib: IronBee engine handle
-- ===============================================
function onModuleLoad(ib)
    ironbee.ib_log_debug(ib, 4, "%s.onModuleLoad ib=%p",
                       _NAME, ib.cvalue())

    -- Register to handle a configuration directive
    ironbee.ib_config_register_directive(
        -- Engine handle
        ib,
        -- Directive
        "LuaExampleDirective",
        -- Directive Type (currently it MUST be 0 for directive or 1 for block)
        0,
        -- Full name of handler: modulename.funcname
        _NAME .. ".onDirectiveLuaExampleDirective",
        -- Block end function
        nil,
        -- Callback data (should be number, string or other C compat type)
        nil
    )

    return 0
end

-- ===============================================
-- ===============================================
-- Event Handlers
--
-- Normally only the onEventHandle* functions are
-- used for detection, but they are all listed
-- here.
--
-- NOTE: As a best practice, you should avoid
-- using the "onEvent" prefix in any public
-- functions that are NOT to be used as event
-- handlers as these may be treated specially
-- by the engine.
-- ===============================================
-- ===============================================

-- ===============================================
-- This is called when a connection context was
-- chosen and is ready to be handled.
--
-- ib: IronBee engine handle
-- conn: IronBee connection handle
-- ===============================================
function onEventHandleContextConn(ib, conn)
    ironbee.ib_log_debug(ib, 4, "%s.onEventHandleContextConn ib=%p conn=%p",
                       _NAME, ib.cvalue(), conn.cvalue())

    -- Create a pcre matcher for later use
    if pcre == nil then
        pcre = ironbee.ib_matcher_create(ib, conn.mp(), "pcre")
        ironbee.ib_log_debug(ib, 4, "Created PCRE matcher=%p", pcre)
    end

    return 0
end

-- ===============================================
-- This is called when the request headers are
-- available to inspect.
--
-- ib: IronBee engine handle
-- tx: IronBee transaction handle
-- ===============================================
function onEventHandleRequestHeaders(ib, tx)
    ironbee.ib_log_debug(ib, 4, "%s.onEventHandleRequestHeaders ib=%p tx=%s",
                       _NAME, ib.cvalue(), tx.id())

    -- Request line is a scalar value (a field object type)
    local req_line = ironbee.ib_data_get(tx.dpi(), "request_line")
    ironbee.ib_log_debug(ib, 4, "Request line is a field type: %d", req_line.type())

    -- The cvalue ("C" Value) is a pointer to the field structure, which is
    -- not very useful in Lua, but shows that you do have a direct access
    -- to the "C" inner workings:
    ironbee.ib_log_debug(ib, 4, "Request Line cvalue: %p", req_line.cvalue())

    -- The value is a Lua value (string) which can be used with other
    -- Lua functions. Be aware, however, that calling value() makes a
    -- copy of the underlying "C" representation to create the Lua version
    -- and you may not want the overhead of doing thisi (see PCRE matcher
    -- below for another option).
    ironbee.ib_log_debug(ib, 4, "Request Line value: %s", req_line.value())

    -- You can also request a transformed value
    local req_line_lower = ironbee.ib_data_tfn_get(tx.dpi(), "request_line", "lowercase")
    ironbee.ib_log_debug(ib, 4, "Lower case request line is a field type: %d", req_line_lower.type())
    ironbee.ib_log_debug(ib, 4, "Lower case Request Line value: %s", req_line_lower.value())

    -- Request headers are a collection (table of field objects)
    local req_headers = ironbee.ib_data_get(tx.dpi(), "request_headers")
    ironbee.ib_log_debug(ib, 4, "Request Headers is a field type: %d", req_headers.type())
    if req_headers.type() == ironbee.IB_FTYPE_LIST then
        for k,f in base.pairs(req_headers.value()) do
            if f.type() == ironbee.IB_FTYPE_LIST then
                ironbee.ib_log_debug(ib, 4, "Request Header value: %s=<list>", k)
            else
                ironbee.ib_log_debug(ib, 4, "Request Header value: %s=%s", k, f.value())
            end
        end
    end
    -- Or you can access individual subfields within collections directly
    -- via "name.subname" syntax:
    local http_host_header = ironbee.ib_data_get(tx.dpi(), "request_headers.host")
    ironbee.ib_log_debug(ib, 4, "HTTP Host Header is a field type: %d", http_host_header.type())
    ironbee.ib_log_debug(ib, 4, "HTTP Host Header value: %s", http_host_header.value())


    -- Request URI params are a collection (table of field objects)
    local req_uri_params = ironbee.ib_data_get(tx.dpi(), "request_uri_params")
    ironbee.ib_log_debug(ib, 4, "Request URI Params is a field type: %d", req_uri_params.type())
    if req_uri_params.type() == ironbee.IB_FTYPE_LIST then
        for k,f in base.pairs(req_uri_params.value()) do
            if f.type() == ironbee.IB_FTYPE_LIST then
                ironbee.ib_log_debug(ib, 4, "Request URI Param value: %s=<list>", k)
            else
                ironbee.ib_log_debug(ib, 4, "Request URI Param value: %s=%s", k, f.value())
            end
        end
    end

    -- Use the IronBee PCRE matcher directly
    --
    -- A benefit of doing this over using any builtin Lua matchers is that
    -- a Lua copy of the value is not required. Using the PCRE matcher passes
    -- the field value by reference (the cvalue) without the overhead of
    -- a copy. You should use this method for large values.
    --
    -- NOTE: The "pcre" variable used here was initialized in the
    --       onEventHandleContextConn() handler so that it can be used
    --       in any other handler following it.
    if pcre ~= nil then
        local patt = "(?i:foo)"
        local rc = ironbee.ib_matcher_match_field(pcre, patt, 0, req_line)
        if rc == ironbee.IB_OK then
            ironbee.ib_log_debug(ib, 4, "Request Line matches: %s", patt)
            -- Generate a test event (alert)
            ironbee.ib_clog_event(
                tx.ctx(), 
                ironbee.ib_logevent_create(
                    tx.mp(),
                    "-",
                    IB_LEVENT_TYPE_ALERT,
                    IB_LEVENT_ACT_ATTEMPTED_ATTACK,
                    IB_LEVENT_PCLASS_INJECTION,
                    IB_LEVENT_SCLASS_SQL,
                    90, 80,
                    IB_LEVENT_SYS_PUBLIC,
                    IB_LEVENT_ACTION_BLOCK,
                    IB_LEVENT_ACTION_IGNORED,
                    "[TEST Event] Request Line matches: %s", patt
                )
            )
        else
            ironbee.ib_log_debug(ib, 4, "Request Line does not match: %s", patt)
        end
    end

    return 0
end

Appendix A. Configuration Examples

...

Example IronBee Configuration

### Logging
LogLevel 4
LogHandler ironbee-ts

### Sensor Info
# Sensor ID, must follow UUID format
SensorId AAAABBBB-1111-2222-3333-FFFF00000023

### Load Modules

LoadModule "ibmod_htp.so"
LoadModule "ibmod_pcre.so"
LoadModule "ibmod_rules.so"

# Parse the user agent
#LoadModule "ibmod_user_agent.so"

# GeoIP lookup
#LoadModule "ibmod_geoip.so"
#GeoIPDatabaseFile /var/lib/GeoIP/GeoLiteCity.dat

# Enable audit engine
AuditEngine RelevantOnly
AuditLogIndex auditlog.log
AuditLogBaseDir /tmp/ironbee
AuditLogSubDirFormat "%Y%m%d-%H%M"
AuditLogDirMode 0755
#AuditLogFileMode 0644
AuditLogParts minimal request -requestBody response -responseBody

### Sites
Include "site-1.conf"
Include "site-2.conf"
Include "site-default.conf"

Example Apache Configuration

LoadFile /usr/local/lib/libhtp.so
LoadModule ironbee_module /usr/local/ironbee/lib/mod_ironbee.so
<IfModule ironbee_module>
    LogLevel debug
    IronBeeEnable on
    IronBeeConfig /usr/local/ironbee/etc/ironbee.conf
</IfModule>

Example Trafficserver Configuration

# Insert this into trafficserver's plugin.config.
# Adjust paths as appropriate for your installation.

# First we need to load libraries the Ironbee plugin relies on.
/usr/local/ironbee/lib/libloader.so /usr/local/libhtp/lib/libhtp.so /usr/local/ironbee/lib/libironbee.so

# Now we can load the ironbee plugin.  The argument to this is ironbee's
# configuration file: see ironbee-config.txt
/usr/local/ironbee/lib/ts_ironbee.so /usr/local/trafficserver/etc/ironbee-ts.conf

Appendix B. Ideas For Future Improvements

This document contains the list of things we want to look into.

Reminder (to ourselves): We will not add features unless we can demonstrate clear need.

Directive: AuditLogPart

Syntax if/when we want to support configurable parts, and multiple instances of the same part (with different names):

AuditLogPart [+|-]partType [partName][; param=value]

Directive: Include

  • Brian wants to support only Unix path separators (but why not just use whatever works on current platform?).

  • Ivan wants to consider syntax that would allow configuration to retrieved from sources other than the filesystem (e.g., from a database).

  • Consider using optional parameters to restrict what can be in the included files

Directive: Hostname

Add support for URI-based mapping.

Need to validate domain names that we accept.

Our configuration files are in UTF-8 -- do we want to support international domain names (and convert them into punycode)?

Directive: LoadModule

Support many instances of the same module:

LoadModule /path/to/module.so moduleName

Module name is optional. When not provided, the filename with extension removed is used as the name.

Some ideas to support module parameters, should we need to do it later on:

<LoadModule /path/to/module.so>
    Param paramName1 paramValue1
    Param paramName2 paramValue2

    <Param paramName3>
        # value3 here, free-form
    </Param>

    Param paramName4 @file:/path/to/file/with/paramValue4
</LoadModule>

Modules should be able to hook into the engine in the correct order relative to other modules, but should manual tweaking be needed, we could use the following:

<LoadModule /path/to/module.so>
    HookPriority hookName PRIORITY_FIRST "beforeModule1, beforeModule2" "afterModule1, afterModule2"
</LoadModule>

Directive: Site

Enable variable expansion in site names. The idea is to avoid overlap when managing multiple sensors. For example:

<Site %{sensorName}.default>
    # Site definition
</Site>

On the other hand, this type of site name manipulation can be performed in the management component. Why should a sensor care about what other sensors' sites are called?

Sites should be viewed primarily as a way of identifying (and mapping to) real-life entities. They should be used to reduce clutter and map multiple hostnames into a single name, and to use different policies with potentially different group in charge of every entity.

Directive: DebugLogLevel

Extend DebugLogLevel to use different levels for different parts of the engine (e.g., on per-module basis)

Directive: SensorName

Explicitly configure sensor name. If omitted, use hostname.

Directive: RequestParamsExtra

Extract parameters transported in the request URI. The parameter supplied to this directive should be a regular expression with named captures. On a match, the named captures will be placed in the ARGS_EXTRA collection. A new effective path will be constructed (using back references?).

PersonalityAdd

Description: Adds a personality to the current configuration, configuring all the parameters the alias contains

Syntax: PersonalityAdd alias

Default: None

Context: Site, Location

Version: Not Implemented Yet

The parameters extracted from the personality being added will overwrite the existing settings, but non-overlapping settings will be preserved. This allows multiple personalities to be stacked as needed.

PersonalityAliasClear

Description: Clears a previously configured personality alias

Syntax: PersonalityAliasClear alias

Default: None

Context: Main

Module: core

Version: Not Implemented Yet

PersonalityAliasParam

Description: Configures a parameter for the given alias

Syntax: PersonalityAliasParam alias paramName paramValue

Default: None

Context: Main

Module: core

Version: Not implemented yet

See PersonalityParam for the list of all parameters.

PersonalityClearAll

Description: Clears all personality configuration in the current context

Syntax: PersonalityClearAll

Default: None

Context: Site, Location

Module: core

Version: Not Implemented Yet

PersonalityParam

Description: Configures a specific personality parameter

Syntax: PersonalityParam paramName paramValue

Default: None

Context: Site, Location

Module: core

Version: Not implemented yet

Configuration of the multipart/form-data parser:

multipart.file_limit

Configures a hard limit for the number of files that will be accepted in one request. Default: 100

Request parameters personality parameters:

params.urlencoded_separator

Default separator: &

params.multi_retrieval

Possible values: first, last, combined_comma, combined_dbman, array_perl, array_python. No default setting; a configuration error is raised if an attempt to use a single variable is detected and the personality has not been configured.

params.name_transformation

Possible values: none (default), php.

params.source_order

Possible values: place source identifiers in the order in which you wish new parameters to be added to ARGS (G for query string parameters, P for request body parameters, C for cookies, and U for parameters extracted from request URLs). Default: GP.

params.parse_limit

Configures a hard limit for the number of parameters created during parsing. Default: 1000

Variable: ARGS_URI

Request parameters extracted from request URI.

Modules

Description: Establishes a configuration section for module loading (NOTE: Is this really needed???)

Syntax: <Modules>...</Modules>

Default: None

Context: Main

Version: Not Implemented Yet

Modules can be specified only from within the Modules container:

<Modules>
    LoadModule /path/to/module1.so
    LoadModule /path/to/module2.so
</Modules>

Variable length

To find out how many bytes there are in a variable:

#REQUEST_URI

Applied to a collection, the length operator will return the lengths of all variables in it.

Special single-variable syntax

In addition to the colon operator described in the previous section, collections support the comma operator, which returns exactly one variable or no variables at all. This is an advanced operator with dynamic behavior.

ARGS.name

In most cases, the comma operator will simply return the first matching variable in a collection (or nothing, if no matching variable can be found). However, when applied to ARGS, this operator may return a value that depends on the underlying platform that's being protected. More accurately, the functioning of the operator depends on the current parsing personality. Parsing personalities are designed to mimic the parsing implementation of backend systems in order to minimize a problem known as impedance mismatch. The issue is that, when faced with multiple parameters with the same name, some platforms return the first parameter, some platforms return the last parameter, and some platforms even return a combined value of all parameters with the same name.

Consolidation of @pm rules

With the pm operator, not only can you manually parallelize the matching of a large number of patterns, but the engine itself will seamlessly consolidate multiple parallel matching rules when they use identical inputs. The goal is to minimize the number of separate matching operations and thus increase performance.

This feature is very useful for rule qualification. Rule sets will often contain many rules with complex PCRE regular expressions, and running all of them often requires substantial effort. However, it is often possible to identify simpler patterns that are more efficient and which can be used to quickly determine if the complex PCRE pattern has a chance of matching. With seamless parallelization of @pm rules, rule qualification is even more efficient.

For example, let's start with two complex regular expression patterns that we want to speed up:

Rule ARGS "@rx complexPattern1" id:1
Rule ARGS "@rx complexPattern2" id:"

To add rule qualification, for each complex pattern we determine a simple pattern that can be fed to the pm operator. We then convert the standalone rules into chained rules, using the @pm operator with the simple pattern first, followed by the original rules:

Rule ARGS "@pm simplePattern1" chain,id:1
Rule ARGS "@rx complexPattern1"

Rule ARGS "@pm simplePattern2" chain,id:2
Rule ARGS "@rx complexPattern2"

In the above case, the engine will detect two @pm rules that apply to the same input (ARGS), and consolidate them into a single internal @pm rule. When this internal rule is run, its results will be saved and reused as needed. As a result, the complex patterns will be attempted only when there is a reason to believe they will match. If the simple patterns are well selected, that will happen only on a fraction of all transactions.

Notes:

  • Consolidation operates on identical inputs, which means the same variables with the same transformation pipeline.

  • The MATCHED_VARS collection can be used to continue inspection only on the inputs where the initial patterns matched.

Warning

This is an experimental feature, which we may need to tweak as we gain better understanding of its advantages and disadvantages.

standalone

Description: Parallel matching with patterns from file.

Cardinality: 0..1

Module: core

Version: 0.2

Marks rule as standalone, which means that the engine is allowed to move it anywhere within the phase if it wishes to optimize execution.

Rule ARGS "@rx TEST" phase:REQUEST,standalone

sqlDecode

Decodes input in a way similar to how a DBMS would:

  • Decodes hexadecimal literals that use the SQL standard format x'HEX_ENCODED_VALUE' (case insensitive)

  • Decodes hexadecimal literals that use the ODBC format 0xHEX_ENCODED_VALUE (case insensitive)

  • Decodes backslash-escaped characters

References:

  • MySQL Reference: Hexadecimal Literals http://dev.mysql.com/doc/refman/5.6/en/hexadecimal-literals.html

  • String Literals http://dev.mysql.com/doc/refman/5.6/en/string-literals.html

Appendix C. ModSecurity Migration Guide

...

Variables

Syntax

  • Change to Unicode, with UTF-8 as the storage format

  • Use the same escaping syntax as in JavaScript

  • Generally tighten the syntax, specifying which characters can be used and where

  • Add syntax to specify n-th element in a collection, starting from the beginning or starting from the end

  • Allow the comma operator to be used anywhere to retrieve only one element from a collection

  • Make the comma operator smart so that it changes behavior based on the current context personality

Other changes

With IronBee, we are starting to build the list of variables from scratch, adding what is needed, as it is needed. We may support some of the missing variables in future releases, but there will remain a number of variables that only make sense in ModSecurity. In general, we are reusing the names of ModSecurity variables wherever possible.

In some cases, variables with the same name will behave differently. This happens for two reasons. First, whereas in ModSecurity httpd provided certain values, in IronBee we have to do it ourselves, possibly differently and from a different starting point. Second, in some cases we determined that there is room for improvement in how variables are constructed, and we wanted to do the right thing.

In this section we will highlight some of the major differences:

  • todo

Rules

  • The main rule directive is shortened, to just Rule. All other supported directives from ModSecurity also have the Sec prefix removed.

  • In chained rules, instructions (known as disruptive actions in ModSecurity) are specified on the last rule in the chain, not on the first; the rule metadata remains with the first rule in the chain.

  • There are no default transformations, which means that it is no longer necessary to begin rules with t:none.

  • The SecRuleScript functionality is now handled by RuleRunExt, which works as an interface between the rule language and externally-implemented security logic (for example, Lua rules, rules implemented natively, etc).

  • Run-time rule manipulation (using ctl:ruleRemoveById or ctl:ruleUpdateTagetById) is not currently supported. These features are slow in ModSecurity and we wish to rethink them before and if we implement them. At the very least we will wish to provide a faster implementation.

  • Changing rule flow at run-time is not supported at this time. This means that the functionality of skip, skipAfter, and SecMarker is not supported. Neither is that of SecAction, which is commonly used with skipping.

  • IronBee uses a simplified configuration model in which site configuration always starts from scratch. Inheritance is used when location contexts are created, but, unlike in ModSecurity, locations always inherit configuration from their sites.

  • There is no ability to update rule targets and actions at configure-time, but we will probably implement a similar feature in the future.

  • All rules that generate events must use the msg modifier to provide a message. This is because IronBee does not use machine-generated rule messages. In ModSecurity, machine-generated messages have shown to have little value, especially as rules increase in complexity. They are often a source of confusion.

Miscellaneous

  • The audit log format has been redesigned to support new features.

  • SecArgumentSeparator is implemented as a personality parameter (see PersonalityParam for more information).

  • In IronBee, request and response body inspection is not tied to buffering. Disabling buffering will generally not affect inspection; it will only affect the ability to block attacks reliably.

  • In IronBee, like in ModSecurity, you can have a transaction blocked if the buffer limit is encountered, but, you can also choose to continue to use the buffer in circular fashion. In that case, IronBee will simply buffer as much data as it can, allowing any overflowing data to pass through.

Features Not Supported (Yet)

  • Content injection - will be added in the future

  • Guardian log - will not implement this obsolete feature

  • XML support - will be added in the future

  • Persistent storage - will be added in the future (with high priority)

  • Chroot support - will not implement this httpd–specific feature

  • File upload inspection and extraction - will be added in the future

  • Anti-DoS features are not supported, and we don't expect they will be in the future

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值