Defining Rules

A condition or rule you never want your application to break. It's composed of three ingredients:

  • A metric

  • An operator

  • A target value

Your Rules should evaluate to False for the base case, and to True for unwanted scenarios.

In the example above, the "input/output shall never contain PII" is encoded into a Rule like below:

                    { 
                        "metric": "pii",
                        "operator": "contains",
                        "target_value": "ssn",
                    },

Or:

gp.Rule(
    metric=gp.RuleMetrics.pii,
    operator=gp.RuleOperator.contains,
    target_value="ssn"
)

Metrics and Operators supported

We support several metrics within Protect rules. Because each metric can have different output values (e.g. float metrics, categorical, etc.), the Operators and Target values differ by metric. Below is a list of all supported metric and their available configurations:

Prompt Injection

Used to detect and stop prompt injections in the input.

Metric Constants:

  • gp.RuleMetrics.prompt_injection

Payload Field: input

Potential Categories:

  • impersonation

  • obfuscation

  • simple_instruction

  • few_shot

  • new_context

Operators and Target Value Supported:

OperatorTarget Value

Any (gp.RuleOperator.any)

A list of categories (e.g. ["obfuscation", "impersonation"])

All (gp.RuleOperator.all)

A list of categories (e.g. ["obfuscation", "impersonation"])

Contains (gp.RuleOperator.contains)

A single category (e.g. "impersonation")

Equal (gp.RuleOperator.eq)

A single category (e.g. "impersonation")

Not equal (gp.RuleOperator.neq)

A single category (e.g. "impersonation")

Empty (gp.RuleOperator.empty)

-

Not Empty (gp.RuleOperator.not_empty)

-

Example:

gp.Rule(
    metric=gp.RuleMetrics.prompt_injection,
    operator=gp.RuleOperator.any,
    target_value=["impersonation", "obfuscation"]
)

PII (Personal Identifiable Information)

Used to detect and stop Personal Identifiable Information (PII). When applied on the input, it can be used to stop the user or company PII from being included in API calls to external services. When applied on the output, it can be used to prevent data leakage or PII being shown back to the user.

Metric Constants:

  • gp.RuleMetrics.piifor output PII

  • gp.RuleMetrics.input_pii for input PII

Payload Field: input (for input PII) or output (for output PII)

Potential Categories:

  • address

  • date

  • email

  • financial_info

  • name

  • phone_number

  • ssn

  • username_password

Operators and Target Value Supported:

OperatorTarget Value

Any (gp.RuleOperator.any)

A list of categories (e.g. ["ssn", "address"])

All (gp.RuleOperator.all)

A list of categories (e.g. ["ssn", "address"])

Contains (gp.RuleOperator.contains)

A single category (e.g. "ssn")

Equal (gp.RuleOperator.eq)

A single category (e.g. "ssn")

Not equal (gp.RuleOperator.neq)

A single category (e.g. "ssn")

Empty (gp.RuleOperator.empty)

-

Not Empty (gp.RuleOperator.not_empty)

-

Example:

gp.Rule(
    metric=gp.RuleMetrics.pii,
    operator=gp.RuleOperator.any,
    target_value=["ssn", "address"]
)

Context Adherence

Measures whether your model's response was purely based on the context provided. It can be used to stop hallucinations from reaching your end users. Powered by Context Adherence Luna.

Metric Constant: gp.RuleMetrics.context_adherence_luna

Payload Field: Both input and output must be included in the payload

Potential Values: 0.00 to 1.00.

Generally, we see 0.1 as a good threshold below which we're confident the response is not adhering to the context.

Operators Supported:

  • Greater than (gp.RuleOperator.gt)

  • Less than (gp.RuleOperator.lt)

  • Greater than or equal (gp.RuleOperator.gte)

  • Less than or equal (gp.RuleOperator.lte)

Example:

gp.Rule(
    metric=gp.RuleMetrics.context_adherence_luna,
    operator=gp.RuleOperator.lt,
    target_value=0.90
)

Toxicity

Used to detect and stop toxic or foul language in the input (user query) or output (response shown to the user).

Metric Constants:

  • gp.RuleMetrics.toxicityfor output Toxicity

  • gp.RuleMetrics.input_toxicity for input Toxicity

Payload Field: input or output

Potential Values: 0.00 to 1.00 (higher values indicate higher toxicity)

Operators Supported:

  • Greater than (gp.RuleOperator.gt)

  • Less than (gp.RuleOperator.lt)

  • Greater than or equal (gp.RuleOperator.gte)

  • Less than or equal (gp.RuleOperator.lte)

Example:

gp.Rule(
    metric=gp.RuleMetrics.toxicity,
    operator=gp.RuleOperator.gt,
    target_value=0.95
)

Sexism

Detect sexist or biased language. When applied on the input, it can be used to detect sexist remarks in user queries. When applied on the output, it can be used to prevent your application from using an making biased or sexist comments in its responses.

Metric Constants:

  • gp.RuleMetrics.sexistfor output Sexism

  • gp.RuleMetrics.input_sexist for input Sexism

Payload Field: input or output

Potential Values: 0.00 to 1.00 (higher values indicate higher toxicity)

Operators Supported:

  • Greater than (gp.RuleOperator.gt)

  • Less than (gp.RuleOperator.lt)

  • Greater than or equal (gp.RuleOperator.gte)

  • Less than or equal (gp.RuleOperator.lte)

Example:

gp.Rule(
    metric=gp.RuleMetrics.sexist,
    operator=gp.RuleOperator.gt,
    target_value=0.95
)

Tone

Primary tone detected from the text. When applied on the input, it can be used to detect negative tones in user queries. When applied on the output, it can be used to prevent your application from using an undesired tone in its responses.

Metric Constants:

  • gp.RuleMetrics.tonefor output Tone

  • gp.RuleMetrics.input_tone for input Tone

Payload Field: input (for input Tone) or output (for output Tone)

Potential Categories:

  • anger

  • annoyance

  • confusion

  • fear

  • joy

  • love

  • sadness

  • surprise

  • neutral

Operators and Target Value Supported:

OperatorTarget Value

Equal (gp.RuleOperator.eq)

A single category (e.g. "anger")

Not equal (gp.RuleOperator.neq)

A single category (e.g. "neutral")

Example:

gp.Rule(
    metric=gp.RuleMetrics.tone,
    operator=gp.RuleOperator.neq,
    target_value="neutral"
)

Last updated