FactionDocs
Technical Reference

Modules

Per-module deep dive across the four Faction services.

The four modules are independently callable. Each has well-defined inputs, outputs, edge-case behavior, performance targets, and configuration knobs.

Intent Classifier

Purpose. Determine whether an inbound case is a quote request, an order, a status update, an RFQ, or something else, so the orchestrator can route it correctly.

Endpoint. POST /v1/intent/classify

Required inputs

FieldTypeNotes
case_idstringCaller-owned identifier; opaque to Faction.
senderstring (email)Used as a signal, not as authentication.
subjectstringEmpty string allowed.
bodystringPlain text or HTML; HTML is sanitized server-side.

Optional inputs

FieldTypeWhy it helps
attachments[]arrayPDFs, images, spreadsheets. Improves classification when subject and body are sparse.
countryISO-3166 alpha-2Enables country-scoped taxonomy and rules.
branch_idstringAllows branch-tuned routing.
thread_history[]arrayPrior messages in the same email thread.

Outputs

FieldTypeNotes
intentenumquote, order, update, status, rfq, other. Configurable per tenant.
confidence_scorefloat (0–1)Calibrated; comparable across modules.
rationalestringNatural-language justification with cited tokens.
secondary_intents[]arrayRanked alternatives with confidence scores.

Edge cases

CaseBehavior
Empty subject and bodyReturns intent: "other" with low confidence; rationale flags missing content.
Attachment-only requestEngine reads attachments via the same pipeline used by the Info Extractor. Result is still a single intent.
Multi-intent messageDominant intent in intent, secondary in secondary_intents.
Foreign languageAuto-detected. Supported: English, French, German, Spanish, Italian, Dutch. Other languages return language_unsupported: true.
Auto-reply / OOODetected; returns auto_reply: true to suppress downstream processing.

Performance

MetricTarget
p50 latencyUnder 600 ms
p95 latencyUnder 1.5 s
Max payload10 MB total. Larger payloads use the async pattern.

Configuration knobs (caller-controlled)

  • Intent taxonomy (the enum values).
  • Confidence threshold per intent class.
  • Routing rules attached to thresholds.
  • Country-scoped overrides.

Quote Info Extractor

Purpose. Extract structured quote fields (line items, delivery, urgency, special conditions) from the body and attachments of a quote-related case.

Endpoint. POST /v1/extract/quote

Required inputs

FieldTypeNotes
case_idstringCaller-owned identifier.
bodystringEmail body.
quote_schemaobjectTarget schema. Configured at tenant level; can be overridden per call.

Optional inputs

FieldTypeWhy it helps
attachments[]arrayPDFs, spreadsheets, images, scanned documents.
customer_idstringIf already resolved, allows customer-specific format conventions.
country, branch_idstringsLocale defaults (date, number, currency).

Outputs

FieldTypeNotes
line_items[]arrayDescription, quantity, unit, requested delivery. Per-field confidence and source span.
delivery_requirementsobjectShip-to hint, requested-by date, urgency.
urgency_signalsobjectDetected signals: explicit deadline, "ASAP" language, escalation tone.
payment_termsobjectIf detectable.
special_conditions[]arrayFree-form (export controls, hazmat, certifications required).
attachments_processed[]arrayWhich attachments were read; which were skipped and why.

Schema-aware extraction

Extraction is schema-aware. If the tenant's quote_schema defines a field, the extractor will look for it. If it isn't in the schema, it isn't returned. This keeps outputs aligned with the caller's data model.

Edge cases

CaseBehavior
Handwritten note (photo / scan)OCR pipeline runs first. Per-field confidence reflects OCR quality.
Multi-language documentLanguage detected per attachment. Mixed-language documents handled.
Spreadsheet attachmentTabular extraction with header detection. Multi-sheet workbooks: relevant sheets retained, others skipped with reason.
Quantity / unit ambiguousField returned with low confidence and ambiguous: true flag.
Encoded / password-protected attachmentSkipped, reason recorded in attachments_processed[]. No exception thrown.
Email signature with line-item-like textFiltered out using signature-block detection.

Performance

MetricTarget
p50 (body only)Under 1.5 s
p50 (one PDF, < 5 pages)Under 4 s
p50 (multi-page scan, OCR required)Under 12 s
Max payload50 MB. Larger sizes use async with callback.

Configuration knobs

  • Quote schema (field definitions, types, required vs. optional).
  • Per-field confidence thresholds.
  • Locale defaults.
  • Customer-specific format hints.

Customer Matcher

Purpose. Resolve the inbound case to a customer ID, ship-to ID, and contact ID in master data.

Endpoint. POST /v1/match/customer

Required inputs

FieldTypeNotes
case_idstringCaller-owned identifier.
senderstring (email)Primary signal.

Optional inputs

FieldTypeWhy it helps
phonestring (E.164)For WhatsApp / phone-originated cases.
bodystringAllows extraction of customer references.
signature_blockstringIf extracted separately.
country, branch_idstringsScopes the match space.

Outputs

FieldTypeNotes
customer_idstringEmpty if no match clears threshold.
ship_to_idstringBest ship-to inferred from request or customer default.
contact_idstringBest contact match for the sender.
confidence_scorefloatJoint score across customer, ship-to, contact.
match_rationalestringWhich signals matched.
alternatives[]arrayRanked candidates above floor threshold.
branch_hintstringInferred from customer's typical branch.

Match strategy

The matcher combines three signal sources, weighted by tenant configuration:

  1. Deterministic (highest weight): exact email domain → account, phone → contact, prior-thread linkage.
  2. Structured: address normalization, name normalization with company-suffix handling.
  3. Behavioral: prior order patterns, typical branch, typical ship-to.

Disambiguation policy

For customers with multiple subsidiaries on shared domains (e.g., a holding company), the matcher returns the most likely entity in customer_id and the rest in alternatives[] with reason codes. The orchestrator can choose to surface the disambiguation to the rep.

Edge cases

CaseBehavior
Generic email domain (gmail.com, hotmail.com)Email-domain signal weighted near zero. Falls back to phone, signature, prior-thread, body extraction.
New customer (no master-data match)Returns empty customer_id, confidence_score: 0, unmatched_reason. Orchestrator can route to onboarding.
Multiple ship-tos on one customerBest ship-to inferred from request body; otherwise customer default with reduced confidence.
Stale contact (left the company)Contact match drops; customer match still resolves via domain.

Performance

MetricTarget
p50 latencyUnder 400 ms
p95 latencyUnder 1.0 s

Configuration knobs

  • Match thresholds per customer type.
  • Branch-aware matching rules.
  • Disambiguation policy (single best vs. surface alternatives).
  • Refresh cadence for master data.

Product Matcher

Purpose. Map extracted line-item descriptions to product IDs, with substitutes and rationale.

Endpoint. POST /v1/match/product

Required inputs

FieldTypeNotes
case_idstringCaller-owned identifier.
line_items[]arrayFrom the Info Extractor, or constructed by the orchestrator.

Optional inputs

FieldTypeWhy it helps
customer_idstringEnables customer-specific product history boost.
branch_idstringEnables branch-level catalogue scoping.
countrystringCountry-specific catalogue scoping.
match_strategyenumstrict, balanced (default), permissive.

Outputs

FieldTypeNotes
matches[]arrayOne entry per input line item.
matches[].rubix_product_idstringEmpty if unmatched_flag: true.
matches[].confidence_scorefloatCalibrated.
matches[].match_rationalestringWhich match path won.
matches[].alternatives[]arraySubstitutes / equivalents with reason codes.
matches[].unmatched_flagboolTrue if no candidate cleared threshold.
unmatched[]arrayConvenience list for orchestrator routing.

Match paths

The matcher tries multiple paths and returns the strongest:

  1. Manufacturer + part number exact match (highest confidence path).
  2. Manufacturer cross-reference (competitor part to stocked equivalent).
  3. Semantic match (description embeddings against catalogue).
  4. Historical-pattern match (this customer ordered this SKU before).
  5. Branch-local override (a branch-supplied spreadsheet).

The path that won is reported in match_rationale.

Edge cases

CaseBehavior
Multiple SKUs at similar confidenceBest candidate if customer history indicates preference; otherwise unmatched_flag: true with all candidates in alternatives[].
Discontinued SKUReturns the successor with rationale, if mapped.
Competitor part with no equivalentunmatched_flag: true with reason no_rubix_equivalent.
Quantity unit mismatchResolves match; unit_conversion_required: true flag added with proposed conversion.

Performance

MetricTarget
p50 latency (10 line items)Under 800 ms
p95 latency (10 line items)Under 2.0 s
p50 latency (100 line items)Under 4.0 s

Configuration knobs

  • Match thresholds per product category.
  • Substitute / equivalent rules.
  • Country-specific catalogue scoping.
  • Per-customer override lists.
  • Branch-local-knowledge ingestion (SFTP, API, or scheduled file drop).

Modular vs. unified call patterns

The four modules are independently callable. The orchestrator can choose any of these patterns:

PatternWhen to useNotes
Intent onlyRouting decisions only.Cheap, fast.
Intent + ExtractExtract content for non-quote intents (e.g., status requests with attached PDFs).
Full pipeline (all four)Standard quote handling.Faction shares case context internally across the four calls when invoked within a short window with the same case_id and a shared correlation_id.
Single module reuseRe-running just product matching after a rep edits a line.Idempotent; safe to call repeatedly.

There is no requirement to call modules in a specific order. The orchestrator decides. Modules do not call each other; the caller is always in charge of orchestration.

On this page