Provn
    How it worksBrowse jobsFor companiesBlogLog in

    © 2026 Provn Inc. All rights reserved.

    About•Blog•Terms of Service•Privacy Policy

    Made with love in Seattle

    Challenge Library/Automate Compliance Triage With an AI Agent at a Financial Services Firm

    Automate Compliance Triage With an AI Agent at a Financial Services Firm

    ai
    agent
    builder
    Estimated Time:
    1 hour 15 minutes
    Status:Not started

    What You'll Be Doing

    You are an AI engineer at a mid-sized financial services firm. The Compliance team manually reviews hundreds of regulatory documents each week — policy updates, audit reports, and external regulatory filings — to identify action items, flag risk areas, and route tasks to the right internal owners.

    The process is slow, inconsistent, and increasingly unsustainable. A junior analyst currently spends 12–15 hours per week on initial triage alone. The Head of Compliance has asked your team to explore whether an AI agent could automate the first-pass review and routing — while keeping a human in the loop for final decisions.

    You have been asked to design and prototype a Document Intelligence Agent that can:

    • Ingest a regulatory document (PDF or structured text)
    • Extract key entities: risk areas, required actions, responsible parties, and deadlines
    • Classify the document by urgency and compliance domain (e.g. AML, data privacy, capital requirements)
    • Draft a structured summary and proposed routing decision for human review
    • Flag ambiguities that require human judgment before a routing decision is made

    You will not be evaluated on whether your agent is production-complete. You will be evaluated on the quality of your design decisions, your reasoning about trade-offs, your approach to reliability and evaluation, and how you communicate the system to a non-technical stakeholder (the Head of Compliance) who will decide whether to fund the next phase.

    Honor these constraints in your design. Strong candidates explicitly acknowledge them. AI tools will typically ignore them — this is intentional.

    • Human-in-the-loop is non-negotiable: the agent may not autonomously route or act on a document. Every output must be a proposed action for a human reviewer to approve. Design for this from the start — not as an afterthought.

    • The documents are unstructured: you cannot assume clean headers, consistent terminology, or structured fields. Your design must handle documents where key information is buried in paragraph 14 of a 40-page PDF.

    • No greenfield infrastructure: the firm runs on standard cloud infrastructure (AWS or Azure) with an existing document storage system (S3 or Blob Storage). You cannot propose a new data warehouse or real-time streaming pipeline as a prerequisite.

    • Audit trail required: compliance workflows require a complete record of what the agent extracted, what it proposed, and why. Every agent decision must be logged and explainable to a regulator.

    • Scope for first phase: the first phase must be demonstrably useful within four weeks of build time with a team of two engineers. If your design requires six months to show value, it will not get funded.

    We expect you to use AI tools. We evaluate how you use them — not whether you use them. Evidence of iteration, redirection, and critical evaluation scores higher than a polished output with no process documentation.

    The single highest-signal indicator: your video answer to the mandatory AI question. If you cannot name a specific moment where you redirected AI output, evaluators will assume you did not.

    What strong AI usage looks like in this challenge:

    • You used AI to draft tool schemas, then caught a type error or missing field and corrected it — and you can describe what you caught

    • You prompted AI for an architecture, it gave you a generic LangChain tutorial structure, and you redirected it to handle the unstructured document constraint specifically

    • You used AI to draft the stakeholder brief, then rewrote the jargon-heavy parts because a compliance executive would not understand them

    • You independently designed the failure mode analysis because AI output was too generic ('hallucination is a risk') and you knew the specific failure modes from the scenario

    What weak AI usage looks like:

    • A polished submission with no Section C, or a Section C that says 'I used ChatGPT to help me write'

    • Tool schemas with string types on every field because the AI defaulted to strings and you did not review them

    • A stakeholder brief that reads like a technical spec written for engineers

    How Your Work Will Be Scored

    Agent Architecture & System Design - 30%Tool Integration & Orchestration Quality - 20%Reliability, Evaluation & Production Readiness - 20%Stakeholder Communication & Business Framing - 15%AI Fluency - 15%

    What to Submit

    No submission guidelines provided.

    On this page

    Top of Page
    What You'll Be Doing
    How It's Scored
    What to Submit