brookfields_engg

Inputs & Current Setup:

Data Source: Using Acuris for PEP checks.
Current Environment: Checks are currently being executed using the development (dev) API endpoint.
Cost Consideration: If we switch from the dev API to the production API, each PEP check costs approximately INR 106 (subject to change).

Currently Available:

PEP Level Identification:
- We have direct access to a PEP dataset through Acuris.
- The system can identify PEP levels (e.g., Tier 1, Tier 2, Tier 3 exposure).
Dev API Integration:
- The integration with the Acuris dev environment is functioning and allows us to perform PEP checks.

Required Work:

Production API Integration:
- Move from the Acuris dev environment to the production API endpoint.
- Validate authentication, request quotas, and implement cost-tracking mechanisms to handle INR 106 per check.
Refine Matching Logic for Connected Parties:
- Enhance the entity resolution to apply PEP checks not only to direct Targets and Directors but also to all connected parties (e.g., shareholders, beneficiaries).
- Integrate date-of-birth, nationality, and other attributes to reduce false positives.
Risk Categorization & Reporting:
- Ensure that PEP risk levels (Tier 1, 2, 3) are clearly integrated into the final reports.
- Include logic to automatically flag high-risk PEPs and generate corresponding alerts or notifications.

Inputs & Current Setup:

Classification Columns: A predefined set of columns represent various events and signals (e.g., acquisition-acquiree, legal-issue, regulatory-issue, user-growth).
Planned Expansion: We need additional signals, primarily related to adverse media. Current classification setup includes both adverse and non-adverse categories, but we need to add more signals beyond the 40+ currently listed.

Currently Available:

Base Classification Framework:
- A preliminary classification schema with columns like legal-issue, regulatory-issue, financial-challenge, and more.
- This framework can tag media articles based on keyword or pattern matches, aiding in initial filtering.

Required Work:

Additional Signal Integration:
- Incorporate new adverse media signals into the classification model.
- Update mapping logic to categorize media hits under newly added signals related to adverse events (e.g., bribery, corruption, fraud).
ML/NLP Model Improvements:
- Move from simple keyword-based rules to more advanced NLP techniques (e.g., transformer-based models) to improve accuracy of classification.
- Fine-tune the model using historical datasets to reduce false positives and negatives.
Semantic Search & Vector Database Integration:
- Fully integrate the vector database for semantic similarity searches, enabling quick retrieval of relevant media articles by topic.
- Ensure that the database can handle newly added signals efficiently.
Summarization & Timeline Visualization:
- Implement summarization algorithms to provide concise overviews of adverse media over time.
- Add timeline views to show when certain signals (e.g., "legal-issue", "regulatory-issue") appeared, providing historical context and trends.