brookfields_engg
2. PEP (Politically Exposed Person) Checks
Inputs & Current Setup:
- Data Source: Using Acuris for PEP checks.
- Current Environment: Checks are currently being executed using the development (dev) API endpoint.
- Cost Consideration: If we switch from the dev API to the production API, each PEP check costs approximately INR 106 (subject to change).
Currently Available:
- PEP Level Identification:
- We have direct access to a PEP dataset through Acuris.
- The system can identify PEP levels (e.g., Tier 1, Tier 2, Tier 3 exposure).
- Dev API Integration:
- The integration with the Acuris dev environment is functioning and allows us to perform PEP checks.
Required Work:
- Production API Integration:
- Move from the Acuris dev environment to the production API endpoint.
- Validate authentication, request quotas, and implement cost-tracking mechanisms to handle INR 106 per check.
- Refine Matching Logic for Connected Parties:
- Enhance the entity resolution to apply PEP checks not only to direct Targets and Directors but also to all connected parties (e.g., shareholders, beneficiaries).
- Integrate date-of-birth, nationality, and other attributes to reduce false positives.
- Risk Categorization & Reporting:
- Ensure that PEP risk levels (Tier 1, 2, 3) are clearly integrated into the final reports.
- Include logic to automatically flag high-risk PEPs and generate corresponding alerts or notifications.
4. Adverse Media and Non-Adverse Media Analysis
Inputs & Current Setup:
- Classification Columns: A predefined set of columns represent various events and signals (e.g., acquisition-acquiree, legal-issue, regulatory-issue, user-growth).
- Planned Expansion: We need additional signals, primarily related to adverse media. Current classification setup includes both adverse and non-adverse categories, but we need to add more signals beyond the 40+ currently listed.
Currently Available:
- Base Classification Framework:
- A preliminary classification schema with columns like
legal-issue,regulatory-issue,financial-challenge, and more. - This framework can tag media articles based on keyword or pattern matches, aiding in initial filtering.
- A preliminary classification schema with columns like
Required Work:
- Additional Signal Integration:
- Incorporate new adverse media signals into the classification model.
- Update mapping logic to categorize media hits under newly added signals related to adverse events (e.g., bribery, corruption, fraud).
- ML/NLP Model Improvements:
- Move from simple keyword-based rules to more advanced NLP techniques (e.g., transformer-based models) to improve accuracy of classification.
- Fine-tune the model using historical datasets to reduce false positives and negatives.
- Semantic Search & Vector Database Integration:
- Fully integrate the vector database for semantic similarity searches, enabling quick retrieval of relevant media articles by topic.
- Ensure that the database can handle newly added signals efficiently.
- Summarization & Timeline Visualization:
- Implement summarization algorithms to provide concise overviews of adverse media over time.
- Add timeline views to show when certain signals (e.g., "legal-issue", "regulatory-issue") appeared, providing historical context and trends.