Litigation Sources in Data Warehouse and ETL Pipelines

Litigation Sources in our data warehouse serve as a vital cornerstone for comprehensive due diligence and Third-Party Risk Management (TPRM) analyses. These sources include databases from different courts and tribunals across the country, offering a wealth of information on past and ongoing litigation cases. Our ETL pipelines ensure this data is continuously updated and available for analysis.

Litigation Sources

The litigation_sources schema in our data warehouse encompasses several tables, each dedicated to a different source of litigation data:

Delhi High Court Records (`delhi_hc_records`)

This table hosts data obtained from the Delhi High Court's publicly accessible records. Key data points include appeal number, case type, case status, filing date, appellants, respondents, and order information.

ECourts High Court Records (`ecourts_hc_records`)

The ecourts_hc_records table maintains data from the E-Courts services of India, which covers High Courts across the country. This table stores comprehensive case data, including case number, appellants, respondents, final order, interim orders, and relevant legal sections, among other details.

ITAT Records (`itat_records`)

The itat_records table holds data sourced from the Income Tax Appellate Tribunal (ITAT). It carries case-specific details like appeal type, bench name, filing date, assessment year, case status, order type, and result.

NCDRC Records (`ncdrc_records`)

The ncdrc_records table carries data from the National Consumer Disputes Redressal Commission (NCDRC). The key data points include case number, case type, filing date, appellants, respondents, and advocate details.

NCLAT Records (`nclat_records`) and NCLT Records (`nclt_records`)

These tables store data from the National Company Law Appellate Tribunal (NCLAT) and National Company Law Tribunal (NCLT), respectively. They hold crucial case data including filing number, case number, assessment year, appellants, respondents, hearing details, and case history.

ETL Pipelines for Litigation Sources

ETL Pipelines for litigation sources follow the standard process of extraction, transformation, and loading but are customized to handle the unique nature of litigation data.

Extraction

The data is extracted from each litigation source using custom data scrapers. These scrapers are designed to navigate the web structure of each source, identify new and updated records, and extract the raw data.

Transformation

Once extracted, the raw data is processed through transformation pipelines. Here, the data is cleaned (e.g., removing duplicates and correcting errors), normalized (e.g., ensuring consistent use of terminology and data formats), and structured into a format compatible with our data warehouse schema.

Loading

Finally, the transformed data is loaded into the corresponding table in the litigation_sources schema. Each record is accurately mapped to the correct fields in the table, ensuring data integrity.

Summary

The litigation_sources in our data warehouse provide a comprehensive view of litigation history, enabling effective risk assessment and due diligence. Our ETL pipelines play a crucial role in ensuring this data is up-to-date and ready for analysis, thereby adding significant value to our TPRM processes.