Bombay High Court Scraper
Our Bombay High Court scraper is a Python script that uses the BeautifulSoup and Requests libraries to scrape litigation data from the Bombay High Court's website.
Source
The source of data is the Bombay High Court's official website, specifically the webpage dedicated to case status and case history.
Inputs
The script takes as inputs:
- Case Type: The type of case to look for (e.g., Civil, Criminal, etc.)
- Case Number: The specific number assigned to the case.
- Year: The year in which the case was registered.
These inputs are used to form the search query on the Bombay High Court's case status webpage.
Outputs
The scraper outputs a JSON file containing the following data fields:
- Case Number: The specific number assigned to the case.
- Case Type: The type of case.
- Year: The year in which the case was registered.
- Petitioner: The name of the petitioner in the case.
- Respondent: The name of the respondent in the case.
- Status: The current status of the case.
- Last Hearing Date: The date of the last hearing in the case.
- Next Hearing Date: The scheduled date of the next hearing in the case.
- Court Number: The court number where the hearings are held.
Mechanism
The spider first makes a GET request to the Bombay High Court's case status webpage with the given inputs as query parameters. It then parses the HTML response using BeautifulSoup to find the table containing the case details.
Each row of the table represents a case, and each cell in the row represents a field of the case (e.g., case number, case type, etc.). The script extracts the text in these cells and stores it in a dictionary.
Once all the rows in the table have been processed, the script writes the list of dictionaries to a JSON file. This file can then be used for further analysis or integrated into a larger data pipeline.