US Accident Analysis
Machine Learning-powered traffic risk prediction platform designed to process and analyze over 7 million historical accident records. The system leverages Python, Pandas, and Scikit-Learn to perform large-scale data preprocessing, feature engineering, and predictive modeling, achieving 74.14% classification accuracy. The backend architecture consists of a FastAPI microservice responsible for serving a trained machine learning model through RESTful endpoints, enabling real-time traffic risk predictions. To support frontend consumption, the platform integrates a Java + Spring Boot Backend-for-Frontend (BFF) that securely orchestrates communication between services and delivers geospatial risk data to an interactive Leaflet-based map interface.
Phases:
[x] Phase 1: ETL, EDA, and ML Model Training
Cleaning and preprocessing the 7 million historical records, conducting Exploratory Data Analysis, performing feature engineering (including One-Hot Encoding for weather conditions), training the Scikit-Learn Random Forest model to achieve a 74.14% accuracy, and exporting the trained model and feature structure into .pkl (Pickle) files.
[x] Phase 2: Data & AI Microservice with FastAPI
Building a high-performance Python server that loads the serialized .pkl files into memory on startup. It will expose a GET /data endpoint to serve the optimized sample of 4,000 records (accidentes_muestra.json), a GET /columns endpoint to share the model's feature architecture, and a POST /predict endpoint to run live inference on new accident data using Scikit-Learn.
[ ] Phase 3: BFF (Backend for Frontend) Orchestrator with Spring Boot
Developing a strongly-typed API Gateway layer using Java and Spring Boot. This service serves as the core orchestrator: validating incoming UI requests with strict Jakarta schemas, handling data grouping and pagination to protect frontend rendering, and acting as a secure reverse proxy that forwards clean payloads to the Python microservice, eliminating browser CORS conflicts.
[ ] Phase 4: Interactive Dashboard & Simulator with React & Mapbox
Designing a premium Modern Dark Mode user interface that consumes the structured data from the Node.js BFF. This frontend will feature an analytical dashboard with interactive charts and geospatial heatmaps powered by Mapbox to visualize historical accidents, alongside a dedicated 'Risk Simulator' form allowing users to input live variables and dynamically display prediction risk alerts returned by the machine learning pipeline.
Tech Stack
Python
Jupyter Notebook