A poorly maintained ship engine in the supply chain industry can lead to inefficiencies, increased fuel consumption, higher risks of malfunctions, and potential safety hazards. Issues with engines could lead to engine malfunctions, potential safety hazards, and downtime causing delayed deliveries, resulting in the breakdown of a ship’s overall functionality, consequently impacting the business, such as affecting revenue. Your challenge in this project is to apply critical thinking and ML concepts to design and implement a robust anomaly detection model.
- Project Definition
- Jupyter Notebook
- Report
Data-Driven Anomaly Detection in Maritime Engineering Systems
Project Summary
In this project, I developed an advanced anomaly detection system to monitor the health of ship engines using real-world telemetry data. My goal was to proactively identify potential engine malfunctions that could result in costly downtime, safety risks, and delivery delays—ultimately empowering operational efficiency and predictive maintenance.
Business Context
Shipping companies rely on complex engine systems to power global logistics operations. Even minor undetected anomalies in engine performance can cascade into severe failures, increasing operational costs and risking cargo deadlines.
I was tasked with building an anomaly detection solution using sensor data collected from ship engines. The challenge: anomalies represented only a small fraction (~1–5%) of the dataset, requiring highly sensitive models capable of detecting subtle signals without overwhelming stakeholders with false positives.
Key Objectives
- Accurately identify abnormal engine behaviors that could indicate mechanical failure.
- Reduce false alarms by understanding the operational context of outliers.
- Deliver actionable insights that ship engineers could trust and act on.
Tools & Technologies
- Programming: Python
- Libraries: Pandas, NumPy, Scikit-learn, Matplotlib, Seaborn
- Modeling Techniques: Isolation Forest, One-Class SVM, IQR
- Dimensionality Reduction: PCA
- Data Processing: Feature scaling, normalization, flag engineering
- Collaboration & Delivery: Jupyter Notebook, PDF Report Presentation
Technical Approach
1. Data Exploration & Preprocessing
- Conducted deep exploratory data analysis (EDA) to understand six key features:
- Engine RPM
- Lubrication Oil Pressure & Temperature
- Coolant Pressure & Temperature
- Fuel Pressure
- Addressed missing values, normalized scale differences, and smoothed noisy trends to preserve meaningful patterns.
2. Feature Interaction Analysis
- Recognized that single-variable anomalies were not always meaningful (e.g., high RPM during acceleration).
- Identified multi-feature combinations indicative of real issues (e.g., high RPM + high oil temp + abnormal coolant pressure = overheating risk).
- Built custom anomaly flags for dangerous feature interactions.
3. Anomaly Detection Models
- Applied and compared multiple techniques:
- IQR Method (for univariate detection and visualization)
- One-Class SVM (for boundary-based detection)
- Isolation Forest (for efficient and accurate multivariate outlier detection)
- Isolation Forest achieved the best balance of sensitivity and specificity, detecting ~2.16% of true anomalies with minimal false positives.
4. Dimensionality Reduction & Visualization
- Used PCA (Principal Component Analysis) to reduce complexity and visualize anomaly clusters in 2D space.
- These plots were key in communicating insights to stakeholders without a technical background.
Key Insights Delivered
- Lubrication oil temperature emerged as a leading indicator of potential engine distress.
- Detected several patterns of co-occurring anomalies that strongly correlated with known failure conditions.
- Suggested real-time threshold recommendations (e.g., “flag RPM above 1300 only if coolant pressure > 4 bar and oil temp > 200°F”).
Value to Stakeholders
- Reduced Downtime: By identifying anomalies early, the company could schedule maintenance before failures occurred.
- Improved Safety: Prevented in-operation engine failures, protecting both crew and cargo.
- Operational Trust: Designed models with interpretable logic, ensuring engineers could understand and trust alerts.
- Cost Savings: Enabled proactive parts replacement and repair planning, avoiding emergency costs and delivery penalties.
Core Skills Demonstrated
- Predictive Maintenance Modeling
- Anomaly Detection (Unsupervised ML)
- Statistical Outlier Detection (IQR)
- Multivariate Feature Engineering
- PCA & Data Visualization
- Python (Pandas, NumPy, Scikit-learn, Seaborn, Matplotlib)
- Business Analysis & Communication
Project Reflection
This project demonstrated how technical acumen and business understanding must go hand-in-hand. I learned that the best models aren’t just accurate—they’re actionable and trusted by those who use them. By blending data science with real-world reasoning, I created a system that didn’t just detect anomalies—it prevented costly consequences.