Back to Articles

Training YOLOv11 for Smart Parking

6 min

I am deeply proud of Smart Park. It was my final-year project, and it represents the perfect intersection of sophisticated AI engineering and pragmatic product design.

The traditional approach to "smart parking" involves embedding physical sensors into every single parking bay—a solution that scales terribly and costs a fortune. I wanted to solve the same problem using an infrastructure-light approach: a single camera feed and a highly optimized object detection model.

The Vision Pipeline

The architecture is built for real-time inference on relatively standard cloud hardware. The pipeline begins with a rigorously annotated dataset of parking lots from various angles and times of day. This data was used to fine-tune the YOLOv11 model, prioritizing extreme accuracy over minor confidence thresholds.

The inference step is wrapped in a Flask API, which processes frames, detects bounding boxes, and calculates the exact spatial availability of the lot. These analytics are piped into a MySQL database format and stored rapidly utilizing AWS S3, ensuring the system can be scaled up to hundreds of cameras without bringing down the main thread.

The model itself wasn't the hard part. The real engineering challenge was predicting the physical world.

Designing for Edge Cases

Getting a computer vision model to recognize a cleanly parked sedan at 2:00 PM on a sunny day is trivial. The friction—where the actual engineering occurs—is in the edge cases.

How does the model react when heavy rain distorts the camera lens? What happens when a motorcycle park horizontally across two spots? How do long dynamic shadows cast by trees at 5:00 PM affect the bounding boxes? Solving these issues required rigorous data augmentation and tweaking the confidence thresholds dynamically based on perceived ambient lighting.

Currently, the system achieves a steady 15 frames-per-second on a standard AWS EC2 instance, which is more than enough temporal resolution for a parking lot.

Building the model and the backend infrastructure was phase one. For V2, the goal isn't better AI—it's better UX. I plan to build out a rich frontend dashboard that visualizes this data, turning a complex tensor array into something a parking attendant can actually use.