State-of-the-art models for semantic segmentation are based on adaptations of convolutional networks originally designed for image classification. However, dense prediction problems like semantic segmentation differ structurally. This paper develops a new convolutional network module using dilated convolutions to systematically aggregate multi-scale contextual information without losing resolution. It supports exponential expansion of the receptive field and improves the accuracy of semantic segmentation systems. The authors also simplify classification networks adapted for dense prediction and demonstrate improved accuracy.