|Twitter is generating a large amount of real-time data in the form of microblogs that has potential knowledge for various applications like traffic incident analysis and urban planning. Social media data represents the unbiased actual insights of citizens’ concerns that may be mined in making cities smarter. In this study, a computational framework has been proposed using word embedding and machine learning model to detect traffic incidents using social media data. The study includes the feasibility of using machine learning algorithms with different feature extraction and representation models for the identification of traffic incidents from the Twitter interactions. The comprehensive proposed approach is the combination of following four steps. In the first phase, a dictionary of traffic-related keywords is formed. Secondly, real-time Twitter data has been collected using the dictionary of identified traffic related keywords. In the third step, collected tweets have been pre-processed, and the feature generation model is applied to convert the dataset eligible for a machine learning classifier. Further, a machine learning model is trained and tested to identify the tweets containing traffic incidents. The results of the study show that machine learning models built on top of right feature extraction strategy is very promising to identify the tweets containing traffic incidents from micro-blogs.|
*** Title, author list and abstract as seen in the Camera-Ready version of the paper that was provided to Conference Committee. Small changes that may have occurred during processing by Springer may not appear in this window.