Evaluation of advanced Air Traffic Management concepts is a challenging task due to the limitations in the existing scenario generation methodologies. Their rigorous evaluation on safety metrics, in a variety of complex scenarios, can provide an insight into their performance, which can help improve upon them while developing new ones. In this work, I propose an air traffic simulation system, with a novel representation of airspace, which can prototype advanced ATM concepts. I then propose a novel evolutionary computation methodology to algorithmically generate conflict scenarios of increasing complexity in order to evaluate conflict detection algorithms. I illustrate the methodology by quantitative evaluation of three conflict detection algorithms on safety metrics. I then propose the use of data mining techniques for the discovery of interesting relationships, that may exist implicitly, in the algorithm's performance data. This relationships are formed as a predictive model for algorithm's vulnerability which can then be included in an ensemble that can minimize the overall vulnerability of the system.