Perform data virtualisation by multi-dimensional histogram and clustering plot in order to understand the behaviour(distribution) on raw datasets.
Implement a various numbers of machine learning algorithms on the extracted data set for testing purpose, including SVM, CLUSEQ, HMMs, Time Series and k-NNs.
Adapt new mathematical algorithm to comply with the special data structure(series of dates and sequences) for optimising the insurance fraud detection methods.