Abstract:
According to some embodiments, a system includes a communication device operative to communicate with a user to receive a data set including a plurality of samples at a clustering module; a clustering module to receive the data set, store the data set, and calculate one or more clusters of samples using a clustering strategy; an optimization module to receive and store the one or more clusters of samples from the clustering module and generate one or more samples from the one or more clusters of samples using an optimization strategy; a memory for storing program instructions; at least one sample selection platform processor, coupled to the memory, and in communication with the clustering module and the optimization module and operative to execute program instructions to: calculate one or more clusters of samples based on the clustering strategy by executing the clustering module; analyze the data associated with the one or more clusters received from the clustering module using the optimization strategy associated with the optimization module to automatically select one or more samples from the one or more clusters; and provide one or more samples generated by the optimization module for replication in a validation model. Numerous other aspects are provided.