It seems that classifiers trained on adversarial examples may be finding (more) conservative concept boundaries:
We also found that the weights of the learned model changed significantly, with the weights of the adversarially trained model being significantly more localized and interpretable