We typically think of machine learning models as modeling two different parts of the training data--the underlying generalizable truth (the signal), and the randomness specific to that dataset (the noise).
Fitting both of those parts increases training set accuracy, but fitting the signal also increases test set accuracy (and real-world performance) while fitting the noise decreases both. So we use things like regularization and dropout and similar techniques in order to make it harder to fit the noise, and so more likely to fit the signal.
Just increasing the amount of noise in the training data is one such approach, but seems unlikely to be as useful. Compare random jitter to adversarial boosting, for example; the first will slowly and indirectly improve robustness whereas the latter will dramatically and directly improve it.