Even a simple multilayer perceptron can sort input data to some extent, as you can see here and here.
However, neural networks for sequential data seem more appropriate, as they can handle sequences of variable lengths. It has been done with an LSTM (Long Short-Term Memory), LSTM+HAM (Hierarchical Attentive Memory) and an NTM (Neural Turing Machine).