Simultaneous translation, which translates concurrently with the source language speech, is widely used in many scenarios including multilateral organizations. However, it is well known to be one of the most challenging tasks for humans due to the simultaneous perception and production in two languages. On the other hand, simultaneous translation is also notoriously difficult for machines and has remained one of the holy grails of AI. The key challenge is the word order difference between the source and target languages. There have been efforts towards genuine simultaneous translation, but all these efforts have the following major limitations: (a) none of them can achieve any arbitrary given latency; (b) their base translation model is still trained on full sentences; and (c) their systems are complicated, involving many components and are difficult to train. In this thesis, we start by introducing several simultaneous translation approaches with two orthogonal categories: fixed or adaptive latency policies; trained on full sentences or not. Then, we investigate how to improve simultaneous translation with beam search which is universally used in full-sentence translation but non-trivial to be applied in simulta- neous translation. Finally, we explore speech-to-speech simultaneous interpretation by incorporating streaming ASR and incremental TTS.