Automatic speaker verification (ASV) systems are exposed to spoofing attacks which may compromise their security. While anti-spoofing techniques have been mainly studied for clean scenarios, it has also been shown that they perform poorly in noisy environments. In this work, we aim at improving the performance of spoofing detection for ASV in clean and noisy scenarios. To achieve this, we first propose the use of Gated Recurrent Convolutional Neural Networks (GRCNNs) as a deep feature extractor to robustly represent speech signals as utterance-level embeddings, which are later used by a back-end recognizer for the final genuine/spoofed classification. Then, to enhance the robustness of the system in noisy conditions, we propose the use of signal-to-noise masks (SNMs) as new input features to inform the anti-spoofing system about the time-frequency regions of the input spectral features that are mostly affected by noise and, hence, should be neglected when computing the embeddings. To evaluate our proposals, experiments were carried out on the clean and noisy versions of the ASVspoof 2015 corpus for detecting logical access attacks, as well as on the ASVspoof 2017 database to detect replay attacks. Additional results are provided for the ASVspoof 2019 corpus, including both logical and physical scenarios. The experimental results show that our proposal clearly outperforms some well-known methods based on classical features and other similar deep feature based systems for both clean and noisy conditions.