Distant supervised relation extraction has been an effective way to find relational facts from text. However, distant supervised method inevitably accompanies with wrongly labeled sentences. Noisy sentences lead to poor performance of relation extraction models. Though existing piecewise convolutional neural network model with sentence-level attention (PCNN+ATT) is an effective way to reduce the effect of noisy sentences, it still has two limitations. On one hand, it adopts a PCNN module as sentence encoder, which only captures local contextual features of words and might lose important information. On the other hand, it neglects the fact that not all words contribute equally to the semantics of sentences. To address these two issues, we propose a hierarchical attention-based bidirectional GRU (HA-BiGRU) model. For the first limitation, our model utilizes a BiGRU module in place of PCNN, so as to extract global contextual information. For the second limitation, our model combines word-level and sentence-level attention mechanisms, which help get accurate sentence representations. To further alleviate the wrongly labeling problem, we first calculate the co-occurrence probabilities (CP) between the shortest dependency path (SDP) and the relation labels. Based on these co-occurrence probabilities, two denoising strategies are proposed to reduce noise interference respectively from aspect of filtering labeled data and integrating CP information into model. Experimental results on the corpus of Freebase and New York Times (Freebase+NYT) show that the HA-BiGRU model outperforms baseline models, and the two co-occurrence probabilities based denoising strategies can improve robustness of HA-BiGRU model.