Research into red, green, blue plus depth salient object detection (SOD) has identified the challenging problem of how to exploit raw depth features and fuse cross-modal (CM) information. To solve this problem, we propose an interactive nonlocal joint learning (INL-JL) network for quality RGB-D SOD. INL-JL benefits from three key components. First, we carry out joint learning to extract common features from RGB and depth images. Second, we adopt simple yet effective CM fusion blocks in lower levels while leveraging the proposed INL blocks in higher levels, aiming to purify the depth features and to make CM fusion more efficient. Third, we utilize a dense multiscale transfer strategy to infer saliency maps. INL-JL advances the state-of-the-art methods on five public datasets, demonstrating its power to promote the quality of RGB-D SOD.