An Investigation into Incorporating Visual Information in Audio Processing