Most likely your C implementation was simply suboptimal.