Low-level vision tasks pose an outstanding challenge in terms of computational effort, pixel-wise operations require high-performance architectures to achieve real-time processing. Nowadays, diverse technologies permit a high level of parallelism and in this way researchers can address more and more complex on-chip low-level vision-feature extraction. In the state of the art, different architectures have been described that process single vision modes in real time but multiple computer vision modes are seldom conjointly computed on a single device to produce a general-purpose on-chip low-level vision system; this may be the basis for mid-level or high-level vision tasks. We present here a novel architecture for multiple-vision feature extraction that includes multiscale optical flow, disparity, energy, orientation, and phase. A high degree of robustness in real-life situations is obtained thanks to adopting phase-based models (at the cost of relatively high computing resource requirements). The high flexibility of the reconfigurable devices used allows for the exploration of different hardware configurations to deal with final target and user requirements. Making use of this novel architecture and hardware-sharing techniques we describe a co-processing board implementation as a case study. It reaches an outstanding computing power of 92.3 GigaOPS at very low power consumption (approximately 12.9 GigaOPS/W).