The continued quest for finding a low-power and high-performance hardware algorithm for signed number multiplication led to designing a simple and novel radix-8 signed number multiplier with 3-bit grouping and partial product reduction performed using magnitudes of the multiplicand and the multiplier. The pre-computation stage constitutes magnitude calculation and non-trivial computations required to generate partial products. A new partial product reduction strategy is deployed in the design to improve the speed with low cost. 8 X 8, 16 X 16, 32 X 32, and 64 X 64 designs are presented for the proposed architectures. Performance results include area, power, delay, and power-delay-product of synthesized and post-layout designs using 32 nm CMOS technology with 1.05 V supply voltage.