SPO 600 Lab 5 – SIMD Lab (Part 1)

In this lab, we used the same vol1.c file from previous lab but on the AArch64 system.

Part 1: Auto-Vectorization

1. Modifying the Makefile so that this file is compiled with the option “-fopt-info-vec-all”

# list all binaries in this next line
BINARIES = vol1
CCOPTS = -g -O3 -fopt-info-vec-all

all:	${BINARIES}

vol1:	vol1.c vol.h
	gcc ${CCOPTS} vol1.c -o vol1

test1:	vol1
	bash -c "time ./vol1"

# target to test all binaries
test:	test1 

gdb1:	vol1
	gdb vol1

clean:	
	rm ${BINARIES} || true

2. Review the compiler output

Analyzing loop at vol1.c:32
vol1.c:32:2: note: ===== analyze_loop_nest =====
vol1.c:32:2: note: === vect_analyze_loop_form ===
vol1.c:32:2: note: === get_loop_niters ===
vol1.c:32:2: note: === vect_analyze_data_refs ===
vol1.c:32:2: note: got vectype for stmt: _12 = *_11;
vector(8) short int
vol1.c:32:2: note: got vectype for stmt: *_11 = _40;
vector(8) short int
vol1.c:32:2: note: === vect_analyze_scalar_cycles ===

Only one loop was vectorized, the rest were not. Here is an example of a not vectorized loop:

vol1.c:43:2: note: === vect_analyze_data_refs ===
vol1.c:43:2: note: not vectorized: not enough data-refs in basic block.

3. Examine the output to see which loop(s) are vectorized

vol1.c:32:2: note: add new stmt: MEM[(int16_t *)vectp_data.8_69] = vect__40.6_72;
vol1.c:32:2: note: ------>vectorizing statement: x_35 = x_52 + 1;
vol1.c:32:2: note: ------>vectorizing statement: # DEBUG x => x_35
vol1.c:32:2: note: ------>vectorizing statement: # DEBUG x => x_35
vol1.c:32:2: note: ------>vectorizing statement: ivtmp_74 = ivtmp_75 - 1;
vol1.c:32:2: note: ------>vectorizing statement: vectp_data.0_44 = vectp_data.0_45 + 16;
vol1.c:32:2: note: ------>vectorizing statement: vectp_data.8_70 = vectp_data.8_69 + 16;
vol1.c:32:2: note: ------>vectorizing statement: if (ivtmp_74 != 0)
vol1.c:32:2: note: New loop exit condition: if (ivtmp_64 < 625000)
vol1.c:32:2: note: LOOP VECTORIZED

Resulting times

bash -c "time ./vol1"
Result: 94

real	0m0.527s
user	0m0.505s
sys	0m0.021s

4. Modify the code so that one more loop is vectorized

vol2.c:38:2: note: ------>vectorizing statement: # DEBUG ttl => ttl_35
vol2.c:38:2: note: ------>vectorizing statement: # DEBUG x => x_36
vol2.c:38:2: note: ------>vectorizing statement: ivtmp_72 = ivtmp_73 - 1;
vol2.c:38:2: note: ------>vectorizing statement: vectp_data.1_45 = vectp_data.1_46 + 16;
vol2.c:38:2: note: ------>vectorizing statement: if (ivtmp_72 != 0)
vol2.c:38:2: note: New loop exit condition: if (ivtmp_23 < 625000)
vol2.c:38:2: note: LOOP VECTORIZED
bash -c "time ./vol2"
Result: -42223906

real	0m0.477s
user	0m0.466s
sys	0m0.011s

Published by cindyledev

Full Stack Developer, Computer Programmer, and Analyst

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create your website at WordPress.com
Get started
%d bloggers like this: