summary |
shortlog | log |
commit |
commitdiff |
tree
first ⋅ prev ⋅ next
Sebastian [Tue, 12 Aug 2014 20:47:45 +0000 (20:47 +0000)]
remove useless delay(), since ping-pong works now
Sebastian [Tue, 12 Aug 2014 20:47:22 +0000 (20:47 +0000)]
Makefile: don't print $(EOBJS) in help screen
Sebastian [Wed, 30 Jul 2014 18:46:05 +0000 (18:46 +0000)]
shared: don't pack shm_t, but align it instead
Accesses to the poll flag or the iteration counter do not
result in four byte-accesses anymore. Sigh.
Sebastian [Mon, 28 Jul 2014 15:33:49 +0000 (15:33 +0000)]
all: new polling architecture
Replaces states[][] with pollflag+iteration.
The host communicates to the target when it is done reading shared
memory. No need for delay() on the target anymore. Only core 0
writes the iteration counter, reducing traffic to shared memory.
Writing populations is still slow. But no data loss anymore.
Sebastian [Mon, 28 Jul 2014 14:35:05 +0000 (14:35 +0000)]
target: fail build on too small sizes
at least 1x1 cores and 3x3 blocks, please.
Sebastian [Mon, 28 Jul 2014 14:33:53 +0000 (14:33 +0000)]
host: fix write_populations().
Sebastian [Fri, 25 Jul 2014 20:45:11 +0000 (20:45 +0000)]
timers: save timer values to timers.dat
it is reasonably impossible to calculate the standard inside epiphany,
since there is no 128 bit datatype (as needed for storing the squares).
the required sqrtf will already overflow the internal code memory, adding
bignum stuff won't help. even float ceils.
so put the burden of statistics on the host.
Sebastian [Fri, 25 Jul 2014 14:27:29 +0000 (14:27 +0000)]
all: begin of time measurement infrastructure
measure time differences, write them to shm, print them on the host
Sebastian [Fri, 25 Jul 2014 14:22:00 +0000 (14:22 +0000)]
host: cleanup
Sebastian [Fri, 25 Jul 2014 14:07:09 +0000 (14:07 +0000)]
Makefile: run everything internally with float
- treat floating point constants as single precision
(this removes all dependencies for double)
- use internal.ldf linker script
(puts libc functions in local memory)
speedup is immense. running code from external memory
is extremely slow, especially when all cores fight for it.
Sebastian [Mon, 30 Jun 2014 22:08:10 +0000 (22:08 +0000)]
inner borders
- implement inner borders, still wrap-around the outer borders
- shm_t uses CORES_X, CORES_Y instead of linearized numbers
- change index order to be [y][x] everywhere
- maximum size now 104x104 (using 26x26 blocks in a 4x4 grid)
- compile-time bombs for anything larger
- finally supports non-square block and grid sizes
Sebastian [Thu, 26 Jun 2014 01:44:39 +0000 (01:44 +0000)]
allow blocks of up to 24 KB (three banks)
Sebastian [Thu, 26 Jun 2014 01:43:21 +0000 (01:43 +0000)]
add compile-time error for oversized grids
Sebastian [Wed, 25 Jun 2014 22:05:27 +0000 (22:05 +0000)]
lb: D2Q9 working, single bank, single core
limited to 10x10 (double precision) or 15x15 (single precision)
Sebastian [Tue, 17 Jun 2014 16:17:28 +0000 (16:17 +0000)]
initial commit