I’m attending the Linux Cluster Institute’s Applications Module Program at UNM today and the rest of the week.
Today’s topics:
Doug Pase from IBM
o Performance Programming on Intel & AMD Architectures
Stuff about memory/register/cache arch, vectorization, etc. Some interesting things were discussed, but I was mostly familar with the material. It was good to cover it as intro before the rest of the day’s information.
o Performance Programming on Intel & AMD Architectures
Apply stuff from the previous discussion to basic performance techniques: loop unrolling, memory access patterns, data alignment, FP multiplication by recripocal instead of division, subroutine inlining, etc. Interesting stuff. It clarified at lot of stuff that I was only vaguely aware of.
Ron Brightwell from Sandia
o Linux Cluster Programming
Overview of MPI. Again, stuff I knew a little bit about is now much clearer. Excellent info. Good speaker. Poor slides: Sorry, not my slides, that point is incorrect, etc.
Luiz DeRose from Cray
o Debugging parallel applications with Totalview. Far too much basic information about how a debugger works in serial mode. I know that stuff. Skip to the parallel stuff please. Why show powerpoint slides of a program? Show the actual program in operation. Oh.. and we don’t actually have a way to run Totalview in parallel, because LSF on the machine we’re using doesn’t support X11 tunneling in interactive sessions. Nice.
Federico Bassetti from NCSA
o Local Conputing Environment. A discussion on how to use some of NCSA’s resources. Nice guy, but he didn’t seem an expert on what’s in NCSA’s machine rooms. Many incorrect slides. Again, why talk and point, when we could be typing.