SingleOperationMultipleData < Main

---+ Single Instruction Multiple Data
The biggest, baddest parallel clusters these days are SIMD clusters where the workers do not communicate directly with other workers.

---++ Hardware


---+++ Google
   * ~300K machines available, reportedly 2,000 worker machines is not unusual
   * Typical apps: search, distributed sort, distributed grep, [[http://labs.google.com/papers/sawzall-20030814.gif][world map showing location of queries]]
   * Map-partition-reduce
   * Sawzall: users write code to manipulate _one_ record, it takes care of rest
   * Distributed filesystem-based database
   * Code is a trade secret
   * Vandalism, malicious code low concerns
   * Communication: writes results to distributed file system, where further workers "pick it up" (sort of very slow shared memory!)  Throughput more important than latency.


---+++ Volunteer megacluster
   * Variable.  SETI@home claims 2M
   * Typical apps: biological simulations (protein folding, drug discovery)
   * Code varies from project to project and is pretty closely guarded to prevent vandalism
   * Vandalism, malicious code high concerns
   * Communication: infrequent, small size, presumably bottlenecked by small number of servers

---+++ Botnets
   * 1.5M (Dutch), 400K (US)
   * Typical distributed apps: denial-of-service attacks, spam delivery, click-fraud, subvert other computers (as well as mine local hard drive for account information)
   * Code apparently relatively easily available; some is GPLed!  Legit sites not eager to spread the knowledge.
   * High concerns about tracability, malicious code (i.e. other bot herders)
   * Communication: command-and-control frequently through IRC; bandwidth and computing power is stolen so "free"; latency not particularly important.  Concern about jumping to P2P networks soon.

---+++ GPUs
   * 20ish cores on GPU processor, tenish vertex processors and tenish fragment processors
   * Typical apps: graphics (duh), fluid flow, FFT
   * Higher level languages (HLLs) 
      * Cg (nVidia)
      * Brook (open source, written on top of Cg) -- strong map/reduce component
   * No possibility of external malicious code.
   * Communication: vertex processors and fragment processors communicate via registers.  Vertex processors can't write to external memory; fragment processors can't read from external memory (except in a hack-y way through texture buffer).
This topic: Main > TWikiUsers > DuckySherwood > DuckyHomework > SingleOperationMultipleData
Topic revision: r2 - 2006-04-03 - DuckySherwood