#### Introduction CPE380/CS380, Fall 2025 **Hank Dietz** http://aggregate.org/hankd/ #### **Course Overview** - You know how to write a simple program... from CS courses - You know how to build simple combinatorial and sequential logic circuits from ECE courses (especially CPE282 or EE280/EE281) - This course fills the gap between the two: - So you can better specify & use that stuff - So you can create the stuff in between - There will be implementations in Verilog ``` module mul(ready, c, a, b, reset, clk); parameter BITS = 32; input [BITS-1:0] a, b; input reset, clk; output reg [BITS*2-1:0] c; output reg ready; reg [BITS-1:0] d; reg [BITS-1:0] state; reg [BITS:0] sum; always @(posedge clk or posedge reset) begin if (reset) begin ready <= 0; state <= 1: d <= a; c \le \{\{BITS\{1'b0\}\}, b\}; end else begin if (state) begin sum = c[BITS*2-1:BITS] + d; c \le (c[0] ? \{sum, c[BITS-1:1]\} : (c >> 1)); state <= {state[BITS-2:0], 1'b0};</pre> end else begin ready <= 1; end end end endmodule ``` # Verilog 32-bit Multiplier #### Textbook - The text is: Computer Organization & Design, 5<sup>th</sup> Edition: The Hardware/Software Interface by Patterson & Hennessy - You can use any MIPS edition from 2<sup>nd</sup> 6<sup>th</sup>, but we'll reference sections from the 5<sup>th</sup> - We will not assign problems from the text - Lots of additional materials at the course URL and presented in class... text is reference only #### **Grading & Such** - One individual Verilog project, ~10% - Three team projects, ~10% each - Four homework assignments, ~10% each - In-person final exam, ~20% (course grade limited to 1 letter above final) - Material from lectures, the text as cited, canvas, or from the course URL: - http://aggregate.org/CPE380/ - You are expected to regularly attend class - I try not to curve much; always in your favor ## **Course Content** | Lectures | Topic | |----------|------------------------------------------------| | 1 | Introduction | | 3 | Verilog (individual project) | | 3 | Multi-cycle machine (team project) | | 3 | Machine & assembly languages (homework) | | 2 | Single-cycle machine (team project) | | 3 | Integer & float arithmetic (homework) | | 4 | Pipelined machine (team project) | | 4 | Memory hierarchy and I/O (homework) | | 3 | Parallel processing and performance (homework) | | 1 | reserved for schedule slippage | | 1 | A simple compiler | | 1 | Review for final exam | #### Schedule Notes - Projects are deliberately pushed as early as possible to reduce time pressure - Some topics may be given more or less time depending on how students are doing - I will be presenting research at IEEE/ACM SC (Supercomputing) conference, so we will not have regular class meetings 11/18 & 11/20 ### Me (and why I'm biased) - Hank Dietz, ECE Professor and James F. Hardymon Chair in Networking - Office: 203 Marksbury - Research in parallel compilers & architectures: - Built 1st Linux PC cluster supercomputer - Antlr, AFNs, SWAR, FNNs, MOG, ... - Various awards & world records for best price/performance in supercomputing - Lab: 108/108A Marksbury I have TOYS! # Let's Talk About Computers - Embedded computers, IoT (Internet of Things) - Personal Mobile Devices (PMDs)... usually "smart phones" and tablets - Personal Computers (PCs) - Servers - Supercomputers - Clusters, Farms, Grids, and Clouds (Warehouse Scale Computers – WSC, Software as a Service – SaaS) #### **U.S. Consumer Electronics Market** Size, by Product, 2020 - 2030 (USD Billion) https://www.grandviewresearch.com/industry-analysis/personal-consumer-electronics-market # What's Inside? Evaluating performance # **Processor Terminology** - CPU Central Processing Unit - PE, Core Processing Element - Processor CPU or chip containing PEs - "Computer Family" same ISA - x86, IA32, x64/AMD64 Intel 386-based ISAs - MIPS, ARM, SPARC other common ISAs - DSP Digital Signal Processor - GPU Graphics Processing Unit - Tensor Matrix support for neural networks - Quantum Combinatorial use of superposition # Complexity is Increasing! Lots of things you use every day have # BILLIONS of components! You don't live long enough to know it all El Capitan supercomputer: 11,039,616 cores, 2.746 Exaflop/s Cost approx. \$600M, 29.6 MW power # **Abstraction "Onion"** ### Software Layers - Applications... - Operating Systems (OS)... - High-Level Languages (HLLs) Aka, High Order Languages (HOLs) - Designed for humans to write & read - Modularity - Abstract data types, type checking - Assignment statements - Control constructs - I/O statements #### Instruction Set Architecture - ISA defines HW/SW interface - Assembly Language - Operations match hardware abilities - Relatively simple & limited operations - Mnemonic (human readable?) - Machine Language - Bit patterns 0s and 1s - Actually executed by the hardware ``` High-level swap(int v[], int k) language {int temp; temp = v[k]: program v[k] = v[k+1]; (in C) v[k+1] = temp; Compiler Assembly swap: muli $2, $5,4 language $2, $4,$2 program add (for MIPS) 1w $15, 0($2) 1w $16. 4($2) $16, 0($2) SW $15, 4($2) SW jr $31 Assembler 00000000101000010000000000011000 Binary machine language 0000000000110000001100000100001 program (for MIPS) 100011001111001000000000000000100 ``` # **Hardware Layers** - Function-block organization - Gates & Digital Logic (CPE282 stuff) - Transistors - Used as bi-level (saturated) devices - Amplifiers, not just on/off switches - Materials & Integrated Circuits - Implementation of transistors, etc. - Analog properties #### Who Does What? - Instruction Set Design, by Architect - Machine & Assembly Languages - "Computer Architecture" - Instruction Set Architecture / Processor - Computer Hardware Design, by Engineer - Logic Design & Machine Implementation - "Processor Architecture" - "Computer Organization" #### **How To Use Layers** - Things are too complex to "know everything" - Need to know only layers adjacent - Makes design complexity reasonable - Makes things reusable - Can tunnel to lower layers - For efficiency - For special capabilities #### 8 Great Ideas - Design for Moore's Law - **Abstraction** - Make the common case fast - Pipelining - Parallelism - Prediction - Hierarchy of memories - Dependability via redundancy # SI Terminology Of Scale ``` 1000^1 kilo k 1000^-1 milli m 1000^2 1000^-2 M micro mega U 1000^3 1000^-3 G giga nano n 1000^4 1000^-4 Т tera pico p 1000^-5 1000^5 femto P peta 1000^6 E exa ``` - 1000<sup>^</sup>x vs. 1024<sup>^</sup>x - 1 Byte (B) is 8-10 bits (b), 4 bits in a Nybble - Hertz (Hz) is frequency (vs. period) #### Conclusion - LOTS of stuff to know about... focus of this course is the basic stuff around the ISA and its implementation - A lot of computer system design is about how to build efficient systems despite incredibly high and rapidly increasing system complexity - Look at the history references on the WWW: not to memorize who, what, when, & where, but to see trends...