Last module: Synthesis and Verilog This module Memory arrays SRMs Serial Memories Dynamic memories Memory rrays Memory rrays Random ccess Memory Serial ccess Memory Content ddressable Memory (CM) Read/Write Memory Read Only Memory Shift Registers Queues (RM) (ROM) (Volatile) (Nonvolatile) Serial In Parallel In First In Last In Static RM Dynamic RM Parallel Out Serial Out First Out First Out (SRM) (DRM) (SIPO) (PISO) (FIFO) (LIFO) Mask ROM Programmable Erasable Electrically Flash ROM ROM Programmable Erasable (PROM) ROM Programmable (EPROM) ROM (EEPROM) D. Z. Pan D. Z. Pan 2 rray rchitecture 2 n s of 2 m s each If n >> m, fold by 2 k into fewer rows of more columns n-k n lines row decoder k column decoder line conditioning 2 m s lines memory cells: 2 n-k rows x 2 m+k columns column circuitry Good regularity easy to design Very high density if good cells are used D. Z. Pan 3 2T SRM Cell Basic building block: SRM Cell Holds one of information, like a latch Must be read and written 2-transistor (2T) SRM cell Use a simple latch connected to line 46 x 75 λ unit cell write write_b read read_b D. Z. Pan 4 6T SRM Cell Cell size accounts for most of array size Reduce cell size at expense of complexity 6T SRM Cell Used in most commercial chips Data stored in cross-coupled inverters Read: Precharge, _b Raise line Write: Drive data onto, _b Raise line _b D. Z. Pan 5 SRM Read Improve performance when -line cap. is high Precharge both lines high Then turn on line One of the two lines will be pulled down by the cell Ex: = 0, = discharges, _b stays high But bumps up slightly Read stability must not flip N >> N2.5.0 N2 _b P P2 N4 N N3 0.5 0.0 0 00 200 300 400 500 600 time (ps) D. Z. Pan 6 _b D. Z. Pan
SRM Write Drive one line high, the other low Then turn on line Bitlines overpower cell with new value Ex: = 0, =, =, _b = 0 Force low, then rises high Writability Must overpower feedback inverter N4 >> P2 N2 >> P (symmetry).5.0 0.5 _b N2 P N P2 N3 _b N4 SRM Sizing High lines must not overpower inverters during reads But low lines must write new value into cell _b med weak strong med 0.0 0 00 200 300 400 500 600 700 time (ps) D. Z. Pan 7 D. Z. Pan 8 _q SRM Column Example Read Write Bitline Conditioning _q Bitline Conditioning SRM Layout Cell size is critical: 26 x 45 λ (even smaller in industry) Tile cells sharing V DD, GND, line contacts _vf H SRM Cell H _b_vf write_q _vf SRM Cell _b_vf VDD GND BIT BIT_B GND out_b_vr out_vr data_s φ WORD _q _vf Cell boundary out_vr D. Z. Pan 9 D. Z. Pan 0 Decoders n:2 n decoder consists of 2 n n-input ND gates One needed for each row of memory Build ND from NND or NOR gates Static CMOS Pseudo-nMOS Decoder Layout Decoders must be pitch-matched to SRM cell Requires very skinny gates VDD 3 3 GND 0 2 8 4 0 /2 2 4 2 6 8 NND gate buffer inverter 3 3 D. Z. Pan D. Z. Pan 2 D. Z. Pan 2
Large Decoders For n > 4, NND gates become slow Break large gates into multiple smaller gates 3 0 Predecoding Many of these gates are redundant 3 Factor out common gates into predecoder Saves area Same path effort predecoders of 4 hot predecoded lines 0 2 3 2 3 5 5 D. Z. Pan 3 D. Z. Pan 4 Column Circuitry Some circuitry is required for each column Bitline conditioning Sense amplifiers Column multiplexing Bitline Conditioning Precharge lines high before reads φ _b Equalize lines to minimize voltage difference when using sense amplifiers φ _b D. Z. Pan 5 D. Z. Pan 6 Sense mplifiers Bitlines have many cells attached Ex: 32-k SRM has 256 rows x 28 cols 28 cells on each line t pd (C/I) ΔV Even with shared diffusion contacts, 64C of diffusion capacitance (big C) Discharged slowly through small transistors (small I) Sense amplifiers are triggered on small voltage swing (reduce ΔV) D. Z. Pan 7 Differential Pair mp Differential pair requires no clock But always dissipates static power sense_b P N N3 P2 sense N2 _b D. Z. Pan 8 D. Z. Pan 3
Clocked Sense mp Clocked sense amp saves power Requires sense_ after enough line swing Isolation transistors cut off large line capacitance sense b isolation transistors Twisted Bitlines Sense amplifiers also amplify noise Coupling noise is severe in modern processes Try to couple equally onto and _b Done by twisting lines b0 b0_b b b_b b2 b2_b b3 b3_b regenerative feedback sense sense_b D. Z. Pan 9 D. Z. Pan 20 Column Multiplexing Recall that array may be folded for good aspect ratio Ex: 2 k x 6 folded into 256 rows x 28 columns Must select 6 output s from the 28 columns Requires 6 8: column multiplexers Tree Decoder Mux Column mux can use pass transistors Use nmos only, precharge outputs One design is to use k series transistors for 2 k : mux No external decoder logic needed B0 B B2 B3 B4 B5 B6 B7 B0 B B2 B3 B4 B5 B6 B7 Y to sense amps and write circuits Y D. Z. Pan 2 D. Z. Pan 22 Single Pass-Gate Mux Ex: 2-way Muxed SRM Or eliminate series transistors with separate decoder _q B0 B B2 B3 write0_q write_q Y D. Z. Pan 23 data_v D. Z. Pan 24 D. Z. Pan 4
Multiple Ports We have considered single-ported SRM One read or one write on each cycle Multiported SRM are needed for register files Examples: Multicycle MIPS must read two sources or write a result on some cycles Pipelined MIPS must read two sources and write a third result each cycle Superscalar MIPS must read and write many sources and results each cycle D. Z. Pan 25 Dual-Ported SRM Simple dual-ported SRM Two independent single-ended reads Or one differential write B _b Do two reads and one write by time multiplexing Read during ph, write during ph2 D. Z. Pan 26 Multi-Ported SRM dding more access transistors hurts read stability Multi-ported SRM isolates reads from state node Single-ended design minimizes number of lines B C D E F G b bb bc bd be bf bg Serial ccess Memories Serial access memories do not use an address Shift Registers Tapped Delay Lines Serial In Parallel Out (SIPO) Parallel In Serial Out (PISO) Queues (FIFO, LIFO) write circuits read circuits D. Z. Pan 27 D. Z. Pan 28 Shift Register Shift registers store and delay data Simple design: cascade of registers Watch your hold times! Din 8 Dout Denser Shift Registers Flip-flops aren t very area-efficient For large shift registers, keep data in SRM instead Move R/W pointers to RM rather than data Initialize read address to first entry, write to last Increment address on each cycle 00...00... counter counter readaddr writeaddr Din dual-ported SRM D. Z. Pan 29 reset Dout D. Z. Pan 30 D. Z. Pan 5
Tapped Delay Line tapped delay line is a shift register with a programmable number of stages Set number of stages with delay controls to mux Ex: 0 63 stages of delay Serial In Parallel Out - shift register reads in serial data fter N steps, presents N- parallel output Sin Din SR32 SR6 SR8 SR4 SR2 SR Dout P0 P P2 P3 delay5 delay4 delay3 delay2 delay delay0 D. Z. Pan 3 D. Z. Pan 32 Parallel In Serial Out Load all N s in parallel when shift = 0 Then shift one out per cycle shift/load P0 P P2 P3 Sout Queues Queues allow data to be read and written at different rates. Read, Write each use their own clock, data Queue indicates whether it is full or empty Build with SRM and read/write counters (pointers) WriteClk ReadClk WriteData FULL Queue ReadData EMPTY D. Z. Pan 33 D. Z. Pan 34 FIFO, LIFO Queues First In First Out (FIFO) Initialize read and write pointers to first element Queue is EMPTY On write, increment write pointer If write almost catches read, Queue is FULL On read, increment read pointer Last In First Out (LIFO) lso called a stack Use a single stack pointer for read and write D. Z. Pan 35 4-Transistor Dynamic RM Cell Remove the two p-channel transistors from static RM cell, to get a four-transistor dynamic RM cell Data stored as charge on gate capacitors (complementary nodes) Data must be refreshed regularly Dynamic cells must be designed very carefully D. Z. Pan 36 D. Z. Pan 6
3-Transistor Dynamic RM Cell -Transistor Dynamic RM Cell Value of C B must be chosen very carefully; otherwise, voltage on -line will be affected by charge sharing Data stored on the gate of a transistor Need two additional transistors, one for write and the other for read control D. Z. Pan 37 Cannot get any smaller than this: data stored on a (trench) capacitor C, need a transistor to control data Bit line normally precharged to ½ V DD (need a sense amplifier) D. Z. Pan 38 Single-Event Upsets Particle produces electron-hole pairs in substrate; when collected at source and drain, will cause current pulse -flip can occur in the memory cell due to the charge generated by the particle called a single-event upset Seen in spacecraft electronics in the past Source: erospace Corporation D. Z. Pan 39 D. Z. Pan 7