#### cādence

#### **Applying Logic Synthesis for Speeding Up SAT**

Niklas Een Alan Mishchenko Niklas Sörensson

## **Problem Formulation**

• Given a (combinational) circuit, how do we best clausify it for the CNF-based SAT-solver?

Assume given on *And-Inverter Graph* (AIG) form:



**Contract:** Clausify logic such that the (functional) relation between PIs and POs is established.

#### ...gives the freedom to:

Omit internal signals
(don't give them a SAT variable)
Create new internal signals
(by circuit simplifying rewrites)

#### cādence<sup>°</sup>

## Clausification in the small ("easy")

- How to produce a small set of clauses for 1-output, kinput subgraphs ("super-gates") of the AIG for small k:s:
  - − k ≤ 4: Pre-compute and tabulate exact results
  - 4 < k ≤ 16: Use Minato's ISOP-algorithm</p>

#### Clausification in the large (hard!)

- How to partition AIG into super-gates?
- How to handle reconvergence? (the root of all evil)

**Proposal:** Use FPGA-style technology mapping!

## **Overview – "The three staged rocket"**

outpu

0xF7FB

**x7** 

**x**3

**Techmap for CNF** 

0x3133

<NOR>

0xC040

**x1** 

**x5** 

**x8** 

0xFD00

 $\mathbf{x2}$ 

xб

**x**0

- The AIG representing the SAT problem is minimized (DAG-aware).
- The minimized AIG is translated into k-input LUTs. "cost(LUT) = #clauses"
- Final CNF is optimized by variable elimination and subsumption.



11765

-11 -7

-11 -5

12 11 -2

12 -4 -2

12 -3 -2

-122

-12 -11 4 3

13 -3 -10 -4

-13 -11 3

-13 10

13 11 -10 -4 -15 -12 8

**CNF** minimization

-134

14 11 9

14 - 49

1439

-14 -9

-14 -11 4 -3

15 - 12 - 8 - 13 14

15 12 8 -13 14

-15 12 -8

-15 13

-15 -14

#### **Cut Enumeration**



### **DAG-Aware Minimization**

- Minimizes an AIG taking sharing into account.
- Compute "good" AIG representations for each 4-input function.
- Enumerate all cuts: see if cut cone can be replaced by node saving representation.
- If time-budget admits: perturb and repeat.

## **DAG-Aware Minimization: Example**



cādence<sup>®</sup>

## **DAG-Aware Minimization: Example**



## **DAG-Aware Minimization: Example**



# **Technology Mapping**

- Enumerate all *k*-input cuts (k=4 in example).
- Select a cut for each node (= potential LUT).
- Outputs from logic will recursively induce a subgraph corresponding to LUT representation.
- Area Flow: Estimate the area increase that would result from including a node:

<u>cost of node + cost of children</u> estimated number of fanouts

 "cost of node = 1" in example on following slides (for simplicity), but "#clauses" in real algorithm.

**c ā d e n c e**"

#### **Techmapping – enumerate cuts**



cādence<sup>™</sup>

#### **Techmapping – best cut for each node**



| 13: | { | <b>x</b> 5 | <b>x4</b>  |            |            | } |
|-----|---|------------|------------|------------|------------|---|
| 14: | { | <b>x5</b>  | <b>x4</b>  | <b>x</b> 3 |            | } |
| 15: | { | <b>x5</b>  | <b>x4</b>  | <b>x</b> 3 | <b>x1</b>  | } |
| 16: | { | <b>x5</b>  | <b>x4</b>  | <b>x</b> 3 | <b>x</b> 2 | } |
| 17: | { | 14         | <b>x</b> 2 | <b>x1</b>  |            | } |
| 18: | { | 14         | <b>x</b> 7 | <b>x</b> 2 | <b>x1</b>  | } |
| 19: | { | 14         | <b>x8</b>  | <b>x</b> 2 | <b>x1</b>  | } |
| 20: | { | 16         | <b>x1</b>  |            |            | } |
| 21: | { | 16         | <b>x1</b>  | <b>x</b> 0 |            | } |
| 22: | { | 16         | <b>x6</b>  | <b>x1</b>  | <b>x</b> 0 | } |
| 23: | { | 16         | <b>x6</b>  | <b>x1</b>  | <b>x</b> 0 | } |
| 24: | { | 14         | <b>x8</b>  | <b>x</b> 2 | <b>x1</b>  | } |
| 25: | { | <b>x5</b>  | <b>x4</b>  | <b>x</b> 3 | <b>x</b> 2 | } |
| 26: | { | 15         | <b>x8</b>  |            |            | } |
| 27: | { | 16         | <b>x6</b>  | <b>x1</b>  | <b>x</b> 0 | } |
| 28: | { | 18         | 15         | <b>x8</b>  | <b>x</b> 2 | } |
| 29: | { | 27         | 19         | 17         | <b>x</b> 7 | } |

### **Techmapping – induced subgraph**



| 13: | { | <b>x</b> 5 | <b>x4</b>  |            |            | } |
|-----|---|------------|------------|------------|------------|---|
| 14: | { | <b>x5</b>  | <b>x4</b>  | <b>x</b> 3 |            | } |
| 15: | { | <b>x5</b>  | <b>x4</b>  | <b>x</b> 3 | <b>x</b> 1 | } |
| 16: | { | <b>x</b> 5 | <b>x4</b>  | <b>x</b> 3 | <b>x</b> 2 | } |
| 17: | { | 14         | <b>x</b> 2 | <b>x1</b>  |            | } |
| 18: | { | 14         | <b>x</b> 7 | <b>x</b> 2 | <b>x</b> 1 | } |
| 19: | { | 14         | <b>x</b> 8 | <b>x</b> 2 | <b>x</b> 1 | } |
| 20: | { | 16         | <b>x1</b>  |            |            | } |
| 21: | { | 16         | <b>x1</b>  | <b>x</b> 0 |            | } |
| 22: | { | 16         | <b>x6</b>  | <b>x1</b>  | <b>x</b> 0 | } |
| 23: | { | 16         | <b>x6</b>  | <b>x1</b>  | <b>x</b> 0 | } |
| 24: | { | 14         | <b>x</b> 8 | <b>x</b> 2 | <b>x1</b>  | } |
| 25: | { | <b>x</b> 5 | <b>x4</b>  | <b>x</b> 3 | <b>x</b> 2 | } |
| 26: | { | 15         | <b>x</b> 8 |            |            | } |
| 27: | { | 16         | <b>x6</b>  | <b>x1</b>  | <b>x</b> 0 | } |
| 28: | { | 18         | 15         | <b>x</b> 8 | <b>x</b> 2 | } |
| 29: | { | 27         | 19         | 17         | <b>x</b> 7 | } |

#### **Techmapping – iterate procedure**



cādence<sup>®</sup>

# **Benchmark Results**

|                                  | <b>SAT Runtime</b> (sec) $-$ Cadence BMC |              |                  |                  |                  |       |       |                   |
|----------------------------------|------------------------------------------|--------------|------------------|------------------|------------------|-------|-------|-------------------|
| $\mathbf{Problem}$               | (orig)                                   | $\mathbf{S}$ | D                | DS               | Т                | TS    | DT    | DTS               |
| CIS-70 $u$                       | 21.9                                     | 12.3         | 3.6              | 3.1              | 2.5              | 4.1   | 1.2   | 1.3               |
| CIS-71 $s$                       | 15.2                                     | 8.8          | 7.7              | 3.9              | <b>2.1</b>       | 3.1   | 4.0   | 2.7               |
| olm-154 $u$                      | 116.4                                    | 48.3         | 41.1             | 37.7             | 11.6             | 34.4  | 15.6  | 9.3               |
| olm-155 $s$                      | 101.8                                    | 22.9         | 12.9             | 16.2             | 18.2             | 50.6  | 13.4  | 6.9               |
| r1-18 $u$                        | 1516.0                                   | 139.4        | 361.9            | 119.4            | 196.3            | 78.8  | 78.8  | <b>39.0</b>       |
| r1-19 $s$                        | 1788.2                                   | 276.7        | 535.0            | 154.8            | 317.8            | 137.1 | 131.9 | 42.5              |
| r2-19 $u$                        | 403.8                                    | 214.4        | 239.8            | 169.7            | 140.9            | 73.7  | 114.8 | 78.1              |
| r2-20s                           | 3066.1                                   | 893.4        | 1002.9           | 353.2            | 376.2            | 313.5 | 687.5 | 96.5              |
| r6-19 $u$                        | 316.1                                    | 225.6        | 133.9            | 104.7            | 107.9            | 107.6 | 53.2  | 55.0              |
| r6-20s                           | 2305.4                                   | 456.4        | 863.1            | 385.8            | 507.0            | 236.9 | 307.2 | 101.2             |
| Total speedup: 4.2x              |                                          |              | 3.0x             | 7.2x             | $5.7 \mathrm{x}$ | 9.3x  | 6.9 x | 22.3x             |
| Arithmetic average speedup: 3.9x |                                          |              | 3.6x             | $6.5 \mathrm{x}$ | 6.3x             | 7.6x  | 9.2x  | $19.7 \mathrm{x}$ |
| Harmonic average speedup: 2.7x   |                                          |              | $2.9 \mathrm{x}$ | 4.8x             | 5.3x             | 4.9x  | 6.6x  | 11.5x             |

S = cnf Simplification D = Dag shrink T = Techmap for cnf

cādence<sup>™</sup>

# Conclusions

- Two techniques from logic synthesis was used:
  - DAG-aware minimization
  - Technology mapping (adapted for CNF generation)
- Both techniques contributed to speed-ups
- Orthogonal to CNF-based preprocessing
- For BMC problems, the speedup was ~10x.