top of page

ASIC Implementation of RISC-V 

Processor Core in 28nm

Screen Shot 2020-01-04 at 3.07.08 AM.png

INTRODUCTION:

 

        One of the primary memories used in today’s computers and smart devices is SRAM, which stands for Static Random Access Memory. This type of memory is very fast and can retain data bits as long as power is being supplied. In our course, we will be given the opportunity to design our own SRAM chip using EDA tools such as Synopsys Custom Compiler. The memory consists of various sub-blocks and interconnects that play specific roles in order for the SRAM to work as desired. These sub-blocks include: SRAM cell, SRAM array, pre-charge circuit, sense amplifier, write driver, row decoder, and controller (Figure 1). Each of these components are explained in more detail in the following sections. We will also be discussing the process of how we integrated the blocks and the results that we accomplished. The goal of this project is to achieve minimal: layout area, access time, and/or active power consumption.

​

        There are various processors, created by some of the top companies in the industry such as ARM, Intel, and SPARC, that are known to be the best. However, there is an upcoming processor that is drawing a lot of attention due to its unique features. This chip is called the RISC-V processor, which stands for Reduced Instruction Set Computer. It was first introduced at the University of California, Berkeley in 2010 with a mission to accelerate RISC-V adoption to the entire community. This ambition was then amplified by a non-profit corporation called the RISC-V Foundation who host yearly global events and discuss current RISC-V projects, implementations, and future ideas.

​

        The RISC-V processor is a hardware Instruction Set Architecture (ISA) which is a set of commands and machine codes that communicate with the processor. The ISA is based on RISC principles that perform simple tasks in less number of instructions, emphasizing efficiency in cycles per instruction. But the main feature that makes RISC-V so popular is that it is the only processor that has an open source ISA, meaning that the user is able to monitor how the chip operates binary numbers. This freedom allows anyone to use it for any purpose and customize it to fit any market niche.

​

          In this course, we will be designing a 28 nm RISC-V processor from the system-level to the physical layout using EDA tools such as Synopsys IC Compiler and PrimeTime. We’ll begin by ensuring that the RTL code tolerates RISC-V instructions such as ALU, shift, and jump operations using Verilog testbenches. We will then perform synthesis to generate a gate-level netlist. This will be followed by physical layout procedures such as: partitioning, floorplanning, placement, routing, compaction, and extraction. Lastly, we will carry out a final timing verification using PrimeTime to ensure that all timing constraints are met. The goal of this project is to achieve minimal layout area and power consumption.

​

DESIGN SPECIFICATIONS AND CONSTRAINTS:

​

        - ​Clock frequency:​ ​200 MHz

        - ​Clock uncertainty:​ ​0.5 ns

        - ​Input delay:​ ​1 ns

        - ​Clock latency:​ ​1 ns
        - ​Output delay:​ ​1 ns

​

RTL CODE AND TESTBENCHES:

​

        For this project we used a modified version of the RISC-V processor called PicoRV-32. We were provided an RTL code that performed ALU, shift, and jump operations. To ensure that this file was running correctly we ran three Verilog testbenches: testbench_RS.sv (ALU), testbench_Stype.sv (Shift), and testbench_Jtype.sv (Jump). These files (RTL code and testbenches)​ c​ an be found in the Source Code and Testbenches folder submitted with this report.

​

    In order for these testbenches to be ran, we had to create a script. We named our script Logical_verification_script.tcl, which can be found in the Project Scripts folder. Figure 1 below shows a snippet of one of the testbenches that was used to verify the functionality of the RTL code. The testbench shown below is testbench_Jtype.sv, which verifies that the processor is branching to different intended locations. In our case, it is running correctly indicated by the “ok, Error= 0” results. Furthermore, the other testbenches also passed which confirms that the RTL code is working properly.

Screen Shot 2020-04-25 at 10.08.07 PM.pn

Figure 1: Snippet of testbench output of jump instruction.

SYNTHESIS OF DESIGN:

​

        After verifying that our design is functionally correct, we synthesized the code to obtain a netlist that we will be used for the physical implementation of our design. In order to do this, we wrote a script called Synthesis_script.tcl and used DC Compiler to synthesize the netlist and create reports about our design. This script is located in the Project Scripts folder and the reports are found in the synth_reports folder.

​

        During this phase of the project, our design resulted with no setup violations, but it did contain numerous hold violations. We still proceeded with our design because hold violations can be fixed and eliminated in the future when we perform the clock tree synthesis portion of physical design. Figure 2 shows the timing report on setup time and Figure 3 shows the timing report on hold time, which were generated in PrimeTime. The script Pre-Layout_PT_script.tcl was used to generate these reports and the file can be found in the Project Scripts folder.

Screen Shot 2020-04-25 at 10.09.25 PM.pn

Figure 2: PrimeTime Setup Timing Report.

Screen Shot 2020-04-25 at 10.09.38 PM.pn

Figure 3: PrimeTime Hold Timing Report.

PHYSICAL DESIGN:

​

        We used IC Compiler to create a physical layout of our design using standard cells. We created a script called Physical_design_script.tcl to create our physical layout. The script first sets up a Milkyway database that contains our design. We imported the netlist along with the logical libraries needed to implement our design and linked them together.

​

        Our script setup the design floorplan followed by placement, routing and clock tree synthesis. After clock tree synthesis, we used IC Compiler to report the setup and hold timing. During this phase, our design met hold and setup timing requirements. Figure 4 shows the setup timing report and Figure 5 shows the hold timing report. Next we moved on to routing, then we extracted the parasitic information and final netlist.

Screen Shot 2020-04-25 at 10.09.51 PM.pn

Figure 4: IC Compiler Setup Timing Report.

Screen Shot 2020-04-25 at 10.10.03 PM.pn

Figure 5: IC Compiler Hold Timing Report.

TIMING VERIFICATION RESULTS:

​

        After extracting parasitics and exporting the final netlist, we used PrimeTime to verify that our design met setup and hold time. Figure 6 shows that our design met the setup time constraint and Figure 7 shows that our design met the hold time constraint. These reports can be found in the post_extract_reports folder.

Screen Shot 2020-04-25 at 10.10.14 PM.pn

Figure 6: PrimeTime Post Extraction Setup Timing Report.

Screen Shot 2020-04-25 at 10.10.30 PM.pn

Figure 7: PrimeTime Post Extraction Hold Timing Report.

POWER AND AREA RESULTS:

​

        Our RISC-V processor consumes a total power of 1.136 mW. This comes from Cell Leakage of 787.6 μW and Switching Power of 348.4 μW. Figure 8 shows the power report from IC Compiler. Our layout design resulted with an area of 41,892 μm​^2. Figure 9 shows the area report from IC Compiler and Figure 10 displays an image of our final layout. These reports can be found in the post_extract_folder.

Screen Shot 2020-04-25 at 10.10.46 PM.pn

Figure 8: Power Report of Final Design.

Screen Shot 2020-04-25 at 10.10.55 PM.pn

Figure 9: Area Report of Final Design.

Screen Shot 2020-01-04 at 3.07.08 AM.png

Figure 10: Physical Layout of Design.

DISCUSSION:

​

        During this project we noticed that the synthesis step was an iterative process. For example, we originally set the max_area command to zero and compiled it, the design resulted in a compact area with many timing violations. Then we compiled with the max_area set to 200 which resulted in a larger area but no timing errors. After a few iterations we converged to setting the max_area to 70.

​

        We also took an iterative approach to the physical design portion of the project. We noticed that when we set the core utilization at 80% and above, we were getting many Design Rules Check (DRC) errors. We achieved a good design when setting core utilization at 70% but we continued to get Max transient errors. In the last few iterations we managed to fix the timing errors with only 3 to 4 max transient errors. We attempted alternative routing options but unfortunately none of them resolved the DRC errors.

​

        Overall, this project introduced us to the RISC-V processor and all of its benefits. We gained an understanding of how to take a design from a system-level description to physical layout. We learned how to use Synopsys EDA tools to synthesize a netlist from Verilog RTL code and how to use IC Compiler to create the physical layout. We also learned how to use PrimeTime to generate timing reports for our design. Furthermore, we grasped the power of creating scripts that allow us to iterate through multiple designs quickly. We are very glad that we were given the opportunity to work on this project as it can definitely be useful in the real world.

USEFUL LINKS:

​

https://riscv.org/

​

https://github.com/cliffordwolf/picorv32

​

​

bottom of page