关注RISC-V和Chisel以及开源IC和EDA在中国的发展
要点新闻:
高人Marcelo Samsoniuk在github上以BSD License发布了其一个晚上设计出的RISC-V处理器,足以说明了标准本身的精简。
The main problem around the picorv32 is that most instructions requires 3 or 4 clocks per instruction, which resembles the 68020 in some ways, but running at 150MHz. Anyway, with 3 clocks per instruction, the peak performance is around 50MIPS only. As long I had some good experience with experimental RISC cores, I started code the darkriscv only to check the level of complexity. For my surprise, in the first night I mapped almost all instructions of the RV32I specification and the darkriscv started to execute the first instructions correctly at 75MHz and with one clock per instruction, which resembles a fast and nice 68040! wow! :)
Github Repo: darklife/darkriscv
目前是Alibaba旗下子公司的杭州中天微,近日发布了其面向物联网安全的RISC-V处理器,这款处理器可选的支持TEE环境,核心很小很精简。
技术特征:
技术优势:
Link: 中天微发布全球首款支持物联网安全的RISC-V处理器
微软和谷歌最近都开始规划从芯片级别提高其服务的安全性。在最近的Hot Chips大会上,Spectre/Meltdown依然是关注的热点,而且目前并没有非常好的解决方案。
谷歌之前公布了其Titan安全芯片,以此提升其服务的安全性;同时谷歌也宣布在明年的Titan芯片中将会采用RISC-V处理器,并且可能会是一个开源实现。
Google is forming an industry group to launch, probably next year, work on an open source implementation of Titan, possibly based on a 32-bit version of a RISC-V core. The open variant is geared for broad use in any embedded or consumer product.
小编提示:目前来看,Titan解决的并不是Spectre/Meltdown这类的问题,而解决可信启动和安全认证的解决方案,代替的是过去不开源的TPM;而Titan Security Key则是类似Yubikey的方案,这里比较容易混淆。
Link: Microsoft and Google Planning Silicon-Level Security
Pierre G.
在sw-dev
上提问为什么在压栈和出栈都需要借助mscratch寄存器,是不是强制需要的?
在Priviledged ISA specification中mscratch寄存器在描述的作用是:
The mscratch register is an XLEN-bit read/write register dedicated for use by machine mode. Typically, it is used to hold a pointer to a machine-mode hart-local context space and swapped with a user register upon entry to an M-mode trap handler.
Samuel Falvo II
和Michael Clark
做了回答
Samuel Falvo II 的回答:
The problem is that the stack pointer represents the value of SP as it was just before the trap occurred. So if you’re implementing an interrupt handler, and interrupts are vectored, then sure, you can make the reasonable argument that SP is valid, and you can probably use it.
However, if you’re handling interrupts in a non-vectored manner, OR, if you’re handling a trap such as a page fault or other such thing, can you really trust the value of SP to be valid? Consider, maybe the SP register got corrupted and now contains an odd address; or, maybe it points into ROM; etc. OR, maybe the SP register is completely valid, but now points into a page which is not mapped (e.g., you’ve exceeded the OS-allocated stack space, and now a page fault handler has to expand the stack by mapping a new page). In pretty much any of these circumstances, a reference through SP will incur another trap, and will do so at precisely the worst possible time – between the trap having been taken and the time when user-state has been preserved for later restoration.
The use of the scratch register is important because it’s the only way you can statically guarantee a buffer large enough to hold user state while taking a trap and figuring out what to do about it. Maybe, after having decided it was a plain interrupt, you switch over to using the user-mode stack. Or, maybe for a page fault, you need to put the thread to sleep while the fault handler waits on I/O to page in a needed block of memory. Etc.
On other processor architectures, like ARM, 680x0, or x86, each privilege mode has its own stack pointer, which the supervisor-mode code can depend on to always be correct. The SP register gets reloaded with a constant every time that privilege mode is entered. RISC-V doesn’t have this mechanism in hardware; relying on the scratch register is the closest analog it has.
总结起来: 在ARM, 680x0, or x86架构的处理器中,他们每个特权模式都有单独的栈指针,所以每次进入supervisor-modes时,SP都能读取正确的值, 从而保证正确运行。但是RISC-V在硬件没有这样的机制,只能通过mscratch寄存器来模拟实现类似的功能。
Michael Clark
的回答:
t is needed when you implement U mode so that user mode and machine mode can have separate stacks. Arguably it could be made optional for M-mode only systems. A separate interrupt stack doesn’t seem like it should necessarily be mandatory.
当系统中user模式和machine模式需要单独的栈时,你需要用到mscratch 如果系统只有一个中断栈是可以不需要
RISC-V现在有两个代码模型:medlow和medany。
在这两种代码模型中,lui和auipc都是为了设置全局数据的[31:8]位的20位高址。 在medlow模型中,由于lui/ld的指令组合使用0x00000000为基址做32比特地址寻址,当局部出现的多个全局数据寻址,他们可共享同一个lui设定的高址。 在medany模型中,lui被替换为auipc,使用pc为基址。由于所有的全局寻址为基于pc的相对寻址,medany代码模型可以支持对任意地址空间的寻址,突破medlow的地址空间限制。 同时,medany可以支持加载时地址重定向,即地址无关代码(position independent code, PIC)的生成。
在Linux环境中,所有的程序都是运行在虚拟内存中,独占一个虚拟内存空间。当不需要加载时地址重定向的需求时(非动态加载库), 使用medlow的lui/ld组合,GCC能够生成性能较优的代码,因而Linux上的RISC-V GCC默认使用medlow代码模型。 在编译动态加载库时,则需要手动使用medany代码模型。
为什么medany生成的代码性能比medlow差一些呢?
由于pc不是一个常量,对局部的多个全局数据寻址共享同一个auipc设定的高址可能会造成错误。
所以auipc/ld的指令组合并不能很好地被GCC优化,会产生大量的auipc指令。
此外,现有GCC编译器在-O -mcmodel=medany -mexplicit-relocs
的参数组合下甚至会生成错误的代码。
在没有必要时,应当使用medlow模型,如果需要生成可重定向代码,现在暂时不要使用-O -mcmodel=medany -mexplicit-relocs
的参数组合。
int array[995] = { [10] 10, [99] 99 };
long long ll = 100;
long long
sub (void)
{
return ll;
}
int
main (void)
{
return sub ();
}
这段代码用 rv32i
newlib
配置的 riscv gun
工具链加上 -O -mcmodel=medany -mexplicit-relocs
选项会产生下列汇编:
sub:
.LA0: auipc a5,%pcrel_hi(ll)
lw a0,%pcrel_lo(.LA0)(a5)
lw a1,%pcrel_lo(.LA0+4)(a5)
ret
这看起来合理,尽管也许应该是 %pcrel_lo(.LA0)+4
,因为 +4
应该是放在 ll
地址后面,而不是 .LA0
。但是当反汇编 a.out
时发现:
000101ac <sub>:
101ac: 00002797 auipc a5,0x2
101b0: 7fc7a503 lw a0,2044(a5) # 129a8 <ll>
101b4: 8007a583 lw a1,-2048(a5)
101b8: 00008067 ret
+4 的偏移量溢出了,没有警告生成了出错的代码。
我小心选择数组的长度来强制复现该错误。
这里出错的原因是,尽管变量 ll
是 8 字节对齐,但 auipc
不是,而 medany
用的是 auipc
指令到 ll
变量之间的偏移量,所以这个偏移量不是 8 的整倍数。 auipc
只保证 4 按字节对齐(如果没有指定压缩指令集 C 扩展),指定了的话是按 2 字节对齐。GCC 假定任意小于变量对齐的偏移量是安全的,显然实际上(这个例子里)并不是。
同样的错误会发生在使用 long doubles
和 int128_t
的 rv32 he rv64
,因为需要 16 字节对齐。
不幸的是,目前没有很好的解决方案。如果禁止 pcrel_lo
的偏移量,但会需要额外产生地址的指令使得效率上不如 medlow
。如果强制 auipc
按正确的方式对齐,就会潜在地在 auipc
之前增加多个空指令,对代码量的多少和性能产生不良影响。
只使用 -mcmodel=medany
是安全可靠的,但同时使用 -mcmodel=medany
和 -mexplicit-relocs
就会造成问题。
目前来说,可能让 gcc 的该选项无效是最好的选择,因为更改名字或者给出警告会让构建产生错误,招来大家的抱怨。如果只是让该选项无效,很少会有人注意到。
Jim Wilson 回复之前 Tommy Murphy 的信息,说明了以下内容: 首先对于 gdb 和 binutils 的编译参考如下
Binutils 可以使用主线代码。 gdb 可以使用主线代码及其 gdb-8.2-branch (尚未正式发布),或者使用 riscv-gnu-toolchain 仓库的 gdb 分支。
在 riscv-gnu-toolchain 仓库创建了分离的 riscv-binutils 和 riscv-gdb 树,其中:
对于 gcc 测试套件的运行环境搭建,参考如下:
以下是不同 FSF 工具的测试参考文档:
对于 riscv-binutils-gdb 和 riscv-binutils-gdb 的解释如下:
Karsten Merker 回复之前 Tommy Murphy 对于 toolchain 组件版本的疑问,说明了以下内容:
过去由于RISC-V的工具链处于不断的变化之中,所以在rocket-chip中不得不包含对某个特定版本的riscv-tools
的git submoudle
链接。而最近的Pull Request显示,开发者正在将riscv-tools submodule移除,因为工具链已经足够稳定,无需再指定特定工具链了。
The software ecosystem is now mostly stable. There are work-arounds to avoid bringing the giant riscv-tools into the tree for most projects using rocket. This PR makes their life easier.
Github PR: https://git.io/fA3h0
刘鹏同学正在翻译整理 SiFive TileLink Specification 文档,目前翻译了部分内容,后续内容会陆续放出。 待翻译完毕会重新进行整理校对,该文档主要是对于 TileLink 协议进行翻译与解释,重点讲述了 TileLink 中各通道内信号 传输规则,适合于初学者了解 TileLink 协议的特点,设计思想。翻译内容链接如下,英文原文档在下方。
Richard W.M. Jones在GitHub上完整公开了他编译Fedora RISC-V Linux的所有脚本。 对编译Linux平台bootstrap过程感兴趣的同学不要错过这个好工程。
Rambus最近发表博客,阐述了他们对于采用开放标准ISA的态度。
With companies like Apple, Facebook, Google, and Samsung building their own processors instead of relying on Intel, Qualcomm, or others, there is major interest in RISC-V. The open source, free approach could potentially lower risks associated with building custom chips. Because of the low cost and the open source nature of the architecture, manufacturers are free to design a chip without expending the amount of resources usually associated with designing a chip. Companies like Nvidia and Western Digital have signed on to use RISC-V in their own silicon. The former is using RISC-V for a governing microcontroller that manages its graphics cards while the latter plans to unveil a new RISC-V processor for its cores in its hard drives for 2019-2020.
Link: Lowering Risks with RISC-V
CNRV提供为行业公司提供公益性质的一句话的招聘信息发布,若有任何体系结构、IC设计、软件开发的招聘信息,欢迎联系我们!
整理编集: 宋威、黄柏玮、汪平、林容威、傅炜、巍巍、郭雄飞、黄玮
特别感谢: 刘鹏
欢迎关注微信公众号CNRV,接收最新最时尚的RISC-V讯日, RISC-V Day Tokyo将在Keio University举办,演讲征集已经开始。注册网站 息!
本作品采用知识共享署名-非商业性使用-相同方式共享 3.0 中国大陆许可协议进行许可。