A20 issue

最近在读 「30天自制操作系统」 (日文原名「三十日でできる！OS自作入門」), 里面有提到利用 ES 和 BX 获取地址的时候有一个计算公式: ES * 16 + BX. 这里为什么要乘以 16 我非常的困惑. 在 Stackoverflow 上提了这个问题, 但是依然没太看懂大家的讨论.

不过第一个答案中提到了一个关键字 A20 issue. 于是我在维基百科中找到了以下内容. 总算是有点理解了:

The early Intel 8086, Intel 8088, and Intel 80186 processors had 20 address lines, numbered A0 to A19; with these, the processor can access 2^20 bytes or 1 megabyte. Internal address registers of these processors only had 16 bits. To access a 20-bit address space, an external memory reference was made up of a 16-bit Offset address added to a 16-bit Segment number, shifted 4 bits so as to produce a 20-bit physical address. The resulting address is equal to Segment * 16 + Offset. There are many combinations of segment and offset that produce the same 20-bit physical address. In consequence there were various ways to address the same byte in memory. For example, here are four of the 4096 different segment:offset combinations, all referencing the byte whose physical address is 0x000FFFFF (the last byte in 1 MB-memory space):

F000:FFFF
FFFF:000F
F555:AAAF
F800:7FFF

Referenced the last way, an increase of one in the offset yields F800:8000, which is a proper address for the processor, but since it translates to the physical address 0x00100000 (the first byte over 1 MB) the processor would need another address-line to actually access this byte. Since such a line doesn't exist on the 8086 line of processors, the 21st bit above, while set, gets dropped, causing the address F800:8000 to "wrap around" and to actually point to the physical address 0x00000000.

参考资料: Wikipedia

个人感觉这个 16 的确就是随便设置的, 这样可以保证刚好能覆盖 20 条数据线能访问的地址, 而且还略微大一点, 能满足需求还不至于太浪费.

可以具体计算一下, 最初 8086 是20 条数据线, 也就是:

2 ^ 20 / 1024 = 1024 kb

而计算式 ES * 16 + BX 所能提供的最大值(FFFF:FFFF):

(65535 * 16 + 65535) / 1024 = 1087 kb

可见 1087 略微大于 1024, 满足需求且并不是很浪费地址.

Comments