Logical address - techintroduce

Introduction

In a computer with address conversion function, the address (operand) given by the access instruction is called a logical address, or a relative address. The actual effective address in the internal memory, that is, the physical address, is obtained through calculation or transformation of the addressing mode.

The addressing mode (address conversion function) of various computers is different. When writing a program in assembly language, you must first be familiar with the instruction system of this machine.

Reference book explanation

1. In the computer with address conversion function, the address (operand) given by the access instruction is called the logical address, also called the relative address. The physical address in the internal memory is obtained through calculation or transformation of the addressing mode.

2. The address used in the user program is called the relative address or logical address.

3. The logical address is composed of two 16-bit address components, one is the segment base value and the other is the offset. Both components are coded with unsigned numbers.

Academic literature explanation

1. In this way, the address of the storage unit can be represented by the segment base address (segment address) and the offset within the segment (offset address). The base address determines the location of the segment in which it resides in the entire storage space, and the offset determines its location within the segment. This address representation is called a logical address, and is usually expressed in the form of segment address: offset address.

2. The so-called logical address refers to the location of the disk given by the logical block number of the data (l block = 512 words l word = 64 bits), and the physical address is determined by the cylinder and head of the disk. , Segment and other physical locations determined by the address.

Logical address

Background

Looking at the source, Intel’s 8-bit computer 8080CPU has 8-bit data bus (DB) and 16-bit address bus (AB). Then this 16-bit address information is also to be transmitted through the 8-bit data bus, and it is also stored in the scratchpad in the data channel, and in the registers and memory in the CPU, but because AB is exactly an integer multiple of DB, so There will be no contradictions!

But when it was upgraded to a 16-bit machine, the design of Intel8086/8088CPU could not exceed 40 pins due to the limitation of IC integration technology and external packaging and pin technology. But I feel that the original addressing ability of 8-bit machine 2^16=64KB is too small, but it is directly increased to an integer multiple of 16, even if AB=32-bit is not achieved. Therefore, AB can only be temporarily increased by 4 to 20. Then the addressing capacity of 2^20=1MB has been increased by 16 times. But this has caused a contradiction between the 20-bit of AB and the 16-bit of DB. The 20-bit address information cannot be transmitted on the DB, nor can it be stored in the 16-bit CPU register and memory unit. So came into being the principle of the CPU segment structure.

Linear address

A logical address consists of two parts, the segment identifier and the offset within the segment. The segment identifier is composed of a 16-bit field called the segment selector. The first 13 digits are an index number. Quotation marks can be understood as the subscript of an array-and it will correspond to an array. What index is it? This is the "segment descriptor". The specific address of the segment descriptor describes a segment (the understanding of the word "segment": we can understand that virtual memory is divided into segments. For example, a memory has 1024 It can be divided into 4 segments, each segment has 256 bytes). In this way, many segment descriptors are grouped into an array called "segment descriptor table". In this way, a specific segment descriptor can be found directly in the segment descriptor table through the first 13 bits of the segment identifier. The descriptor describes a segment. The abstraction of the segment just now was not accurate, because by looking at what is in the descriptor—that is, how it is described, you can understand what is in the segment. , Each segment descriptor is composed of 8 bytes, as shown in Figure 1:

These things are very complicated, although a data structure can be used to define it, but I only care about one thing here, which is Base Field, which describes the linear address of the start position of a segment.

The original intention of Intel’s design is that some global segment descriptors are placed in the "Global Segment Descriptor Table (GDT)", and some local ones, such as each process's own, are placed in the so-called "Local Segment Descriptor Table (LDT)". So when should I use GDT and when should I use LDT? This is indicated by the T1 field in the segment selector, =0 means using GDT, and =1 means using LDT.

The address and size of the GDT in the memory are stored in the gdtr control register of the CPU, while the LDT is in the ldtr register.

Many concepts, like tongue twisters. Figure 2 looks more intuitive:

First, given a complete logical address [segment selector: offset address within the segment],

1, look at the segment selector T1=0 or 1, know whether the current conversion is a segment in GDT or LDT, and then get its address and size according to the corresponding register. We have an array.

2. Take out the first 13 bits of the segment selector. You can find the corresponding segment descriptor in this array, so that it is Base, that is, the base address.

3, Base + offset is the linear address to be converted.

Related Differences

Logical Address (Logical Address) refers to the offset address part related to the segment generated by the program. For example, when you are doing C language pointer programming, you can read the value of the pointer variable itself (& operation). In fact, this value is the logical address, which is relative to the address of the data segment of your current process, and is not related to the absolute physical address. . Only in Intel real mode, the logical address is equal to the physical address (because the real mode does not have a segmentation or paging mechanism, the CPU does not perform automatic address conversion); the logic is the deviation within the code segment limit of the program execution in Intel protected mode Shift address (assuming that the code segment and data segment are exactly the same). Application programmers only need to deal with logical addresses, and the segmentation and paging mechanism is completely transparent to you and is only involved by system programmers. Although application programmers can directly manipulate the memory themselves, they can only operate on the memory segment allocated to you by the operating system. Linear address (Linear Address) is the intermediate layer between the logical address to the physical address conversion. The program code will generate a logical address, or an offset address in the segment, plus the base address of the corresponding segment to generate a linear address. If the paging mechanism is enabled, the linear address can be transformed to generate a physical address. If the paging mechanism is not enabled, the linear address is directly the physical address. The linear address space capacity of Intel 80386 is 4G (2 to the 32th power is 32 address bus addressing).

Physical Address (Physical Address) refers to the address signal that appears on the CPU external address bus to address physical memory, and is the final address of the address conversion. If the paging mechanism is enabled, the linear address will be converted into a physical address using the entries in the page directory and page table. If the paging mechanism is not enabled, the linear address becomes the physical address directly.

Virtual Memory refers to the amount of memory that the computer presents is much larger than the actual memory. So it allows programmers to compile and run programs that have much larger memory than the actual system has. This allows many large projects to be implemented on systems with limited memory resources. A very appropriate analogy is: you don't need a long track to get a train from Shanghai to Beijing. You only need long enough rails (say 3 kilometers) to complete this task. The method adopted is to lay the rails behind to the front of the train immediately. As long as your operation is fast enough and can meet the requirements, the train can run like a complete track. This is the task that virtual memory management needs to complete. In the Linux 0.11 kernel, each program (process) is divided into a virtual memory space with a total capacity of 64MB. Therefore, the logical address range of the program is 0x0000000 to 0x4000000.

Sometimes we also refer to logical addresses as virtual addresses. Because it is similar to the concept of virtual memory space, the logical address is also independent of the actual physical memory capacity.

The "gap" between the logical address and the physical address is 0xC0000000, which is due to the fact that the virtual address -> linear address -> physical address mapping is exactly different from this value. This value is specified by the operating system.