Creation history
In the spring of 1991, Intel completed the development of the first prototype version of the PCI bus. The engineers were tasked with developing an inexpensive, high performance solution that would enable the 486, Pentium, and Pentium Pro processors. In addition, it was necessary to take into account the mistakes made by VESA when designing the VLB bus (the electrical load did not allow to connect more than 3 expansion boards), as well as to implement the autoconfiguration of devices on the example of the Autoconfig protocol for Amiga computers. MCA's marketing mistakes were also taken into account, leading to a "gang of nine" confrontation with EISA.

In 1992 the first version of the PCI bus appeared, Intel announced that it would be an open standard, and formed the PCI Special Interest Group. This gave any developer with an interest in the PCI bus the ability to create PCI bus devices without purchasing a license. The first version of the bus was clocked at 33 MHz, could be 32- or 64-bit, and the devices could handle either 5V or 3.3V signals. Theoretically the bus had a bandwidth of 133 Mbytes/s, but in reality the bandwidth was about 90 Mbytes/s.
In mid-1993, Intel withdrew from VESA and began to take active steps to market the PCI bus. The response to criticism from Usenet and competitors (the bus had many of the same characteristics as the Zorro III, and articles were published about the faulty bus design) was PCI 2.0.
1995 saw the introduction of PCI 2.1 (also called the "parallel PCI bus"), which offered a bus data rate of 66 MHz, with a maximum transfer rate of 533 Mbytes/s (for the 64-bit 66 MHz version). In addition, the bus was already supported at the Windows 95 operating system level (Plug and Play technology). The PCI 2.1 bus version had proven so popular that it was soon ported to platforms with Alpha, MIPS, PowerPC, SPARC, etc. processors.
In 1997, with the development of computer graphics and the AGP bus, PCI bus stopped being used for graphics cards due to new, higher requirements to graphics cards.
In the late 2000s and early 2010s, the PCI interface was gradually superseded by the PCI Express and USB interfaces. On consumer motherboards, the number of PCI connectors decreased, they are installed no more than 1-2, instead of 3-4 or more, used in the early 2000s. Some motherboards (especially compact mATX form factors, etc.) don't have a PCI connector at all.
Architecture
Originally 32 address/data conductors at 33 MHz. Later versions with 64 conductors (using an additional connector pad) and a frequency of 66 MHz appeared.
The bus is decentralized, there is no master device, any device can be the initiator of the transaction. Arbitration is used to select the initiator with a separate arbitrator logic. Arbitration is "hidden", not time-consuming - the selection of a new initiator takes place during the transaction executed by the previous initiator.
A transaction consists of 1 or 2 address cycles (2 address cycles are used to transfer 64-bit addresses, not supported by all devices, give DMA support on memory over 4 GB) and one or many data cycles. Transaction with many data cycles is called "burst" and is understood as a read/write of consecutive addresses and gives higher speed - one address cycle for several, not each data cycle, and no downtime (to "settle" conductors) between transactions.
Special transaction types are used to address the configuration space of a device.
"Batch" transactions can be temporarily suspended by both devices due to missing data in the buffer or buffer overflow.
Split" transactions are supported when the target device responds with an "in progress" state and the initiator must free the bus for other devices, reacquire it through arbitration, and repeat the transaction. This is done until the target device responds "done". It is used to interface buses with different speeds (PCI itself and Front Side Bus of CPU) and to prevent deadlocks in a scenario with many inter-bus bridges.
Rich cross-bus bridge support. Rich support for cache modes, such as:
- posted write - write data is immediately accepted by the bridge, and the bridge immediately responds "done", already after that trying to perform a write operation on the slave bus;
- write combining - several posted write requests, going in a row by addresses, are combined in the bridge into one "burst" transaction on the slave bus;
- prefetching - used in read transactions, means sampling immediately a large range of addresses into the bridge cache by one "burst" transaction, further requests are executed by the bridge itself without operations on the slave bus.
Interrupts are supported either as Message Signaled Interrupts (new) or in the classic way using INTA-D# guides. Interrupt conductors operate independently of the rest of the bus, and it is possible to share one conductor with many devices.
Configuration
PCI devices are self-configuring (Plug and Play) from the user's point of view. After the computer starts, the system software examines the PCI configuration space of each device connected to the bus and allocates resources.
Each device can claim up to six ranges in the PCI memory address space or in the PCI I/O address space.
In addition, devices can have ROM that contains executable code for x86 or PA-RISC processors, Open Firmware (system software for SPARC and PowerPC-based computers), or an EFI driver.
Interrupts are also configured by the system software (unlike on the ISA bus, where interrupts were configured by switches on the card). Interrupt requests on the PCI bus are made by changing the signal level on one of the IRQ lines, so it is possible to have several devices share the same interrupt request line; normally the system software tries to assign a separate interrupt to each device to maximize performance.
Specification
- Bus frequency 33.33 or 66.66 MHz, synchronous transmission;
- Bus bit size is 32 or 64 bits, the bus is multiplexed (address and data are sent over the same lines)
- peak throughput for the 32-bit version running at 33.33 MHz is 133 Mbytes/sec;
- memory address space - 32 bits (4 bytes);
- I/O port address space - 32 bits (4 bytes);
- configuration address space (for one function) - 256 bytes;
- voltage - 3.3 or 5 V.