摘要:計(jì)算機(jī)選讀(960)Overtheyears,theclockfrequencybecamethekeymeasureofprocessorperformance.InparallelwithMoore''sLaw,whichpredictsthatthenumberoftransistorsonachipincreasesexponentially,theclockfrequencyhasdonethesame,doublingroughlyevery18monthsfromthousands
計(jì)算機(jī)選讀(960)
Over the years, the clock frequency became the key measure of processor performance. In parallel with Moore's Law, which predicts that the number of transistors on a chip increases exponentially, the clock frequency has done the same, doubling roughly every 18 months from thousands of ticks per second in 1977, to millions in the 1980s, to billions today. But while optimists believe that this process will continue, chip developers across the industry now agree that clock frequency will no longer be the key metric of processor performance, for several reasons.
The first is the growth of parallelism——the practice of getting a chip to execute many different operations simultaneously. In the past, this was confined to the realm of high-end supercomputers, as a way of improving their performance. But it is now becoming common in personal computers, and is bound to become more so.
A driving factor behind this parallelism is the fact that, while processor speed has increased with such remarkable rapidity, the speed of memories has lagged. What's more, the gap between processor speed and memory speed is likely to grow. Parallelism within a single chip allows several different processing units to share the same memory, so the memory's slowness is not such a problem.
This is because the limiting factor is not so much the throughput of memory chips (the rate at which data can be moved in and out of them) but the administrative overhead associated with moving information in and out of the processor. Because of this, chip designers can gain by putting several distinct processors on the same chip, and have them share a fast, local memory inside the chip itself. This approach is known as multiple cores, or multi-core for short. A related approach is known as simultaneous multi-threading. It involves modifying a single processor to enable it to switch quickly between several distinct tasks. While one task is waiting for data to arrive from the main memory, another can continue to execute——so a single processor can in effect, do the work of many.
A second reason why clock frequency will no longer be an accurate measure of performance is that distributing the clock's signal to all the different parts of a chip is more difficult that it sounds. Reducing the “skew” on a chip ——the amount by which clock signals might be out of synch——takes a very skillful chip designer. It is becoming more difficult as chips get larger and more complex.
That's why “asynchronous” technology is exploring aggressively, which involves getting rid of the clock entirely. This approach has costs and benefits, since miniature circuits known as “rendezvous circuits” must be placed at circuit junctions to co-ordinate the flow of data. It is rather like replacing a city-wide network of traffic lights with policemen at every corner. In one recent experiment with a test chip that could run in both synchronous and asynchronous modes, the asynchronous mode won out. That's because in a synchronous design, every operation must wait for the slowest one to complete, while in an asynchronous one, a laggard only delays the local part of calculation.
Clockless chips also have the added benefit of emitting for less radio interference. So asynchronous circuits could be particularly useful in devices such as mobile phones, where radio interference is a substantial concern.
Finally, getting chips to run at higher clock frequency is diminishing in importance because another problem is becoming more pressing: getting them to consume less power.. Power consumption is now the biggest problem in chip design.
速度不是一切
多年來(lái),時(shí)鐘頻率一直是處理器性能的主要測(cè)量指標(biāo)。與預(yù)測(cè)芯片上晶體管數(shù)目指數(shù)地急劇增加的摩爾定律相輔相成,時(shí)鐘頻率也是做同樣的事,每18個(gè)月翻一番,從1977年的每秒幾千次,增加到上個(gè)世紀(jì)80年代的幾百萬(wàn)次,到目前的幾十億次。雖然樂(lè)觀(guān)主義者還認(rèn)為這個(gè)過(guò)程將繼續(xù),但是全行業(yè)的芯片開(kāi)發(fā)人員都同意,時(shí)鐘頻率因多種原因?qū)⒉辉偈翘幚砥餍阅艿闹饕笜?biāo)。
首先是并行處理的發(fā)展——讓芯片同時(shí)執(zhí)行很多不同操作的做法。過(guò)去,并行處理僅限于高端的巨型機(jī),作為提高性能的方法。但現(xiàn)在,它在個(gè)人計(jì)算機(jī)中也已常見(jiàn),而且會(huì)越來(lái)越多。
并行處理背后的驅(qū)動(dòng)因素是這樣一個(gè)事實(shí),當(dāng)處理器的速度快速提高的同時(shí),存儲(chǔ)器的速度卻落后了。更有甚者,處理器速度與存儲(chǔ)器速度之間的差距有可能拉大。單一芯片中的并行性讓幾個(gè)不同的處理器共享同一存儲(chǔ)器,從而存儲(chǔ)器的緩慢不再是一個(gè)問(wèn)題。
這是因?yàn)楹艽蟪潭壬舷拗埔蛩夭皇谴鎯?chǔ)器芯片的吞吐能力(數(shù)據(jù)進(jìn)出存儲(chǔ)器的速率),而是與信息出入處理器相關(guān)聯(lián)的管理開(kāi)銷(xiāo)。正由于此原因,芯片設(shè)計(jì)師能做到在同一芯片中放入多個(gè)處理器并共享該芯片內(nèi)的快速本地存儲(chǔ)器。該方法叫做多內(nèi)核。另一個(gè)相關(guān)的方法叫同時(shí)多線(xiàn)程。它涉及到改進(jìn)單一處理器,使之能在幾個(gè)不同的任務(wù)之間快速轉(zhuǎn)換。當(dāng)一個(gè)任務(wù)等待主存中的數(shù)據(jù)送來(lái)之時(shí),另一個(gè)任務(wù)能繼續(xù)執(zhí)行——從而單個(gè)處理器實(shí)際上能做很多工作。
時(shí)鐘頻率不再是性能的精確測(cè)量指標(biāo)的第二個(gè)原因是,將時(shí)鐘信號(hào)分配到芯片的不同部分,要比說(shuō)說(shuō)困難得多。減少芯片上的“偏差”——時(shí)鐘信號(hào)失去同步的程度,需要技術(shù)高超的芯片設(shè)計(jì)師。隨著芯片越來(lái)越大、越來(lái)越復(fù)雜,這個(gè)問(wèn)題也變得更加困難。
這就是為什么“異步”技術(shù)得到大力研究開(kāi)發(fā),該技術(shù)涉及到將時(shí)鐘徹底去除。此方法既有得也有失,因?yàn)楸仨氃陔娐方唤狱c(diǎn)放置稱(chēng)之為“聚集電路”的微型化芯片,以協(xié)調(diào)數(shù)據(jù)的流動(dòng)。這相當(dāng)于在每個(gè)路口用警察代替整個(gè)城市的交通信號(hào)燈網(wǎng)絡(luò)。在最近的一次對(duì)測(cè)試芯片(在同步和異步方式下都能運(yùn)行)進(jìn)行的實(shí)驗(yàn)中,異步方式勝出。這是因?yàn)樵谕皆O(shè)計(jì)中每個(gè)操作必須等待最慢操作的完成,而在異步方式下遲緩的操作只是延緩局部的計(jì)算。
無(wú)時(shí)鐘的芯片還有一個(gè)額外的優(yōu)點(diǎn),即輻射更少的射頻干擾。因而異步電路特別適合用于移動(dòng)電話(huà)等對(duì)射頻干擾非常關(guān)注的設(shè)備。
最后,讓芯片運(yùn)行在更高時(shí)鐘頻率的重要性也在減小,因?yàn)榱硪粋€(gè)問(wèn)題變得更為迫切:讓(芯片)消耗更少的電能。能耗現(xiàn)已是芯片設(shè)計(jì)中最大的問(wèn)題。
軟考備考資料免費(fèi)領(lǐng)取
去領(lǐng)取
共收錄117.93萬(wàn)道題
已有25.02萬(wàn)小伙伴參與做題
售后投訴:156-1612-8671