» 您尚未登录:请 登录 | 注册 | 标签 | 帮助 | 小黑屋 |


发新话题
打印

[转帖]我們終於學到了PS3做Real Time Radiosity實在是屌到不行!(非作者原标题)

引用:
原帖由 zhangjingy 于 2007-8-10 16:53 发表
C1到CPU的上下行才10GB/S,比RSX的写15GB/S、读20GB/S差很远。
问题是,单纯的20GB带宽,你的处理速度跟得上么……?也不看看RSX内部带宽瓶颈是多少


TOP

引用:
原帖由 zhangjingy 于 2007-8-10 16:53 发表
不信的可以问问某些专业人士。
请把专业人士请出来,还是要我把NV的大佬们请出来自己说话?



TOP

引用:
原帖由 zhangjingy 于 2007-8-10 16:51 发表
那张图数据本来就有错误,早就被纠正了。

c1:500MHz x 48 x (4x2+1) = 216GFLOPS  (以前的240GFLOPS是错误的,xps_3_0的scalar根本没有mad这个指令,只有add/mul。具体参看MSDN:http://msdn2.microsoft.c ...
来自PS3DevForum

-problems with Cell
"There are some challenges involving the architecture of the Cell, the cell consists of PPE and SPE cores. A developer states, "It is impossible to extract the full performance of the Cell on launch titles, it will take time get familiar with it". Another developer states that they are having difficulties with the 256KB memory of each SPE core. The actual useable area of the 256KB is closer to 128KB when buffering is considered with accessing external DRAM. "It would have been much different situation if there was 1MB of local memory”. There is however a benefit for these restrictions on local memory, since latency can be reduced, and latency cycles are more easily read. This is an advantage for real-time gaming applications.

Till now, developers were stingy with programming and memory usage, and this will not change with the PS3, in fact, the Cell will reward developers that put more effort into programming. While that may not be a negative, it is a hurdle that will take larger developer resources and time. For the Cell it has changed from extracting performance from the hardware, but more towards multi-threaded performance and takes a different skill set then the previously."

-problems with RSX [7600GT bandwidth]
"Developers were using 7800GTX for development, The RSX uses Nvidia’s G70 and performance programmable shader performance is very high. But the memory interface is 128-bit, in addition 8 ROP (Rasterizing Operation). It can be said that the RSX has a shader equivalent of a high-end PC with mid-range memory bandwidth. For that reason, due to the GPU high shader performance there is a bottleneck to the ROP memory and is causing a bottleneck. “For lower resolutions it is a fantastic GPU, but it gets difficult for high end HDTV resolutions”, says a developer."

-not finished SDK
The article goes into some hurdles PS3 developers had to go through. SDK 1.00 was supposed to be released by TGS, but the latest release before TGS is 0.93. What is shown at TGS is still developed with SDKs that are not final."

- RSX still downgraded
"All this takes into account of the rumored RSX clock speed downgrades to 500/650. This is not going to effect the quality of games we have witnessed as the dev kits were working to lower clock speeds of 450. It can be looked as an upgrade in from another point of view. Also there is still nothing to say come final retail PS3 the clocks are actually 550/700."
See also earlier confirmation from the same page RSX downgraded to 500/650:


TOP

引用:
原帖由 zhangjingy 于 2007-8-10 16:57 发表


内部带宽有什么瓶颈?有C1-CPU的瓶颈大吗?
请问难道C1是将未处理的净数据送到CPU去的么?

CPU需要对纹理数据、Shader进行处理么?不需要,因为C1-CPU之间的带宽根本不是瓶颈

RSX那7600级别的内部带宽要处理巨大的纹理、Shader、物理等数据,这才叫瓶颈

TOP

引用:
原帖由 zhangjingy 于 2007-8-10 16:57 发表


找吧,他说得你应该信。
不好意思,他说RSX就是个加强版的7600——NV Asia张某7月于上海某U公司会议室

TOP

引用:
原帖由 zhangjingy 于 2007-8-10 17:01 发表
C1不需要传数据到CPU?天啊。
天啊,C1需要把纹理、Shading、物理等一众数据全部传到CPU!

那还要C1干什么?天师,你知道游戏机内部数据处理的流程么?

TOP

引用:
原帖由 zhangjingy 于 2007-8-10 17:02 发表


不是让你问数据吗?RSX什么级别也没什么秘密了,可惜PS3不光有RSX。
对,SPE有缓存,能当显卡用,对不对?

TOP

引用:
原帖由 zhangjingy 于 2007-8-10 17:05 发表


难道RSX需要传吗?没看主楼文章,SPE产生材质,RSX读取,“相對於其他系統而言需要GPU做RoT再讀回來需要兩倍的頻寬與時間,又會影響GPU的PS可用資源,即使offload到CPU上,也會另外需要足夠的I/O頻寬來負擔 ...
RSX难道内部不需要传输么?内部传输的时候就不受到那7600级别带宽的影响了么?

天师,你太让我失望了

TOP

引用:
原帖由 amego 于 2007-8-10 17:07 发表

向天师求教一个计算一个整数数列的并行计算算法,可不能google哦,也不要吧最俗的算法说出来哦,我只想学习一下天师的分布式开发的经验而已,别舍不得拿出来呀
这玩意儿我们做MKT的人都会算,哈哈,天师,看你的了

TOP

引用:
原帖由 zhangjingy 于 2007-8-10 17:08 发表


谁说不需要了,搞笑的是说不过总拿一句”太失望了“来回应。

”PS3有SPE處理、FlexIO提供充足的頻寬,RSX針對主記憶體內的材質做了改進,使得Enlighten在PS3上的執行變得非常有效率。想當然爾,這讓Geom ...
请问FlexIO和RSX内部带宽有什么关系?好像FlexIO是CPU与GPU之间的外部通信带宽吧?

TOP

引用:
原帖由 zhangjingy 于 2007-8-10 17:11 发表


我不会,教教我吧,MKT的人!
这东西在大学一年级《线性代数》课本上就有,虽然我考过不及格,重修还是过了

TOP

引用:
原帖由 zhangjingy 于 2007-8-10 17:13 发表


RSX内部带宽怎么会有问题呢?请详细说说,学习学习。C1到CPU的带宽确实小点。
CPU和GPU之间的带宽根本不需要非常庞大,难道GPU做的所有事情都要CPU来代劳?所谓外部的庞大总线,只不过是噱头而已!

看看RSX的内部带宽:128bit!

看看C1:512bit!还没有算eDRAM的4096bit!

TOP

引用:
原帖由 zhangjingy 于 2007-8-10 17:17 发表


C1也是128bit好不好,这年头怎么净是谣传啊。
http://www.anandtech.com/video/showdoc.aspx?i=2453&p=7

At this point in time, much of the bandwidth generated by graphics hardware is required to handle color and z data moving to the framebuffer. ATI hopes to eliminate this as a bottleneck by moving this processing and the back framebuffer off the main memory bus. The bus to main memory is 512MB of 128-bit 700MHz GDDR3 (which results in just over 22GB/sec of bandwidth). This is less bandwidth than current desktop graphics cards have available, but by offloading work and bandwidth for color and z to the daughter die, ATI saves themselves a good deal of bandwidth. The 22GB/sec is left for textures and the rest of the system (the Xbox implements a single pool of unified memory).

不好意思,数据来源不同,现在以AATC的数据为准,128bit……

不过ATI的管理非常好,该带宽仅用于纹理存储;由于统一渲染的架构,Shader的缓冲池是额外的,单在这一项传输上,就是要胜过RSX的分离渲染架构

[ 本帖最后由 上海恐龙 于 2007-8-10 17:26 编辑 ]

TOP

引用:
原帖由 shiningfire 于 2007-8-10 17:27 发表
我是等着TS说 “再一次大胜而归”云云:D :D   太经典了
17:30即将到来

TOP

引用:
原帖由 zhangjingy 于 2007-8-10 17:28 发表
靠,没注意时间,本座要走了,FSF不要借机说本座缩了,会让人笑话的。
又大胜而归了?

SPE比C1快?

TOP

发新话题
     
官方公众号及微博