`
love19820823
  • 浏览: 935509 次
文章分类
社区版块
存档分类
最新评论

CUDA 4.0发布

 
阅读更多

CUDA Toolkit 4.0 RC (March 2011)

For older releases, see the CUDA Toolkit Release Archive

Release Highlights

Easier Application Porting

  • Share GPUs across multiple threads
  • Use all GPUs in the system concurrently from a single host thread
  • No-copy pinning of system memory, a faster alternative to cudaMallocHost()
  • C++ new/delete and support for virtual functions
  • Support for inline PTX assembly
  • Thrust library of templated performance primitives such as sort, reduce, etc.
  • NVIDIA Performance Primitives (NPP) library for image/video processing
  • Layered Textures for working with same size/format textures at larger sizes and higher performance

Faster Multi-GPU Programming

  • Unified Virtual Addressing
  • GPUDirect v2.0 support for Peer-to-Peer Communication

New & Improved Developer Tools

  • Automated Performance Analysis in Visual Profiler
  • C++ debugging in cuda-gdb
  • GPU binary disassembler for Fermi architecture (cuobjdump)

Please refer to the Release Notes and Getting Started Guides for more information.

从特性上看,不是简单的硬件版本更新,而是对所有的显卡都有用的。

尤其值得称赞的是多卡之间可以通过pcie直接进行数据交换,很多应用就不会再受PCIE带宽的限制了。

利用PCIE的采集卡的设备,也可以在不久的将来直接通过PCIE直接跟GPU进行数据交互,不用再通过主内存传递数据,这是一个伟大的进步!

统一寻址地址:

GPUDirect V2.0新特性,可以直接通过PCIE之间传数据,而不是通过主内存做中转:

C++模板的支持

Nvidia的硬件在今年没有太多变化,但是CUDA4.0的出现,必然会对已有的硬件产生新的活力。

对于大数据传输的应用,很多都可以很好的支持了!

分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics