112 records found
1
Scalable parallel programming applied to H.264/AVC decoding
A predictor-based power-saving policy for DRAM memories
An Instruction to Accelerate Software Caches
Nexus: hardware support for task-based programming
Composable Local Memory Organisation for Streaming Applications on Embedded MPSoCs
Instruction precomputation with memoization for fault detection
Extending the cell SPE with energy efficient branch prediction
Protective redundancy overhead reduction using instruction vulnerability factor
A case for hardware task management support for the StarSS programming
A multidimensional software cache for scratchpad-based systems
The SARC architecture
Evaluation of parallel H.264 decoding strategies for the cell broadband engine
Scalar processing overhead on SIMD-only architectures
Performance evaluation of macroblock-level parallelization of H.264 decoding on a cc-NUMA multiprocessor architecture
Limiting the number of dirty cache lines
Energy efficient branch prediction on the cell SPE
Intra-vector SIMD instructions for core specialization
Specialization of the cell SPE for media applications
Performance improvement of multimedia kernels by alleviating overhead instructions on SIMD devices