Products

Architecture

Since the inception of GPU as the special component for graphics processing in computers near the start of this century, GPU has taken its own path of development and evolution distinct from CPU, which is mainly responsible for processing data.  The mainstream processor companies have offered products that incorporate both CPU and GPU as distinct components, and the increasing trend is to integrate them into the same SoC chip.  But no matter how tightly CPU and GPU are integrated together, they are still separate processing units due to their development history and their need to maintain upward compatibility.  In the last few years, as GPU becomes programmable and more “general purpose”, and CPU becomes multi-core and multi-thread, they are taking on characteristics that makes them resemble each other.  ICube has seized on the opportunity of this inevitable convergence and created the first “Unified” processor architecture in which a single processing unit performs the functions of both CPU and GPU.  UPU is the ultimate result of the convergence between CPU and GPU.  This new processor innovation forms the basis of ICube’s technology offerings.

As pioneer in this new processor architecture approach, ICube has developed a brand-new instruction set architecture (ISA) called MVP for implementing UPU.  MVP stands for Multi-thread Virtual Pipeline, and is the name for both the UPU core and the ISA.

The first generation of the MVP ISA, called MVP-I, has less than 150 instructions.  It is a RISC (reduced instruction set computer) style architecture, elegantly designed and friendly to compilers.  The ISA takes on characteristics of both CPU and GPU, and has some characteristics unique to UPU.  Performance and efficiency are achieved via a combined hardware and software approach.

ICube’s world class hardware and software team has built systems based on real silicon to prove the UPU concept.   The first SoC is called IC1, which contains two MVP cores plus peripherals.  A full compilation toolchain that complies with industry-standard APIs is provided for the MVP architecture.  To date, Linux (SMP enabled) and Android 4.2 have been ported to MVP and are running on systems based on IC1.

ICube understands a new architecture needs to be open in order to be widely accepted.  MVP is an open architecture and ICube plans to collaborate with industry partners in standardizing and extending MVP at the architecture level.  ICube believes the emergence of “processor abstraction” along the lines of OpenCL, LLVM, etc. will result in cross-platform computing becoming prevalent and software ecosystem becoming less of an issue.  ICube embraces this software trend as this will eventually result in more and more architecture innovation.

Processor

The UPU core that ICube designed, called MVP, is a parallel, high performance, low power and low cost processor.  Each MVP core provides computation resources that can execute 4 threads simultaneously out of a total of 8 threads being managed by the hardware.  To maintain maximum throughput among the threads and minimize idle time among the computation resources, waiting threads are continuously being swapped out while ready-to-execute threads are swapped in.  Under this automatic load balancing, concurrent CPU and/or GPU tasks can be automatically assigned to up to four thread resources.  An inter-thread communication mechanism is provided to shorten latency and minimize data movement between the processor and external memory.

Key features of the MVP processor:

MVP-I Instruction-Set-Architecture
4-issue, 7-stage superscalar pipeline
Integrated CPU and GPU instruction micro-architecture
4 simultaneous multiple threads (SMT)
Single precision floating point unit
64KB I-cache and 64KB D-cache, plus 64KB local SRAM
MMU
Integrated DMA and Interrupt controller
Harmony thread scheduler and management
Dynamic load balancing with latency hiding capability
AXI/AHB bus interface

Since each MVP core is in fact a 4-way SMP, as the number of cores increases, the number of effective processors increase four times as fast.  The threads in each core share the primary cache and built-in local memory, while the multiple cores share and use the secondary cache for communication among them.

ICube’s MVP core is available for licensing as a processor IP.  MVP’s first external licensee has already used it to design an SoC for use in an automobile infotainment system.  Please contact us for more information about our processor licensing model.

IC3228 SoC

IC3228 is ICube’s first SoC product based on the MVP core.  It is a dual core UPU chip with tight integration of many industry-standard peripherals.  It is fabricated by TSMC using their 65nm process and runs at 600 MHz.  Following is IC3228′s detailed specification.

CPU function

  • 4-way simultaneous multi-threading (SMT) in each core
  • Symmetric-multi-processing (SMP), dual MVP cores
  • 64KB I-cache, 64KB D-cache and 64KB local memory in each core, 256KB shared L2 cache
  • Homogeneous parallel programs
  • Support Pthread, OpenMP

GPU function

  • Data parallel, Task parallel, and/or Function parallel computing
  • Multi-standard media processor
  • Programmable unified shader
  • Support OpenGL ES 2.0
  • 70 million triangles / sec, 300 million pixel / sec

System Clock

600MHz (TSMC 65nm)

Multi-thread Processing

Simultaneous 8 threads (4 threads x dual core) and 8 hybrid threads

Processing Power

5160 DMIPS (equivalent to 4.3 DMIPS/MHz per core)

Display System

LCD maximum pixel clock: 165MHz@16.7M (24-bit) true color, HDMI/DVI output capable

Camera

8/10 bit camera data interface

Video

Support HD 720p H.264 decoding via pure software

Audio

Max. 5.1 channel audio

Memory

Support SD, SDHC, MMC card, USB mass storage device, Nand flash, NOR flash, DDR3 SDRAM

Power Control

10 independent power domain, 3 low power modes

OS Support

Android 4.2, Linux 3.4

Supported Connectivity

USB host/slave, UART, WiFi (external), 3G modem (external), GPS (external)

Keypad

12 keypad I/O for Qwerty keyboard

I/O

UART x 4; I2C x 2; I2S x 3; SPI 4x slave; GPIO x 9; PWM x 3

Timer

Watchdog; RTC

Linux and Android are already up and running on evaluation boards based on IC3228. Please contact us for more information about IC3228, including demos and chip samples.

IC3138 SoC

IC3138 is ICube’s second SoC based on the MVP core.  It is a cost effective SoC consisting of a single MVP core with 4 hardware threads that can execute simultaneously. It provides the following features:

  • General parallel computing capabilities
  • High performance in spite of low clock frequencies
  • Dynamic load balancing capability between the CPU and GPU tasks running on the unified core
  • Flexibility in controlling the trade-off between performance and power

IC3138 is suitable for applications that require elements of graphical display, touch user interface or multimedia support through soft codec.  Its application areas include:

  • Cost effective application specific tablets and portable devices
  • HMI (Human Machine Interface) with touch panel support
  • User interfaces in Perceptual Computing
  • Computer vision
  • Low resolution video codec
  • Connectivity and smart modules for Microcontroller Unit (MCU) applications

IC3138 is fabricated by Fujitsu Semiconductor’s 55nm CMOS low power technology and has operating frequency up to 350MHz. Following is IC3138′s detailed specification:

 

CPU function

  • 4-way simultaneous-multi-threading (SMT) in each core
  • Symmetric-multi-processing (SMP), dual MVP cores
  • 32KB I-cache, 32KB D-cache and 32KB local memory
  • Homogeneous parallel programs
  • Support Pthread, OpenMP

GPU function

  • Data parallel, Task parallel, and/or Function parallel computing
  • Multi-standard media processor
  • Programmable unified shader
  • Support OpenGL ES2.0
  • 10 million triangles / sec, 50 million pixel / sec

System Clock

350MHz (Fujitsu 55nm)

Multi-thread Processing

Simultaneous 4 hybrid threads

Processing Power

1500 DMIPS (equivalent to 4.3 DMIPS/MHz per core)

Display System

LCD maximum pixel clock: 165MHz@16.7M (24-bit) true color

Imaging

YUV to RGB conversion; gamma correction

Video

Support HD 480p@25fps H.264/MPEG4/RMVB/H.263 decoding via pure software

Audio

Max. 5.1 channel audio

Memory

Support SD, SDHC, MMC card, USB mass storage device, DDR3 SDRAM

Power Control

4 independent power domain, clock gating, dynamic frequency scaling and sleep modes

OS Support

Android 4.2, Linux 3.4

Supported Connectivity

USB host/slave, UART, WiFi (external), 3G modem (external), GPS (external)

Keypad

8-pin keypad I/O for Qwerty keyboard

I/O

UART x 2; I2C x 1; I2S x 1; SPI 4x slave; GPIO; PWM x 3

Timer

Watchdog; RTC

 

Please contact us for more information about the shipment schedule of IC3138.

Evaluation boards & reference design

ICube has designed and manufactured an evaluation board based on the IC3228 SoC.  It uses 4.3″ or 7″ 800×480 LCD touch panels.

The evaluation board deliverables include the following:

Main board with power supply unit
4.3″ or 7” LCD touch panel
HDMI output
UART daughter board (for debug functions)
USB cable (for download and debug functions)
Tool chain and utility programs
Linux SMP kernel 3.4
Android 4.2 (with SDK and NDK)
BSPs
Documentations

Demos based on this evaluation board are available for viewing.  Please contact us for more information.

CPU function

  • Simultaneously-multi-threading (SMT) to efficiently accelerate

  • Symmetrical-multi-processing (SMP), dual MVP cores

  • 64KB I-cache, 64KB D-cache and 64KB local memory each core, 256KB shared L2 cache

  • Homogeneous parallel programs

  • Support Pthread, OpenMP

GPU function

Data parallel, Task parallel, and/or Function parallel computing as programmable unified shader, multi-standard and media processor, and heterogeneous GPGPU applications.

  • Support OpenGL ES2.0, OpenCL

  • 70 million triangles / sec, 300 million pixel / sec

System Clock

600MHz (TSMC 65nm)

Multi-thread Processing

Simultaneous 8 threads (4 threads x dual core) and 8 hybrid threads

Processing Power

5160DMIPS (equals 4.3 DMIPS/MHz per core)

Display System

LCD: Maximum pixel clock: 165MHz@16.7M (24-bit) true color, HDMI/DVI output capable

Camera

8/10 bit camera data interface

Video

Support HD 720p H.264 decoding (soft decoding)

Audio

Max. 5.1 channel audio

Memory

Support SD, HCSD, MMC card, USB mass storage device, Nand flash, NOR flash, DDR2 SDRAM

Power Control

10 independent power domain, 3 low power modes

OS Support

Android, Linux

Supported Connectivity

USB host/slave, UART, WiFi (external) 3G modem (external) GPS (external)

Keypad

12 keypad I/O for Qwerty keyboard

I/O

UART x 4; I2C x 2; I2S x 3; SPI 4x slave; GPIO x 9; PWM x 3

Timer

Watchdog; RTC