Networks-on-Chip: From Implementations to Programming Paradigms

Ebook742 pages7 hours

Networks-on-Chip: From Implementations to Programming Paradigms

Name: Networks-on-Chip: From Implementations to Programming Paradigms
Author: Sheng Ma
ISBN: 9780128011782

By Sheng Ma, Libo Huang, Mingche Lai and Wei Shi

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Networks-on-Chip: From Implementations to Programming Paradigms provides a thorough and bottom-up exploration of the whole NoC design space in a coherent and uniform fashion, from low-level router, buffer and topology implementations, to routing and flow control schemes, to co-optimizations of NoC and high-level programming paradigms.

This textbook is intended for an advanced course on computer architecture, suitable for graduate students or senior undergrads who want to specialize in the area of computer architecture and Networks-on-Chip. It is also intended for practitioners in the industry in the area of microprocessor design, especially the many-core processor design with a network-on-chip. Graduates can learn many practical and theoretical lessons from this course, and also can be motivated to delve further into the ideas and designs proposed in this book. Industrial engineers can refer to this book to make practical tradeoffs as well. Graduates and engineers who focus on off-chip network design can also refer to this book to achieve deadlock-free routing algorithm designs.

Provides thorough and insightful exploration of NoC design space. Description from low-level logic implementations to co-optimizations of high-level program paradigms and NoCs.
The coherent and uniform format offers readers a clear, quick and efficient exploration of NoC design space
Covers many novel and exciting research ideas, which encourage researchers to further delve into these topics.
Presents both engineering and theoretical contributions. The detailed description of the router, buffer and topology implementations, comparisons and analysis are of high engineering value.

Skip carousel

LanguageEnglish

PublisherElsevier Science

Release dateDec 4, 2014

ISBN9780128011782

Author

Sheng Ma

Sheng Ma received the B.S. and Ph.D. degrees in computer science and technology from the National University of Defense Technology (NUDT) in 2007 and 2012, respectively. He visited the University of Toronto from Sept. 2010 to Sept. 2012. He is currently an Assistant Professor of the College of Computer, NUDT. His research interests include on-chip networks, SIMD architectures and arithmetic unit designs.

Related authors

Skip carousel

Related to Networks-on-Chip

Related ebooks

Skip carousel

Heterogeneous Computing with OpenCL 2.0
Ebook
Heterogeneous Computing with OpenCL 2.0
byDavid R. Kaeli
Rating: 0 out of 5 stars
0 ratings
Principles and Practices of Interconnection Networks
Ebook
Principles and Practices of Interconnection Networks
byWilliam James Dally
Rating: 0 out of 5 stars
0 ratings
High Performance Parallelism Pearls Volume Two: Multicore and Many-core Programming Approaches
Ebook
High Performance Parallelism Pearls Volume Two: Multicore and Many-core Programming Approaches
byJim Jeffers
Rating: 0 out of 5 stars
0 ratings
Real-Time Systems Development
Ebook
Real-Time Systems Development
byRob Williams
Rating: 0 out of 5 stars
0 ratings
Shared Memory Application Programming: Concepts and Strategies in Multicore Application Programming
Ebook
Shared Memory Application Programming: Concepts and Strategies in Multicore Application Programming
byVictor Alessandrini
Rating: 0 out of 5 stars
0 ratings
Open-Source Robotics and Process Control Cookbook: Designing and Building Robust, Dependable Real-time Systems
Ebook
Open-Source Robotics and Process Control Cookbook: Designing and Building Robust, Dependable Real-time Systems
byLewin Edwards
Rating: 3 out of 5 stars
3/5
Three-Dimensional Integrated Circuit Design
Ebook
Three-Dimensional Integrated Circuit Design
byVasilis F. Pavlidis
Rating: 0 out of 5 stars
0 ratings
Wireless Receiver Architectures and Design: Antennas, RF, Synthesizers, Mixed Signal, and Digital Signal Processing
Ebook
Wireless Receiver Architectures and Design: Antennas, RF, Synthesizers, Mixed Signal, and Digital Signal Processing
byTony J. Rouphael
Rating: 0 out of 5 stars
0 ratings
Optical Fiber Telecommunications VII
Ebook
Optical Fiber Telecommunications VII
byAlan Willner
Rating: 0 out of 5 stars
0 ratings
Building Wireless Sensor Networks: Application to Routing and Data Diffusion
Ebook
Building Wireless Sensor Networks: Application to Routing and Data Diffusion
bySmain Femmam
Rating: 0 out of 5 stars
0 ratings
Power Converters with Digital Filter Feedback Control
Ebook
Power Converters with Digital Filter Feedback Control
byKeng C. Wu
Rating: 0 out of 5 stars
0 ratings
Designing Embedded Systems with PIC Microcontrollers: Principles and Applications
Ebook
Designing Embedded Systems with PIC Microcontrollers: Principles and Applications
byTim Wilmshurst
Rating: 2 out of 5 stars
2/5
Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation
Ebook
Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation
byScott Hauck
Rating: 0 out of 5 stars
0 ratings
Practical Knowledge Engineering
Ebook
Practical Knowledge Engineering
byRichard Kelly
Rating: 0 out of 5 stars
0 ratings
Discrete Networked Dynamic Systems: Analysis and Performance
Ebook
Discrete Networked Dynamic Systems: Analysis and Performance
byMagdi S. Mahmoud
Rating: 0 out of 5 stars
0 ratings
System on Chip Interfaces for Low Power Design
Ebook
System on Chip Interfaces for Low Power Design
bySanjeeb Mishra
Rating: 0 out of 5 stars
0 ratings
Intel Xeon Phi Processor High Performance Programming: Knights Landing Edition
Ebook
Intel Xeon Phi Processor High Performance Programming: Knights Landing Edition
byJames Jeffers
Rating: 0 out of 5 stars
0 ratings
Lighttpd
Ebook
Lighttpd
byAndre Bogus
Rating: 4 out of 5 stars
4/5
Modeling Embedded Systems and SoC's: Concurrency and Time in Models of Computation
Ebook
Modeling Embedded Systems and SoC's: Concurrency and Time in Models of Computation
byAxel Jantsch
Rating: 0 out of 5 stars
0 ratings
Parallel Processing for Artificial Intelligence 1
Ebook
Parallel Processing for Artificial Intelligence 1
byElsevier Books Reference
Rating: 5 out of 5 stars
5/5
Heterogeneous System Architecture: A New Compute Platform Infrastructure
Ebook
Heterogeneous System Architecture: A New Compute Platform Infrastructure
byWen-mei W. Hwu
Rating: 0 out of 5 stars
0 ratings
Foundations of Microprogramming: Architecture, Software, and Applications
Ebook
Foundations of Microprogramming: Architecture, Software, and Applications
byAshok K. Agrawala
Rating: 0 out of 5 stars
0 ratings
Real-Time Embedded Systems
Ebook
Real-Time Embedded Systems
byJiacun Wang
Rating: 0 out of 5 stars
0 ratings
Single and Multi-Chip Microcontroller Interfacing: For the Motorola 6812
Ebook
Single and Multi-Chip Microcontroller Interfacing: For the Motorola 6812
byG. Jack Lipovski
Rating: 0 out of 5 stars
0 ratings
The Art of Programming Embedded Systems
Ebook
The Art of Programming Embedded Systems
byJack Ganssle
Rating: 3 out of 5 stars
3/5
Embedded Computing for High Performance: Efficient Mapping of Computations Using Customization, Code Transformations and Compilation
Ebook
Embedded Computing for High Performance: Efficient Mapping of Computations Using Customization, Code Transformations and Compilation
byJoão Manuel Paiva Cardoso
Rating: 4 out of 5 stars
4/5
Embedded System Design on a Shoestring: Achieving High Performance with a Limited Budget
Ebook
Embedded System Design on a Shoestring: Achieving High Performance with a Limited Budget
byLewin Edwards
Rating: 4 out of 5 stars
4/5
Modeling and Verification Using UML Statecharts: A Working Guide to Reactive System Design, Runtime Monitoring and Execution-based Model Checking
Ebook
Modeling and Verification Using UML Statecharts: A Working Guide to Reactive System Design, Runtime Monitoring and Execution-based Model Checking
byDoron Drusinsky
Rating: 0 out of 5 stars
0 ratings
ARM System Developer's Guide: Designing and Optimizing System Software
Ebook
ARM System Developer's Guide: Designing and Optimizing System Software
byAndrew Sloss
Rating: 4 out of 5 stars
4/5
On-Chip Communication Architectures: System on Chip Interconnect
Ebook
On-Chip Communication Architectures: System on Chip Interconnect
bySudeep Pasricha
Rating: 0 out of 5 stars
0 ratings

Systems Architecture For You

Skip carousel

The Practice of Enterprise Architecture: A Modern Approach to Business and IT Alignment
Ebook
The Practice of Enterprise Architecture: A Modern Approach to Business and IT Alignment
bySvyatoslav Kotusev
Rating: 4 out of 5 stars
4/5
CompTIA ITF+ CertMike: Prepare. Practice. Pass the Test! Get Certified!: Exam FC0-U61
Ebook
CompTIA ITF+ CertMike: Prepare. Practice. Pass the Test! Get Certified!: Exam FC0-U61
byMike Chapple
Rating: 0 out of 5 stars
0 ratings
SysML in Action with Cameo Systems Modeler
Ebook
SysML in Action with Cameo Systems Modeler
byOlivier Casse
Rating: 5 out of 5 stars
5/5
Computer Science: A Concise Introduction
Ebook
Computer Science: A Concise Introduction
byIan Sinclair
Rating: 4 out of 5 stars
4/5
Top-Down Digital VLSI Design: From Architectures to Gate-Level Circuits and FPGAs
Ebook
Top-Down Digital VLSI Design: From Architectures to Gate-Level Circuits and FPGAs
byHubert Kaeslin
Rating: 0 out of 5 stars
0 ratings
The IT Support Handbook: A How-To Guide to Providing Effective Help and Support to IT Users
Ebook
The IT Support Handbook: A How-To Guide to Providing Effective Help and Support to IT Users
byMike Halsey
Rating: 0 out of 5 stars
0 ratings
Microservices with Azure
Ebook
Microservices with Azure
byNamit Tanasseri
Rating: 0 out of 5 stars
0 ratings
Systems Architecture Modeling with the Arcadia Method: A Practical Guide to Capella
Ebook
Systems Architecture Modeling with the Arcadia Method: A Practical Guide to Capella
byPascal Roques
Rating: 5 out of 5 stars
5/5
Engineering a Compiler
Ebook
Engineering a Compiler
byKeith D. Cooper
Rating: 0 out of 5 stars
0 ratings
Microsoft Certification: Complete step by step guide to pass all Microsoft Exams and get certifications real and unique practice tests included
Ebook
Microsoft Certification: Complete step by step guide to pass all Microsoft Exams and get certifications real and unique practice tests included
byDavid Mayer
Rating: 5 out of 5 stars
5/5
Xbox Architecture: Architecture of Consoles: A Practical Analysis, #13
Ebook
Xbox Architecture: Architecture of Consoles: A Practical Analysis, #13
byRodrigo Copetti
Rating: 0 out of 5 stars
0 ratings
A Practical Guide for IoT Solution Architects
Ebook
A Practical Guide for IoT Solution Architects
byDr Mehmet Yildiz
Rating: 5 out of 5 stars
5/5
Professional Cloud Architect – Google Cloud Certification Guide: A handy guide to designing, developing, and managing enterprise-grade GCP cloud solutions
Ebook
Professional Cloud Architect – Google Cloud Certification Guide: A handy guide to designing, developing, and managing enterprise-grade GCP cloud solutions
byKonrad Cłapa
Rating: 0 out of 5 stars
0 ratings
Mastering Kubernetes
Ebook
Mastering Kubernetes
byGigi Sayfan
Rating: 5 out of 5 stars
5/5
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
Ebook
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
byArthur T. Brooks
Rating: 0 out of 5 stars
0 ratings
.NET Core in Action
Ebook
.NET Core in Action
byDustin Metzgar
Rating: 0 out of 5 stars
0 ratings
NES Architecture: Architecture of Consoles: A Practical Analysis, #1
Ebook
NES Architecture: Architecture of Consoles: A Practical Analysis, #1
byRodrigo Copetti
Rating: 5 out of 5 stars
5/5
Extending Docker
Ebook
Extending Docker
byMcKendrick Russ
Rating: 0 out of 5 stars
0 ratings
Raspberry Pi Projects For Dummies
Ebook
Raspberry Pi Projects For Dummies
byMike Cook
Rating: 5 out of 5 stars
5/5
Linux and the Unix Philosophy
Ebook
Linux and the Unix Philosophy
byMike Gancarz
Rating: 0 out of 5 stars
0 ratings
Modeling Enterprise Architecture with TOGAF: A Practical Guide Using UML and BPMN
Ebook
Modeling Enterprise Architecture with TOGAF: A Practical Guide Using UML and BPMN
byPhilippe Desfray
Rating: 5 out of 5 stars
5/5
Wii Architecture: Architecture of Consoles: A Practical Analysis, #11
Ebook
Wii Architecture: Architecture of Consoles: A Practical Analysis, #11
byRodrigo Copetti
Rating: 0 out of 5 stars
0 ratings
The Construction Technology Handbook
Ebook
The Construction Technology Handbook
byHugh Seaton
Rating: 0 out of 5 stars
0 ratings
The Tao of Microservices
Ebook
The Tao of Microservices
byRichard Rodger
Rating: 0 out of 5 stars
0 ratings
Solution Architecture Foundations
Ebook
Solution Architecture Foundations
byMark Lovatt
Rating: 3 out of 5 stars
3/5
Wii U Architecture: Architecture of Consoles: A Practical Analysis, #21
Ebook
Wii U Architecture: Architecture of Consoles: A Practical Analysis, #21
byRodrigo Copetti
Rating: 0 out of 5 stars
0 ratings
Software Architecture with Python
Ebook
Software Architecture with Python
byAnand Balachandran Pillai
Rating: 0 out of 5 stars
0 ratings
DevOps for Networking
Ebook
DevOps for Networking
bySteven Armstrong
Rating: 0 out of 5 stars
0 ratings
CompTIA Network+ CertMike: Prepare. Practice. Pass the Test! Get Certified!: Exam N10-008
Ebook
CompTIA Network+ CertMike: Prepare. Practice. Pass the Test! Get Certified!: Exam N10-008
byMike Chapple
Rating: 0 out of 5 stars
0 ratings
JavaScript Application Design: A Build First Approach
Ebook
JavaScript Application Design: A Build First Approach
byNicolas Bevacqua
Rating: 0 out of 5 stars
0 ratings

Related podcast episodes

Skip carousel

One Shot and Metric Learning - Quadruplet Loss (Machine Learning Dojo)
Podcast episode
One Shot and Metric Learning - Quadruplet Loss (Machine Learning Dojo)
byMachine Learning Street Talk (MLST)
0 ratings
0% found this document useful
41. Bob Nystrom
Podcast episode
41. Bob Nystrom
byIt's All Widgets! Flutter Podcast
0 ratings
0% found this document useful
Episode 99. SHHH! It's a secret! (Storing API Keys / Passwords / tokens!): Ok, so is time to talk about something secretive! Like API Passwords, Auth tokens, or keys... these are things that we want to have as a Secret within our microservice. And yeah, adding them into your source code is a big no-no Here we cover the dos...
Podcast episode
Episode 99. SHHH! It's a secret! (Storing API Keys / Passwords / tokens!): Ok, so is time to talk about something secretive! Like API Passwords, Auth tokens, or keys... these are things that we want to have as a Secret within our microservice. And yeah, adding them into your source code is a big no-no Here we cover the dos...
byJava Pub House
0 ratings
0% found this document useful
Taking A Tour Of PostgreSQL with Jonathan Katz - Episode 42: A Whirlwind Tour Of The PostgreSQL Database (Interview)
Podcast episode
Taking A Tour Of PostgreSQL with Jonathan Katz - Episode 42: A Whirlwind Tour Of The PostgreSQL Database (Interview)
byData Engineering Podcast
100%
100% found this document useful
Hacker-Proof Code Confirmed: Computer scientists can prove certain programs to be error-free with the same certainty that mathematicians prove theorems.
Podcast episode
Hacker-Proof Code Confirmed: Computer scientists can prove certain programs to be error-free with the same certainty that mathematicians prove theorems.
byQuanta Science Podcast
0 ratings
0% found this document useful
The Learning Curve: Can augmented reality help kids learn? As more learning goes online, education is shifting. Host Christina Warren discusses how new technology, enabled by the benefits of 5G, can make school more effective and fun.
Podcast episode
The Learning Curve: Can augmented reality help kids learn? As more learning goes online, education is shifting. Host Christina Warren discusses how new technology, enabled by the benefits of 5G, can make school more effective and fun.
byNetworked: The 5G Future
0 ratings
0% found this document useful
Cryptography 101 with Oso's Dr. Sam Scott: One of the best parts of having a podcast is having smart people explain stuff to you! Scott talks to Dr. Sam Scott, the CTO of https://www.osohq.com/ about what the average developer should know about Cryptography. SSL, TLS, public/private key, certs, PKIs, hashing, encryption, salts, algorithms, sessions, bearers, oh my!
Podcast episode
Cryptography 101 with Oso's Dr. Sam Scott: One of the best parts of having a podcast is having smart people explain stuff to you! Scott talks to Dr. Sam Scott, the CTO of https://www.osohq.com/ about what the average developer should know about Cryptography. SSL, TLS, public/private key, certs, PKIs, hashing, encryption, salts, algorithms, sessions, bearers, oh my!
byHanselminutes with Scott Hanselman
0 ratings
0% found this document useful
Being Bayesian: This episode explores the root concept of what it is to be Bayesian: describing knowledge of a system probabilistically, having an appropriate prior probability, know how to weigh new evidence, and following Bayes's rule to compute the revised...
Podcast episode
Being Bayesian: This episode explores the root concept of what it is to be Bayesian: describing knowledge of a system probabilistically, having an appropriate prior probability, know how to weigh new evidence, and following Bayes's rule to compute the revised...
byData Skeptic
0 ratings
0% found this document useful
Living on the Edge (Computing)
Podcast episode
Living on the Edge (Computing)
byTechStuff
0 ratings
0% found this document useful
Using AI to supercharge DevX with Deepak Singh of AWS: Developer experience, or DevX, is a critical aspect of modern software development that focuses on creating a seamless and productive environment for developers. It encompasses everything from the tools and technologies used in the development process ...
Podcast episode
Using AI to supercharge DevX with Deepak Singh of AWS: Developer experience, or DevX, is a critical aspect of modern software development that focuses on creating a seamless and productive environment for developers. It encompasses everything from the tools and technologies used in the development process ...
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
014 Shallow Algos 3: Speed run of Anomaly Detection, Recommenders(Content Filtering vs Collaborative Filtering), and Markov Chain Monte Carlo (MCMC). ocdevel.com/mlg/14 for notes and resources
Podcast episode
014 Shallow Algos 3: Speed run of Anomaly Detection, Recommenders(Content Filtering vs Collaborative Filtering), and Markov Chain Monte Carlo (MCMC). ocdevel.com/mlg/14 for notes and resources
byMachine Learning Guide
0 ratings
0% found this document useful
Microservices with Rafi Schloming: Microservices are a widely adopted pattern for breaking an application up into pieces that can be well-understood by the individual teams within the company. Microservices also allow these individual pieces to be scaled independently and updated in iso...
Podcast episode
Microservices with Rafi Schloming: Microservices are a widely adopted pattern for breaking an application up into pieces that can be well-understood by the individual teams within the company. Microservices also allow these individual pieces to be scaled independently and updated in iso...
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
Episode 161: Trapped as a QA engineer and trapped as a generalist
Podcast episode
Episode 161: Trapped as a QA engineer and trapped as a generalist
bySoft Skills Engineering
0 ratings
0% found this document useful
Engineering interview tips & tricks: with Emma Draper & Jonas
Podcast episode
Engineering interview tips & tricks: with Emma Draper & Jonas
byGo Time: Golang, Software Engineering
0 ratings
0% found this document useful
2: Pytest vs Unittest vs Nose: Choosing a test framework
Podcast episode
2: Pytest vs Unittest vs Nose: Choosing a test framework
byTest and Code
0 ratings
0% found this document useful
Can networking be simple? with Tailscale's Avery Pennarun: Double NAT? Triple NAT? Opening Ports, punching holes in firewalls, it's all so complex, right? Does it have to be? Scott talks to Tailscale's Avery Pennarun and asks "can networking be simple?" Avery and his team believes it can with a new take on networking. Personal mesh-style VPNs with tech like WireGuard over a faster, leaner, cleaner, and simpler way to share your network with your team.
Podcast episode
Can networking be simple? with Tailscale's Avery Pennarun: Double NAT? Triple NAT? Opening Ports, punching holes in firewalls, it's all so complex, right? Does it have to be? Scott talks to Tailscale's Avery Pennarun and asks "can networking be simple?" Avery and his team believes it can with a new take on networking. Personal mesh-style VPNs with tech like WireGuard over a faster, leaner, cleaner, and simpler way to share your network with your team.
byHanselminutes with Scott Hanselman
0 ratings
0% found this document useful
005 Linear Regression: Introduction to the first machine-learning algorithm, the 'hello world' of supervised learning - Linear Regression ocdevel.com/mlg/5 for notes and resources
Podcast episode
005 Linear Regression: Introduction to the first machine-learning algorithm, the 'hello world' of supervised learning - Linear Regression ocdevel.com/mlg/5 for notes and resources
byMachine Learning Guide
0 ratings
0% found this document useful
Spring Boot with Josh Long: Spring Framework is an application framework for Java and JVM languages. Spring was originally built around dependency injection, but grew to become an entire ecosystem of tools and plugins for Java developers.
Podcast episode
Spring Boot with Josh Long: Spring Framework is an application framework for Java and JVM languages. Spring was originally built around dependency injection, but grew to become an entire ecosystem of tools and plugins for Java developers.
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
Edge Databases with Glauber Costa: Picture a user interacting with a web app on their phone. When they tap the screen the app triggers communication with a server, which in turn communicates with a database. This process then happens in reverse to eventually update what the user sees on...
Podcast episode
Edge Databases with Glauber Costa: Picture a user interacting with a web app on their phone. When they tap the screen the app triggers communication with a server, which in turn communicates with a database. This process then happens in reverse to eventually update what the user sees on...
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
TSMC: It's time. We dive into the unbelievable history behind the quietest technology giant of them all — and as of recording the world's 9th (!) most valuable company — the Taiwan Semiconductor Manufacturing Company. This story checks every box in the Acqu
Podcast episode
TSMC: It's time. We dive into the unbelievable history behind the quietest technology giant of them all — and as of recording the world's 9th (!) most valuable company — the Taiwan Semiconductor Manufacturing Company. This story checks every box in the Acqu
byAcquired
100%
100% found this document useful
Jack Fitzsimons | Evil Models: Hiding Malware in Neural Networks: Jack Fitzsimons | Evil Models: Hiding Malware in Neural Networks Did you know that it's possible to hide malware in neural networks? Actually, you can hide malware in many statistical models. This is the subject of two recently-published papers (aptly ti...
Podcast episode
Jack Fitzsimons | Evil Models: Hiding Malware in Neural Networks: Jack Fitzsimons | Evil Models: Hiding Malware in Neural Networks Did you know that it's possible to hide malware in neural networks? Actually, you can hide malware in many statistical models. This is the subject of two recently-published papers (aptly ti...
byData & Science with Glen Wright Colopy
0 ratings
0% found this document useful
55: Go on The Web: Summary Andrew Gerrand (@enneff), Developer Advocate at Google & Go core contributor, talks about GoLang and how it is being used in Web Development today as well as the plans for the future of the Go as a platform for the web. Resources Go...
Podcast episode
55: Go on The Web: Summary Andrew Gerrand (@enneff), Developer Advocate at Google & Go core contributor, talks about GoLang and how it is being used in Web Development today as well as the plans for the future of the Go as a platform for the web. Resources Go...
byThe Web Platform Podcast
100%
100% found this document useful
Episode 95. Ludicruos speed! Practical GraalVM: So we had a previous episode where we show a party trick with GraalVM, where we saw how to create a Native Image. It was really the "hello world" of Native image creation, so Bob decided that's not good enough! In this epidose we dive a little deeper...
Podcast episode
Episode 95. Ludicruos speed! Practical GraalVM: So we had a previous episode where we show a party trick with GraalVM, where we saw how to create a Native Image. It was really the "hello world" of Native image creation, so Bob decided that's not good enough! In this epidose we dive a little deeper...
byJava Pub House
0 ratings
0% found this document useful
Putting Airflow Into Production With James Meickle - Episode 43: Lessons Learned While Building A Data Science Platform With Airflow (Interview)
Podcast episode
Putting Airflow Into Production With James Meickle - Episode 43: Lessons Learned While Building A Data Science Platform With Airflow (Interview)
byData Engineering Podcast
0 ratings
0% found this document useful
All Roads Lead to Kubernetes with Kendall Miller: Kendall Miller is the president at Fairwinds, a shop that helps teams optimize containerized apps and get the most out of Kubernetes that was formerly called ReactiveOps. He's also the host of Authority Issues, a podcast about leadership. Prior to these p
Podcast episode
All Roads Lead to Kubernetes with Kendall Miller: Kendall Miller is the president at Fairwinds, a shop that helps teams optimize containerized apps and get the most out of Kubernetes that was formerly called ReactiveOps. He's also the host of Authority Issues, a podcast about leadership. Prior to these p
byScreaming in the Cloud
0 ratings
0% found this document useful
Ali Ghodsi – The Past, Present, and Future of Big Data – [Founder’s Field Guide, EP.18]: My Guest today is Ali Ghodsi, founder and CEO of Databricks, a data analytics platform for data scientists and developers. He's also the founder of Apache Spark, the open-source project that Databricks is built on, and is an accomplished researcher at...
Podcast episode
Ali Ghodsi – The Past, Present, and Future of Big Data – [Founder’s Field Guide, EP.18]: My Guest today is Ali Ghodsi, founder and CEO of Databricks, a data analytics platform for data scientists and developers. He's also the founder of Apache Spark, the open-source project that Databricks is built on, and is an accomplished researcher at...
byInvest Like the Best with Patrick O'Shaughnessy
0 ratings
0% found this document useful
React Hooks - 1 Year Later: In this episode of Syntax, Scott and Wes talk about React Hooks, one year later — what’s changed, how to use them, and more! Sanity - Sponsor is a real-time headless CMS with a fully customizable Content Studio built in React. Get a Sanity...
Podcast episode
React Hooks - 1 Year Later: In this episode of Syntax, Scott and Wes talk about React Hooks, one year later — what’s changed, how to use them, and more! Sanity - Sponsor is a real-time headless CMS with a fully customizable Content Studio built in React. Get a Sanity...
bySyntax - Tasty Web Development Treats
0 ratings
0% found this document useful
Hasty Treat - Why should I use React Hooks?: In this Hasty Treat, Scott and Wes talk about React Hooks and why you might want to use them instead of class components. Sentry - Sponsor If you want to know what’s happening with your errors, track them with . Sentry is open-source error...
Podcast episode
Hasty Treat - Why should I use React Hooks?: In this Hasty Treat, Scott and Wes talk about React Hooks and why you might want to use them instead of class components. Sentry - Sponsor If you want to know what’s happening with your errors, track them with . Sentry is open-source error...
bySyntax - Tasty Web Development Treats
0 ratings
0% found this document useful
Recurrent Neural Nets: This week, we're doing a crash course in recurren…
Podcast episode
Recurrent Neural Nets: This week, we're doing a crash course in recurren…
byLinear Digressions
0 ratings
0% found this document useful
41: Piezoelectric Materials: In Your Body, Underwater, and In Space (ft. Dr. Susan Trolier-McKinstry): The Curie brothers discovered a class of materials that, with an asymmetrical crystal structure, could produce an electric potential upon mechanical deformation. These piezoelectric materials are now widely used in the medical, naval, and space industrie...
Podcast episode
41: Piezoelectric Materials: In Your Body, Underwater, and In Space (ft. Dr. Susan Trolier-McKinstry): The Curie brothers discovered a class of materials that, with an asymmetrical crystal structure, could produce an electric potential upon mechanical deformation. These piezoelectric materials are now widely used in the medical, naval, and space industrie...
byIt's a Material World | Materials Science Podcast
0 ratings
0% found this document useful

Skip carousel

The Virtual Garage
Racecar Engineering
Article
The Virtual Garage
Aug 6, 2021
11 min read
Silq Is An Easier Quantum Programming Language
Futurity
Article
Silq Is An Easier Quantum Programming Language
Jun 22, 2020
3 min read
Rise Of The Robots
Linux Format
Article
Rise Of The Robots
Jan 12, 2021
7 min read
Quantum Entanglement Could Take GPS To The Next Level
Futurity
Article
Quantum Entanglement Could Take GPS To The Next Level
Apr 20, 2020
3 min read
Design Your Own Microprocessor
Linux Format
Article
Design Your Own Microprocessor
Jan 14, 2020
15 min read
The Return Of Gpu Computing
PC Pro Magazine
Article
The Return Of Gpu Computing
Jul 8, 2021
5 min read
We Need an FDA For Algorithms
Nautilus
Article
We Need an FDA For Algorithms
Nov 1, 2018
In the introduction to her new book, Hannah Fry points out something interesting about the phrase “Hello World.” It’s never been quite clear, she says, whether the phrase—which is frequently the entire output of a student’s first computer program—is
10 min read
» Stochastic Algorithms
Linux Format
Article
» Stochastic Algorithms
Dec 14, 2021
If you’re up for some relatively maths-heavy computer-science reading (and who isn’t?), then consider looking into stochastic algorithms. Sometimes lumped together with machine-learning, stochastic algorithms is a loosely defined category that you co
1 min read
The Space Broadband Race
PC Pro Magazine
Article
The Space Broadband Race
Jul 8, 2021
7 min read
Custom Embedded Linux Images
Linux Format
Article
Custom Embedded Linux Images
Jun 4, 2019
The Yocto Project (Yocto) www.yoctoproject.org is a system that uses the Linux kernel and packages contributed from the OpenEmbedded software team. The Yocto team points out that its product is not a Linux distribution, but instead builds custom dist
8 min read
Drone Glossary
TechLife
Article
Drone Glossary
Jan 11, 2021
2 min read
Security Cams
T3
Article
Security Cams
Aug 7, 2020
2 min read
Computer Scientists Discover Limits of Major Research Algorithm
Quanta
Article
Computer Scientists Discover Limits of Major Research Algorithm
Aug 17, 2021
1 min read
Understanding CPUs
PC Powerplay
Article
Understanding CPUs
Sep 2, 2019
10 min read
Superspeed Your Network For Free… (or Nearly)
PC Pro Magazine
Article
Superspeed Your Network For Free… (or Nearly)
Jan 6, 2022
A slow network hampers your productivity, and it’s also frankly embarrassing – it sends a message to your staff, and to any external parties who may access it. Yet it’s an area where major improvements can frequently be made with some tweaks to your
8 min read
Quantum Entanglement
Techfastly
Article
Quantum Entanglement
Oct 1, 2021
4 min read
Hello, World!
PC Gamer
Article
Hello, World!
Oct 17, 2019
3 min read
Technical The Power Of Data
Yachting Monthly
Article
Technical The Power Of Data
Jun 18, 2020
4 min read
Intel Charts Course To Trillion-transistor Chips
APC
Article
Intel Charts Course To Trillion-transistor Chips
Jan 23, 2023
2 min read
Here’s A Feasible Way For Our Devices To Send Data With Light
Futurity
Article
Here’s A Feasible Way For Our Devices To Send Data With Light
Apr 25, 2018
Researchers have developed a method to fabricate silicon chips that can communicate with light and are no more expensive than current chip technology. The new microchip technology capable of optically transferring data could solve a severe bottleneck
3 min read
Editor’s Note
Techfastly
Article
Editor’s Note
Apr 1, 2022
Dear Readers, eBPF is a ground-breaking technology derived from the Linux kernel that allows sandboxed programmes to operate within an operating system kernel. It stands for Extended Berkeley Packet Filter (eBPF). eBPF was first published in diminish
1 min read
Why The Future Needs Optical Data Centres
PC Pro Magazine
Article
Why The Future Needs Optical Data Centres
Sep 10, 2020
9 min read
Declutter Your Network
PC Pro Magazine
Article
Declutter Your Network
Apr 4, 2024
9 min read
Strategic Command
Racecar Engineering
Article
Strategic Command
Nov 4, 2022
9 min read
Hello, World!
PC Gamer (US Edition)
Article
Hello, World!
Nov 5, 2019
4 min read
Quantum Computing Is Here… With One Small Caveat
APC
Article
Quantum Computing Is Here… With One Small Caveat
Feb 5, 2024
8 min read
Networking
Sail
Article
Networking
Jan 15, 2019
The National Marine Electronics Association (NMEA) sets the standards for marine electronics networks. The best known is NMEA0183, which has been in existence for more than 20 years. Under this protocol devices communicate with each other by sending
1 min read
Quantum Computing Is Here…with One Small Caveat
PC Pro Magazine
Article
Quantum Computing Is Here…with One Small Caveat
Jan 4, 2024
7 min read
Communications (In)Security
CQ Amateur Radio
Article
Communications (In)Security
Aug 1, 2021
3 min read
Switch Hitting
Sail
Article
Switch Hitting
Jul 16, 2019
5 min read

Related categories

Skip carousel

Reviews for Networks-on-Chip

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Networks-on-Chip - Sheng Ma

techniques.

Part I

Prologue

Chapter 1

Introduction

Abstract

The continual advancement of semiconductor technology and unconquerable obstacles for designing efficient single-core processors together are rapidly driving computer architecture to enter the many-core era. Mitigating development challenges for many-core processors needs a communication-centric cross-layer optimization method. The traditional bus and crossbar communication structures have several shortcomings, including poor scalability, low bandwidth, large latency, and high power consumption. To address these limitations, the network-on-chip (NoC) introduces a packet-switched fabric for on-chip communication, and it becomes the de facto many-core interconnection mechanism. The baseline NoC design exploration mainly consists of the design of the network topology, routing algorithm, flow control mechanism, and router microarchitecture. This chapter reviews the significant research progress in all these design aspects. The research progress mainly focuses on optimizing the pure NoC-layer performance, including the zero-load latency and saturation throughput. We also discuss the NoC design trends of several commercial or prototype processors. The NoCs of these real processors cover a significant design space, since their functionalities and requirements are rich and different. Finally, we provide an overview of the book content.

Keywords

Many-Core Era

Communication-Centric Cross-Layer Optimization

Networks-on-Chip

Chapter outline

1.1 The Dawn of the Many-Core Era 3

1.2 Communication-Centric Cross-Layer Optimizations 5

1.3 A Baseline Design Space Exploration of NoCs 7

1.3.1 Topology 8

1.3.2 Routing Algorithm 9

1.3.3 Flow Control 11

1.3.4 Router Microarchitecture 13

1.3.5 Performance Metric 16

1.4 Review of NoC Research 17

1.4.1 Research on Topologies 17

1.4.2 Research on Unicast Routing 18

1.4.3 Research on Supporting Collective Communications 19

1.4.4 Research on Flow Control 20

1.4.5 Research on Router Microarchitecture 22

1.5 Trends of Real Processors 23

1.5.1 The MIT Raw Processor 23

1.5.2 The Tilera TILE64 Processor 24

1.5.3 The Sony/Toshiba/IBM Cell Processor 26

1.5.4 The U.T. Austin TRIPS Processor 28

1.5.5 The Intel Teraflops Processor 29

1.5.6 The Intel SCC Processor 30

1.5.7 The Intel Larrabee Processor 32

1.5.8 The Intel Knights Corner Processor 34

1.5.9 Summary of Real Processors 36

1.6 Overview of the Book 38

References 40

1.1 The dawn of the many-core era

The development of the semiconductor industry has followed the self-fulfilling Moore's law [108] in the past half century: to maintain a competitive advantage for products, the number of transistors per unit area on integrated circuits doubles every 2 years. Currently, a single chip is able to integrate several billion transistors [114, 128], and Moore's law is expected to continue to be valid in the next decade [131]. The steadily increasing number of transistors, on the one hand, offers the computer architecture community enormous opportunities to design and discover innovative techniques to scale the processor's performance [54]. On the other hand, the huge number of transistors exacerbates several issues, such as power consumption [42, 111], resource utilization [38], and reliability problems [14], which bring about severe design challenges for computer architects.

Until the beginning of this century, the primary performance scaling methods focused on improving the sequential computation ability for single-core processors [110]. Since the birth of the first processor, the processor's sequential performance-price ratio has been improved by over 10 billion times [117]. Some of the most efficient performance enhancement techniques include leveraging deep pipelines to boost the processor frequency [56, 60], utilizing sophisticated superscalar techniques to squeeze the instruction-level parallelism [73, 83], and deploying large on-chip caches to exploit the temporal and spatial locality [70, 121]. Yet, since the switching power is proportional to the processor frequency, sophisticated superscalar techniques involve complex logic, and large caches configure huge memory arrays, these traditional performance-scaling techniques induce a sharp increase in processor complexity [3] and also power consumption [134]. In addition, there are diminishing performance returns as the already exploited instruction-level parallelism and temporal/spatial locality have nearly reached the intrinsic limitations of sequential programs [120, 143]. The unconquerable obstacles, including the drastically rising power consumption and diminishing performance gains for single-core processors, have triggered a profound revolution in computer architecture.

To efficiently reap the benefits of billions of transistors, computer architects are actively pursuing the multicore design to maintain sustained performance growth [50]. Instead of an increase in frequency or the use of sophisticated superscalar techniques to improve the sequential performance, the multicore designs scale the computation throughput by packing multiple cores into a single chip. Different threads of a single application or different concurrent applications can be assigned to the multiple cores to efficiently exploit the thread-level and task-level parallelism respectively. This can achieve significantly higher performance than using a single-core processor. Also, multicore processors reduce engineering efforts for the producers, since different amounts of the same core can be stamped down to form a family of processors. Since Intel's cancellation of the single-core processors Tejas and Jayhawk in 2004 and its switching to developing dual-core processors to compete with AMD, the community has been rapidly embracing the multicore design method [45].

Generally speaking, when the processor integrates eight or fewer cores, it is called a multicore processor. A processor with more cores is called a many-core processor. Current processors have already integrated several tens or hundreds of cores, and a new version of Moore's law uses the core count as the exponentially increasing parameter [91, 135]. Intel and IBM announced their own 80-core and 75-core prototype processors, the Teraflops [141] and Cyclops-64 [29], in 2006. Tilera released the 64-core TILE64 [11] and the 100-core TILE-Gx100 [125] processors in 2007 and 2009 respectively. In 2008, ClearSpeed launched the 192-core CSX700 processor [105]. In 2011, Intel announced its 61-core Knights Corner coprocessor [66], which is largely and successfully deployed in the currently most powerful supercomputer, the TianHe-2 [138]. Intel plans to release its next-generation acceleration co-processor, the 72-core Knights Landing co-processor, in 2015 [65]. Other typical many-core processors include Intel's Single-chip Cloud Computer (SCC) [58], Intel's Larrabee [130], and AMD's and NVIDIA's general purpose computing on graphics processing units (GPGPUs) [41, 90, 114, 146]. The many-core processors are now extensively utilized in several computation fields, ranging from scientific supercomputing to desktop graphic applications; the computer architecture and its associated computation techniques are entering the many-core era.

1.2 Communication-centric cross-layer optimizations

Although there have already been great breakthroughs in the design of many-core processors in the theoretical and practical fields, there are still many critical problems to solve. Unlike in single-core processors, the large number of cores increases the design and deployment difficulties for many-core processors; the design of efficient many-core architecture faces several challenges ranging from high-level parallel programming paradigms to low-level logic implementations [2, 7]. These challenges are also opportunities. Whether they can be successfully and efficiently solved may largely determine the future development for the entire computer architecture community. Figure 1.1 shows the three key challenges for the design of a many-core processor.

Figure 1.1 Key challenges for the design of a many-core processor.

The first challenge lies in the parallel programming paradigm layer [7, 31]. For many years, only a small number of researchers used parallel programming to conduct large-scale scientific computations. Nowadays, the widely available multicore and many-core processors make parallel programming essential even for desktop users [103]. The programming paradigm acts as a bridge to connect the parallel applications with the parallel hardware; it is one of the most significant challenges for the development of many-core processors [7, 31]. Application developers desire an opaque paradigm which hides the underlying architecture to ease programming efforts and improve application portability. Architecture designers hope for a visible paradigm to utilize the hardware features for high performance. An efficient parallel programming paradigm should be an excellent tradeoff between these two requirements.

The second challenge is the interconnection layer. The traditional bus and crossbar structures have several shortcomings, including poor scalability, low bandwidth, large latency, and high power consumption. To address these limitations, the network-on-chip (NoC) introduces a packet-switched fabric for the on-chip communication [25], and it becomes the de facto many-core interconnection mechanism. Although significant progress has been made in NoC research [102, 115], most works have focused on the optimizations of pure network-layer performance, such as the zero-load latency and saturation throughput. These works do not consider sufficiently the upper programming paradigms and the lower logic implementations. Indeed, the cross-layer optimization is regarded as an efficient way to extend Moore's law for the whole information industry [2]. The future NoC design should meet the requirements of both upper programming paradigms and lower logic implementations.

The third challenge appears in the logic implementation layer. Although multicore and many-core processors temporarily mitigate the problem of the sharply rising power consumption, the computer architecture will again face the power consumption challenge with the steadily increasing transistor count driven by Moore's law. The evaluation results from Esmaeilzadeh et al. [38] show that the power limitation will force approximately 50% of the transistors to be off for processors based on 8nm technology in 2018, and thus the emergence of dark silicon. The dark silicon phenomenon strongly calls for innovative low-power logic designs and implementations for many-core processors.

The aforementioned three key challenges directly determine whether many-core processors can be successfully developed and widely used. Among them, the design of an efficient interconnection layer is one of the most critical challenges, because of the following issues. First, the many-core processor design has already evolved from the computation-centric method into the communication-centric method [12, 53]. With the abundant computation resources, the efficiency of the interconnection layers largely and heavily determines the performance of the many-core processor. Second, and more importantly, mitigating the challenges for the programming paradigm layer and the logic implementation layer requires optimizing the design of the interconnection layer. The opaque programming paradigm requires the interconnection layer to intelligently manage the application traffic and hide the communication details. The visible programming paradigm requires the interconnection layer to leverage the hardware features for low communication latencies and high network throughput. In addition, the interconnection layers induce significant power consumption. In the Intel SCC [59], Sun Niagara [81], Intel Teraflops [57] and MIT Raw [137] processors, the interconnection layer's power consumption accounts for 10%, 17%, 28%, and 36% of the processor's overall power consumption respectively. The design of low-power processors must optimize the power consumption for the interconnection layer.

On the basis of the communication-centric cross-layer optimization method, this book explores the NoC design space in a coherent and uniform fashion, from the low-level logic implementations of the router, buffer, and topology, to network-level routing and flow control schemes, to the co-optimizations of the NoC and high-level programming paradigms. In Section 1.3, we first conduct a baseline design space exploration for NoCs, and then in Section 1.4 we review the current NoC research status. Section 1.5 summarizes NoC design trends for several real processors, including academia prototypes and commercial products. Section 1.6 briefly introduces the main content of each chapter and gives an overview of this book.

1.3 A baseline design space exploration of NoCs

Figure 1.2 illustrates a many-core platform with an NoC interconnection layer. The 16 processing cores are organized in a 4 × 4 mesh network. Each network node includes a processing node and a network router. The processing node may be replaced by other hardware units, such as special accelerators or memory controllers. The network routers and the associated links form the NoC. The router is composed of the buffer, crossbar, allocator, and routing unit. The baseline NoC design exploration mainly consists of the design of the network topology, routing algorithm, flow control mechanism, and router microarchitecture [26, 35]. In addition, we also discuss the baseline performance metrics for NoC evaluations.

Figure 1.2 A many-core platform with an NoC.

1.3.1 Topology

The topology is the first fundamental aspect of NoC design, and it has a profound effect on the overall network cost and performance. The topology determines the physical layout and connections between nodes and channels. Also, the message traverse hops and each hop's channel length depend on the topology. Thus, the topology significantly influences the latency and power consumption. Furthermore, since the topology determines the number of alternative paths between nodes, it affects the network traffic distribution, and hence the network bandwidth and performance achieved.

Since most current integrated circuits use the 2D integrated technology, an NoC topology must match well onto a 2D circuit surface. This requirement excludes several excellent topologies proposed for high-performance parallel computers; the hypercube is one such example. Therefore, NoCs generally apply regular and simple topologies. Figure 1.3 shows the structure of three typical NoC topologies: a ring, a 2D mesh, and a 2D torus.

Figure 1.3 Three typical NoC topologies. (a) Ring. (b) 2D mesh. (c) 2D torus.

The ring topology is widely used with a small number of network nodes, including the six-node Ivy Bridge [28] and the 12-node Cell [71] processors. The ring network has a very simple structure, and it is easy to implement the routing and flow control, or even apply a centralized control mechanism [28, 71]. Also, since Intel has much experience in implementing the cache coherence protocols on the ring networks, it prefers to utilize the ring topology, even in the current 61-core Knights Corner processor [66]. Yet, owing to the high average hop count, the scalability of a ring network is limited. Multiple rings may mitigate this limitation. For example, to improve the scalability, Intel's Knights Corner processor leverages ten rings, which is about two times more rings than the eight-core Nehalem-EX processor [22, 30, 116]. Another method to improve the scalability is to use 2D mesh or torus networks.

The 2D mesh is quite suitable for the wire routing in a multimetal layer CMOS technology, and it is also very easy to achieve deadlock freedom. Moreover, its scalability is better than that of the ring. These factors make the 2D mesh topology widely used in several academia and industrial chips, including the U.T. Austin TRIPS [48], Intel Teraflops [140], and Tilera TILE64 [145] chips. Also, the 2D mesh is the most widely assumed topology for the research community. A limitation of this topology is that its center portion is easily congested and may become a bottleneck with increasing core counts. Designing asymmetric topologies, which assign more wiring resources to the center portion, may be an appropriate way to address the limitation [107].

The node symmetry of a torus network helps to balance the network utilization. And the torus network supports a much higher bisection bandwidth than the mesh network. Thus, the 2D or higher-dimensional torus topology is widely used in the off-chip networks for several supercomputers, including the Cray T3E [129], Fujitsu K Computer [6], and IBM Blue Gene series [1, 16]. Also, the torus's wraparound links convert plentiful on-chip wires into bandwidth, and reduce the hop count and latency. Yet, the torus needs additional effort to avoid deadlock. Utilizing two virtual channels (VCs) for the dateline scheme [26] and utilizing bubble flow control (BFC) [18, 19, 100, 123] are two main deadlock avoidance methods for torus networks.

1.3.2 Routing algorithm

Once the NoC topology has been determined, the routing algorithm calculates the traverse paths for packets from the source nodes to the destination nodes. This section leverages the 2D mesh as the platform to introduce routing algorithms.

There are several classification criteria for routing algorithms. As shown in Figure 1.4, according to the traverse path length, the routing algorithms can be classified as minimal routing or nonminimal routing. The paths of nonminimal routing may be outside the minimal quadrant defined by the source node and the destination node. The Valiant routing algorithm is a typical nonminimal routing procedure [139]. It firstly routes the packet to a random intermediate destination (the inter. dest node shown in Figure 1.4a), and then routes the packet from this intermediate destination to the final destination.

Figure 1.4 Nonminimal routing and minimal routing. (a) Nonminimal routing. (b) Minimal routing.

In contrast, the packet traverse path of minimal routing is always inside the minimal quadrant defined by the source node and the destination node. Figure 1.4b shows two minimal routing paths for one source and destination node pair. Since minimal routing's packet traverse hops are fewer than those of nonminimal routing, its power consumption is generally lower. Yet, nonminimal routing is more appropriate to achieve global load balance and fault tolerance.

The routing algorithms can be divided into deterministic routing and nondeterministic routing on the basis of the path count provided. Deterministic routing offers only one path for each source and destination node pair. Dimensional order routing (DOR) is a typical deterministic routing algorithm. As shown in Figure 1.5a, the XY DOR offers only one path between nodes (3,0) and (1,2); the packet is firstly forwarded along the X direction to the destination's X position, and then it is forwarded along the Y direction to the destination node.

Figure 1.5 DOR, O1Turn routing, and adaptive routing. (a) XY DOR. (b) O1Turn routing. (c) Adaptive routing.

In contrast, nondeterministic routing may offer more than one path. According to whether it considers the network status, nondeterministic routing can be classified as oblivious routing or adaptive routing. Oblivious routing does not consider the network status when calculating the routing paths. As shown in Figure 1.5b, O1Turn routing is an oblivious routing algorithm [132]. It offers two paths for each source and destination pair: one is the XY DOR path, and the other one is the YX DOR path. The injected packet randomly chooses one of them. Adaptive routing considers the network status when calculating the routing paths. As shown in Figure 1.5c, if the algorithm detects there is congestion between nodes (3,1) and (3,2), adaptive routing tries to avoid the congestion by forwarding the packet to node (2,1) rather than node (3,2). Adaptive routing has some abilities to avoid network congestion; thus, it generally supports higher saturation throughput than deterministic routing.

One critical issue of the routing algorithm design is to achieve deadlock freedom. Deadlock freedom requires that there is no cyclic dependency among the network resources, typically the buffers and channels [24, 32]. Some proposals eliminate the cyclic resource dependency by forbidding certain turns for packet routing [20, 43, 46]. These routing algorithms are generally classified as partially adaptive routing since they do not allow the packets to utilize all the paths between the source and the destination. Other proposals [32, 33, 97, 99], named fully adaptive routing, allow the packet to utilize all minimal paths between the source and the destination. They achieve deadlock freedom by utilizing the VCs [24] to form acyclic dependency graphs for routing subfunctions, which act as the backup deadlock-free guarantee.

1.3.3 Flow control

Flow control allocates the network resources, such as the channel bandwidth, buffer capacity, and control state, to traversing packets. Store-and-forward (SAF) [26], virtual cut-through (VCT) [72], and wormhole [27] flow control are three primary flow control types.

Only after receiving the entire packet does the SAF flow control apply the buffers and channels of the next hop. Figure 1.6a illustrates an SAF example, where a five-flit packet is routed from router 0 to router 4. At each hop, the packet is forwarded only after receiving all five flits. Thus, SAF flow control induces serialization latency at each packet-forwarding hop.

Figure 1.6 Examples of SAF, VCT, and wormhole flow control. H, head flit; B, body flit; T, tail flit. (a) An SAF example. (b) A VCT example. (c) A wormhole example.

To reduce serialization latency, VCT flow control applies the buffers and channels of the next hop after receiving the header flit of a packet, as shown in Figure 1.6b. There is no contention at router 1, router 2, and router 3; thus, these routers forward the packets immediately after receiving the header flit. VCT flow control allocates the buffers and channels at the packet granularity; it needs the downstream router to have enough buffers for the entire packet before forwarding the header flit. In cycle 3 in Figure 1.6b, there are only two buffer slots at router 3, which is not enough for the five-flit packet. Thus, the packet waits at router 3 for three more cycles to be forwarded. VCT flow control requires the input port to have enough buffers for an entire packet.

In contrast to the SAF and VCT mechanisms, the wormhole flow control allocates the buffers at the flit granularity. It can send out the packet even if there is only one buffer slot downstream. When there is no network congestion, wormhole flow control performs the same as VCT flow control, as illustrated for routers 0, 1, and 2 in Figure 1.6b and c. However, when congestion occurs, VCT and wormhole flow control perform differently. Wormhole flow control requires only one empty downstream slot. Thus, as shown in Figure 1.6c, router 2 can send out the head flit and the first body flit at cycle 3, even if the two free buffer slots at router 3 are not enough for the entire packet. Wormhole flow control allocates the channels at the packet granularity. Although the channel between routers 2 and 3 is free from cycle 5 to cycle 7 in Figure 1.6c, this channel cannot be allocated to other packets until the tail flit of the last packet is sent out at cycle 10.

The VC techniques [24] can be combined with all three flow control types. A VC is an independent FIFO buffer queue. The VC can achieve many design purposes, including mitigating the head-of-line blocking and improving the physical channel utilization for wormhole flow control, avoiding the deadlock and supporting the quality of service (QoS). For example, to improve the physical channel utilization, multiple VCs can be configured to share the same physical channel to allow packets to bypass the current blocked packet. The router allocates the link bandwidth to different VCs and packets at the flit granularity. When the downstream VC for one packet does not have enough buffers, other packets can still use the physical channel through other VCs. Table 1.1 summarizes the properties of these flow control techniques [35].

Table 1.1

Summary of Flow Control Mechanisms

1.3.4 Router microarchitecture

The router microarchitecture directly determines the router delay, area overhead, and power consumption. Figure 1.7 illustrates the structure of a canonical VC NoC router. The router has five ports, with one port for each direction of a 2D mesh network and another port for the local node. A canonical NoC router is mainly composed of the input units, routing computation (RC) logic, VC allocator, switch allocator, crossbar, and output units [26]. The features and functionality of these components are as

Enjoying the preview?

Page 1 of 1

Networks-on-Chip: From Implementations to Programming Paradigms

About this ebook

Sheng Ma

Related authors

Related to Networks-on-Chip

Related ebooks

Systems Architecture For You

Related podcast episodes

Related articles

Related categories

Reviews for Networks-on-Chip

What did you think?

Book preview

Networks-on-Chip - Sheng Ma

Abstract

Keywords

Chapter outline

1.1 The dawn of the many-core era

1.2 Communication-centric cross-layer optimizations

1.3 A baseline design space exploration of NoCs

1.3.1 Topology

1.3.2 Routing algorithm

1.3.3 Flow control

1.3.4 Router microarchitecture