Embedded Systems: ARM Programming and Optimization

Ebook446 pages3 hours

Embedded Systems: ARM Programming and Optimization

Name: Embedded Systems: ARM Programming and Optimization
Author: Jason D. Bakos
ISBN: 9780128004128

By Jason D. Bakos

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Embedded Systems: ARM Programming and Optimization combines an exploration of the ARM architecture with an examination of the facilities offered by the Linux operating system to explain how various features of program design can influence processor performance. It demonstrates methods by which a programmer can optimize program code in a way that does not impact its behavior but improves its performance. Several applications, including image transformations, fractal generation, image convolution, and computer vision tasks, are used to describe and demonstrate these methods. From this, the reader will gain insight into computer architecture and application design, as well as gain practical knowledge in the area of embedded software design for modern embedded systems.

Covers three ARM instruction set architectures, the ARMv6 and ARMv7-A, as well as three ARM cores, the ARM11 on the Raspberry Pi, Cortex-A9 on the Xilinx Zynq 7020, and Cortex-A15 on the NVIDIA Tegra K1
Describes how to fully leverage the facilities offered by the Linux operating system, including the Linux GCC compiler toolchain and debug tools, performance monitoring support, OpenMP multicore runtime environment, video frame buffer, and video capture capabilities
Designed to accompany and work with most of the low cost Linux/ARM embedded development boards currently available

Skip carousel

LanguageEnglish

PublisherElsevier Science

Release dateSep 3, 2015

ISBN9780128004128

Author

Jason D. Bakos

Jason D. Bakos is an associate professor of Computer Science and Engineering at the University of South Carolina. He received a BS in Computer Science from Youngstown State University in 1999 and a PhD in Computer Science from the University of Pittsburgh in 2005. Dr. Bakos’s research focuses on mapping data- and compute-intensive codes to high-performance, heterogeneous, reconfigurable, and embedded computer systems. His group works closely with FPGA-based computer manufacturers Convey Computer Corporation, GiDEL, and Annapolis Micro Systems, as well as GPU and DSP manufacturers NVIDIA, Texas Instruments, and Advantech. Dr. Bakos holds two patents, has published over 30 refereed publications in computer architecture and high-performance computing, was a winner of the ACM/DAC student design contest in 2002 and 2004, and received the US National Science Foundation (NSF) CAREER award in 2009. He is currently serving as associate editor for ACM Transactions on Reconfigurable Technology and Systems.

Related authors

Skip carousel

Related to Embedded Systems

Related ebooks

Skip carousel

High Performance Parallelism Pearls Volume One: Multicore and Many-core Programming Approaches
Ebook
High Performance Parallelism Pearls Volume One: Multicore and Many-core Programming Approaches
byJames Reinders
Rating: 0 out of 5 stars
0 ratings
Build Your Own Distributed Compilation Cluster: A Practical Walkthrough
Ebook
Build Your Own Distributed Compilation Cluster: A Practical Walkthrough
byHunter Davis
Rating: 0 out of 5 stars
0 ratings
Learning OpenCV 3 Application Development
Ebook
Learning OpenCV 3 Application Development
bySamyak Datta
Rating: 0 out of 5 stars
0 ratings
Systems Programming: Designing and Developing Distributed Applications
Ebook
Systems Programming: Designing and Developing Distributed Applications
byRichard Anthony
Rating: 0 out of 5 stars
0 ratings
Digital Design of Signal Processing Systems: A Practical Approach
Ebook
Digital Design of Signal Processing Systems: A Practical Approach
byShoab Ahmed Khan
Rating: 5 out of 5 stars
5/5
CUDA Application Design and Development
Ebook
CUDA Application Design and Development
byRob Farber
Rating: 0 out of 5 stars
0 ratings
Multicore Software Development Techniques: Applications, Tips, and Tricks
Ebook
Multicore Software Development Techniques: Applications, Tips, and Tricks
byRobert Oshana
Rating: 3 out of 5 stars
3/5
OpenVX Programming Guide
Ebook
OpenVX Programming Guide
byFrank Brill
Rating: 0 out of 5 stars
0 ratings
Embedded System Design on a Shoestring: Achieving High Performance with a Limited Budget
Ebook
Embedded System Design on a Shoestring: Achieving High Performance with a Limited Budget
byLewin Edwards
Rating: 4 out of 5 stars
4/5
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Ebook
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
byFouad Sabry
Rating: 0 out of 5 stars
0 ratings
Combinatorial Optimization
Ebook
Combinatorial Optimization
byWilliam J. Cook
Rating: 5 out of 5 stars
5/5
Hands-on TinyML: Harness the power of Machine Learning on the edge devices (English Edition)
Ebook
Hands-on TinyML: Harness the power of Machine Learning on the edge devices (English Edition)
byRohan Banerjee
Rating: 5 out of 5 stars
5/5
Beginning STM32: Developing with FreeRTOS, libopencm3 and GCC
Ebook
Beginning STM32: Developing with FreeRTOS, libopencm3 and GCC
byWarren Gay
Rating: 0 out of 5 stars
0 ratings
Practical Neural Network Recipies in C++
Ebook
Practical Neural Network Recipies in C++
byMasters
Rating: 3 out of 5 stars
3/5
Multi-robot Exploration for Environmental Monitoring: The Resource Constrained Perspective
Ebook
Multi-robot Exploration for Environmental Monitoring: The Resource Constrained Perspective
byKshitij Tiwari
Rating: 0 out of 5 stars
0 ratings
Hybrid Dynamical Systems: Modeling, Stability, and Robustness
Ebook
Hybrid Dynamical Systems: Modeling, Stability, and Robustness
byRafal Goebel
Rating: 0 out of 5 stars
0 ratings
Computer Architecture
Ebook
Computer Architecture
byGérard Blanchet
Rating: 5 out of 5 stars
5/5
Introduction to Parallel Programming
Ebook
Introduction to Parallel Programming
bySteven Brawer
Rating: 0 out of 5 stars
0 ratings
Parallel and Distributed Processing
Ebook
Parallel and Distributed Processing
byK. Boyanov
Rating: 0 out of 5 stars
0 ratings
Stack Computers: The New Wave
Ebook
Stack Computers: The New Wave
byPhilip Koopman
Rating: 0 out of 5 stars
0 ratings
Embedded DSP Processor Design: Application Specific Instruction Set Processors
Ebook
Embedded DSP Processor Design: Application Specific Instruction Set Processors
byDake Liu
Rating: 0 out of 5 stars
0 ratings
Assembly Language Programming: ARM Cortex-M3
Ebook
Assembly Language Programming: ARM Cortex-M3
byVincent Mahout
Rating: 0 out of 5 stars
0 ratings
The Dosimetry of Ionizing Radiation: Volume III
Ebook
The Dosimetry of Ionizing Radiation: Volume III
byBozzano G Luisa
Rating: 0 out of 5 stars
0 ratings
Practical Parallel Programming
Ebook
Practical Parallel Programming
byBarr E. Bauer
Rating: 0 out of 5 stars
0 ratings
Writing Compilers and Interpreters: A Software Engineering Approach
Ebook
Writing Compilers and Interpreters: A Software Engineering Approach
byRonald Mak
Rating: 3 out of 5 stars
3/5
Predictive Control
Ebook
Predictive Control
byDaniel Lequesne
Rating: 0 out of 5 stars
0 ratings
Pic® Micro Principles Teachers Pack V11
Ebook
Pic® Micro Principles Teachers Pack V11
byClive W. Humphris
Rating: 0 out of 5 stars
0 ratings
Conceptive C
Ebook
Conceptive C
byHarry McGeough
Rating: 0 out of 5 stars
0 ratings
Professional Python
Ebook
Professional Python
byLuke Sneeringer
Rating: 0 out of 5 stars
0 ratings
Decision-Making Techniques for Autonomous Vehicles
Ebook
Decision-Making Techniques for Autonomous Vehicles
byJorge Villagra
Rating: 0 out of 5 stars
0 ratings

Hardware For You

Skip carousel

Chip War: The Fight for the World's Most Critical Technology
Ebook
Chip War: The Fight for the World's Most Critical Technology
byChris Miller
Rating: 4 out of 5 stars
4/5
Javascript: Javascript Programming For Absolute Beginners: Ultimate Guide To Javascript Coding, Javascript Programs And Javascript Language
Ebook
Javascript: Javascript Programming For Absolute Beginners: Ultimate Guide To Javascript Coding, Javascript Programs And Javascript Language
byWilliam Sullivan
Rating: 4 out of 5 stars
4/5
CompTIA A+ Complete Review Guide: Exam Core 1 220-1001 and Exam Core 2 220-1002
Ebook
CompTIA A+ Complete Review Guide: Exam Core 1 220-1001 and Exam Core 2 220-1002
byTroy McMillan
Rating: 5 out of 5 stars
5/5
CompTIA A+ Complete Review Guide: Core 1 Exam 220-1101 and Core 2 Exam 220-1102
Ebook
CompTIA A+ Complete Review Guide: Core 1 Exam 220-1101 and Core 2 Exam 220-1102
byTroy McMillan
Rating: 5 out of 5 stars
5/5
Amazon Web Services (AWS) Interview Questions and Answers
Ebook
Amazon Web Services (AWS) Interview Questions and Answers
byTech Interviews
Rating: 5 out of 5 stars
5/5
Build Your Own PC Do-It-Yourself For Dummies
Ebook
Build Your Own PC Do-It-Yourself For Dummies
byMark L. Chambers
Rating: 4 out of 5 stars
4/5
Computer Science: A Concise Introduction
Ebook
Computer Science: A Concise Introduction
byIan Sinclair
Rating: 4 out of 5 stars
4/5
Raspberry Pi Cookbook for Python Programmers
Ebook
Raspberry Pi Cookbook for Python Programmers
byTim Cox
Rating: 0 out of 5 stars
0 ratings
CompTIA A+ Complete Review Guide: Exams 220-901 and 220-902
Ebook
CompTIA A+ Complete Review Guide: Exams 220-901 and 220-902
byTroy McMillan
Rating: 5 out of 5 stars
5/5
Upgrading and Fixing Computers Do-it-Yourself For Dummies
Ebook
Upgrading and Fixing Computers Do-it-Yourself For Dummies
byAndy Rathbone
Rating: 4 out of 5 stars
4/5
Dancing with Qubits: How quantum computing works and how it can change the world
Ebook
Dancing with Qubits: How quantum computing works and how it can change the world
byRobert S. Sutor
Rating: 5 out of 5 stars
5/5
Creative Selection: Inside Apple's Design Process During the Golden Age of Steve Jobs
Ebook
Creative Selection: Inside Apple's Design Process During the Golden Age of Steve Jobs
byKen Kocienda
Rating: 5 out of 5 stars
5/5
The Ridiculously Simple Guide to MacBook Pro With Touch Bar: A Practical Guide to Getting Started With the Next Generation of MacBook Pro and MacOS Mojave (Version 10.14)
Ebook
The Ridiculously Simple Guide to MacBook Pro With Touch Bar: A Practical Guide to Getting Started With the Next Generation of MacBook Pro and MacOS Mojave (Version 10.14)
byBrian Norman
Rating: 0 out of 5 stars
0 ratings
Evernote Essentials Guide (Boxed Set): Evernote Guide For Beginners for Organizing Your Life
Ebook
Evernote Essentials Guide (Boxed Set): Evernote Guide For Beginners for Organizing Your Life
bySpeedy Publishing
Rating: 3 out of 5 stars
3/5
iPhone 14 Pro Max User Guide for Beginners and Seniors
Ebook
iPhone 14 Pro Max User Guide for Beginners and Seniors
byCharles J. Jones
Rating: 0 out of 5 stars
0 ratings
Mastering ChatGPT
Ebook
Mastering ChatGPT
byCharles J. Jones
Rating: 0 out of 5 stars
0 ratings
Hacking With Linux 2020:A Complete Beginners Guide to the World of Hacking Using Linux - Explore the Methods and Tools of Ethical Hacking with Linux
Ebook
Hacking With Linux 2020:A Complete Beginners Guide to the World of Hacking Using Linux - Explore the Methods and Tools of Ethical Hacking with Linux
byJoseph Kenna
Rating: 0 out of 5 stars
0 ratings
Programming Arduino: Getting Started with Sketches
Ebook
Programming Arduino: Getting Started with Sketches
bySimon Monk
Rating: 4 out of 5 stars
4/5
Unlock Any Roku Device: Watch Shows, TV, & Download Apps
Ebook
Unlock Any Roku Device: Watch Shows, TV, & Download Apps
byJan Hendrick
Rating: 0 out of 5 stars
0 ratings
iPhone Photography: A Ridiculously Simple Guide To Taking Photos With Your iPhone
Ebook
iPhone Photography: A Ridiculously Simple Guide To Taking Photos With Your iPhone
byScott La Counte
Rating: 0 out of 5 stars
0 ratings
Debugging Embedded and Real-Time Systems: The Art, Science, Technology, and Tools of Real-Time System Debugging
Ebook
Debugging Embedded and Real-Time Systems: The Art, Science, Technology, and Tools of Real-Time System Debugging
byArnold S. Berger
Rating: 5 out of 5 stars
5/5
Macs For Dummies
Ebook
Macs For Dummies
byEdward C. Baig
Rating: 5 out of 5 stars
5/5
Mastering Apple MacBook - MacBook Pro, MacBook Air, MacOS Ultimate User Guide
Ebook
Mastering Apple MacBook - MacBook Pro, MacBook Air, MacOS Ultimate User Guide
byAdidas Wilson
Rating: 1 out of 5 stars
1/5
iPad Mini 6 User Instruction Manual: A User Guide to Help Master the Most Challenging Aspects of This Handy Device
Ebook
iPad Mini 6 User Instruction Manual: A User Guide to Help Master the Most Challenging Aspects of This Handy Device
byGerard McClay
Rating: 0 out of 5 stars
0 ratings
Exploring Windows 10 May 2020 Edition: The Illustrated, Practical Guide to Using Microsoft Windows
Ebook
Exploring Windows 10 May 2020 Edition: The Illustrated, Practical Guide to Using Microsoft Windows
byKevin Wilson
Rating: 0 out of 5 stars
0 ratings
The Ridiculously Simple Guide To iPhone 12, iPhone Pro, and iPhone Pro Max: A Practical Guide To Getting Started With the Next Generation of iPhone and iOS 14
Ebook
The Ridiculously Simple Guide To iPhone 12, iPhone Pro, and iPhone Pro Max: A Practical Guide To Getting Started With the Next Generation of iPhone and iOS 14
byScott La Counte
Rating: 0 out of 5 stars
0 ratings
Debugging: The 9 Indispensable Rules for Finding Even the Most Elusive Software and Hardware Problems
Ebook
Debugging: The 9 Indispensable Rules for Finding Even the Most Elusive Software and Hardware Problems
byDavid J. Agans
Rating: 4 out of 5 stars
4/5
iPhone For Seniors For Dummies: Updated for iPhone 12 models and iOS 14
Ebook
iPhone For Seniors For Dummies: Updated for iPhone 12 models and iOS 14
byDwight Spivey
Rating: 4 out of 5 stars
4/5
Tor Darknet Bundle: Master the Art of Invisibility
Ebook
Tor Darknet Bundle: Master the Art of Invisibility
byLance Henderson
Rating: 0 out of 5 stars
0 ratings
iPhone 12, iPhone Pro, and iPhone Pro Max For Senirs: A Ridiculously Simple Guide to the Next Generation of iPhone and iOS 14
Ebook
iPhone 12, iPhone Pro, and iPhone Pro Max For Senirs: A Ridiculously Simple Guide to the Next Generation of iPhone and iOS 14
byScott La Counte
Rating: 0 out of 5 stars
0 ratings

Related podcast episodes

Skip carousel

Lessons Learned from Cloud Foundry
Podcast episode
Lessons Learned from Cloud Foundry
byThe Cloudcast
0 ratings
0% found this document useful
C# and IL2CPP with Josh Peterson: Rob and Jason are joined by Josh Peterson to talk about C# and some of the similarities and differences between the Managed language and C++, he also talks about his work at Unity 3D on IL2CPP. Josh is a programmer working at Unity Technologies, where...
Podcast episode
C# and IL2CPP with Josh Peterson: Rob and Jason are joined by Josh Peterson to talk about C# and some of the similarities and differences between the Managed language and C++, he also talks about his work at Unity 3D on IL2CPP. Josh is a programmer working at Unity Technologies, where...
byCppCast
0 ratings
0% found this document useful
Generators, Coroutines, and Learning Python Through Exercises
Podcast episode
Generators, Coroutines, and Learning Python Through Exercises
byThe Real Python Podcast
0 ratings
0% found this document useful
ChatOps with Jason Hand: Chat bots are your newest co-worker. Slack, HipChat, and other chat clients allow developers and other team members to communicate more dynamically than the limits of email. Companies have started to add bots to their chat rooms.
Podcast episode
ChatOps with Jason Hand: Chat bots are your newest co-worker. Slack, HipChat, and other chat clients allow developers and other team members to communicate more dynamically than the limits of email. Companies have started to add bots to their chat rooms.
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
MLA 017 AWS Local Development: Show notes: Developing on AWS first (SageMaker or other) Consider developing against AWS as your local development environment, rather than only your cloud deployment environment. Solutions: Stick to AWS Cloud IDEs (, , Connect...
Podcast episode
MLA 017 AWS Local Development: Show notes: Developing on AWS first (SageMaker or other) Consider developing against AWS as your local development environment, rather than only your cloud deployment environment. Solutions: Stick to AWS Cloud IDEs (, , Connect...
byMachine Learning Guide
0 ratings
0% found this document useful
Looping With enumerate() and Python GUIs With PyQt
Podcast episode
Looping With enumerate() and Python GUIs With PyQt
byThe Real Python Podcast
0 ratings
0% found this document useful
MLA 018 Descript: (Optional episode) just showcasing a cool application using machine learning Dept uses Descript for some of their podcasting. I'm using it like a maniac, I think they're surprised at how into it I am. Check out the transcript & see how it...
Podcast episode
MLA 018 Descript: (Optional episode) just showcasing a cool application using machine learning Dept uses Descript for some of their podcasting. I'm using it like a maniac, I think they're surprised at how into it I am. Check out the transcript & see how it...
byMachine Learning Guide
0 ratings
0% found this document useful
Exploring The Patterns And Practices For Deep Learning With Andrew Ferlitsch: An interview with Andrew Ferlitsch about his experiences building and teaching deep learning models and his work on a book to capture those lessons for everyone to learn from.
Podcast episode
Exploring The Patterns And Practices For Deep Learning With Andrew Ferlitsch: An interview with Andrew Ferlitsch about his experiences building and teaching deep learning models and his work on a book to capture those lessons for everyone to learn from.
byThe Python Podcast.__init__
0 ratings
0% found this document useful
Unix and C History with Brian Kernighan
Podcast episode
Unix and C History with Brian Kernighan
byCppCast
0 ratings
0% found this document useful
Stable Diffusion & Generative AI with Emad Mostaque - #604
Podcast episode
Stable Diffusion & Generative AI with Emad Mostaque - #604
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
48: A GUI for pytest: The story of how I discovered the current best GUI to run pytest.
Podcast episode
48: A GUI for pytest: The story of how I discovered the current best GUI to run pytest.
byTest and Code
0 ratings
0% found this document useful
This Week In Machine Learning & AI - 5/20/16: AI at Google I/O, Amazon's Deep Learning DSSTNE: This Week In Machine Learning & AI - May 20, 2016…
Podcast episode
This Week In Machine Learning & AI - 5/20/16: AI at Google I/O, Amazon's Deep Learning DSSTNE: This Week In Machine Learning & AI - May 20, 2016…
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
Package Management in Elixir vs. JavaScript with Wojtek Mach & Amal Hussein: Wojtek Mach of HexPM and Amal Hussein, engineering leader and former NPM team member, join Owen Bickford to compare notes on package management in Elixir vs. JavaScript.
Podcast episode
Package Management in Elixir vs. JavaScript with Wojtek Mach & Amal Hussein: Wojtek Mach of HexPM and Amal Hussein, engineering leader and former NPM team member, join Owen Bickford to compare notes on package management in Elixir vs. JavaScript.
byElixir Wizards
0 ratings
0% found this document useful
381 Programming Framework: Which Ones To Learn? - Simple Programmer Podcast: If you're a software developer I doubt you'll ever be able to learn everything that software developer has to offer. Every day new programming languages come out, technology changes and the process is updated. All this amount of information makes it...
Podcast episode
381 Programming Framework: Which Ones To Learn? - Simple Programmer Podcast: If you're a software developer I doubt you'll ever be able to learn everything that software developer has to offer. Every day new programming languages come out, technology changes and the process is updated. All this amount of information makes it...
bySimple Programmer Podcast
0 ratings
0% found this document useful
242 How Do I Learn C#? - Simple Programmer Podcast: C# is featured among one of the most important and popular programming languages in the software development industry. There are a lot of different uses for C# and it is definitely a big choice if you want to specialize in C# one day. According to...
Podcast episode
242 How Do I Learn C#? - Simple Programmer Podcast: C# is featured among one of the most important and popular programming languages in the software development industry. There are a lot of different uses for C# and it is definitely a big choice if you want to specialize in C# one day. According to...
bySimple Programmer Podcast
0 ratings
0% found this document useful
Embedded Systems in Elixir vs. C, C++, and Java with Connor Rigby & Taylor Barto: Connor Rigby, Software Engineer at SmartRent, and Taylor Barto, Lead Embedded Software Engineer at Eaton, join Sundi to compare notes on embedded systems development with Elixir, C, C++, and Java. The guests ask one another questions to gain valuable insights into challenges, tooling, resources, and more across different embedded ecosystems.
Podcast episode
Embedded Systems in Elixir vs. C, C++, and Java with Connor Rigby & Taylor Barto: Connor Rigby, Software Engineer at SmartRent, and Taylor Barto, Lead Embedded Software Engineer at Eaton, join Sundi to compare notes on embedded systems development with Elixir, C, C++, and Java. The guests ask one another questions to gain valuable insights into challenges, tooling, resources, and more across different embedded ecosystems.
byElixir Wizards
0 ratings
0% found this document useful
081 Mastering Memory Aware .NET Software Development with Konrad Kokosa: The .NET Runtime – whether .NET Framework or .NET Core – provides many ways to optimize memory management. But they don’t come in the form of configuration switches as we know if from Java. While there are a handful of settings, the .NET Runtime...
Podcast episode
081 Mastering Memory Aware .NET Software Development with Konrad Kokosa: The .NET Runtime – whether .NET Framework or .NET Core – provides many ways to optimize memory management. But they don’t come in the form of configuration switches as we know if from Java. While there are a handful of settings, the .NET Runtime...
byPurePerformance
0 ratings
0% found this document useful
MLOps Coffee Sessions #10 Analyzing the Article “Continuous Delivery and Automation Pipelines in Machine Learning" // Part 2
Podcast episode
MLOps Coffee Sessions #10 Analyzing the Article “Continuous Delivery and Automation Pipelines in Machine Learning" // Part 2
byMLOps.community
0 ratings
0% found this document useful
2020-029- Brad Spengler, Linux kernel security in the past 10 years, software dev practices in Linux, WISP.org PSA: WISP.org PSA at 35m56s - 37m 19s Agenda:Bio/background Why are you here (topic discussion) What is the Linux Security Summit North America Questions from the meeting invite: This only affects people who want to use a custom...
Podcast episode
2020-029- Brad Spengler, Linux kernel security in the past 10 years, software dev practices in Linux, WISP.org PSA: WISP.org PSA at 35m56s - 37m 19s Agenda:Bio/background Why are you here (topic discussion) What is the Linux Security Summit North America Questions from the meeting invite: This only affects people who want to use a custom...
byBrakeSec Education Podcast
0 ratings
0% found this document useful
S2 E19 - Dynamic Module + Component Loading Using any Observables: Single Page Applications (SPA) have many advantages, including increased interactivity, responsiveness, and user experience. However, a SPA often requires sending large chunks of JavaScript code to the client. This code must be downloaded and parsed...
Podcast episode
S2 E19 - Dynamic Module + Component Loading Using any Observables: Single Page Applications (SPA) have many advantages, including increased interactivity, responsiveness, and user experience. However, a SPA often requires sending large chunks of JavaScript code to the client. This code must be downloaded and parsed...
byThe Angular Plus Show
0 ratings
0% found this document useful
gRPC at CoreOS with Brandon Philips: Brandon Philips, CTO of CoreOS, tells your cohosts Mark and Francesc why they chose gRPC for the newest version of etcd and how this improved its performance and development flow.
Podcast episode
gRPC at CoreOS with Brandon Philips: Brandon Philips, CTO of CoreOS, tells your cohosts Mark and Francesc why they chose gRPC for the newest version of etcd and how this improved its performance and development flow.
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
756 Is PYTHON The FUTURE Of Programming? (With Rafeh Qazi From Clever Programmer) - Simple Programmer Podcast
Podcast episode
756 Is PYTHON The FUTURE Of Programming? (With Rafeh Qazi From Clever Programmer) - Simple Programmer Podcast
bySimple Programmer Podcast
0 ratings
0% found this document useful
OpenTelemetry for Databases: Empowering DevOps through sqlcommenter with Nimesh Bhagat: Optimizing or debugging database calls has to become as easy as optimizing your application code based on logs, metrics or traces your observability platform provides to developers. It has to be doable by the development and DevOps teams who are...
Podcast episode
OpenTelemetry for Databases: Empowering DevOps through sqlcommenter with Nimesh Bhagat: Optimizing or debugging database calls has to become as easy as optimizing your application code based on logs, metrics or traces your observability platform provides to developers. It has to be doable by the development and DevOps teams who are...
byPurePerformance
0 ratings
0% found this document useful
Mike Waud and Tony Winn on the Future of Elixir on the Grid: Join Elixir Wizards Owen Bickford and Dan Ivovich as they host Mike Waud, Senior Software Engineer at SparkMeter, and Tony Winn, Lead Software Architect at Generac, to discuss the future of the BEAM in the electric grid. They discuss how their companies use Elixir and the challenges they face in implementing cutting-edge technologies in an environment with a mix of old and new systems.
Podcast episode
Mike Waud and Tony Winn on the Future of Elixir on the Grid: Join Elixir Wizards Owen Bickford and Dan Ivovich as they host Mike Waud, Senior Software Engineer at SparkMeter, and Tony Winn, Lead Software Architect at Generac, to discuss the future of the BEAM in the electric grid. They discuss how their companies use Elixir and the challenges they face in implementing cutting-edge technologies in an environment with a mix of old and new systems.
byElixir Wizards
0 ratings
0% found this document useful
The State of Kubernetes
Podcast episode
The State of Kubernetes
byThe Cloudcast
0 ratings
0% found this document useful
Working with Kubernetes and KRM with Megan O'Keefe: This week on the podcast, we welcome guest Megan O’Keefe to talk about KRM and Kubernetes with your hosts Mark Mirchandani and Anthony Bushong.
Podcast episode
Working with Kubernetes and KRM with Megan O'Keefe: This week on the podcast, we welcome guest Megan O’Keefe to talk about KRM and Kubernetes with your hosts Mark Mirchandani and Anthony Bushong.
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
Bringing Observability to .NET with Georg Schausberger and Bernhard Ruebl: Getting visibility into .NET code whether it runs on a developer machine, on a windows server on-premise or as a serverless function in the cloud is the day2day job of Georg Schausberger (@BombadilThomas) and Bernhard Ruebl, part of the Dynatrace .NET...
Podcast episode
Bringing Observability to .NET with Georg Schausberger and Bernhard Ruebl: Getting visibility into .NET code whether it runs on a developer machine, on a windows server on-premise or as a serverless function in the cloud is the day2day job of Georg Schausberger (@BombadilThomas) and Bernhard Ruebl, part of the Dynatrace .NET...
byPurePerformance
0 ratings
0% found this document useful
Platform Engineering at a FAANG Company
Podcast episode
Platform Engineering at a FAANG Company
byThe Cloudcast
0 ratings
0% found this document useful
Serverless Data APIs
Podcast episode
Serverless Data APIs
byThe Cloudcast
0 ratings
0% found this document useful
Ep. 24 - How to run a successful development process (even if you're not technical): This episode is for anyone who wants to effectively orchestrate a development process without becoming the butt of their team’s water-cooler jokes. It's more attainable than you think, because it's all about process. Don't be a Bill Lumbergh - be...
Podcast episode
Ep. 24 - How to run a successful development process (even if you're not technical): This episode is for anyone who wants to effectively orchestrate a development process without becoming the butt of their team’s water-cooler jokes. It's more attainable than you think, because it's all about process. Don't be a Bill Lumbergh - be...
byfreeCodeCamp Podcast
0 ratings
0% found this document useful

Skip carousel

Custom Embedded Linux Images
Linux Format
Article
Custom Embedded Linux Images
Jun 4, 2019
The Yocto Project (Yocto) www.yoctoproject.org is a system that uses the Linux kernel and packages contributed from the OpenEmbedded software team. The Yocto team points out that its product is not a Linux distribution, but instead builds custom dist
8 min read
Kernel Internals
Linux Format
Article
Kernel Internals
Aug 24, 2021
4 min read
Silq Is An Easier Quantum Programming Language
Futurity
Article
Silq Is An Easier Quantum Programming Language
Jun 22, 2020
3 min read
MapReduce: The ‘Big Data’ Idea Inside Your Android Phone
APC
Article
MapReduce: The ‘Big Data’ Idea Inside Your Android Phone
Dec 2, 2019
4 min read
Coding Arm 64-bit Assembly Language
Linux Format
Article
Coding Arm 64-bit Assembly Language
Feb 9, 2021
9 min read
Run Windows on a Raspberry Pi 4
Maximum PC
Article
Run Windows on a Raspberry Pi 4
May 25, 2021
YOU’LL NEED THIS RASPBERRY PI 4 We used the 8GB model. You also need an 8GB or greater microSD card, a wired Internet connection, and infinite patience. THIS IS IT, the real deal. No tricks involving VPNs or virtualization. This is the ARM port of Wi
5 min read
Design Your Own Microprocessor
Linux Format
Article
Design Your Own Microprocessor
Jan 14, 2020
15 min read
AWS Thinkbox Deadline
3D World
Article
AWS Thinkbox Deadline
May 20, 2020
3 min read
Raspberry Pi 4 B
APC
Article
Raspberry Pi 4 B
Aug 12, 2019
5 min read
Code Micro:bit Wirelessly With Android
APC
Article
Code Micro:bit Wirelessly With Android
Dec 2, 2019
8 min read
Live-plotting Data
Linux Format
Article
Live-plotting Data
Jul 30, 2019
7 min read
Learn How To Program The 50 Pence Chip
Linux Format
Article
Learn How To Program The 50 Pence Chip
Jul 28, 2020
Mike Bedford discovered PICs many years ago and felt an immediate affinity since they provided an ideal way of working on hardware and software together. With the entry-level boards in Raspberry Pi family costing less than £5, we’re fully conversan
10 min read
Open Source Processors
Linux Format
Article
Open Source Processors
Jun 2, 2020
8 min read
Raspberry Pi news
Linux Format
Article
Raspberry Pi news
May 4, 2021
1 min read
System Boot Speeds
Linux Format
Article
System Boot Speeds
May 4, 2021
3 min read
The Empire Strikes Back
Maximum PC
Article
The Empire Strikes Back
Jul 21, 2020
9 min read
Software Pools Server Memory for Faster Networks
Futurity
Article
Software Pools Server Memory for Faster Networks
May 31, 2017
A group of engineers has created open-source software that allows for memory sharing among servers in a computer network, allowing for more efficient use of memory and even faster computer operations. For decades, operators of large computer clusters
2 min read
Natural Language Translation
Linux Format
Article
Natural Language Translation
Jun 27, 2023
4 min read
Surface Laptop 4 Leaks Suggest A Mix Of AMD And Intel CPUs
Tech Advisor
Article
Surface Laptop 4 Leaks Suggest A Mix Of AMD And Intel CPUs
Mar 31, 2021
1 min read
Tpm Exposed
APC
Article
Tpm Exposed
Apr 18, 2022
9 min read
What Is The Future Of Game Streaming Now That Stadia Is Dead?
APC
Article
What Is The Future Of Game Streaming Now That Stadia Is Dead?
Oct 31, 2022
Once hyped as being ‘the future of gaming’, the Google Stadia game streaming service was officially, just three years after launch and before even making it to Australian shores. When game streaming first launched we did have some apprehension about
2 min read
EMBEDDED COMPUTING Getting Hands-on With Embedded Hardware
Linux Format
Article
EMBEDDED COMPUTING Getting Hands-on With Embedded Hardware
Jun 4, 2019
Single board computers (SBCs), typically credit-card sized or smaller, are now a familiar part of the computing scene, thanks in no small part to the Raspberry Pi family of products. The Pi has spawned lookalikes from several companies, which have al
10 min read
Google To Invest $1.2b In Germany Cloud Computing Program
TechLife News
Article
Google To Invest $1.2b In Germany Cloud Computing Program
Sep 4, 2021
1 min read
The Return Of GPU Computing
APC
Article
The Return Of GPU Computing
May 16, 2022
5 min read
Set Up Your First Database
Linux Format
Article
Set Up Your First Database
Aug 25, 2020
1 min read
Common Errors
Linux Format
Article
Common Errors
Aug 27, 2019
If you receive a ‘Script not found’ error, this probably means that you don’t have the mod scripts installed in your Minecraft directory. Check that you’ve replaced .minecraft with the one from McPiFoMo; this should include mcpipy, which will be full
1 min read
Silicon Insides
Linux Format
Article
Silicon Insides
Jun 27, 2023
11 min read
Give Old Mac Software Eternal Life
MacWorld
Article
Give Old Mac Software Eternal Life
Jun 19, 2018
4 min read
How To Add EBPF Observability To Your Project
Techfastly
Article
How To Add EBPF Observability To Your Project
Apr 1, 2022
3 min read
Observability Of The Kernel And Containers
Linux Format
Article
Observability Of The Kernel And Containers
Apr 4, 2023
Mihalis Tsoukalos is currently working on Time Series. You can reach him at: @mactsouk. For our final delve into eBPF, we’re tackling applications, the kernel and Docker containers. At the end of the day, all Linux machines execute code for applicat
10 min read

Related categories

Skip carousel

Reviews for Embedded Systems

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Embedded Systems - Jason D. Bakos

9780128004128_FC

Embedded Systems

ARM® Programming and Optimization

First Edition

Jason D. Bakos

Department of Computer Science and Engineering, University of South Carolina, Columbia, SC

publogo

Cover image

Title page

Copyright

Dedication

Preface

Using this Book

Instructor Support

Acknowledgments

Chapter 1: The Linux/ARM embedded platform

Abstract

1.1 Performance-Oriented Programming

1.2 ARM Technology

1.3 Brief History of ARM

1.4 ARM Programming

1.5 ARM Architecture Set Architecture

1.6 Assembly Optimization #1: Sorting

1.7 Assembly Optimization #2: Bit Manipulation

1.8 Code Optimization Objectives

1.9 Runtime Profiling with Performance Counters

1.10 Measuring Memory Bandwidth

1.11 Performance Results

1.12 Performance Bounds

1.13 Basic ARM Instruction Set

1.14 Chapter Wrap-Up

Exercises

Chapter 2: Multicore and data-level optimization: OpenMP and SIMD

Abstract

2.1 Optimization Techniques Covered by this Book

2.2 Amdahl's Law

2.3 Test Kernel: Polynomial Evaluation

2.4 Using Multiple Cores: OpenMP

2.5 Performance Bounds

2.6 Performance Analysis

2.7 Inline Assembly Language in GCC

2.8 Optimization #1: Reducing Instructions per Flop

2.9 Optimization #2: Reducing CPI

2.10 Optimization #3: Multiple Flops per Instruction with Single Instruction, Multiple Data

2.11 Chapter Wrap-Up

Chapter 3: Arithmetic optimization and the Linux Framebuffer

Abstract

3.1 The Linux Framebuffer

3.2 Affine Image Transformations

3.3 Bilinear Interpolation

3.4 Floating-Point Image Transformation

3.5 Analysis of Floating-Point Performance

3.6 Fixed-Point Arithmetic

3.7 Fixed-Point Performance

3.8 Real-Time Fractal Generation

3.9 Chapter Wrap-Up

Chapter 4: Memory optimization and video processing

Abstract

4.1 Stencil Loops

4.2 Example Stencil: The Mean Filter

4.3 Separable Filters

4.4 Memory Access Behavior of 2D Filters

4.5 Loop Tiling

4.6 Tiling and the Stencil Halo Region

4.7 Example 2D Filter Implementation

4.8 Capturing and Converting Video Frames

4.9 Video4Linux Driver and API

4.10 Applying the 2D Tiled Filter

4.11 Applying the Separated 2D Tiled Filter

4.12 Top-Level Loop

4.13 Performance Results

4.14 Chapter Wrap-Up

Chapter 5: Embedded heterogeneous programming with OpenCL

Abstract

5.1 GPU Microarchitecture

5.2 OpenCL

5.3 OpenCL Programming Model, Idioms, and Abstractions

5.4 Kernel Workload Distribution

5.5 OpenCL Implementation of Horner's Method: Device Code

5.6 Performance Results

5.7 Chapter Wrap-Up

Appendix A: Adding PMU support to Raspbian for the Generation 1 Raspberry Pi

A.1 Download the Linux Kernel and Cross-Compiler Tools

A.2 Kernel Modifications

A.3 Building the Kernel

A.4 Installing the Kernel

Appendix B: NEON intrinsic reference

B.1 Vector Data Types

B.2 Reading and Writing Vector Variables

B.3 Vector Element Manipulation

B.4 Optimizing Floating-Point Code with NEON Intrinsics

B.5 Summary of NEON Instrinsics

Appendix C: OpenCL reference

C.1 Platform Layer

C.2 Memory Types

C.3 Buffer Management

C.4 Programs and Compiling

C.5 Kernel Functions

C.6 Command Queue Functions

C.7 Vector and Image Data Types

C.8 Attributes

C.9 Constants

C.10 Built-in Functions

Index

Copyright

Acquiring Editor: Steve Merken

Editorial Project Manager: Nathaniel McFadden

Project Manager: Sujatha Thirugnana Sambandam

Designer: Mark Rogers

Morgan Kaufmann is an imprint of Elsevier

225 Wyman Street, Waltham, MA 02451, USA

No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions.

This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein).

Notices

Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary.

Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility.

To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein.

Library of Congress Cataloging-in-Publication Data

A catalog record for this book is available from the Library of Congress

British Library Cataloguing-in-Publication Data

A catalogue record for this book is available from the British Library

ISBN: 978-0-12-800342-8

For information on all Morgan Kaufmann publications visit our website at http://store.elsevier.com/

fm01-9780128003428

Dedication

For Lumi, Jade, and Justin

Preface

For many years I have worked in the area of reconfigurable computing, whose goal is to develop tools and methodologies to facilitate the use of field programmable gate arrays (FPGAs) as co-processors for high-performance computer systems.

One of the main challenges in this discipline is the programming problem, in which the practical application of FPGAs is fundamentally limited by their tedious and error-prone programming model. This is of particular concern because this problem is a consequence of the technology’s strengths: FPGAs operate with fine grain concurrency, where the programmer can control the simultaneous behavior of every circuit on the chip. Unfortunately, this control also requires that the programmer manage fine grain constraints such as on-chip memory usage and routing congestion. The CPU programmer, on the other hand, needs only consider the potential state of the CPU at each line of code, while on-chip resources are automatically managed by the hardware at runtime.

I recently realized that modern embedded systems may soon face a similar programming problem. Battery technology continues to remain relatively stagnant, and the slowing of Moore’s Law became painfully evident after the nearly 6-year gap between 65 and 28 nm fabrication technology. At the same time, consumers have come to expect the continued advancement of embedded system capabilities, such as being able to run real-time augmented reality software on a processor that fits in a pair of eyeglasses.

Given these demands for energy efficiency and performance, many embedded processor vendors are seeking more energy-efficient approaches to microarchitecture, often involving targeting the types of parallelism that cannot be automatically extracted from software. This will require cooperation of the programmers to write parallel code. This is a lot of to ask of programmers, who will need to juggle both functionality and performance on a resource- and power-constrained platform that includes a wide range of potential sources of parallelism from multicores to GPU shader units.

Many universities have developed unified parallel programming courses that cover the spectrum of parallel programming from distributed systems to manycore processors. However, the topic is most often taught from the perspective of high-performance computing as opposed to embedded computing.

With the recent explosion of advanced embedded platforms such as the Raspberry Pi, I saw a need to develop curriculum that combines topics from computer architecture and parallel programming for performance-oriented programming of embedded systems. I also wanted to include interesting and relevant projects and case studies for the course to avoid the traditional types of dull course projects associated with embedded systems courses (e.g., blink the light) and parallel programming courses (e.g., write and optimize a Fast Fourier Transform).

While using these ideas in my own embedded systems course, and I often find the students competing among themselves to achieve the fastest image rotation or the fastest Mandelbrot set generator. This type of collegial competition cultivates excitement for the material.

Using this Book

This book is intended for use in a junior- or senior-level undergraduate course in a computer science or computer engineering curriculum. Although a course in embedded systems may focus on subtopics such as control theory, robotics, low power design, real-time systems, or other related topics, this book is intended as an introduction to performance-oriented programming for lightweight system-on-chip embedded processors.

This book should accompany an embedded design platform such as a Raspberry Pi, on which the student can evaluate the practices and methodologies described.

When using this text, students are expected to know the C programming language, have a basic knowledge of the Linux operating system, and understand basic concurrency such as task synchronization.

Instructor Support

Lecture slides, exercise solutions, and errata are provided at the companion website: textbooks.elsevier.com/9780128003428

Acknowledgments

Several students assisted me in the development of this book.

During spring and summer 2013, undergraduate students Benjamin Morgan, Jonathan Kilby, Shawn Weaver, Justin Robinson, and Amadeo Bellotti evaluated the DMA controller and performance monitoring unit on the Raspberry Pi’s Broadcom BCM2835 and the Xilinx Zynq 7020.

During summer 2014, undergraduate student Daniel Clements helped develop a uniform approach for using the Linux perf_event on the ARM11, ARM Cortex A9, and ARM Cortex A15. Daniel also evaluated Imagination Technology’s OpenCL runtime and characterized its performance on the PowerVR 544 GPU on our ODROID XU Exynos 5 platform.

During summer 2015, undergraduate student Friel Scottie Scott helped evaluate the Mali T628 GPU on the ODROID-XU3 platform and proofread Chapter 5.

Much of my insight about memory optimizations for computer vision algorithms were an outgrowth of my graduate student Fan Zhang’s dissertation on auto-optimization of stencil loops on the Texas Instruments Keystone Digital Signal Processor architecture.

I would like to thank the following reviewers, who provided feedback, insight, and helpful suggestions at multiple points throughout the development of the book:

■ Miriam Leeser, Northeastern University

■ Larry D. Pyeatt, South Dakota School of Mines and Technology

■ Andrew N. Sloss, University of Washington, Consulting Engineer at ARM Inc.

■ Amr Zaky, Santa Clara University

I would like to thank Morgan Kaufmann and specifically to Nate McFadden for his constant encouragement and limitless patience throughout the writing. I am especially grateful for Nate's open-mindedness and flexibility with regard to the content, which continually evolved to keep current with new ARM-based embedded development platforms being released while I was developing the content. I also wish to thank Sujatha Thirugnana Sambandam for her detail-oriented editing and to Mark Rogers for designing the cover.

Chapter 1

The Linux/ARM embedded platform

Abstract

ARM processors are embedded in virtually all mobile consumer electronic devices but mobility has a price: reliance on battery power places limits on the performance, making them less capable than that of desktop computers. Despite this limitation, users of mobile computing devices still expect desktop-class performance. In order to deliver real-time, immersive application performance, embedded system programmers must adopt a performance-driven approach for programming. In other words, programs need not only be functionally correct but must also deliver the highest possible performance.

This chapter introduces ARM processor technology, ARM assembly language, Linux-based software development tools, and methods for evaluating the runtime performance of software. Included are several worked-out examples for writing code in ARM assembly language and combining C and assembly-language subprograms into a single executable.

Keywords

ARM instruction set architecture

Linux

Performance counters

Inline assembly language

Code optimization

Linux kernel

GCC compiler

Chapter Outline

1.1 Performance-Oriented Programming 3

1.2 ARM Technology 6

1.3 Brief History of ARM 7

1.4 ARM Programming 8

1.5 ARM Architecture Set Architecture 8

1.5.1 ARM General Purpose Registers 9

1.5.2 Status Register 11

1.5.3 Memory Addressing Modes 12

1.5.4 GNU ARM Assembler 13

1.6 Assembly Optimization #1: Sorting 14

1.6.1 Reference Implementation 14

1.6.2 Assembly Implementation 15

1.6.3 Result Verification 18

1.6.4 Analysis of Compiler-Generated Code 21

1.7 Assembly Optimization #2: Bit Manipulation 22

1.8 Code Optimization Objectives 25

1.8.1 Reducing the Number of Executed Instructions 25

1.8.2 Reducing Average CPI 25

1.9 Runtime Profiling with Performance Counters 28

1.9.1 ARM Performance Monitoring Unit 28

1.9.2 Linux Perf_Event 29

1.9.3 Performance Counter Infrastructure 30

1.10 Measuring Memory Bandwidth 34

1.11 Performance Results 37

1.12 Performance Bounds 38

1.13 Basic ARM Instruction Set 38

1.13.1 Integer Arithmetic Instructions 39

1.13.2 Bitwise Logical Instructions 39

1.13.3 Shift Instructions 39

1.13.4 Movement Instructions 40

1.13.5 Load and Store Instructions 40

1.13.6 Comparison Instructions 42

1.13.7 Branch Instructions 42

1.13.8 Floating-Point Instructions 42

1.14 Chapter Wrap-Up 44

Exercises 45

Our culture is becoming increasingly inundated with mobile and wearable devices that can deliver rich multimedia, high-speed wireless broadband connectivity, and high-resolution sensing and signal processing. These devices owe their existence to two crucial technologies.

The first is ARM processor technology, which powers virtually all of these devices. ARM processors were introduced into customer electronics in the 1980s and have grown to become the de facto embedded processor technology.

Unlike desktop and server processors, ARM processors are never manufactured as standalone chips, but rather as integrated components of a diverse and heterogeneous embedded system-on-chip (SoC) that is customized for each specific product.

An embedded SoC is a collection of interconnected hardware modules on a single chip. A typical SoC will include one or more ARM processor cores. These processors serve as the host, or central controller, of the whole SoC, and are responsible for the user interface and maintaining control of the peripherals and special-purpose coprocessors.

In addition to the processor cores, a typical SoC will contain a set of different types of memory interfaces (for SDRAM, Flash memory, etc.), a set of communications interfaces (for USB, Bluetooth, WiFi, etc.), and special-purpose processors for graphics and video (such as graphical processor units).

Figure 1.1 shows a photo of the Apple A8 chip, the processor inside Apple’s iPhone 6. The area identified as CPU comprises a dual-core ARM processor and the areas identified as GPU comprise a set of special-purpose processors used for video and graphics. The large unlabeled area contains additional modules used for various other peripherals to support the iPhone’s functionality.

f01-01-9780128003428

Figure 1.1 The processor integrated circuit inside the Apple A8 chip package. It has two ARM processor cores, four PowerVR G6450 cores, four DRAM interfaces (connections to off-chip memory), an LCD interface, a camera interface, and a USB interface. (Photo from ChipWorks.)

In the sense of being capable of executing any program written in any programming language, the ARM processors comprise the computer part of the SoC. Unfortunately, by the standards set by modern desktop and server computers, ARM processors are very low performing processors because their design places higher emphasis on energy efficiency than performance as compared to desktop and server processors.

The second technology that enables modern consumer electronics is the Linux operating system (OS), which is run on nearly all ARM processors. Due to its free availability, success, and widespread adoption on desktop and server computers, Linux has grown to become the universal standard embedded OS. As an embedded OS, Linux greatly facilitates code development and code reuse across different embedded platforms.

1.1 Performance-Oriented Programming

In general, embedded processors are designed with different objectives than desktop and server processors. Mobile embedded processors in particular emphasize energy efficiency above all other goals, and writing highly performing code for embedded systems generally requires more effort on the part of the programmer as compared to writing code for desktop and server processors.

This is because desktop and server processors are designed to extract maximum performance from both legacy code, code written and compiled long ago for a previous version of the processor, and processor agnostic code, code that was not written with any specific target processor in mind. They do this by devoting much of real estate and energy consumption to two specific features:

1. Extraction of instruction-level parallelism: The processor attempts to (1) execute as many instructions as possible per clock cycle, (2) change the execution order of instructions to reduce waiting time between dependent pairs of instructions, and (3) predict execution behavior and execute instructions before knowing if they are needed, throwing away unneeded results when necessary. This allows the processor to achieve maximum instruction execution rate even when instructions are not ordered in the program code in any particular way.

2. Large, complex caches: The processor attempts to maximize memory system performance even when the program accesses memory nonideally for the attached memory. This involves intelligent data pre-fetching, memory access scheduling, and high associativity, allowing the cache to avoid repeated accesses to the same memory.

Embedded processors, on the other hand, typically forego most of these features, which makes embedded processors exhibit a wider performance gap between programs that are performance tuned and those that are not.

Keep in mind, though, that this difference is only noticeable for compute bound programs, which are programs whose speed or response time is determined by the processor or memory speed as opposed to input and output. For example, an embedded processor will certainly require more time than a server processor to compress a video, since this waiting time is determined by the processor and memory speed.

On the other hand, the performance of an I/O bound program is determined by the speed of communication channels or other peripherals. For example, a program that forces the user to wait for data to be downloaded would not be any faster regardless of the processor technology. In this case the response time is determined solely by how fast the device can complete the download.

So how many mobile embedded programs are compute bound, as opposed to I/O bound? Most image and video encoding are compute bound, but since these are generally based on rarely changing standards, most SoC vendors cheat by offloading these tasks to special-purpose hardware as opposed to processing them in software running on ARM processor cores.

However, next-generation embedded applications such as computer vision will require rapidly evolving algorithms, and most of these are compute bound. Examples of these include programs that stitch together individual images to form a panoramic image, facial detection and recognition, augmented reality, and object recognition.

This textbook provides a general overview of some of the methods of how program design can influence processor performance—methods in which a programmer can make changes to code without changing program semantics but improving code performance.

These techniques generally require that the programmer write his or her code in such a way as to expose specific features in the underlying microarchitecture. In other words, the programmer must write code at a low abstraction level in such a way that the program is aware of the underlying processor technology. This is often referred to as performance tuning or code optimization.

These ideas are not new; many of these techniques are common in the area of high-performance computing. However, this book will present these ideas in the context of embedded processors, and ARM processors in particular. This will also provide insight into computer architecture, application design, and embedded systems, as well as gain practical knowledge in the area of embedded software design for modern embedded systems.

In describing these methodologies, the textbook will use several example applications, including image transformations, fractal generation, image convolution, and several computer vision tasks. These examples will target the ARMv6 and ARMv7-A instruction set architectures used on three ARM cores:

25AA ARM11 used in the generation 1 Raspberry Pi,

25AA Cortex-A9 used in the Xilinx Zynq (the device used in the Avnet Zedboard, Adapteva Parallela, and National Instruments myRIO), and

25AA Cortex-A15 used in the NVIDIA Tegra K1.

This book also introduces methodologies for programming mobile general-purpose GPUs (GPGPU) using the OpenCL programming model.

This textbook will take advantage of the system facilities offered by the Linux OS, including Linux’s GCC compiler toolchain and debug tools, performance monitoring support, OpenMP multicore runtime environment, video frame buffer, and video capture capabilities.

This textbook is designed to accompany and work with any of the many low-cost Linux/ARM embedded development boards currently on the market. Examples of these include the following:

25AA $35 Raspberry Pi with a 700 MHz ARM11 processor, and the newer Generation 2 Raspberry Pi with a dual-core Cortex-A9 processor,

25AA $65 ODROID-U3 with a 1.7 GHz quad-core ARM Cortex A9 processor,

25AA $90 BeagleBone Black with a 1 GHz ARM Cortex A8 processor,

25AA $99 Parallella Platform with a 1 GHz dual-core ARM Cortex A9 processor

25AA $169 ODROID-XU3 with a 1.6 GHz quad-core ARM Cortex A15 processor,

25AA $182 PandaBoard with a 1.2 GHz dual-core ARM Cortex A9 processor,

25AA $192 NVIDIA Jetson TK1 with a 2.23 GHz quad-core ARM Cortex A15 processor,

25AA $199 Arndale Octa with a 1.8 GHz quad-core ARM Cortex A15 processor,

25AA $199 Avnet MicroZed with a 666 MHz dual-core ARM Cortex A9 processor.

Each of these platforms allows the programmer to use an interactive login session to compile, debug, and characterize code, eliminating the need for complex cross-compilers, remote debuggers, and/or architectural simulators.

1.2 ARM Technology

ARM processor technology is controlled by ARM Holdings. ARM stands for Advanced RISC Machine, and RISC stands for Reduced Instruction Set Computer. RISC is a design philosophy in which the native language of the processor, or instruction set, is deliberately designed as a repertoire of extremely simple instructions, which requires that the processor execute a

Enjoying the preview?

Page 1 of 1

Embedded Systems: ARM Programming and Optimization

About this ebook

Jason D. Bakos

Related authors

Related to Embedded Systems

Related ebooks

Hardware For You

Related podcast episodes

Related articles

Related categories

Reviews for Embedded Systems

What did you think?

Book preview

Embedded Systems - Jason D. Bakos

Embedded Systems

Table of Contents

Copyright

Preface

Using this Book

Instructor Support

Acknowledgments

Abstract

Keywords

Chapter Outline

1.1 Performance-Oriented Programming

1.2 ARM Technology